The next Linguistics Circle Seminar will be held on Friday 23 November from 12:00 to 14:00 in Hall D1 Gateway Building (GW HD1).
Dr Teresa Lynn, University of Dublin will be talking about 'Developing the Irish Dependency Treebank'.
Dr Teresa Lynn, University of Dublin will be talking about 'Developing the Irish Dependency Treebank'.
Abstract
Syntactic parsing is concerned with the linguistic structural analysis of language in text. Statistical parsers are data-driven and rely on the availability of syntactically annotated corpora (known as treebanks) from which they learn patterns of syntax in a given language. Treebanks are costly in both terms of development time and skills required. For this reason, low-resourced languages often lack both treebanks and statistical parsers.
In this talk Dr Lynn will report on the development of the first Irish dependency treebank and syntactic parser. She will discuss the linguistic structures of the Irish language (a low-resourced language), and the motivation behind the design of the final dependency annotation scheme. Dr Lynn will also demonstrate how we examined methods such as Active Learning to semi-automate the treebank development. Through empirical methods, the impact our treebank's size and content has on parsing accuracy for Irish will be dealt with. She will also briefly discuss our work in cross-lingual studies through the use of a universal annotation scheme and our involvement in the Universal Dependencies Project.