Bootstrapping parsers via syntactic projection across parallel texts
Title | Bootstrapping parsers via syntactic projection across parallel texts |
Publication Type | Journal Articles |
Year of Publication | 2005 |
Authors | Hwa R, Resnik P, Weinberg A, Cabezas C, Kolak O |
Journal | Nat. Lang. Eng. |
Volume | 11 |
Issue | 3 |
Pagination | 311 - 325 |
Date Published | 2005/09// |
ISBN Number | 1351-3249 |
Abstract | Broad coverage, high quality parsers are available for only a handful of languages. A prerequisite for developing broad coverage parsers for more languages is the annotation of text with the desired linguistic representations (also known as “treebanking”). However, syntactic annotation is a labor intensive and time-consuming process, and it is difficult to find linguistically annotated text in sufficient quantities. In this article, we explore using parallel text to help solving the problem of creating syntactic annotation in more languages. The central idea is to annotate the English side of a parallel corpus, project the analysis to the second language, and then train a stochastic analyzer on the resulting noisy annotations. We discuss our background assumptions, describe an initial study on the “projectability” of syntactic relations, and then present two experiments in which stochastic parsers are developed with minimal human intervention via projection from English. |
URL | http://dx.doi.org/10.1017/S1351324905003840 |
DOI | 10.1017/S1351324905003840 |