Friday, February 22, 2008

Foreign Language Translation

METIS II is a European-based machine-translation project that has demonstrated an inexpensive technique for translating documents from Dutch, German, Greek, or Spanish to English. Machine translation currently works best for formal texts in specialized areas with unambiguous vocabulary and limited sentence patterns. The European Union has been supporting research in this field since the large Eurotra project in the 1980s, which used a rules-based approach that taught a computer the rules of syntax and applied them to translate texts from one language to another. However, starting in the early 1990s, a new concept of statistical translation has gained in popularity. Statistical translation replaces rules with statistical methods that are based on a text corpus--a large body of written material, up to tens of millions of words--that is intended to be representative of a language. Parallel corpora contain the same material in two or more languages that the computer uses to compare corpora and learn how words and expressions in one language translate to another. Parallel corpora are expensive and rare and exist only in a very few languages. METIS II researchers are employing statistical machine translation without a parallel corpora resource by using monolingual corpora for the target language. Using a single corpus requires using a dictionary for the vocabulary and a way of understanding syntax. METIS II matches patterns at the "chunk" level by matching phrases or fragments of a sentence instead of the entire sentence, which makes the pattern matching more efficient.

No comments: