Issues |
Corpus architectureWhy pair one language with 6 others? Why English as the target language? Why Japanese, Korean, Hindi, Arabic, French and Spanish as the source languages? Actual corpus sizeApparently only 5% of the corpus actually exists yet: Each corpus currently contains 5 source language news articles (125 by the end of the project), each with either two or three independently produced high-quality translations into English .... IAMTC 2004, "Goals" Distributed corpora contain translations of 6 articles per source language, not 125. Evaluation included only 6 articles per source language. |