|
|||||||
Machine Translation
Previous research has shown that broad coverage MT systems can be deployed by developing large scale hand built knowledge sources, which are expensive, or by extracting information from a very large scale corpora, which are typically unavailable for low density languages. The government is sponsoring the development of building broad coverage MT system for low-density languages, using as little hand-built resources and corpora as possible. One aspect of the research that is not well understood is the prioritized value of different lexical and grammatical information for Machine Translation. The UMES research effort will provide basic information needed to assess the relative importance of data input. The immediate practical outcome of that effort is the availability of several test components of a computer based knowledge elicitation system that provides the knowledge bases needed for machine translation. By testing these components, UMES is advancing the state of art in linguistic acquisition for MT and assessing the technical challenges to overcome. In this process, we expect to be able to ramp up new machine translation systems based on the new knowledge bases. UMES confers with the Government to specify a test plan to use the training materials developed by the African Language Project. These languages include: Igbo, Hausa, Yoruba, and Lingala. Other language materials may be supplied if both UMES and the Government determine that they may be used productively to build knowledge bases for other languages. Representatives on the UMES research team demonstrate capabilities in data acquisition, linguistic analysis, or elicitation technology. The participation of graduate students on the research team is encouraged as a way of promoting interest in the topic of this research. Particpating students will be current graduate students in Computer Science. |
|||||||
|
|||||||