Machine Translation


The UMES has received a grant to establish a cross disciplinary project for machine translation of African languages.The project to be collaboratively operated with the UMES Computer Science graduate program and the African Language Project, involves testing large knowledge-based elicitation systems using African languages in electronic formats. The software, developed by the New Mexico State University, expands available on-line tools such as the dictionary and a user glossary.

The Machine Translation component of the project is a three-year study to be performed by UMES to test Government-supplied, experimental knowledge elicitation systems. The initial test cases  shall use the training material developed in the African Language Project to build knowledge bases.

Previous research has shown that  broad coverage MT systems can be deployed by developing large scale hand built knowledge sources, which are expensive, or by extracting information from a very large scale corpora, which are typically unavailable for low density languages. The government is sponsoring the development of building broad coverage MT system for low-density languages, using as little hand-built resources and corpora as possible.

One aspect of the research that is not well understood is the prioritized value of different lexical and grammatical information for Machine Translation. The UMES research effort will provide basic information needed to assess the relative importance of data input.

The immediate practical outcome of that effort is the availability of several test components of a computer based knowledge elicitation system that provides the knowledge bases needed for machine translation. By testing these components, UMES is advancing the state of art in linguistic acquisition for MT and assessing the technical challenges to overcome. In this process, we expect to be able to ramp up new machine translation systems based on the new knowledge bases.

UMES confers with the Government to specify a test plan to use the training materials developed by the African Language Project. These languages include: Igbo, Hausa, Yoruba, and Lingala. Other language materials may be supplied if both UMES and the Government determine that they may be used productively to build knowledge bases for other languages.

Representatives on the UMES research team demonstrate capabilities in data acquisition, linguistic analysis, or elicitation technology. The participation of graduate students on the research team is encouraged as a way of promoting interest in the topic of this research. Particpating students will be current graduate students in Computer Science. 

   
 
University of Maryland Eastern Shore
African Language Research Project
Department of English and Modern Languages
Princess Anne, MD 21853
Office: (410) 651-6909