LUCAS-rummet E:4130

Master Thesis Seminar: Phonetic text input for Indic languages

Date: December 12, 2008 (Friday) at 13:00

Magnus Höjer presenterar sitt exjobb "Phonetic text input for Indic languages"

The complicated structure of Indic languages means they are not very well suited for text input on a computer or, especially, a mobile phone. An alternative approach is to let users type text in romanized versions of their languages, and automatically convert, transliterate, this into the native script. In this thesis we investigate models for transliteration, utilising decision trees and support vector machines, suitable for implementation on a mobile phone.

It was found that the model was not quite flexible enough to handle all the spelling variations in the test set. Although it should be good enough for simple use in e.g. an SMS application. It was also found that ultimately the SVM implementation was superior to the decision tree, but that the difference was small enough that choice of model could be based mainly on computational grounds.

