LUCAS-rummet E:4130

Master Thesis Seminar: Predictive text input engine for Indic scripts

Date: March 25, 2009 (Wednesday) at 13:00

Mitch Selander och Erik Svensson presenterar sitt examensarbete "Predictive text input engine for Indic scripts".

Opponenter:


Examinator:
Pierre Nugues

Abstract:
Languages with many letters pose a problem for text entry on reduced keyboards. Using multitap is time consuming as there can be 6-9 characters per key on a mobile phone. For singletap methods more letters per key results in more words per key sequence, i.e. greater ambiguity when selecting which word to present to the user. Todays singletap methods for mobile phones mostly rely on a dictionary and word frequencies, this works remarkably well with the Latin alphabet. But this is not enough when the number of letters per key increases.

In this master thesis we investigated different methods to improve the word disambiguation. These methods include word bigrams, part of speech n-grams and keypad remappings. We have chosen the Devanagari script for our implementation as it is one of the scripts with this problem. We have worked with Hindi for the language specific data.

We found that a dictionary based solution with word bigrams combined with a remapped keypad layout gave the desired results. The use of these techniques gave an increase in disambiguation accuracy, from 77% to 94%. We also saw an improvement in KSPC, from 1.0856 to 1.0154

Room: E:4130

Last modified Dec 9, 2011 12:57 pm by Mikael.Antic@cs.lth.se

0320