Jump to content

Welcome to CyanogenMod

Welcome to our forum. Like most online communities you must register to post, but don't worry this is a simple free process that requires minimal information for you to signup. Be a part of the CyanogenMod Forum by signing in or creating an account. You can even sign in with your Facebook or Twitter account.
  • Start new topics and reply to others
  • Subscribe to topics and forums to get automatic updates
  • Get your own profile and make new friends
  • Download files attached to the forum.
  • Customize your experience here
  • Share your CyanogenMod experience!
Guest Message by DevFuse
 

CM9 ICS LatinIME buggy prediction for non-English Latin languages

LatinIME dictionary bug

This topic has been archived. This means that you cannot reply to this topic.
No replies to this topic

#1 navdra

navdra
  • Members
  • 2 posts

Posted 12 March 2012 - 12:13 AM

Since I can't find the place where to post the bug, I'm doing it here for reference.
Hopefully, someone will do something about it. I've tried to go trough the code and managed to find the cause but I couldn't solve it all the way.

I believe that the problem begins with native/src/char_utils.cpp but changes are needed elsewhere too.
I uploaded the patch for native/src/char_utils.cpp here:
http://review.cyanogenmod.com/13521


The problem with new ICS LatinIME I noticed is that it does not recognize unicode characters from languages like Czech or Croatian properly.
Gingerbread LatinIME works fine.


First, examples of expected behaviour:

1. CM9 ICS English - Start typing: "hom"

IME offers:
him, home, homogeneous, hometown, homemade, homosexual, homeland, homicide...


2. CM7 Croatian - Start typing: "šuš"

IME offers:
šuška, šušku, šuški, šušte, šuškom...



Here is what happens in CM9 ICS:

1. Croatian - One starts to type: "šuš"

IME offers:
usput, sluša, šuma, slušanja, šumi, slušaš, usprokos, uskrsnuće, slušao, usluge...

IME should offer (from dictionary with frequency):
...
<w word="šuška" f="48"></w>
<w word="šuškanje" f="44"></w>
<w word="šuškalo" f="44"></w>
<w word="šuša" f="37"></w>
<w word="šuškanja" f="29"></w>
<w word="šuše" f="28"></w>
<w word="šuštanja" f="26"></w>
...


2. Czech

One starts to type: "čer"

IME offers: čele, včetně, ČSSR, českou, dcera, čemu, česky, erik, českým...


IME should offer (from dictionary with frequency):
...
<w word="červ" f="255"></w>
<w word="červen" f="254"></w>
<w word="čele" f="170"></w>
<w word="černá" f="138"></w>
<w word="červenec" f="127"></w>  
<w word="červená" f="103"></w>
<w word="června" f="96"></w>     
...