A method to improve the accuracy of suggested matching words in Chinese Input Method Editor (IME)
Publication Date: 2010-Jul-29
The IP.com Prior Art Database
This article reveals a way to improve the suggested match accuracy in Chinese Input Method.
A method to improve the accuracy of suggested matching words in Chinese Input Method Editor
Currently, popular Chinese Input Method Editors (IME) have functionality of providing Suggested Match Words to reduce the input
word load for customers. In Figure1, the IME gives 9 suggested words that match the abbreviation word "hsr", then the user press
number 1-9 to choose exact word he/she wants to type in.
, there are more than 9
words that match the "hsr"
, and user can press PageUp and PageDown to get more suggested words. However, the more steps for user to get what they want, the less satisfaction the user will get. The IME should try to figure out the right order the suggested match words. Some IMEs tries to sort the suggested match words based on the analysis of all users' typing behavior, the problem for this method is, the sorted order cannot fulfill user's typing requirement on different applications.
Figure1. Suggested Match words for "hsr"
when a user is using a web application on a gourmet net
www.dianping.com), he/she types "hsr" in the search bar,
he/she gets 9 suggested match words (In Figure2).
Figure2. Suggested Match words for "hsr" on a gourmet net
The word numbered 3 is the name of dish. It has the greatest possibility to match the user's typing intention among all 9 suggested
. Meanwhile, if there are any other words about dish,
3 to choose the right word, some IME may put ahead the word "红红红". For the next time, the word "红红红" will have higher priority in the suggested match words. However, this does not solve the problem in case that user typing "hsr" again in the other search bar on a movie net (
After user presses
match word becomes the one numbered 8,
Figure3. Suggested Match words for "hsr" on a movie net
In conclusion, there is a need to improve the sorting mechanism for suggested match words of IME under different contexts .
Our idea implements scenario specific order of suggested match words by the following steps:
1. Collect input records with application contexts from end users.
2. Upload the data into a central server.
3. The data processing server analyzes the uploaded typing records, categorizes application context into different scenarios, and then generates a set of suggested match word data for each scenario.
4. Distribute scenario specific suggested match word to end users.
5. Users are free to customize the downloaded scenario data. This step is optional.
6. The suggested match order is adjusted based on the defined scenario in end user 's IME.
Before explaining step1, let us have a look at a scenario definition and its data structure.
As we mentioned,
words. It should have the highest number among all suggested match words
they should be moved into the first 9 suggested words....