I Implemented My Own Mobile Input Method

One day in 2007, I suddenly couldn’t stand the candidate overload problem in the Pinyin input method on the internet cafe computer. So in a moment of impulse, I decided to spend some time learning the Wubi input method.

Back then, even Symbian system phones had Baidu Wubi input method. Baidu’s Wubi input method had a characteristic: it allowed mixed input of Wubi and Pinyin.

However, human nature is lazy. Because I relied on Wubi-Pinyin mixed input too early, I still don’t know how to type many characters in Wubi, and on the other hand, I’ve completely forgotten the Wubi radical mnemonics. Thus, a strange victim emerged.

Until now, I can’t fix this bad habit of mine, unable to break free from Pinyin dependency. Although I’ve forced myself many times to either use pure Wubi or pure Pinyin.

On mobile phones, my input method journey has been: Baidu Input Method → System Default Wubi Input Method → Trime Input Method → WeChat Input Method → Fcitx Input Method.

That’s right, I’ve been avoiding commercial input methods because they really do collect typing content for advertising.

My requirements:

Support Wubi-Pinyin mixed input.
Pure local functionality, no network dependency.
Support displaying Wubi radicals.

Initial Idea

I’m currently using Fcitx input method. My initial idea was to add a function to display Wubi radicals by swiping up on keys in Fcitx, because I was quite satisfied with Fcitx’s usage and it’s open-source, so adding this feature shouldn’t be a big problem.

But I accidentally looked at Fcitx’s issues and found that Fcitx was working on keyboard customization functionality, but it took three years and still isn’t complete. I looked at the source code and found that it doesn’t implement the swipe-up key function.

Similarly, Trime input method is also open-source. Could I modify it? I didn’t research this carefully because Trime’s input interface is too ugly, completely not matching my aesthetics. If I were to modify it, I’d have to overhaul its UI first.

Since both use Rime as the input engine, why don’t I write my own input method? An input method that only serves my own needs, without considering others’ feelings and configurations—the functionality wouldn’t be very complex.

Whether on computer or mobile, I’ve actually been using input method configurations based on Rime. I’ve been using a Wubi input method configuration forked from others.

Implementation

Naturally, I created this input method project that only serves myself. This project is also developed based on the Rime engine, and the default configuration uses the one I’ve been using in Trime, Fcitx, and Squirrel.

Difficulties

Theoretically, to display Wubi radicals, I just need to output the radicals in the configuration and display them through swipe-up. However, I can’t completely output all the radicals. I’ve looked through a lot of materials, but none explained how to output radicals. Currently, I can only output some radicals, plus WeChat input method can display some radicals. But I still can’t collect all the radicals.

This means I need to use design software to draw the radicals.

Because it’s impossible to actually display the mnemonics:

王旁青头戋（兼）五一
土士二干十寸雨
...

Handwriting Input

I previously trained a handwriting input method model and made a web version demo: https://ochw.pages.dev/ (project here: https://github.com/ximeiorg/ochw). To satisfy my sense of achievement, I specially integrated this model.

Since the model requires a very large inference runtime, after converting to onnx and integrating, I found the entire input method installation package directly jumped to over 100MB. So I switched to the ncnn framework for inference, but it was still very large. I thought about it seriously—do I really need this handwriting feature?

I finally decided not to implement this feature.

Speech-to-Text

Speech-to-text is sometimes a very necessary function. Open-source input methods have a problem of not being able to obtain good ASR capabilities. The reason is that useful ASR models are too large to be deployed on the edge.

As for small ASR models that can be deployed on mobile, for someone with my Mandarin level, they’re not practical—the error rate is too high.

Therefore, the best approach is to access external ASR models to obtain better ASR capabilities.

Currently, this feature has not yet been implemented.

Results

Effect of swiping down on ‘g’: Effect of swiping down on g:

Other effects:

Summary

For me, this project basically implemented the functions I wanted.