I worked on improving the NLP features provided by the Fleksy mobile keyboard, such as auto-correction, auto-completion, next-word prediction, and swipe gesture recognition. I worked with Python, C++, and experimented with Rust.
Fleksy is a Spanish start-up, working fully remotely with a team scattered across all the continents.
They develop a SDK that allows developers to create their own fully-featured keyboard (with swipe typing, auto-correction, etc…) in 82 different languages.
My first task at Fleksy was to create proper benchmark for the existing NLP features.
I achieved this by creating a standard test set for each language, and then artificially introduce typos with fuzzy typing and create human-like swipe gestures for scoring our models.
Examples of generated swipe gesture for the word gives
<aside> 🎯 Skills
Python
Benchmarking
CI/CD
Multiprocessing
</aside>
One important aspect of this task was accessibility. I integrated it in our CI/CD pipeline (using Bitrise
), which would upload the results in a database, and created a dashboard so anyone in the company can check how well our models are performing :
The dashboard is making it easy to compare scores across versions for a given language Numbers in this screenshot are faked for privacy reasons
The benchmark was of course aggressively optimized through multiprocessing, to reduce the time needed to test the 82 languages supported.
The auto-correction (and other NLP capabilities of the keyboard) rely heavily on a Language Model to decide word corrections. Creating a better Language Model is a sure way to improve the quality of the NLP features of the keyboard.
I updated the language model pipeline with modern best-practices, as well as introduced open-source tooling in the pipeline. The first impact of these changes were reducing the time needed to train a language model from 2h to less than 5 minutes.
<aside> 🎯 Skills
Python
NLP
Algorithms
Multiprocessing
C++
</aside>
With these changes, language models showed improvements for 97% of the languages supported.
For some languages, auto-correction was up to 28% better at correcting typos !
I also modified the language models to support additional features, such as auto-correcting casing errors (which is crucial in some languages such as German).
The current engine used by Fleksy is written in C++. I had the opportunity to rewrite a component of the engine, the swipe gesture recognizer, in a more modern and safer language : Rust.
The goal was to explore alternative language and see if a partial rewrite of the engine would be worth.
<aside> 🎯 Skills
Rust
C++
Rewrite
Algorithms
</aside>