I worked on improving the NLP features provided by the Fleksy mobile keyboard, such as auto-correction, auto-completion, next-word prediction, and swipe gesture recognition. I worked with Python, C++, and experimented with Rust.


The company

Fleksy is a Spanish start-up, working fully remotely with a team scattered across all the continents.

They develop a SDK that allows developers to create their own fully-featured keyboard (with swipe typing, auto-correction, etc…) in 82 different languages.

Projects

Standard NLP benchmark

My first task at Fleksy was to create proper benchmark for the existing NLP features.

I achieved this by creating a standard test set for each language, and then artificially introduce typos with fuzzy typing and create human-like swipe gestures for scoring our models.

Examples of generated swipe gesture for the word

Examples of generated swipe gesture for the word gives

<aside> 🎯 Skills


Python

Benchmarking

CI/CD

Multiprocessing

</aside>

One important aspect of this task was accessibility. I integrated it in our CI/CD pipeline (using Bitrise), which would upload the results in a database, and created a dashboard so anyone in the company can check how well our models are performing :

The dashboard is making it easy to compare scores across versions for a given language
Numbers in this screenshot are faked for privacy reasons

The dashboard is making it easy to compare scores across versions for a given language Numbers in this screenshot are faked for privacy reasons

The benchmark was of course aggressively optimized through multiprocessing, to reduce the time needed to test the 82 languages supported.

Language Model improvements

The auto-correction (and other NLP capabilities of the keyboard) rely heavily on a Language Model to decide word corrections. Creating a better Language Model is a sure way to improve the quality of the NLP features of the keyboard.

I updated the language model pipeline with modern best-practices, as well as introduced open-source tooling in the pipeline. The first impact of these changes were reducing the time needed to train a language model from 2h to less than 5 minutes.

<aside> 🎯 Skills


Python

NLP

Algorithms

Multiprocessing

C++

</aside>

With these changes, language models showed improvements for 97% of the languages supported.

For some languages, auto-correction was up to 28% better at correcting typos !

For some languages, auto-correction was up to 28% better at correcting typos !

I also modified the language models to support additional features, such as auto-correcting casing errors (which is crucial in some languages such as German).

Engine rewrite in Rust

The current engine used by Fleksy is written in C++. I had the opportunity to rewrite a component of the engine, the swipe gesture recognizer, in a more modern and safer language : Rust.

The goal was to explore alternative language and see if a partial rewrite of the engine would be worth.

<aside> 🎯 Skills


Rust

C++

Rewrite

Algorithms

</aside>