Read more | Notion

I worked on several NLP tasks as a Machine Learning Engineer in 42Maru, with a special focus on text summarization.

The company

42Maru is a Korean start-up, based in Seoul.

They run QA (Question Answering) solutions based on deep-learning.

Projects

NLP research

When I joined the company, my first tasks were focused on research for several NLP tasks :

English paraphrasing : I learned and worked with Keras, and implemented a Pervasive Attention neural network.
Document Similarity : I started to work on this project right after BERT came out, so I extensively studied the architecture. I used BERT as service, and implemented a Siamese network on top of it, using Keras. This architecture lead to great results on the SICK and STS-B datasets.
Question Generation : In order to improve the MRC model through data augmentation, a coworker and I implemented a basic attentional encoder-decoder architecture (in Pytorch) for question generation. We then improved the performance by using BERT as encoder (and LSTM as decoder) and achieved SOTA on SQuAD dataset for QG :

Previous SOTA Our Model

BLEU 1 45.07 49.85

BLEU 2 29.58 36.79

BLEU 3 21.60 29.36

BLEU 4 16.38 23.98

	Previous SOTA	Our Model
BLEU 1	45.07	49.85
BLEU 2	29.58	36.79
BLEU 3	21.60	29.36
BLEU 4	16.38	23.98

<aside> 🎯 Skills

Keras

BERT

Transformer

NLP

Pytorch

Tensorflow

Flask

</aside>

Machine Reading Comprehension : After trying to improve the score of the previous model (using Tensorflow) of the company, we implemented 2 API (English + Korean) to make these MRC models available through a website (using Flask) :

Text summarization

Text Summarization was my main task while working at 42Maru.

I developed several models beating the SOTA in English text summarization, and built a production-ready, controlable demonstration.

After learning the basics of text summarization, I used Presumm as a starting point (because it was the current SOTA), and integrated XLNet, a newer model. I could improve the current SOTA on CNN/DM dataset in both Extractive and Abstractive summarization :

Extractive	PreSumm (Previous SOTA)	Ours
ROUGE 1	43.23	43.73
ROUGE 2	20.24	20.50
ROUGE L	39.63	40.08

Abstractive	PreSumm (Previous SOTA)	Ours
ROUGE 1	42.13	42.60
ROUGE 2	19.60	20.16
ROUGE L	39.18	39.67

<aside> 🎯 Skills

Pytorch

BART

Transformers

Fairseq

Beam search

Tensorflow 2

</aside>

In addition to better scores, my contributions allowed to generate summary from documents of any length, while the original PreSumm code has a limitation on the input document’s length.

Later, new models (BART, PEGASUS, etc…) were released, beating the SOTA by a large margin. I studied these models and their framework (transformers, fairseq).

I could improve BART model by adding a mecanism allowing us to control the abstractiveness of the generated summaries, without even re-training the model ! This mecanism also gave better results :

	BART (Previous SOTA)	Ours
ROUGE 1	44.16	44.86
ROUGE 2	21.28	21.60
ROUGE L	40.90	41.77

But BART is a big model, and the GPU available at the company weren’t big enough for training.