I worked on several NLP tasks as a Machine Learning Engineer in 42Maru, with a special focus on text summarization.


The company

42Maru is a Korean start-up, based in Seoul.

They run QA (Question Answering) solutions based on deep-learning.

Projects

NLP research

When I joined the company, my first tasks were focused on research for several NLP tasks :

<aside> 🎯 Skills


Keras

BERT

Transformer

NLP

Pytorch

Tensorflow

Flask

</aside>

Text summarization

Text Summarization was my main task while working at 42Maru.

I developed several models beating the SOTA in English text summarization, and built a production-ready, controlable demonstration.


After learning the basics of text summarization, I used Presumm as a starting point (because it was the current SOTA), and integrated XLNet, a newer model. I could improve the current SOTA on CNN/DM dataset in both Extractive and Abstractive summarization :

Extractive PreSumm (Previous SOTA) Ours
ROUGE 1 43.23 43.73
ROUGE 2 20.24 20.50
ROUGE L 39.63 40.08
Abstractive PreSumm (Previous SOTA) Ours
ROUGE 1 42.13 42.60
ROUGE 2 19.60 20.16
ROUGE L 39.18 39.67

<aside> 🎯 Skills


Pytorch

BART

Transformers

Fairseq

Beam search

Tensorflow 2

</aside>

In addition to better scores, my contributions allowed to generate summary from documents of any length, while the original PreSumm code has a limitation on the input document’s length.


Later, new models (BARTPEGASUS, etc…) were released, beating the SOTA by a large margin. I studied these models and their framework (transformersfairseq).

I could improve BART model by adding a mecanism allowing us to control the abstractiveness of the generated summaries, without even re-training the model ! This mecanism also gave better results :

BART (Previous SOTA) Ours
ROUGE 1 44.16 44.86
ROUGE 2 21.28 21.60
ROUGE L 40.90 41.77

But BART is a big model, and the GPU available at the company weren’t big enough for training.