The 13th Belgium NLP Meetup surely lived up to its serial number, because it took a bit longer than expected to plan. However, we’re now more than ready to kick off 2020 in great fashion. Our next event will bring you news about the latest Dutch (Ro)BERT(a) model, insights into NLP for intellectual property and the state of the art in medical records management. It will be hosted by DigitYser in Brussels on March 12th. Doors open at 7pm, talks start around 7.30pm. We hope to see you all there!

RobBERT: A Dutch RoBERTa-based Language Model
Pieter Delobelle, KU Leuven
Pre-trained language models have been dominating the field of natural language processing in recent years, and have led to significant performance gains for various complex natural language tasks. One of the most prominent pre-trained language models is BERT. Although the multilingual version of BERT performs well on many tasks, recent studies showed that BERT models trained on a single language significantly outperform the multilingual results.
For this reason we present a Dutch model based on RoBERTa, which we call RobBERT. We show that RobBERT improves state of the art results in Dutch-specific language tasks.

Natural Language Processing for Intellectual Property
Vignesh Baskaran, Darts-IP
From analyzing complex judicial documents across multiple languages to figuring out trademarks that lookalike across million of registered trademarks in just a couple of seconds, algorithms at Darts-IP assist lawyers every single day to make their work more interesting. Vignesh, the first Data scientist of Darts-IP will walk us through their history of writing algorithms and demonstrate them at work!

Turning Medical Records into Actionable Knowledge
Brice de Behault, EarlyTracks
Medical records represent an obvious mine of information for patient treatment. Both quality of care and operational excellence can benefit significantly from always improving data-driven applications. Unfortunately, practitioners do not perceive the importance of these evolutions and most medical records are plain text, unfit for direct use by machines. These millions of documents in every single hospital therefore represent an untapped source of information. To turn this data into actionable knowledge, EarlyTracks uses tailored named entity recognition and semantic technologies (medical ontologies) to provide practitioners with automatically generated patient summaries. EarlyTracks now works with 15 hospitals in Belgium and is a catalyst of innovation in these institutions.


