Geoffrey Scoutheeten
Engineer, published NLP researcher, and builder. I have spent twenty years where mathematics meets software — pricing derivatives, building search engines, doing research on language models — and I now teach companies how AI actually works, so they can use it with method and confidence.
Why this matters for your training
Plenty of AI trainers discovered the field in 2023. I co-authored peer-reviewed research on controlling hallucinations in text generation back when "hallucination" was a research term, not a headline. When your team asks why the model invents facts — and how much they can trust it — they get an answer from someone who studied the problem at the source.
Career
- Today — Founder: building PanPerfect, an AI-powered thermal cooking assistant. Hands-on daily with LLMs, agents, and AI-assisted development.
- 2022 — Alma (fintech startup): Senior Staff Engineer at the French buy-now-pay-later scale-up.
- 2016 – 2022 — BNP Paribas CIB, Senior Data Scientist: internal analytics consulting for the whole corporate & investment bank. Led the design and delivery of an internal full-text search engine ranking results across heterogeneous sources while enforcing access rights. Co-supervised three CIFRE PhD theses on natural language processing.
- 2008 – 2016 — BNP Paribas CIB, Quantitative Analyst: equity derivatives pricing engine — performance, parallelization, and cluster scheduling.
- 2006 – 2008 — Software engineer (Altran): C++ on a WiMAX system-on-chip at SEQUANS Communications, then Java tooling for BNP Paribas equity derivatives.
Research & publications
Through the CIFRE PhD theses I co-supervised at BNP Paribas, I co-authored research in natural language processing, published at EMNLP and in peer-reviewed journals:
- Controlling hallucinations at word level in data-to-text generation — Data Mining and Knowledge Discovery, 2021
- Data-QuestEval: A Referenceless Metric for Data-to-Text Semantic Evaluation — EMNLP 2021
- PARENTing via Model-Agnostic Reinforcement Learning to Correct Pathological Behaviors in Data-to-Text Generation — INLG 2020
- Let's Stop Incorrect Comparisons in End-to-end Relation Extraction! — EMNLP 2020
- Separating Retention from Extraction in the Evaluation of End-to-end Relation Extraction — EMNLP 2021
- A Hierarchical Model for Data-to-Text Generation — ECIR 2020
Full list on ACL Anthology and Semantic Scholar.
Teaching
- Designed and delivered generative AI training sessions for non-technical professional audiences — live demos, hallucination spotting, tool panoramas, societal stakes.
- Co-supervised three PhD students (CIFRE) on NLP research topics.
- Mathematics mock examiner ("colleur") in MP* preparatory classes — teaching has been part of the journey from the start.
Education
- École Polytechnique (X2001), mathematics program
- Master M2, ENS Cachan — Mathematics, Vision, Learning (MVA)
- École Nationale Supérieure des Télécommunications, Paris
Languages
French (native), English (fluent), German (intermediate).