Lydia Nishimwe

AI Research Scientist | PhD

Biography

AI Research Scientist working on large-scale Transformer models, with a focus on how they behave under distribution shift and how to make them more robust to real-world data.

I completed my PhD at Inria Paris and Sorbonne Université, where I studied the interaction between data, scale, and model behaviour in neural systems, from representation learning to generation.

My work sits at the intersection of model design, evaluation, and real-world reliability, with a particular interest in understanding and improving how AI models generalise beyond clean benchmarks.

Beyond research, I enjoy writing, public speaking, and exploring languages and cultures.

Currently exploring research scientist and applied AI roles.

Nishimwe (/niːʃiːm^ŋé/) is a Rwandan name meaning ‘Thanks be to God.'
Fun fact: there is another Lydia Nishimwe who is a singer—we are not related.

Interests

LLMs & Foundation Models
Representation Learning
Robustness, Evaluation & Distribution Shift
Multimodality & Multilinguality

Education

PhD in Computer Science, 2021-2025

Inria Paris, Sorbonne Université
MEng in Mathematics and Computer Science, 2017-2021

École Centrale de Nantes
BSc in Mathematics and Computer Science, 2014-2017

Université Grenoble Alpes

Languages

English

Native

French

Native

Spanish

Advanced

Swahili

Intermediate

German

Intermediate

Kinyarwanda

Elementary

Experience

AI Research Scientist (PhD Candidate)

ALMAnaCH Team, Inria

Oct 2021 – Jun 2025 Paris, France

Topic: Robust Neural Machine Translation of User-Generated Content
Supervised by Benoît SAGOT and Rachel BAWDEN, defended on June 18, 2025

Designed and trained large-scale Transformer models (100M–1.3B parameters) for multilingual representation learning and generation.
Studied how performance evolves under data scale, noise, and distribution shifts, using datasets from thousands to hundreds of millions of samples.
Developed robust representation learning approaches, aligning noisy and standard inputs to improve generalisation.
Explored trade-offs between normalisation, robustness, and semantic preservation using BERT-style models and LLMs.
Fine-tuned multilingual models on real-world data, demonstrating robustness gains without sacrificing standard performance.
Analysed scaling effects, showing that larger models reduce sensitivity to surface noise but introduce new distribution-related failure modes.
Framed robustness as learning across heterogeneous data distributions, with implications for real-world and multimodal AI systems.

Tech stack: Python, PyTorch (Fairseq, Hugging Face Transformers), Pandas, Scikit-Learn, SLURM

AI Research Intern

Orange Labs

Jun 2020 – Dec 2020 Lannion, France

Studied sequence generation strategies, analysing trade-offs between quality, latency, and search complexity.
Designed controlled experiments to isolate decoding behaviour from model architecture.
Identified key failure modes in neural generation, shaping an interest in evaluation and reliability of generative models.

Tech stack: Python, TensorFlow, Keras, Pandas, Scikit-Learn

Software Development Intern

Mean-In-Full

May 2017 – Jul 2017 Meylan, France

Integrated an external LMS (Opencast) into a production platform, focusing on API reliability and data flow across services.
Worked on backend systems in Erlang, gaining early exposure to distributed system interactions.

Tech stack: Erlang, HTTP

Assembly Programming Intern

Laboratoire TIMA

May 2016 – Jun 2016 Grenoble, France

Topic: Functional verification of an ARM7 microprocessor

Verified and debugged an ARM7 microprocessor using VHDL simulations and assembly-level test programs.
Investigated execution behaviour (pipeline hazards, instruction decoding), developing an early focus on system correctness and reliability.

Tech stack: VHDL, C, ARM Assembly, ModelSim

🏆Stage d’excellence (Excellence Internship Program) - Université Grenoble Alpes🏆

Functional Programming Intern

Laboratoire VERIMAG

Jun 2015 – Jun 2015 Grenoble, France

Modelled and simulated GPS trajectories using synchronous functional languages (Lustre, Lutin).
Worked on temporal system behaviour, introducing a formal and structured approach to modelling and reasoning.

Tech stack: Lutin, Lustre

🏆Stage d’excellence (Excellence Internship Program) - Université Grenoble Alpes🏆