Lydia Nishimwe
Lydia Nishimwe
Home
Experience
Projects
Publications
Talks
Blog
Contact
Light
Dark
Automatic
Deep Learning
When the Gold Standard Isn't Necessarily Standard: Challenges of Evaluating the Translation of User-Generated Content
Preprint
Lydia Nishimwe
,
Benoît Sagot
,
Rachel Bawden
Robust Neural Machine Translation of User-Generated Content
🎓 PhD Thesis 🎓
Lydia Nishimwe
Making Sentence Embeddings Robust to User-Generated Content
LREC-COLING 2024
Lydia Nishimwe
,
Benoît Sagot
,
Rachel Bawden
Your Fairseq-trained model might have more embedding parameters than it should.
How a bug in reading SentencePiece vocabulary files causes some Fairseq-trained models to have up to 3k extra parameters in the embedding layer.
Lydia Nishimwe
,
posted on Mar 16, 2024
Last updated on Jan 8, 2026
Fairseq Bug Fix
A bug in reading SentencePiece vocabulary files causes models to have 3k extra params in the embedding layer.
RoLASER
Making LASER sentence embeddings robust to user-generated content via Knowledge Distillation and Data Augmentation.
Normalisation lexicale de contenus générés par les utilisateurs sur les réseaux sociaux
🏆 Prix du Meilleur Article (Best Paper Award) - RÉCITAL 2023 🏆
Lydia Nishimwe
Inria-ALMAnaCH at the WMT 2022 shared task: Does Transcription Help Cross-Script Machine Translation?
Jesujoba O Alabi
,
Lydia Nishimwe
,
Benjamin Muller
,
Camille Rey
,
Benoît Sagot
,
Rachel Bawden
Cite
×