Hugging Face Introduces mmBERT, a Multilingual Encoder for 1,800+ Languages

Wait 5 sec.

Hugging Face has released mmBERT, a new multilingual encoder trained on more than 3 trillion tokens across 1,833 languages. The model builds on the ModernBERT architecture and is the first to significantly improve upon XLM-R, a long-time baseline for multilingual understanding tasks. By Robert Krzaczyński