Get the latest tech news

LLäMmlein 1B and 120M – German-only decoder models


& 120M We created two German-only decoder models, LLäMmlein 120M and 1B, from scratch. The project involved several key steps, including extensive data preprocessing, the creation of a custom tokenizer, and optimization of training settings to effectively utilize available hardware.

The project involved several key steps, including extensive data preprocessing, the creation of a custom tokenizer, and optimization of training settings to effectively utilize available hardware. Throughout the training process, various checkpoints were saved and analyzed to monitor the models' learning dynamics. The LLäMmlein 1B also showed comparable results to larger models, with no significant performance difference observed.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of German

German

Photo of decoder models

decoder models

Related news:

News photo

German fintech unicorn N26 just had its first profitable quarter

News photo

We assume damage to Baltic Sea cables was sabotage, German Defence minister

News photo

German Defense Chief Sees Baltic Cable Breaches as Sabotage