Get the latest tech news
Can small language models revitalize Indigenous languages?
Brooke Tanner and Cameron Kerry discuss how small language models can offer a practical solution for low-resource communities.
As the release of DeepSeek-R1 displayed, techniques often used in SLMs such as distillation and reinforcement learning can result in models that can be trained faster, consume less energy, and run more efficiently on devices with limited computational power and low-bandwidth connectivity. SLMs can be trained on much smaller, language-specific datasets, facilitating the development of tailored language tools that support preservation and revitalization efforts through applications like spellcheckers, word predictors, machine translation systems, and digital documentation platforms. In northern Europe, research groups like Divvun and Giellatekno received funding from the Norwegian government to develop digital support for the Indigenous Sámi languages, including spellcheckers, predictive text systems, and machine translation.
Or read this on r/technology