Get the latest tech news

What even is a small language model now?


If you asked someone in 2018 what a "small model" was, they'd probably say something with a few million parameters that ran on a Raspberry Pi or your phone. Fast-forward to today, and we're calling 30B parameter models "small"—because they only need ...

Back in the early days of machine learning, a "small model" might've been a decision tree or a basic neural net that could run on a laptop CPU. Examples: Meta Llama 3 70B (quantized), MPT-30B Use cases: internal RAG pipelines, chatbot endpoints, summarizers, code assistants In 2016, Google switched to a neural machine translation system, GNMT, which uses an encoder-decoder architecture with long short-term memory (LSTM) layers and attention mechanisms.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of small language model

small language model

Related news:

News photo

Flame: A small language model for spreadsheet formulas (2023)

News photo

AMD Unveils Its First Small Language Model AMD-135M

News photo

AMD Releases AMD-135M: An Open-Source Small Language Model