Get the latest tech news

Why are your models so big? (2023)


I don’t understand why today’s LLMs are so large. Some of the smallest models getting coverage sit at 2.7B parameters, but even this seems pretty big to me. If you need generalizability, I totally get it. Things like chat applications require a high level of semantic awareness, and the model has to respond in a manner that’s convincing enough to its users. In cases where you want the LLM to produce something human-like, it makes sense that the brains would need to be a little juiced up.

None

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Models

Models

Related news:

News photo

Some models of reality are bolder than others

News photo

OpenAI's new confession system teaches models to be honest about bad behaviors

News photo

Apple drops Night mode Portraits with iPhone 17 models