Get the latest tech news
Why are your models so big? (2023)
I don’t understand why today’s LLMs are so large. Some of the smallest models getting coverage sit at 2.7B parameters, but even this seems pretty big to me. If you need generalizability, I totally get it. Things like chat applications require a high level of semantic awareness, and the model has to respond in a manner that’s convincing enough to its users. In cases where you want the LLM to produce something human-like, it makes sense that the brains would need to be a little juiced up.
None
Or read this on Hacker News
