Get the latest tech news

Train a 70b language model at home (2024)

We’re releasing an open source system, based on FSDP and QLoRA, that can train a 70b model on two 24GB GPUs.

Today, we’re releasing Answer.AI’s first project: a fully open source system that, for the first time, can efficiently train a 70b large language model on a regular desktop computer with two or more standard gaming GPUs (RTX 3090 or 4090). Figuring out how to make large model training inexpensive and accessible is just the kind of thing Eric Ries and Jeremy Howard hoped we’d be able to do when the organization was launched at NeurIPS last year. Hugging Face stepped in once again here, creating the PEFT library, which made LoRA training far simpler, and also integrating it directly with bitsandbytes to allow anyone to use QLoRA with just a few lines of code.

Get the Android app

Or read this on Hacker News