Get the latest tech news
LLM from scratch, part 28 – training a base model from scratch on an RTX 3090
I felt like it should be possible to train a GPT-2 small level model on my own hardware using modern tools and open datasets from scratch. It was!
None
Or read this on Hacker News