Get the latest tech news

LLM from scratch, part 28 – training a base model from scratch on an RTX 3090


I felt like it should be possible to train a GPT-2 small level model on my own hardware using modern tools and open datasets from scratch. It was!

None

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Scratch

Scratch

Photo of LLM

LLM

Photo of base model

base model

Related news:

News photo

PatchworkOS: An OS for x86_64, built from scratch in C and assembly

News photo

Making tiny 0.1cc two stroke engine from scratch

News photo

OpenAI Has Trained Its LLM To Confess To Bad Behavior