Get the latest tech news

SmolGPT: A minimal PyTorch implementation for training a small LLM from scratch


Contribute to Om-Alve/smolGPT development by creating an account on GitHub.

Designed for educational purposes and simplicity, featuring efficient training, flash attention, and modern sampling techniques. Minimal Codebase: Pure PyTorch implementation with no abstraction overhead Modern Architecture: GPT model with: Flash Attention (when available) RMSNorm and SwiGLU Efficient top-k/p/min-p sampling Note: This implementation is inspired by modern LLM training practices and adapted for educational purposes.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Scratch

Scratch

Photo of small llm

small llm

Related news:

News photo

Snowdrop OS – a homebrew operating system from scratch, in assembly language

News photo

Building a Medieval Castle from Scratch

News photo

Show HN: I made an open-source laptop from scratch