Get the latest tech news

SmolGPT: A minimal PyTorch implementation for training a small LLM from scratch

Contribute to Om-Alve/smolGPT development by creating an account on GitHub.

Designed for educational purposes and simplicity, featuring efficient training, flash attention, and modern sampling techniques. Minimal Codebase: Pure PyTorch implementation with no abstraction overhead Modern Architecture: GPT model with: Flash Attention (when available) RMSNorm and SwiGLU Efficient top-k/p/min-p sampling Note: This implementation is inspired by modern LLM training practices and adapted for educational purposes.

Get the Android app

Or read this on Hacker News