Get the latest tech news
Llm.c – LLM training in simple, pure C/CUDA
LLM training in simple, raw C/CUDA. Contribute to karpathy/llm.c development by creating an account on GitHub.
For the repo, I'd like to maintain both clean, simple reference implementations alongside a also lot more optimized versions that can come close to PyTorch, but in a tiny fraction of the code and dependencies. This script will download the GPT-2 (124M) model, overfit a single batch of data for 10 iterations, run a few steps of generation, and most importantly it will save two files: 1) the gpt2_124M.bin file that contains the raw model weights for loading in C, and gpt2_124M_debug_state.bin, which also contains more debug state: the inputs, targets, logits and loss. Simply, there are implementations for the forward and backward pass of all the layers, and they get strung together into a large, manual, forward/backward/update loop.
Or read this on Hacker News