Get the latest tech news
Karpathy/Nano-Llama31
nanoGPT style version of Llama 3.1. Contribute to karpathy/nano-llama31 development by creating an account on GitHub.
This turns out to not be trivial because Meta's official repo does not seem to include documentation or instructions on how to actually use the models once you download them. You'll see that this prints the identical same result as the reference code above, giving us confidence that this single file of PyTorch is a bug-free adaptation. Requires quite a bit of VRAM atm, e.g. only training the RMSNorm still takes up a good chunk of my 80GB GPU.
Or read this on Hacker News