Get the latest tech news

Karpathy/Nano-Llama31

nanoGPT style version of Llama 3.1. Contribute to karpathy/nano-llama31 development by creating an account on GitHub.

This turns out to not be trivial because Meta's official repo does not seem to include documentation or instructions on how to actually use the models once you download them. You'll see that this prints the identical same result as the reference code above, giving us confidence that this single file of PyTorch is a bug-free adaptation. Requires quite a bit of VRAM atm, e.g. only training the RMSNorm still takes up a good chunk of my 80GB GPU.

Get the Android app

Or read this on Hacker News