Get the latest tech news
Show HN: NanoEuler – GPT-2 scale model in pure C/CUDA from scratch
GPT-2-style LLM built from scratch in C/CUDA with hand-written backprop, BPE tokenizer, FlashAttention, pretraining, and SFT. - JustVugg/nanoeuler
None
Or read this on Hacker News