Get the latest tech news

GateGPT: 56k tokens per second Transformer (KV cache) on FPGA at 80 MHz

56,000+ tokens/sec at just 80 MHz. 🤯 I burned a full Transformer with KV cache into a custom chip. Designed gate by gate as a 100% digital integrated circuit. Prototyped on a FPGA. (No GPU. No CPU) Just pure digital silicon running @karpathy microGPT, spelling out names on a https://t.co/hXX8kKIxiA

None

Get the Android app

Or read this on Hacker News

Related news:

Speculative KV coding: losslessly compressing KV cache by up to ~4×

Autoregressive next token prediction and KV Cache in transformers

When Fast Fourier Transform Meets Transformer for Image Restoration (2024)

« UK declares under-16 social media ban to protect children, but experts warn of enforcement challenges

Apple's Camera-Equipped AirPods Coming in Late 2027 Alongside 20th Anniversary iPhone »