Get the latest tech news
Data movement bottlenecks to large-scale model training: Scaling past 1e28 FLOP
Data movement bottlenecks limit LLM scaling beyond 2e28 FLOP, with a “latency wall” at 2e31 FLOP. We may hit these in ~3 years. Aggressive batch size scaling could potentially overcome these limits.
Or read this on Hacker News