Get the latest tech news

Zen5's AVX512 Teardown and More


By Alexander J. Yee (@Mysticial) (Last updated: August 7, 2024) Shortcuts: This article was supposed to be published all at once on July 30'th.

In fact, the AVX512 improvement on Zen5 created a memory bottleneck so large that it became the primary reason why I promoted the BBP mini-program from a tool for verifying Pi records to a formal benchmark. Rather than stalling the execution for the ~50,000 cycles needed to do this transition, Intel CPUs will break up the wider instructions and "multi-pump" them into the hardware that is already powered on and ready (and safe) to use at the current clock speed. Nevertheless, it does become a problem in heavily optimized code that saturates the 4 x 512-bit EUs as memory accesses will easily push the port requirements above 10 and cause pipeline bubbles.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of AVX512 Teardown

AVX512 Teardown

Photo of Zen5

Zen5