Get the latest tech news
Execution units are often pipelined
In the context of out-of-order microarchitectures, I was under the impression that execution units remain occupied until the µop they’re processing is complete. This is often not the case.
In the context of out-of-order microarchitectures, I was under the impression that execution units remain occupied until the µop they’re processing is complete. cycleEU 1EU 2completed0[a][ ] 1[a][ ] 2[a][ ] 3[b][ ]a 4[b][ ]a 5[b][ ]a 6[c][ ]a, b 7[c][ ]a, b 8[c][ ]a, b 9[d][ ]a, b, c 10[d][ ]a, b, c 11[d][ ]a, b, c 12[ ][ ]a, b, c, d With my original understanding of how execution units work, a sequence of independent instructions like Knowing this, I finally get why instruction latency and bandwidth tables specify reciprocal throughput; because it’s equivalent to cycles/instruction!
Or read this on Hacker News