Get the latest tech news

Counting bytes faster than you'd think possible


Bytes Faster Than You'd Think Possible Summing ASCII Encoded Integers on Haswell at the Speed of memcpy turned out more popular than I expected, which inspired me to take on another challenge on HighLoad: Counting uint8s. I’m currently only #13 on the leaderboard, ~7% behind #1, but I already learned some interesting things.

Summing ASCII Encoded Integers on Haswell at the Speed of memcpy turned out more popular than I expected, which inspired me to take on another challenge on HighLoad: Counting uint8s. I was reading through the typo-ridden Intel Optimization Manual looking for anything memory related when, on page 788, I encountered a description of the 4 hardware prefetchers. Sequential AccessInterleaved Access (8 pages)This improves the score on HighLoad by some 15%, but if your kernel is even more memory bound, let’s say you just vpaddb the bytes to find their sum modulo 255, you can get up to 30% gain with this.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Counting

Counting

Related news:

News photo

The Triumph of Counting and Scripting

News photo

Ten Years and Counting: My Affair with Microservices

News photo

A Pig Kidney Has Now Survived Inside a Human Body for Six Weeks and Counting