Get the latest tech news

Summing ASCII encoded integers on Haswell at almost the speed of memcpy


“Print the sum of 50 million ASCII-encoded integers uniformly sampled from [0, 2³¹−1], separated by a single new line and sent to standard input.” On the surface, a trivial problem. But what if you wanted to go as fast as possible? I’m currently one of the top ranked competitors in exactly that kind of challenge and in this post I’ll show you a sketch of my best performing solution.

“Print the sum of 50 million ASCII-encoded integers uniformly sampled from [0, 2³¹−1], separated by a single new line and sent to standard input.” I’ll leave out some of the µoptimizations and look-up table generation to keep this post short, easier to understand and to not completely obliterate the HighLoad leaderboard. We’ll instead iterate over 32 byte chunks of the input using SIMD, from back to front, keeping track of the sum of the digits in each decimal place.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of speed

speed

Photo of Integers

Integers

Photo of memcpy

memcpy

Related news:

News photo

How water controls the speed of muscle contraction

News photo

Speed Limiters Now Mandatory in All New EU Cars

News photo

Speed Limiters Now Mandatory In All New EU Cars