Get the latest tech news
tolower() with AVX-512
couple of years ago I wrote about tolower() in bulk at speed using SWAR tricks. A couple of days ago I was interested by Olivier Giniaux’s article about unsafe read beyond of death, an optimization for handling small strings with SIMD instructions, for a fast hash function written in Rust.
Reading more around the topic, I learned that some SIMD instruction sets do, in fact, have useful masked loads and stores that are suitable for string processing, that is, they have byte granularity. Top tip: You can use* as a wildcard in the search box, so I made heavy use of mm512*epi8 to find byte-wise AVX-512 functions ( epi8 is an obscure alias for byte). The tolower64() kernel in the previous section needs to be wrapped up in more convenient functions such as copying a string while converting it to lower case.
Or read this on Hacker News