Get the latest tech news
Vector math library codegen in Debug
This will be about how when in your C++ code you have a “vector math library”, and how the choices of code style in there affect non-optimized build performance. Backstory A month ago I got into the rabbit hole of trying to “sanitize” the various ways that images can be resized within Blender codebase.
It is just number math, with very clear “four lanes” being operated on (maps perfectly to SSE or NEON registers), no complex cross-lane shuffles, packing or any of that stuff. Of course, as Matt Pharr writes in the excellent ISPC blog series, “Auto-vectorization is not a programming model” ( post) (original quote by Theresa Foley). However, in Debug build configuration, SIMD intrinsics incur heavy cost on performance, i.e. code is way slower than written in pure C scalar style.
Or read this on Hacker News