Get the latest tech news

Why those particular integer multiplies?

The x86 instruction set has a somewhat peculiar set of SIMD integer multiply operations, and Intel’s particular implementation of several of these operations in their headline core designs ha…

In short, not only does PMADDWD let us use both 32-bit results that we already computed anyway fully, it also doesn’t touch the first 90% of the datapath at all and can be made to share plenty of logic with the regular path for the final 10% too if desired. The headline item for SSE was SIMD floating point operations (not my subject today), but it also patched a hole in the original MMX design by adding PMULHUW(packed multiply high unsigned word). For b), I have no idea whether this is the case or not, it’s just funny to me that AltiVec had these integer dot product instructions from the standard while x86 took forever to add them (after people used PMADDUBSW with a follow-up PMADDWD by an all-1’s vector literally just to sum the pairs of words in a 32-bit lane together for something like a decade).

Get the Android app

Or read this on Hacker News