Get the latest tech news
An SVE backend for astcenc (Adaptive Scalable Texture Compression Encoder)
Recent Arm CPUs have provided a new SIMD instruction set, the Arm Scalable Vector Extension (SVE). SVE makes the ISA independent of vector length, allowing CPUs to provide different performance points without having to invent a new ISA each time.
The hardest part was getting a new enough compiler (Clang 17) to pick up a pre-packaged version NEON-SVE bridge header, which allows conversion between NEON and SVE data types. To benefit from SVE, the wrapper API abstraction needed raising to expose the higher-level concept of a narrowing store, emulating this with the sequence above for the older SIMD implementations that can’t do it natively. Widening loads ( svld1ub_s32()) for reading the bottom of N bits of a register lane from contiguous memory, without needing manual post-load expansion.
Or read this on Hacker News