neonsunset 2 days ago

.NET's compiler does not perform loop autovectorization as it has not been as profitable of a compiler throughput investment as other optimizations (but it does many small optimizations that employ SIMD operations otherwise like unrolling string and span comparisons, copies, moving large structs, zeroing, etc., it also optimizes the SIMD operations themselves ala LLVM).

.NET does however offer best-in-class portable SIMD API and large API surface of platform intrinsics both of which are heavily used by CoreLib and many performance-oriented libraries. You can usually port intrinsified implementations hand-written in C++ to C# while making the code more readable and portable and not losing any performance (though sometimes you have to make different choices to make the compiler happy).

https://github.com/dotnet/runtime/blob/main/docs/coding-guid...

1
physicsguy 2 days ago

Oh, that's surprising, I thought RyuJIT could do it with certain types!

neonsunset 2 days ago

If you're interested, here's the overview of planned compiler work for .NET 10: https://github.com/dotnet/runtime/issues/108988

Autovectorization is usually very fragile and in areas where you care about it hand-written implementation always provides much better results that will not randomly break on minor changes to compiler version or the code, that must be carefully guarded against.

It would be still nice to have it eventually, and I was told that JIT team actively discusses this but there are just many more lower hanging fruits that will light up in disproportionately more instances of user code.

If it's any consolation, Clang/LLVM is not a silver bullet either and you will find situations where .NET's compiler output is competitive or even better: https://godbolt.org/z/3aKnePaez