rbanffy a day ago

Ideally, a compiler would recognize the code can be turned into SIMD instructions and just issue those, but C doesn't have syntax for making it trivial. If C had a portable syntax for at least some vector operations, compilers could readily generate code with those instructions and we'd all be a lot happier.

  • gary_0 19 hours ago

    Unfortunately a C compiler isn't going to know what to do with your probably-unaligned pointer to 5 floats that you're doing mostly horizontal arithmetic on. And SIMD instructions provide zero benefit if the data is spread out in memory, as is the case for most C/C++ application code.

    Otherwise, if you want to smack proper vectors and matrices together at high speed, libraries like Eigen or DXMath already abstract away the SIMD details and work great. For nitty-gritty stuff like codecs, that's always going to be handwritten with intrinsics (or ASM), and that's fine. And libc functions like memcpy already use the fastest, fanciest instructions. It's mostly a solved problem.

    Lastly, for a lot of tasks, regular math instructions are plenty fast. On modern CPUs, you need to be doing a lot of math before worrying about SIMD is worth it. And once your program becomes particularly math-heavy, you'll probably want to use the GPU instead anyways.

    • rbanffy 13 hours ago

      I completely agree - libraries is the way to deal with the problem, not only for C, but for any language that lacks syntax for array and matrix operations. Intrinsics is mot a great solution because they aren't portable, even when the exact function is the same across different ISAs. GPUs are a different ball game entirely, and reality gets messy, especially if your code intends to be portable across GPU architectures.

zdw a day ago

That header image is some truly cursed AI abomination.