upvote
Almost literally what I stated. Consider a row in Postgres table or similar. Convert the entire WHERE clause across all columns in that table into a very short sequence of SIMD instructions against the same memory. All of the columns, regardless of type, are evaluated simultaneously using SIMD. For many complex constraints you can match rows in single digit clock cycles even across many unrelated types. This is much faster than using secondary indexes in many cases.

It isn’t hypothetical, I’ve shipped systems that worked this way. You can match search patterns across a random dozen columns across a schema of hundreds of columns at essentially full memory bandwidth.

reply
OK, I thought it couldn't be that, because that should be doable with std::simd or a SIMD abstraction. Well, unless you JIT it, in which case intrinsics wouldn't help either.

> You can match search patterns across a random dozen columns across a schema of hundreds of columns at essentially full memory bandwidth

Do I underatand it correctly, that this would only work, if you have multiple of the same comparisons (e.g. equality check with same sized data) in the WHERE clause and the relevant collumns are within one multiple of the SIMD width of each other?

reply
Every column has its own independent constraint: equality, order, range intersection, bit sets, etc that is evaluated concurrently in single operations. Independent per column in parallel. It does require handling the representation of columns to enable it but that isn’t onerous in practice.

It isn’t intuitive but it is one of those things that is obvious in hindsight once you see how it works. The gap is that people struggle to understand how to make this something SIMD native, especially in high-performance systems.

reply