(github.com)
Includes: - AVX-512 SIMD path + scalar fallback - Wait-free lookups with rebuild-and-swap dynamic FIB - Benchmarks on synthetic data and real RIPE RIS BGP (~254K prefixes)
Interesting result: on real BGP + uniform random lookups, a plain Patricia trie can sometimes match or beat the SIMD tree due to cache locality and early exits.
Would love feedback, especially comparisons with PopTrie / CP-Trie.
[0] https://github.com/esutcu/planb-lpm/blob/748d19d5fbd945cefa3...
[1] https://github.com/esutcu/planb-lpm/blob/748d19d5fbd945cefa3...
__m512i vx = _mm512_set1_epi64(static_cast<long long>(x));
__m512i vk = _mm512_load_si512(reinterpret_cast<const __m512i*>(base));
__mmask8 m = _mm512_cmp_epu64_mask(vx, vk, _MM_CMPINT_GE);
return static_cast<std::uint32_t>(__builtin_popcount(m));
would be replaced with: return __riscv_vcpop(__riscv_vmsgeu(__riscv_vle64_v_u64m1(base, FANOUT), x, FANOUT), FANOUT);
and you set FANOUT to __riscv_vsetvlmax_e32m1() at runtime.Alternatively, if you don't want a dynamic FANOUT you keep the FANOUT=8 (or another constant) and do a stripmining loop
size_t cnt = 0;
for (size_t vl, n = 8; n > 0; n -= vl, base += vl) {
vl = __riscv_vsetvl_e64m1(n);
cnt += __riscv_vcpop(__riscv_vmsgeu(__riscv_vle64_v_u64m1(base, vl), x, vl), vl);
}
return cnt;
This will take FANOUT/VLEN iterations and the branches will be essentially almost predicted.If you know FANOUT is always 8 and you'll never want to changes it, you could alternatively use select the optimal LMUL:
size_t vl = __riscv_vsetvlmax_e32m1();
if (vl == 2) return __riscv_vcpop(__riscv_vmsgeu(__riscv_vle64_v_u64m4(base, 8), x, 8), 8);
if (vl == 4) return __riscv_vcpop(__riscv_vmsge(u__riscv_vle64_v_u64m2(base, 8), x, 8), 8);
return __riscv_vcpop(__riscv_vmsgeu(__riscv_vle64_v_u64m1(base, 8), x, 8), 8);It does look a bit AI generated though
That goes for a lot of comments here accusing each other of being a bot.
I feel like we've known internet trust at its highest and it can only go one way now.
These days, when I hear a project owner/manager describe the project as a "clean room reimplementation", I expect that they got an LLM [0] to extrude it. This expectation will not always be correct, but it'll be correct more likely than not.
[0] ...whose "training" data almost certainly contains at least one implementation of whatever it is that it's being instructed to extrude...