This is not new math — it’s a memory-layout optimization of a classic wheel/segmented sieve (H=30).
Speedup comes from packing 8 residues into one byte, reducing memory traffic and cache misses on ARM64 (tested on Apple M1).
Benchmarks compare against a standard wheel sieve.
I’m a graphic designer by background; code was written with AI help. Feedback welcome.
1 comment
Show HN: ARM64-optimized prime sieve with 3.75x memory compression | Heykuki News