The Problem #
Standard timer implementations (like BinaryHeap) suffer from O(log N) insertion and cancellation costs. In high-throughput networking (C10M), cancelling millions of timers creates a CPU bottleneck due to pointer chasing and cache misses.
The Solution #
I implemented the Varghese & Lauck (1987) paper in Rust, optimizing it for modern CPU architectures:
- Slab Allocation: Replaced heap pointers with a custom Slab allocator using
NonZeroU32handles to reduce struct size to 24 bytes. - Cache Locality: Ensured all timer nodes live in a contiguous memory region to maximize L1 cache hits.
- O(1) Cancellation: Achieved constant-time cancellation using an intrusive doubly-linked list strategy.
Benchmark Results #
| Operation | Heap (Std) | Wheel (My Implementation) | Improvement |
|---|---|---|---|
| Cancel (10k items) | 50.6ms | 0.029ms | 1,700x Faster |