Skip to main content

Sharded Timing Wheel

·119 words·1 min
Ankur Rathore
Author
Ankur Rathore
Senior Systems Engineer pivoting to High-Performance Infrastructure. Building zero-allocation network drivers and cache-friendly data structures.

The Problem
#

Standard timer implementations (like BinaryHeap) suffer from O(log N) insertion and cancellation costs. In high-throughput networking (C10M), cancelling millions of timers creates a CPU bottleneck due to pointer chasing and cache misses.

The Solution
#

I implemented the Varghese & Lauck (1987) paper in Rust, optimizing it for modern CPU architectures:

  1. Slab Allocation: Replaced heap pointers with a custom Slab allocator using NonZeroU32 handles to reduce struct size to 24 bytes.
  2. Cache Locality: Ensured all timer nodes live in a contiguous memory region to maximize L1 cache hits.
  3. O(1) Cancellation: Achieved constant-time cancellation using an intrusive doubly-linked list strategy.

Benchmark Results
#

Operation Heap (Std) Wheel (My Implementation) Improvement
Cancel (10k items) 50.6ms 0.029ms 1,700x Faster