Sharded Timing Wheel

Table of Contents

The Problem
#

Standard timer implementations (like BinaryHeap) suffer from O(log N) insertion and cancellation costs. In high-throughput networking (C10M), cancelling millions of timers creates a CPU bottleneck due to pointer chasing and cache misses.

The Solution
#

I implemented the Varghese & Lauck (1987) paper in Rust, optimizing it for modern CPU architectures:

Slab Allocation: Replaced heap pointers with a custom Slab allocator using NonZeroU32 handles to reduce struct size to 24 bytes.
Cache Locality: Ensured all timer nodes live in a contiguous memory region to maximize L1 cache hits.
O(1) Cancellation: Achieved constant-time cancellation using an intrusive doubly-linked list strategy.

Benchmark Results
#

Operation	Heap (Std)	Wheel (My Implementation)	Improvement
Cancel (10k items)	50.6ms	0.029ms	1,700x Faster

The Problem #

The Solution #

Benchmark Results #

The Problem
#

The Solution
#

Benchmark Results
#