Under the Hood: Building a High-Performance LRU Cache for Our Actor-Based Key-Value Store
In the world of high-performance systems, caching is king. For our open-source key-value store, a robust and efficient caching mechanism isn't just a feature, but a necessity. Today, I want to dive into the implementation of our Least Recently Used (LRU) cache, explore the design choices we made, and explain how it integrates seamlessly and safely with our actor-based architecture.
Why LRU?
The LRU cache eviction policy is a classic for a reason. It's simple, intuitive, and generally effective. When the cache reaches its capacity, the item that hasn't been accessed for the longest time is evicted, making space for new data. This policy assumes that data accessed recently is likely to be accessed again soon, which holds true for many real-world workloads.
Performance: O(1) for everything
Duva's LRU implementation combines a HashMap
and a doubly linked list to provide O(1) performance for every operations:
- 1. GET
- 2. INSERT
- 3. REMOVE
- 4. UPDATE
The unsafe
Reality: Raw Pointers for Performance
One of the most notable aspects of the implementation is the judicious use of unsafe
Rust. Specifically, we use raw pointers (*mut CacheNode
) to enable three pointers pointing to the cache node.
Why unsafe
? In Rust, building a classic doubly linked list without unsafe
is notoriously difficult due to strict ownership and borrowing rules. Each node needs mutable references to its neighbors, creating a cyclical reference problem that Box
or Rc<RefCell<...>>
can't solve efficiently.
Using raw pointer, we take manual yet rather brave control over memory and pointer management. This allows us to:
- Avoid
Rc
Overhead: We eliminate the runtime cost of reference counting. - Prevent Runtime Borrow Check Panics: Unlike
RefCell
, raw pointers don't panic on aliasing — as long as we maintain our safety invariants. - Achieve True O(1) Performance: Linked list updates are extremely efficient via raw pointer manipulation.
A Note on Safety: Every unsafe
block in our code is reviewed and documented. We maintain the invariant that all raw pointers refer to valid, allocated nodes currently managed by the LRU cache. But the real safety guard was not on this specific implementation but rather, the architecture we chose : actor model.
Safety in an Actor Model: The Game Changer
How do we reconcile unsafe
internals with a safe user-facing API?
We use the Actor Model. Each actor owns its own local shards with dedicatedLRUCache
and processes messages serially — ensuring there’s no concurrent access.
This gives us:
- Data Race Freedom: Only one thread touches a
LRUCache
at a time. - Safe Mutability: Actor boundaries enforce exclusive
&mut self
access.
Thus, even though we use unsafe
internally, our system as a whole remains safe and race-free — the actor model acts as our concurrency guardian.
Conclusion
By embracing unsafe
Rust and using raw pointers with care, we’ve built a high-performance LRU cache that offers O(1)
access patterns. Thanks to our actor-based architecture, we ensure safety without compromising speed. This fusion of performance and correctness is key to a scalable, reliable key-value store.
If you’re interested, dive into our codebase and join the project!