Consistent hashing maps keys to a changing set of nodes (shards, servers) so that when nodes join or leave, only a small fraction of keys move. It is used in distributed caches, databases, object stores, and load balancers to achieve scalability and high availability with minimal data reshuffling.
Common algorithms
- Consistent hashing (hash ring with virtual nodes)
- Rendezvous hashing
- Jump consistent hash
- Maglev hashing
- AnchorHash: A Scalable Consistent Hash
- DXHash
- JumpBackHash
where N is the number of nodes and R is the number of replicas.
| Algorithm | Lookup per key | Node add/remove | Memory | Replication support |
|---|---|---|---|---|
| Hash ring (with vnodes) | O(log N) binary search over N points; O(1) with specialized structures | O(log N) to insert/remove points | O(N) points | Yes: take next R distinct successors; O(log N + R) |
| Rendezvous | O(N) score per node; top-1 | O(1) (no state to rebalance) | O(N) node list | Yes: pick top R scores; O(N log R) |
| Jump consistent hash | O(log(N)) | O(1) | O(1) | Not native |
| AnchorHash | O(1) expected | O(1) expected/amortized | O(N) | Not native |
| DXHash | O(1) expected | O(1) expected | O(N) | Not native |
| JumpBackHash | O(1) | O(1) expected | O(1) | Not native |
Replication of keys
- Hash ring: replicate by walking clockwise to the next R distinct nodes. Virtual nodes help spread replicas evenly and avoid hotspots.
- Rendezvous hashing: replicate by selecting the top R nodes by score for the key. This naturally yields R distinct owners and supports weights.
- Jump consistent hash: the base function returns one bucket. Replication can be achieved by hashing (key, replica_index) and collecting R distinct buckets; this is simple but lacks the single-pass global ranking HRW provides.
Why replication matters
- Tolerates node failures and maintenance without data unavailability.
- Distributes read/write load across multiple owners, reducing hotspots.
- Enables fast recovery and higher tail-latency resilience.
We define the consistent n-choose-rk replication as follows:
- for a given number
nof nodes, choosekdistinct nodesS. - for a given
keythe chosen set of nodes must be uniformly chosen from all possible sets of sizek. - when
nincreases by one, exactly one node in the chosen set will be changed with probabilityk/(n+1).
For simplicity, nodes are represented by integers 0..n.
Given k independent consistent hash functions h_i(n) for a given key, the following algorithm will have the desired properties:
fn consistent_choose_k<Key>(key: Key, k: usize, n: usize) -> Vec<usize> {
(0..k).rev().scan(n, |n, k| Some(consistent_choose_next(key, k, n))).collect()
}
fn consistent_choose_next<Key>(key: Key, k: usize, n: usize) -> usize {
(0..k).map(|k| consistent_hash(key, k, n - k) + k).max()
}
fn consistent_hash<Key>(key: Key, k: usize, n: usize) -> usize {
// compute the k-th independent consistent hash for `key` and `n` nodes.
}