Consistent Hashing Explained: How Netflix Scales

Quick Answer: Consistent hashing is a distributed systems routing technique that places both servers and data on a conceptual circular ring. Unlike traditional modulo hashing, which requires rehashing almost all data when a server is added or removed, consistent hashing only moves the data adjacent to the new server. This prevents massive network spikes during scaling.
Imagine your team is tasked with building a video streaming architecture. Distributing a massive library of films takes the routing problem to another level. You have way too much data for one machine, so you need to spread those files out across a cluster. The core problem becomes: when a user hits play, how does your system instantly know which server holds that exact film?
Why is the modulo hashing approach bad for distributed caching?
Using a modulo operator for routing works perfectly right up until your cluster changes size. When you add or remove a server, the denominator in your modulo math changes, forcing you to re-evaluate and move the vast majority of your stored data.
Let's say we start with four servers, labeled 0 through 3. A naive routing approach takes a film's identifier, runs it through a hash function to get an integer, and performs a modulo operation against the total number of servers (hash(film) % 4). The resulting number dictates the server where the film lives.
That logic makes complete sense until traffic spikes and you need to spin up a fifth server. Your math is now hash(film) % 5. Because the divisor just changed, the mathematical output for almost every existing hash also changes. Suddenly, you have to re-route and physically move roughly 80% of your data across the network to align with the new modulo results. If those servers are holding massive video files, your network is going to melt under the load of moving all that data.
What is consistent hashing and how does it work?
Consistent hashing maps both your data and your servers onto a fixed, circular ring of integers. By evaluating proximity on this ring rather than performing fixed division, the system can route requests without tying the mathematical outcome directly to the total number of servers.
Instead of relying on modulo division, imagine a conceptual ring of integers. It starts at zero, climbs to a massive maximum integer, and then seamlessly loops back around to zero. You take your servers and assign them positions along this ring, spaced out roughly equally. Each server claims a specific integer as its home base.
When a request comes in for a film, you run the film's identifier through the same hash function to generate an integer. If that integer happens to land exactly on the integer a server occupies, the film lives there. If it lands on an empty integer—which is the most likely scenario—the algorithm simply walks forward around the ring, incrementing the integer, until it bumps into a server. That first server it hits is the designated storage location. Every server is essentially responsible for the films that hash to the integers immediately leading up to it.
How does consistent hashing handle adding new servers?
When you add a new server to a consistent hashing ring, it just claims a new empty spot between two existing servers. The only data that needs to be relocated is the small subset of hashes that fall strictly between the new server and the server immediately preceding it.
Let's go back to our scenario where we scale the cluster from four servers to five. We place the fifth server onto the ring. The original four servers haven't moved, so their relationship to the data on the ring behind them remains totally unchanged. The fifth server simply intercepts the hashes leading up to its specific new position.
Instead of shuffling 80% of your total data across the cluster, you only need to migrate the roughly 20% of data that belongs to the new server's slice of the ring. The rest of the cluster just keeps serving the same files they were already serving, avoiding a massive traffic spike.
Modulo Hashing vs Consistent Hashing
Here is a quick breakdown of how the two approaches compare when routing data across a distributed cluster:
| Feature | Modulo Hashing | Consistent Hashing |
|---|---|---|
| Routing Logic | hash(key) % server_count |
Hash proximity on a circular integer ring |
| Scaling Impact | High network overhead | Low network overhead |
| Data Moved on Scale | ~80% (when going 4 to 5 servers) | ~20% (when going 4 to 5 servers) |
| Best Use Case | Fixed-size server clusters | Elastic, highly scalable distributed systems |
Frequently Asked Questions
What happens when a server goes down in a consistent hashing system?
If a server crashes or is intentionally removed from the ring, the data it was responsible for is picked up by the next available server. The system simply routes the hashes past the missing server's integer until it hits the next active node. Only the data from the down server needs to be reassigned, leaving the rest of the cluster's routing unchanged.
How do you ensure data is evenly distributed on a consistent hash ring?
In practice, placing a physical server at a single point on the ring can lead to uneven data distribution. Systems usually solve this by implementing virtual nodes, where each physical server is assigned multiple random positions on the ring to balance the load more effectively.
What are common real-world use cases for consistent hashing?
Consistent hashing is heavily used in distributed caching layers, content delivery networks (CDNs), and distributed NoSQL databases. It is the standard routing mechanism anytime a system needs to scale stateful storage nodes dynamically without causing network bottlenecks.



