Consistent Hashing: How Do You Fairly Split the Crowd?
Imagine your stadium's online ticket booking system is getting slammed. Thousands of fans trying to book at the same time. One server can't keep up.
So you add more servers. Great. But now a new question shows up — which server handles which fan's request?
This is the load balancing problem. And the concept of consistent hashing is one of the smartest ways to solve it.
First, What's a Server?
Quick recap. When a fan opens the booking app, their phone sends a request — "I want 2 tickets for IND vs PAK, Section A." Your computer receives that, processes it, and sends back a response — "Here are your seats."
Your computer, in this setup, is the server. It serves requests.
When one server can't handle the load, you buy more servers. Now you have multiple servers ready to handle requests. But you need a way to decide — for each incoming request, which server should handle it?
That's load balancing.
The Simple Approach: Just Use a Hash
Every request that comes in has an ID — a number. You can take that number, run it through a hash function, and use the result to pick a server.
Here's how it works with 4 servers (S0, S1, S2, S3):
Hash(request_id) mod 4 = server index
Some examples:
Hash(10) = 3 → 3 mod 4 = 3 → goes to S3
Hash(20) = 15 → 15 mod 4 = 3 → goes to S3
Hash(35) = 12 → 12 mod 4 = 0 → goes to S0
Since request IDs are essentially random, the load spreads evenly. Each server gets roughly 25% of the traffic. Perfect.
With N servers, each handles roughly 1/N of the total load. Clean and even.
The Problem: What Happens When You Add a Server?
Your app goes viral. The stadium's booking system is getting hammered. You need to add a 5th server — S4.
Simple, right? Just change the formula:
Hash(request_id) mod 5 = server index
But look at what happens to the same requests:
Hash(10) = 3 → 3 mod 5 = 3 → still S3 ✓
Hash(20) = 15 → 15 mod 5 = 0 → now S0 ✗ (was S3)
Hash(35) = 12 → 12 mod 5 = 2 → now S2 ✗ (was S0)
Almost everything moved. Adding just one server reshuffled nearly all the requests across all servers.
Think of it like a pie split 4 ways — 25% each. Now you add a 5th person and have to re-cut the entire pie. Everyone's slice changes. The total reshuffling? Close to 100% of all requests.
Why Does This Actually Matter?
You might think — okay, requests moved around, so what?
Here's why it's a big deal. In real systems, the request ID isn't truly random. It usually contains the user's ID. This means the same user always gets sent to the same server.
That's actually useful. If fan Rahul always lands on S2, then S2 can cache Rahul's profile, booking history, and preferences locally. The next time Rahul makes a request, S2 already has everything ready — no need to fetch from the database again. Much faster.
But when you add a new server and everything reshuffles, Rahul might now go to S4. S4 knows nothing about him. All that cached data on S2? Useless now.
Every server loses its cache. Every user has to be re-fetched from the database. The whole system takes a performance hit — exactly when you're trying to scale up.
A full reshuffle means a full cache wipe. The worst time to lose your cache is when traffic is spiking and you're adding servers.
What You Actually Want
When you add a new server, you don't want everything to change. You want minimal disruption.
Ideally, you'd take a small slice from each existing server and hand it to the new one. Like this:
- S0 gives up a little
- S1 gives up a little
- S2 gives up a little
- S3 gives up a little
- All those small pieces together make up S4's 20% share
The total traffic that moves is just 20% — only what's needed. The remaining 80% of users stay exactly where they were. Their cached data is still valid.
That's the goal. Minimum change when servers are added or removed.
The standard hashing approach (mod N) can't do this. It reshuffles everything.
Consistent hashing is the technique that makes minimal reshuffling possible — and that's exactly what we'll cover next.
Quick Recap
| Concept | What It Means |
|---|---|
| Server | A computer that handles incoming requests |
| Load Balancing | Distributing requests evenly across multiple servers |
| Hash Function | Converts a request ID into a number to pick a server |
| The Problem | Adding a new server reshuffles almost all requests |
| Cache Invalidation | Reshuffling wipes useful cached data from every server |
| The Goal | Add servers with minimal disruption to existing assignments |