Consistent Hashing: How Do You Fairly Split the Crowd?

May 28, 2026•Venkata Lokesh P

System DesignLoad BalancingConsistent HashingBeginner

Imagine your stadium's online ticket booking system is getting slammed. Thousands of fans trying to book at the same time. One server can't keep up.

So you add more servers. Great. But now a new question shows up — which server handles which fan's request?

This is the load balancing problem. And the concept of consistent hashing is one of the smartest ways to solve it.

First, What's a Server?

Quick recap. When a fan opens the booking app, their phone sends a request — "I want 2 tickets for IND vs PAK, Section A." Your computer receives that, processes it, and sends back a response — "Here are your seats."

Your computer, in this setup, is the server. It serves requests.

When one server can't handle the load, you buy more servers. Now you have multiple servers ready to handle requests. But you need a way to decide — for each incoming request, which server should handle it?

That's load balancing.

The Simple Approach: Just Use a Hash

Every request that comes in has an ID — a number. You can take that number, run it through a hash function, and use the result to pick a server.

Here's how it works with 4 servers (S0, S1, S2, S3):

CODE

Hash(request_id) mod 4 = server index

Some examples:

CODE

Hash(10)  = 3  →  3 mod 4 = 3  →  goes to S3
Hash(20)  = 15 →  15 mod 4 = 3 →  goes to S3
Hash(35)  = 12 →  12 mod 4 = 0 →  goes to S0

Since request IDs are essentially random, the load spreads evenly. Each server gets roughly 25% of the traffic. Perfect.

With N servers, each handles roughly 1/N of the total load. Clean and even.

The Problem: What Happens When You Add a Server?

Your app goes viral. The stadium's booking system is getting hammered. You need to add a 5th server — S4.

Simple, right? Just change the formula:

CODE

Hash(request_id) mod 5 = server index

But look at what happens to the same requests:

CODE

Hash(10)  = 3  →  3 mod 5 = 3  →  still S3  ✓
Hash(20)  = 15 →  15 mod 5 = 0 →  now S0   ✗ (was S3)
Hash(35)  = 12 →  12 mod 5 = 2 →  now S2   ✗ (was S0)

Almost everything moved. Adding just one server reshuffled nearly all the requests across all servers.

Think of it like a pie split 4 ways — 25% each. Now you add a 5th person and have to re-cut the entire pie. Everyone's slice changes. The total reshuffling? Close to 100% of all requests.

Why Does This Actually Matter?

You might think — okay, requests moved around, so what?

Here's why it's a big deal. In real systems, the request ID isn't truly random. It usually contains the user's ID. This means the same user always gets sent to the same server.

That's actually useful. If fan Rahul always lands on S2, then S2 can cache Rahul's profile, booking history, and preferences locally. The next time Rahul makes a request, S2 already has everything ready — no need to fetch from the database again. Much faster.

But when you add a new server and everything reshuffles, Rahul might now go to S4. S4 knows nothing about him. All that cached data on S2? Useless now.

Every server loses its cache. Every user has to be re-fetched from the database. The whole system takes a performance hit — exactly when you're trying to scale up.

A full reshuffle means a full cache wipe. The worst time to lose your cache is when traffic is spiking and you're adding servers.

What You Actually Want

When you add a new server, you don't want everything to change. You want minimal disruption.

Ideally, you'd take a small slice from each existing server and hand it to the new one. Like this:

S0 gives up a little
S1 gives up a little
S2 gives up a little
S3 gives up a little
All those small pieces together make up S4's 20% share

The total traffic that moves is just 20% — only what's needed. The remaining 80% of users stay exactly where they were. Their cached data is still valid.

That's the goal. Minimum change when servers are added or removed.

The standard hashing approach (mod N) can't do this. It reshuffles everything.

Consistent hashing is the technique that makes minimal reshuffling possible — and that's exactly what we'll cover next.

Quick Recap

Concept	What It Means
Server	A computer that handles incoming requests
Load Balancing	Distributing requests evenly across multiple servers
Hash Function	Converts a request ID into a number to pick a server
The Problem	Adding a new server reshuffles almost all requests
Cache Invalidation	Reshuffling wipes useful cached data from every server
The Goal	Add servers with minimal disruption to existing assignments