Redis HyperLogLog: Counting Millions of Unique Users with Just 12KB of Memory

When building at scale, answering a seemingly simple question like "How many unique users visited our site today?" can become a massive engineering challenge.

If you use a traditional Set data structure to store every unique ID, tracking millions of users will rapidly consume Gigabytes of RAM. This is where Redis HyperLogLog comes to the rescue.

What is HyperLogLog?

HyperLogLog (HLL) is a probabilistic data structure used to estimate the cardinality of a set (the number of unique elements). Instead of storing the actual elements, it hashes them and observes the patterns of the binary representation of the hash.

The magic of Redis' implementation of HyperLogLog is that it can estimate the cardinality of millions of unique items while always using a maximum of 12 KB of memory and maintaining a standard error of just 0.81%.

How it works (The Math made simple)

Imagine you are flipping a coin. If you flip it and get "Heads" 5 times in a row, you'd intuitively know you've probably been flipping that coin for a while.

HLL works on a similar principle using hashing:

Every item added to the HLL is hashed into a large binary string (e.g., 101100101...).
The algorithm looks for the longest sequence of leading zeros in these binary strings.
If the longest sequence of leading zeros across all hashes is N, the estimated number of unique elements is roughly 2^N.

To reduce variance and make the estimate incredibly accurate, Redis divides the data into 16,384 internal registers and averages the results using harmonic means. This brings the memory footprint to exactly 16384 × 6 bits = 12 KB.

Practical Usage in Redis

Using HLL in Redis is delightfully simple. It provides three main commands:

1. PFADD

Adds elements to the HyperLogLog.

bash

PFADD website_visitors:2026-06-06 "user_102" "user_883" "user_911"
# Returns 1 if the internal register was altered

2. PFCOUNT

Returns the approximated cardinality.

bash

PFCOUNT website_visitors:2026-06-06
# Returns: 3

3. PFMERGE

Merges multiple HyperLogLogs into a single one. This is perfect for rolling up daily metrics into weekly or monthly metrics!

bash

PFMERGE visitors:this_week visitors:monday visitors:tuesday visitors:wednesday

When to use HyperLogLog

HyperLogLog is not a silver bullet. You should use it when:

You need to count massive amounts of unique items (IP addresses, user IDs, search queries).
You care about memory efficiency more than absolute 100% precision.
You do not need to retrieve the actual items back from the data structure.

If you need to list the users, or if absolute precision is required for billing purposes, stick to a standard Redis Set or an SQL database. But for analytical dashboards and scale, HyperLogLog is an absolute superpower.

Redis HyperLogLog: Counting Millions of Unique Users with Just 12KB of Memory

What is HyperLogLog?

How it works (The Math made simple)

Practical Usage in Redis

1. PFADD

2. PFCOUNT

3. PFMERGE

When to use HyperLogLog

Related articles

Discussion