When building at scale, answering a seemingly simple question like "How many unique users visited our site today?" can become a massive engineering challenge.
If you use a traditional Set data structure to store every unique ID, tracking millions of users will rapidly consume Gigabytes of RAM. This is where Redis HyperLogLog comes to the rescue.
What is HyperLogLog?
HyperLogLog (HLL) is a probabilistic data structure used to estimate the cardinality of a set (the number of unique elements). Instead of storing the actual elements, it hashes them and observes the patterns of the binary representation of the hash.
The magic of Redis' implementation of HyperLogLog is that it can estimate the cardinality of millions of unique items while always using a maximum of 12 KB of memory and maintaining a standard error of just 0.81%.
How it works (The Math made simple)
Imagine you are flipping a coin. If you flip it and get "Heads" 5 times in a row, you'd intuitively know you've probably been flipping that coin for a while.
HLL works on a similar principle using hashing:
- Every item added to the HLL is hashed into a large binary string (e.g.,
101100101...). - The algorithm looks for the longest sequence of leading zeros in these binary strings.
- If the longest sequence of leading zeros across all hashes is
N, the estimated number of unique elements is roughly2^N.
To reduce variance and make the estimate incredibly accurate, Redis divides the data into 16,384 internal registers and averages the results using harmonic means. This brings the memory footprint to exactly 16384 × 6 bits = 12 KB.
Practical Usage in Redis
Using HLL in Redis is delightfully simple. It provides three main commands:
1. PFADD
Adds elements to the HyperLogLog.
bashPFADD website_visitors:2026-06-06 "user_102" "user_883" "user_911" # Returns 1 if the internal register was altered
2. PFCOUNT
Returns the approximated cardinality.
bashPFCOUNT website_visitors:2026-06-06 # Returns: 3
3. PFMERGE
Merges multiple HyperLogLogs into a single one. This is perfect for rolling up daily metrics into weekly or monthly metrics!
bashPFMERGE visitors:this_week visitors:monday visitors:tuesday visitors:wednesday
When to use HyperLogLog
HyperLogLog is not a silver bullet. You should use it when:
- You need to count massive amounts of unique items (IP addresses, user IDs, search queries).
- You care about memory efficiency more than absolute 100% precision.
- You do not need to retrieve the actual items back from the data structure.
If you need to list the users, or if absolute precision is required for billing purposes, stick to a standard Redis Set or an SQL database. But for analytical dashboards and scale, HyperLogLog is an absolute superpower.