Why is HyperLogLog it's own data structure?

What is the technical explanation that HLLs are a different data structure, but are in turn encoded with a Redis string?

Just like each of the 10 Redis Data Structures, HLL has unique functionality, so even tho it may accept or return a string, it behaves differently inside and serves a different purpose. To learn how to use Redis HyperLogLog, checkout this 5 min. video from Redis University - Redis HyperLogLog Explained

1 Like

I don’t know the initial motivation behind HLL being a string, but it does mean it has some very interesting properties that wouldn’t be directly available if it wasn’t.

For example, it’s possible GET and SET it, so you can store it in another medium. For example, you can use PFADD to create the HLL as normal, then GET the value of that key, store it in a file on the client and later use SET recreate the HLL and run PFCOUNT on it.

1 Like

What makes an HLL unique is that it does not need to use an amount of memory proportional to the number of items counted, but rather it can use a constant amount of memory.

We know that the HLLs in redis are different data structures, they are encoded as a string within Redis, therefore the GET command can be used to serialize and SET to deserialize the server again.

What is use case of HyperLogLog data structure?

Counting really big things and ignoring duplicates. Say, how many unique users watched a video. Add each user to the HLL when they watch the video. Or how many unique visits a website has. Add the IP address for each visit to the site to the HLL.