HyperLogLog (HLL) Intersections

HyperLogLog Intersections are interesting in that they allow to derive more information than what would normally be available with union or cardinality. Based on intersections, it is possible to describe relationships between two sets quantitatively.

What this means is shown in the image above. In this example, when using two HLL sets, based on post ids that relate to different terms used on Social Media, it is possible to return the percentage of posts containing a common list of words, based on intersection.

These two lists of words may first be combined by union of individual HLL sets, for each term, and only afterwards intersected, allowing to study arbitrary semantic relationships.

I’ve discussed and shown HLL intersections on several occasions, this is a list of link with further information: