Count-Min Sketch Explained: Trillions of Counters
Quick Answer: A Count-Min Sketch is a probabilistic data structure used to estimate the frequency of events in massive datasets. By running items through multiple hash functions and storing counts in

Search for a command to run...
Articles tagged with #datastructures
Quick Answer: A Count-Min Sketch is a probabilistic data structure used to estimate the frequency of events in massive datasets. By running items through multiple hash functions and storing counts in

TL;DR: Git is a content-addressable filesystem that stores project states as full snapshots rather than incremental deltas. Every object—blobs, trees, and commits—is identified by a unique SHA-1 hash

TL;DR: Dating apps avoid the architectural nightmare of joining millions of left-swipe records by using Bloom filters. By hashing user IDs into a bit array, they get a 100% guarantee that a '0' means

TL;DR: Scaling unique view counts for millions of posts requires more than just a COUNT(DISTINCT) query. Modern platforms use HyperLogLog, a probabilistic data structure that estimates cardinality usi

TL;DR: A production-ready profanity filter isn't just a list of banned words; it's a pipeline. You start with sanitization to normalize character substitutions, followed by a Trie for efficient prefix

TL;DR: When I'm building high-traffic chat systems, a standard list lookup for profanity is too slow because search time grows with the size of the dictionary. I use a Trie (prefix tree) to move to O(
