Algorithm dossier

  • Category: Compression
  • Worst-case complexity: O(n log n)
  • Approach: Greedy
  • Data structure: Heap
  • First formalised: 1950s

Why this snippet is Huffman Coding

Why Huffman. Repeatedly merge the two lowest-frequency entries until one is left. Each merge prepends 0 to one subtree's codes and 1 to the other's — assembling a *prefix-free* binary code where rarer symbols get longer codes. Optimality. Provably the best symbol-by-symbol prefix code for a given frequency distribution. Beaten only by arithmetic coding (which allows fractional bits per symbol).

How to read a redacted algorithm

Algodle strips identifier names so the snippet has to be read for its shape: the control flow, the data structures it manipulates, the order in which it visits its input. Loops with two pointers crawling toward each other are usually search or partition. A recursion that splits its input in half and recurses on both halves is divide-and-conquer. A priority queue plus graph traversal is almost certainly Dijkstra, Prim, or A*. Six hint columns — category, complexity, approach, data structure, era — let you triangulate even when the snippet itself is opaque.