Algorithm dossier

  • Category: Hashing
  • Worst-case complexity: O(n)
  • Approach: Randomised
  • Data structure: Array
  • First formalised: 1980s

Why this snippet is Reservoir Sampling

Why reservoir sampling. Fills a fixed-size buffer with the first k items, then for each subsequent item i, replaces a random slot with probability k / (i+1). The invariant is that after processing the stream, every original item ends up in the reservoir with uniform probability k/n. Why useful. Works on streams of *unknown length* — you don't need to know n ahead of time, just one pass with O(k) memory.

How to read a redacted algorithm

Algodle strips identifier names so the snippet has to be read for its shape: the control flow, the data structures it manipulates, the order in which it visits its input. Loops with two pointers crawling toward each other are usually search or partition. A recursion that splits its input in half and recurses on both halves is divide-and-conquer. A priority queue plus graph traversal is almost certainly Dijkstra, Prim, or A*. Six hint columns — category, complexity, approach, data structure, era — let you triangulate even when the snippet itself is opaque.