notes

♫ Note

HNSW (graph ANN) & the `hnsw_rs` crate

What it is

Multi-layer proximity graph; search greedily descends layers, then does a best-first walk on layer 0 with a candidate set of size ef. Recall ↔ latency is tuned by ef_search; graph quality/memory by M (neighbors per node) and ef_construction. Sub-linear in N, but each visit is a full-precision distance and traversal is random-access (cache-hostile), and results are approximate (recall plateaus below 1.0).

Why tracked / how it relates to this engine

HNSW is the standard ANN index and the usual above-us layer (the coarse router over all N), which is out of scope per [[history 042]]. We benchmarked it as a within-cell competitor to the binary funnel in history 043: at high dimension (1024-D embeddings) the funnel wins on recall/latency/QPS for all realistic cell sizes and costs ~100–160× less to build; HNSW-in-cell only pays at low-D (≤128) large cells — i.e. when it should just be the index. So HNSW lives above the cell, not inside it. Evolving OSS (active crate; hnswlib is the moving baseline other ANN libs benchmark against).

Related