Table of Contents
HNSW (Hierarchical Navigable Small World) is the default vector index in Qdrant, Weaviate, pgvector, FAISS. Three parameters determine its behaviour: M (graph connectivity), ef_construction (build-time accuracy), ef_search (query-time accuracy). Defaults work for many cases; tuning matters for specific workloads.
Defaults (M=16, ef_construction=128, ef_search=64) work well for most general RAG. Raise M (32-64) + ef_construction (256-512) for high-recall production. Tune ef_search at query time for cost / quality trade-off. Higher = better recall, higher latency / memory. Most teams: bump ef_search to 100-200 for production; defaults elsewhere.
Parameters
- M (typically 8-64, default 16): max connections per node in the HNSW graph. Higher = better recall, more memory, slower build.
- ef_construction (typically 64-512, default 128): build-time accuracy. Higher = better graph quality, slower build.
- ef_search (typically 32-512, default 64): query-time accuracy. Higher = better recall, slower query.
Trade-offs
| Setting | Build time | Memory | Query latency | Recall@10 |
|---|---|---|---|---|
| Defaults (M=16, ef=64) | Fast | Low | Fast | ~95% |
| Tuned for production (M=32, ef=200) | Medium | Medium | Medium | ~99% |
| Maximum quality (M=64, ef=500) | Slow | Highest | Slowest | ~99.9% |
Recipe
For most production RAG:
- Start with defaults; measure recall on eval set
- If recall < 95% on representative queries: bump
ef_searchto 128-200 first (no re-index needed) - If still low: increase
Mto 32 +ef_constructionto 256 (requires rebuild) - Above this rarely worth it for general RAG; specialised IR workloads may push higher
Verdict
HNSW defaults work for most RAG. Tune ef_search first (cheap, no rebuild). Rebuild with higher M / ef_construction only when measured recall regression justifies it. Most production deployments: defaults + slightly raised ef_search. Don't over-tune; the gains diminish above 99% recall.
Bottom line
Tune ef_search first; rebuild rarely. See vector store comparison.