RTX 3050 - Order Now
Home / Blog / Tutorials / Vector Search Tuning: HNSW Parameters
Tutorials

Vector Search Tuning: HNSW Parameters

HNSW is the default vector index. Three parameters matter for production: M, ef_construction, ef_search. The trade-offs.

HNSW (Hierarchical Navigable Small World) is the default vector index in Qdrant, Weaviate, pgvector, FAISS. Three parameters determine its behaviour: M (graph connectivity), ef_construction (build-time accuracy), ef_search (query-time accuracy). Defaults work for many cases; tuning matters for specific workloads.

TL;DR

Defaults (M=16, ef_construction=128, ef_search=64) work well for most general RAG. Raise M (32-64) + ef_construction (256-512) for high-recall production. Tune ef_search at query time for cost / quality trade-off. Higher = better recall, higher latency / memory. Most teams: bump ef_search to 100-200 for production; defaults elsewhere.

Parameters

  • M (typically 8-64, default 16): max connections per node in the HNSW graph. Higher = better recall, more memory, slower build.
  • ef_construction (typically 64-512, default 128): build-time accuracy. Higher = better graph quality, slower build.
  • ef_search (typically 32-512, default 64): query-time accuracy. Higher = better recall, slower query.

Trade-offs

SettingBuild timeMemoryQuery latencyRecall@10
Defaults (M=16, ef=64)FastLowFast~95%
Tuned for production (M=32, ef=200)MediumMediumMedium~99%
Maximum quality (M=64, ef=500)SlowHighestSlowest~99.9%

Recipe

For most production RAG:

  • Start with defaults; measure recall on eval set
  • If recall < 95% on representative queries: bump ef_search to 128-200 first (no re-index needed)
  • If still low: increase M to 32 + ef_construction to 256 (requires rebuild)
  • Above this rarely worth it for general RAG; specialised IR workloads may push higher

Verdict

HNSW defaults work for most RAG. Tune ef_search first (cheap, no rebuild). Rebuild with higher M / ef_construction only when measured recall regression justifies it. Most production deployments: defaults + slightly raised ef_search. Don't over-tune; the gains diminish above 99% recall.

Bottom line

Tune ef_search first; rebuild rarely. See vector store comparison.

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

gigagpu

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

Have a question? Need help?