Table of Contents
Embedding model choice affects RAG quality more than chunking does. Here's the comparison.
Default: BGE-large-en-v1.5 for English; BGE-m3 for multilingual. Nomic-embed-v1.5 for cost-anchored. Jina-embeddings-v3 for long-context. ColBERT for late-interaction precision.
Models
| Model | Size | Languages | Best for |
|---|---|---|---|
| BGE-large-en-v1.5 | 335M | English | Default English RAG |
| BGE-m3 | 568M | Multilingual | Multilingual RAG |
| Nomic-embed-v1.5 | 137M | English | Cost-anchored, fast |
| Jina-embeddings-v3 | 570M | Multilingual | Long-context (8K input) |
| GTE-large | 330M | English | Strong on technical content |
Benchmarks
MTEB English retrieval scores (higher is better):
- BGE-large-en-v1.5: ~54.3
- Nomic-embed-v1.5: ~52.8
- GTE-large: ~53.5
- Jina-embeddings-v3: ~53.6
Verdict
BGE-large is the safe default. Nomic if cost-anchored. Jina for long-context.
Bottom line
Embedding choice matters for RAG quality. See best GPU for embeddings.