MixedBread AI’s mxbai-embed-large-v1 is a 335M-parameter English embedder that competes with the top of the MTEB leaderboard. On our dedicated GPU hosting it fits the smallest card and is a reasonable alternative when you specifically want strong English retrieval.
Contents
VRAM
~670 MB for weights at FP16. Batch activations bring total to 1-3 GB. Runs on any GPU.
Deployment
docker run --gpus all -p 8080:80 \
ghcr.io/huggingface/text-embeddings-inference:1.5 \
--model-id mixedbread-ai/mxbai-embed-large-v1
Client usage requires a specific instruction prefix for queries:
query = "Represent this sentence for searching relevant passages: " + user_query
Documents do not need a prefix.
Matryoshka
mxbai-embed-large supports Matryoshka truncation to 512, 256, or 128 dimensions with minor quality degradation. Quality loss at 512 is negligible (<1%); at 128 expect 3-5% drop.
When to Pick
Pick mxbai when:
- Your corpus is English-only and you want the top MTEB-leaderboard English performer
- You need a simple single-output embedder (vs BGE-M3’s multi-output complexity)
- You want a small efficient model with commercial-friendly licence
Skip mxbai when you need multilingual (pick BGE-M3), when you need the open-training-data story (pick Nomic), or when you are already invested in the BAAI stack.