Home / Blog / Model Guides / Pixtral 12B on a Dedicated GPU

Model Guides

Pixtral 12B on a Dedicated GPU

Mistral's Pixtral 12B is a capable vision-language model with native variable image resolution - a practical generalist VLM for dedicated hosting.

Model Guides April 19, 2026 1 min read gigagpu

Pixtral 12B from Mistral is a vision-language model built on top of Mistral Nemo 12B, supporting variable image resolutions rather than fixed tiling. On our dedicated GPU hosting it fits a 24 GB card at FP16 with room for reasonable concurrency.

VRAM
Deployment
Variable resolution benefit

VRAM

Precision	Weights	Fits On
FP16	~24 GB	24 GB card tight, 32 GB comfortable
FP8	~12 GB	16 GB+ card
AWQ INT4	~7 GB	Any 12 GB+ card

Deployment

python -m vllm.entrypoints.openai.api_server \
  --model mistral-community/pixtral-12b \
  --max-model-len 32768 \
  --limit-mm-per-prompt 'image=4' \
  --tokenizer-mode mistral \
  --config-format mistral

--limit-mm-per-prompt 'image=4' allows up to four images per request – Pixtral handles multi-image reasoning well.

Variable Resolution

Unlike many VLMs that downsample images to a fixed tile grid, Pixtral handles the input at its native resolution (up to the model’s context budget). Small images stay small; large images get more visual tokens. Practical impact:

Better detail recognition on high-resolution photos
Lower cost on simple images (no forced 336×336 tiling)
Context usage varies with image size – budget KV cache accordingly

A 1024×1024 image consumes roughly 4x the visual tokens a 512×512 image does. For high-volume deployments, normalise input resolution before submission.

Vision-Language Model Hosting

Pixtral 12B preconfigured on UK dedicated GPUs with appropriate VRAM sizing.

Browse GPU Servers

Compare against Llama 3.2 Vision 11B and Qwen VL 2.

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

Model Guides

gigagpu

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

Pixtral 12B on a Dedicated GPU

Contents

VRAM

Deployment

Variable Resolution

Vision-Language Model Hosting

Need a Dedicated GPU Server?

gigagpu

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help?

Pixtral 12B on a Dedicated GPU

Contents

VRAM

Deployment

Variable Resolution

Vision-Language Model Hosting

Need a Dedicated GPU Server?

gigagpu

Related Articles

Parler-TTS Self-Hosted

Self-Hosted Qwen 2.5 72B Deployment Guide: Hardware, vLLM Config, Real Benchmarks

FLUX.1 vs Stable Diffusion 3.5: Which Open Image Model in 2026?

Hermes 3 Llama Self-Hosted

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help? Contact us

Have a question? Need help?