Table of Contents
Why Training Time Varies
Fine-tuning time on a dedicated GPU server depends on four main factors: GPU speed (TFLOPS and memory bandwidth), model size, dataset size, and fine-tuning method. A run that takes 45 minutes on an RTX 4060 Ti might complete in 7 minutes on an RTX 6000 Pro. This benchmark provides real training times across eight GPU configurations so you can estimate costs and plan your experiments.
All benchmarks use QLoRA (rank 16) unless noted, with sequence length 512, effective batch size 32 via gradient accumulation, and 3 epochs. Measured on GigaGPU servers with PyTorch and the Hugging Face PEFT library. For VRAM requirements, see our fine-tuning VRAM calculator.
7-8B Model Training Times
Using LLaMA 3 8B as the representative 7-8B model with QLoRA (r=16, INT4 base).
| GPU | VRAM | 1K Examples | 5K Examples | 10K Examples | 50K Examples |
|---|---|---|---|---|---|
| RTX 4060 (8 GB) | 8 GB | ~55 min | ~4.5 hrs | ~9 hrs | ~46 hrs |
| RTX 4060 Ti (16 GB) | 16 GB | ~42 min | ~3.5 hrs | ~7 hrs | ~35 hrs |
| RTX 3090 (24 GB) | 24 GB | ~24 min | ~2 hrs | ~4 hrs | ~20 hrs |
| RTX 5080 (16 GB) | 16 GB | ~20 min | ~1.7 hrs | ~3.3 hrs | ~17 hrs |
| RTX 5090 (32 GB) | 32 GB | ~12 min | ~1 hr | ~2 hrs | ~10 hrs |
| RTX 6000 Pro (80 GB) | 80 GB | ~7 min | ~35 min | ~1.2 hrs | ~6 hrs |
The RTX 5090 is 3.5x faster than the RTX 4060 Ti — a significant gap that can save hours on large datasets. For model-specific details see our LLaMA 3 8B fine-tuning guide and Mistral 7B hardware guide.
13-14B Model Training Times
Using Qwen 2.5 14B with QLoRA (r=16). Requires 24+ GB VRAM.
| GPU | VRAM | 1K Examples | 5K Examples | 10K Examples |
|---|---|---|---|---|
| RTX 3090 (24 GB) | 24 GB | ~48 min | ~4 hrs | ~8 hrs |
| RTX 5090 (32 GB) | 32 GB | ~22 min | ~1.8 hrs | ~3.7 hrs |
| RTX 6000 Pro (80 GB) | 80 GB | ~14 min | ~1.2 hrs | ~2.3 hrs |
Training time roughly doubles from 7B to 14B on the same hardware due to increased model size and activation memory. The RTX 5090 remains the best consumer option, finishing 10K examples in under 4 hours.
70B Model Training Times
Using LLaMA 3 70B with QLoRA (r=16). Requires multi-GPU setups.
| GPU Config | Total VRAM | 1K Examples | 5K Examples | 10K Examples |
|---|---|---|---|---|
| 2x RTX 5090 (64 GB) | 64 GB | ~1.5 hrs | ~7.5 hrs | ~15 hrs |
| 4x RTX 3090 (96 GB) | 96 GB | ~1.2 hrs | ~6 hrs | ~12 hrs |
| RTX 6000 Pro (80 GB) | 80 GB | ~50 min | ~4.2 hrs | ~8.3 hrs |
| 2x RTX 6000 Pro (160 GB) | 160 GB | ~30 min | ~2.5 hrs | ~5 hrs |
70B model fine-tuning is measured in hours even on premium hardware. Plan for overnight runs on consumer GPUs. For deployment after fine-tuning, quantise with GPTQ or AWQ and serve via vLLM.
Cost Per Experiment
Based on approximate GigaGPU hourly rates. Cost per training run for a 10K-example dataset.
| GPU | Hourly Rate | 7B / 10K | 14B / 10K | 70B / 10K |
|---|---|---|---|---|
| RTX 4060 Ti | ~£0.10/hr | ~£0.70 | N/A | N/A |
| RTX 3090 | ~£0.15/hr | ~£0.60 | ~£1.20 | N/A |
| RTX 5090 | ~£0.35/hr | ~£0.70 | ~£1.30 | N/A |
| 2x RTX 5090 | ~£0.70/hr | – | – | ~£10.50 |
| RTX 6000 Pro 96 GB | ~£1.20/hr | ~£1.44 | ~£2.76 | ~£9.96 |
Fine-tuning a 7B model costs well under £1 on consumer GPUs. Even 70B models cost roughly £10 per experiment — far cheaper than API-based fine-tuning services. For GPU selection guidance, see our best GPU for fine-tuning LLMs guide. For method comparisons, read LoRA vs QLoRA vs full fine-tuning. Browse all results in the Benchmarks category.
Conclusion
GPU choice has a dramatic impact on fine-tuning speed: an RTX 5090 is 3-4x faster than budget cards, and an RTX 6000 Pro is 6-8x faster. For most users, the RTX 3090 offers the best cost efficiency — fast enough to avoid wasting time, affordable enough to keep per-experiment costs under £1 for 7B models. Scale to multi-GPU for 70B models, and always use QLoRA to maximise your VRAM budget. For pricing details, check our cost analysis tools.
Fine-Tune at the Speed You Need
From budget RTX 4060 to RTX 6000 Pro clusters. Dedicated GPU servers with PyTorch, CUDA, and PEFT pre-installed.
Browse GPU Servers