165 frames per second. That is fast enough to process five 30-FPS camera streams simultaneously from a single GPU. We benchmarked Ultralytics YOLOv8m (11.2M parameters) on the NVIDIA RTX 5090 (32 GB VRAM) using our standard test harness on a GigaGPU dedicated server, and the results confirm this is the fastest consumer-class detection card we have tested.
Detection Speed at a Glance
| Metric | Value |
|---|---|
| FPS (640×640) | 165 FPS |
| Latency per frame | 6.1 ms |
| Precision | FP16 |
| Input resolution | 640×640 (COCO) |
| Performance rating | Excellent |
Benchmark conditions: FP16 inference, batch size 1, YOLOv8m model on COCO-format input at 640×640.
VRAM — Barely a Dent
| Component | VRAM |
|---|---|
| Model weights (FP16) | 1.8 GB |
| Processing buffer | ~0.5 GB |
| Total RTX 5090 VRAM | 32 GB |
| Free headroom | ~30.2 GB |
With over 30 GB free after loading YOLOv8, the 5090 is practically begging you to add more models. You could run a full 8B-parameter LLM in FP16 alongside YOLO and still have room for a speech model. For teams building compound AI systems — detection plus reasoning plus alerting — this card is the foundation.
Cost Breakdown
| Cost Metric | Value |
|---|---|
| Server cost | £1.50/hr (£299/mo) |
| Cost per 1M frames | £2.53 |
| Frames per £1 | 395257 |
Yes, the RTX 5090 costs more per hour than a 5080. But it also delivers 43% more FPS. If your workload is throughput-limited — think large-scale video surveillance or batch processing dashcam footage — the per-frame economics are competitive. See all benchmarks for the full comparison.
The Verdict
The RTX 5090 is overkill if you only need YOLOv8. Where it truly shines is when you use that massive VRAM surplus for multi-model pipelines. Run detection, feed bounding-box crops into a vision-language model for classification, and trigger alerts through an LLM — all on one card, all under 32 GB. That is the real value proposition here.
Quick deploy:
docker run --gpus all -p 8080:8080 ultralytics/ultralytics:latest yolo detect predict
See our YOLOv8 hosting guide, best GPU for object detection, and all benchmark results. Related: LLaMA 3 8B on RTX 5090 benchmark.
Deploy YOLOv8 on RTX 5090
Order this exact configuration. UK datacenter, full root access.
Order RTX 5090 Server