Benchmarks GIGAGPU

Real performance data, not marketing claims. Our benchmarks test every GPU we offer across LLM inference, image generation, OCR, and TTS workloads on dedicated GPU servers. See our tokens/sec benchmark for the latest results.

Benchmarks

Gemma 2 9B on RTX 4060 Ti: Performance Benchmark & Cost, Category: Benchmarks, Slug: gemma-2-9b-on-rtx-4060-ti-benchmark, Excerpt: Gemma 2 9B benchmarked on RTX 4060 Ti: 23.6 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Gemma 2 9B benchmarked on RTX 4060 Ti: 23.6 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 -->

Read Article 2 min read

Benchmarks Apr 2026

Gemma 2 9B on RTX 3090: Performance Benchmark & Cost, Category: Benchmarks, Slug: gemma-2-9b-on-rtx-3090-benchmark, Excerpt: Gemma 2 9B benchmarked on RTX 3090: 52.0 tok/s at FP16, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Gemma 2 9B benchmarked on RTX 3090: 52.0 tok/s at FP16, VRAM usage, cost per 1M tokens, and deployment configuration.,…

Read More 2 min

Benchmarks Apr 2026

Phi-3 Mini on RTX 5080: Performance Benchmark & Cost, Category: Benchmarks, Slug: phi-3-mini-on-rtx-5080-benchmark, Excerpt: Phi-3 Mini benchmarked on RTX 5080: 82 tok/s at FP16, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Phi-3 Mini benchmarked on RTX 5080: 82 tok/s at FP16, VRAM usage, cost per 1M tokens, and deployment configuration., Internal…

Read More 2 min

Benchmarks Apr 2026

Phi-3 Mini on RTX 5090: Performance Benchmark & Cost, Category: Benchmarks, Slug: phi-3-mini-on-rtx-5090-benchmark, Excerpt: Phi-3 Mini benchmarked on RTX 5090: 100 tok/s at FP16, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Phi-3 Mini benchmarked on RTX 5090: 100 tok/s at FP16, VRAM usage, cost per 1M tokens, and deployment configuration., Internal…

Read More 2 min

Benchmarks Apr 2026

Gemma 2 9B on RTX 5080: Performance Benchmark & Cost, Category: Benchmarks, Slug: gemma-2-9b-on-rtx-5080-benchmark, Excerpt: Gemma 2 9B benchmarked on RTX 5080: 48.8 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Gemma 2 9B benchmarked on RTX 5080: 48.8 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and…

Read More 2 min

Benchmarks Apr 2026

Gemma 2 9B on RTX 5090: Performance Benchmark & Cost, Category: Benchmarks, Slug: gemma-2-9b-on-rtx-5090-benchmark, Excerpt: Gemma 2 9B benchmarked on RTX 5090: 112.3 tok/s at FP16, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Gemma 2 9B benchmarked on RTX 5090: 112.3 tok/s at FP16, VRAM usage, cost per 1M tokens, and deployment configuration.,…

Read More 2 min

Benchmarks Apr 2026

LLaMA 3 70B on RTX 3090: Performance Benchmark & Cost, Category: Benchmarks, Slug: llama-3-70b-on-rtx-3090-benchmark, Excerpt: LLaMA 3 70B benchmarked on RTX 3090: 5.2 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

LLaMA 3 70B benchmarked on RTX 3090: 5.2 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and…

Read More 2 min

Benchmarks Apr 2026

LLaMA 3 70B on RTX 5090: Performance Benchmark & Cost, Category: Benchmarks, Slug: llama-3-70b-on-rtx-5090-benchmark, Excerpt: LLaMA 3 70B benchmarked on RTX 5090: 12.8 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

LLaMA 3 70B benchmarked on RTX 5090: 12.8 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and…

Read More 2 min

Benchmarks Apr 2026

Mixtral 8x7B on RTX 3090: Performance Benchmark & Cost, Category: Benchmarks, Slug: mixtral-8x7b-on-rtx-3090-benchmark, Excerpt: Mixtral 8x7B benchmarked on RTX 3090: 18 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Mixtral 8x7B benchmarked on RTX 3090: 18 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment…

Read More 2 min

Benchmarks Apr 2026

Mixtral 8x7B on RTX 5080: Performance Benchmark & Cost, Category: Benchmarks, Slug: mixtral-8x7b-on-rtx-5080-benchmark, Excerpt: Mixtral 8x7B benchmarked on RTX 5080: 32 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Mixtral 8x7B benchmarked on RTX 5080: 32 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment…

Read More 2 min

Prev 1 … 9 10 11 12 13 … 21 Next

Benchmarks

Gemma 2 9B on RTX 4060 Ti: Performance Benchmark & Cost, Category: Benchmarks, Slug: gemma-2-9b-on-rtx-4060-ti-benchmark, Excerpt: Gemma 2 9B benchmarked on RTX 4060 Ti: 23.6 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Gemma 2 9B on RTX 3090: Performance Benchmark & Cost, Category: Benchmarks, Slug: gemma-2-9b-on-rtx-3090-benchmark, Excerpt: Gemma 2 9B benchmarked on RTX 3090: 52.0 tok/s at FP16, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Phi-3 Mini on RTX 5080: Performance Benchmark & Cost, Category: Benchmarks, Slug: phi-3-mini-on-rtx-5080-benchmark, Excerpt: Phi-3 Mini benchmarked on RTX 5080: 82 tok/s at FP16, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Phi-3 Mini on RTX 5090: Performance Benchmark & Cost, Category: Benchmarks, Slug: phi-3-mini-on-rtx-5090-benchmark, Excerpt: Phi-3 Mini benchmarked on RTX 5090: 100 tok/s at FP16, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Gemma 2 9B on RTX 5080: Performance Benchmark & Cost, Category: Benchmarks, Slug: gemma-2-9b-on-rtx-5080-benchmark, Excerpt: Gemma 2 9B benchmarked on RTX 5080: 48.8 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Gemma 2 9B on RTX 5090: Performance Benchmark & Cost, Category: Benchmarks, Slug: gemma-2-9b-on-rtx-5090-benchmark, Excerpt: Gemma 2 9B benchmarked on RTX 5090: 112.3 tok/s at FP16, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

LLaMA 3 70B on RTX 3090: Performance Benchmark & Cost, Category: Benchmarks, Slug: llama-3-70b-on-rtx-3090-benchmark, Excerpt: LLaMA 3 70B benchmarked on RTX 3090: 5.2 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

LLaMA 3 70B on RTX 5090: Performance Benchmark & Cost, Category: Benchmarks, Slug: llama-3-70b-on-rtx-5090-benchmark, Excerpt: LLaMA 3 70B benchmarked on RTX 5090: 12.8 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Mixtral 8x7B on RTX 3090: Performance Benchmark & Cost, Category: Benchmarks, Slug: mixtral-8x7b-on-rtx-3090-benchmark, Excerpt: Mixtral 8x7B benchmarked on RTX 3090: 18 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Mixtral 8x7B on RTX 5080: Performance Benchmark & Cost, Category: Benchmarks, Slug: mixtral-8x7b-on-rtx-5080-benchmark, Excerpt: Mixtral 8x7B benchmarked on RTX 5080: 32 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Explore GPU Hosting Solutions

Tokens/sec Benchmarks

TTS Latency Benchmarks

OCR Speed Benchmarks

Cost per 1M Tokens

Dedicated GPU Hosting

Open Source LLM Hosting

Ready to deploy your AI workload?

Have a question? Need help?

Benchmarks

Gemma 2 9B on RTX 4060 Ti: Performance Benchmark & Cost, Category: Benchmarks, Slug: gemma-2-9b-on-rtx-4060-ti-benchmark, Excerpt: Gemma 2 9B benchmarked on RTX 4060 Ti: 23.6 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Explore GPU Hosting Solutions

Tokens/sec Benchmarks

TTS Latency Benchmarks

OCR Speed Benchmarks

Cost per 1M Tokens

Dedicated GPU Hosting

Open Source LLM Hosting

Stay ahead on GPU & AI hosting

Ready to deploy your AI workload?

Have a question? Need help? Contact us

Have a question? Need help?