Home / Blog / Benchmarks / RTX 4090 24 GB TFLOPS Benchmark Class: Where It Sits in the AI Hierarchy

Benchmarks

RTX 4090 24 GB TFLOPS Benchmark Class: Where It Sits in the AI Hierarchy

The RTX 4090 punches at roughly the same FP16 TFLOPS class as datacenter A100 cards. Here is the precise benchmark hierarchy and what it means in practice.

Benchmarks May 6, 2026 2 min read gigagpu

Table of Contents

NVIDIA’s GeForce flagship and datacenter cards now overlap on raw FP16 TFLOPS in ways that weren’t true a few years ago. The RTX 4090’s ~165 TFLOPS FP16 lands in the same neighbourhood as A100 SXM4 (312 TFLOPS) and well above older datacenter parts. This page maps it precisely.

TL;DR

RTX 4090 = ~165 TFLOPS FP16 dense, ~330 with sparsity. That is roughly 53% of A100 SXM4 at <5% the price. For inference workloads the gap shrinks further (memory bandwidth catches up). For training, A100 still wins decisively.

The TFLOPS class

GPU	FP16 TFLOPS dense	FP16 TFLOPS sparse	FP8 TOPS	Mem BW (GB/s)
RTX 3090	~36	~71	No native	936
RTX 4090	~165	~330	No native (sw)	1,008
RTX 5080	~75	~150	~600	960
RTX 5090	~210	~420	~838	1,792
RTX 6000 Pro	~234	~468	~936	1,792
A100 80 GB SXM4	~312	~624	No native	2,039
H100 80 GB SXM5	~989	~1,979	~3,958	3,350

Where the 4090 sits

The 4090 is in the "upper-mid" tier of NVIDIA’s hardware lineup for FP16 inference. Not as fast as the latest Blackwell or Hopper datacenter cards, but considerably faster than older datacenter SKUs (V100, T4) and competitive with A100-PCIe.

Real-world benchmarks vs theoretical

Theoretical TFLOPS rarely match real-world tokens per second. Memory bandwidth, kernel maturity, and software stack all matter:

Workload	RTX 4090 (real)	A100 80 GB (real)	4090 % of A100
Mistral 7B FP16 aggregate	950 tok/s	1,310 tok/s	73%
Llama 3.1 8B FP16 aggregate	910 tok/s	1,200 tok/s	76%
SDXL 1024² (s/image)	8 s	9 s	112%
BF16 fine-tuning (8B model)	~12 hours/epoch	~6 hours/epoch	50%
Training (50B token corpus)	24 hours	11 hours	46%

For inference, the 4090 delivers 70–80% of A100 throughput at less than 5% of A100’s effective monthly cost on AWS. For training, A100 still wins because the 4090’s lack of NVLink and ECC matter for multi-week jobs.

Verdict

The RTX 4090’s TFLOPS class is "A100-light" for inference and "solid mid-tier" for training. Worth the price for inference; not worth it for training serious models. For training, look at multi-GPU clusters or A100.

Bottom line

The 4090 punches well above its weight for inference. Treat it as a budget A100 for FP16 chatbot work; treat it as obsolete for training. For broader tier comparison see RTX 5090 vs RTX 3090.

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

Benchmarks

gigagpu

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

RTX 4090 24 GB TFLOPS Benchmark Class: Where It Sits in the AI Hierarchy

The TFLOPS class

Where the 4090 sits

Real-world benchmarks vs theoretical

Verdict

Bottom line

Need a Dedicated GPU Server?

gigagpu

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help?

RTX 4090 24 GB TFLOPS Benchmark Class: Where It Sits in the AI Hierarchy

The TFLOPS class

Where the 4090 sits

Real-world benchmarks vs theoretical

Verdict

Bottom line

Need a Dedicated GPU Server?

gigagpu

Related Articles

FLUX.1 on RTX 5060 Benchmark

YOLOv8 FPS by GPU: Real-Time Object Detection Benchmarks

How Many Concurrent LLM Users Can an RTX 3090 24 GB Handle?

SDXL Lightning vs Turbo – Benchmark Comparison

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help? Contact us

Have a question? Need help?