Home / Blog / AI Hosting & Infrastructure / NVIDIA Tensor Cores Explained: 3rd, 4th, 5th Generation

AI Hosting & Infrastructure

NVIDIA Tensor Cores Explained: 3rd, 4th, 5th Generation

What tensor cores actually do, how they evolved across Ampere / Ada / Blackwell, and why FP8 / FP4 hardware matters for AI inference.

AI Hosting & Infrastructure May 5, 2026 1 min read gigagpu

Table of Contents

Tensor cores are the unit that makes GPU AI cheap. Understanding the generations explains why one card is faster than another for the same TFLOPS rating.

TL;DR

Tensor cores accelerate matrix multiplications — the bulk of LLM inference compute. 3rd gen (Ampere): FP16/BF16. 4th gen (Ada): + FP8 software path. 5th gen (Blackwell): + native FP8 + FP4 hardware. Each generation roughly doubles useful tensor throughput.

What tensor cores do

Specialised matrix-multiplication accelerators. A single tensor core can do a 4×4×4 matrix product in one cycle — far faster than the equivalent in CUDA cores.

Generations

Gen	Cards	Native precisions	Notes
3rd (Ampere)	A100, RTX 30-series	FP16, BF16, INT8	Sparsity 2:4 supported
4th (Ada)	RTX 40-series, L40S	FP16, BF16, INT8, FP8 (sw)	FP8 emulated, no native
Hopper	H100	FP16, BF16, FP8 (native)	Datacenter only
5th (Blackwell)	RTX 50-series, RTX 6000 Pro	FP16, BF16, FP8, FP4 (native)	FP4 is the new headline

Verdict

Generations matter for AI workloads more than raw CUDA core count. FP8 / FP4 hardware is the practical AI advantage of newer cards.

Bottom line

Pick by tensor-core generation, not just TFLOPS. See Blackwell architecture overview.

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

AI Hosting & Infrastructure

gigagpu

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

NVIDIA Tensor Cores Explained: 3rd, 4th, 5th Generation

What tensor cores do

Generations

Verdict

Bottom line

Need a Dedicated GPU Server?

gigagpu

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help?

NVIDIA Tensor Cores Explained: 3rd, 4th, 5th Generation

What tensor cores do

Generations

Verdict

Bottom line

Need a Dedicated GPU Server?

gigagpu

Related Articles

CPU-GPU Offload Strategy for 70B Models

GPU Temperature Monitoring

Colocation vs Dedicated vs Cloud GPU

GPU Server for 5 Concurrent Voice agent Users: Sizing Guide

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help? Contact us

Have a question? Need help?