After provisioning a new RTX 5060 Ti 16GB server on our dedicated hosting, run a sanity test before putting a workload on it. Fifteen minutes of validation catches hardware or driver issues before they surprise you mid-deployment.
Contents
Hardware
nvidia-smi
# Expect: RTX 5060 Ti, 16 GB, driver 565+
nvidia-smi --query-gpu=name,memory.total,memory.used,temperature.gpu,power.draw --format=csv
# Baseline: ~15 MB used, ~35C, ~15 W idle
If nvidia-smi returns “No devices found”, driver is not loaded – reboot or reinstall driver.
CUDA and PyTorch
python3 -c "import torch; print(torch.cuda.is_available(), torch.cuda.get_device_name(0))"
# True RTX 5060 Ti
python3 -c "import torch; a = torch.randn(1024, 1024).cuda(); b = torch.randn(1024, 1024).cuda(); print((a @ b).sum().item())"
# Any finite number; tests matmul on GPU
Stress Test
Load the card briefly to check thermals:
python3 -c "
import torch, time
a = torch.randn(8192, 8192).cuda().half()
b = torch.randn(8192, 8192).cuda().half()
start = time.time()
for _ in range(200):
c = a @ b
torch.cuda.synchronize()
print(f'200 iters: {time.time()-start:.1f}s')
"
Expected: 200 iterations in ~8-12 seconds. Monitor nvidia-smi during the run – temperature should stay under 80°C, power near 170 W.
Baseline Inference
Quick vLLM smoke test:
pip install vllm
python -m vllm.entrypoints.openai.api_server \
--model neuralmagic/Llama-3.1-8B-Instruct-FP8 \
--quantization fp8 --max-model-len 4096 &
sleep 120 # wait for model load
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"neuralmagic/Llama-3.1-8B-Instruct-FP8","messages":[{"role":"user","content":"Hello, say GigaGPU."}],"max_tokens":50}'
Expected: response with generated text in under 3 seconds.
VRAM Ceiling
python3 -c "
import torch
total = torch.cuda.get_device_properties(0).total_memory / 1e9
reserved = torch.cuda.memory_reserved(0) / 1e9
print(f'Total VRAM: {total:.1f} GB, Reserved: {reserved:.1f} GB')
"
# Expect Total ~16.0 GB
If total VRAM reported is under 15.5 GB, check that you did not get a different card variant. Contact support.
Validated Blackwell 16GB
Every server we ship passes this test before handoff. UK dedicated hosting.
Order the RTX 5060 Ti 16GBOnce sanity passes, run the full benchmark script to establish your baseline throughput numbers.