Your GPU server runs a 70B parameter model serving production inference. It has a public IP. Within 30 minutes of provisioning, automated scanners begin brute-forcing SSH on port 22 — typically 3,000 to 10,000 attempts per day. Default SSH configuration with password authentication is an open invitation. One compromised credential and an attacker has root access to your model weights, inference data, and GPU resources (which are worth significant money for cryptocurrency mining). This guide covers SSH hardening for dedicated GPU servers running AI workloads.
Key-Only Authentication
Disable password authentication entirely. SSH keys use 3072-bit RSA or Ed25519 cryptography — brute-forcing is computationally infeasible:
# Generate Ed25519 key on your local machine
ssh-keygen -t ed25519 -C "your-name@your-org" -f ~/.ssh/gpu_server
# Copy public key to server (do this while password auth is still enabled)
ssh-copy-id -i ~/.ssh/gpu_server.pub root@your-gpu-server
# Then on the server, edit /etc/ssh/sshd_config:
PasswordAuthentication no
PubkeyAuthentication yes
ChallengeResponseAuthentication no
UsePAM no
PermitRootLogin prohibit-password
AuthenticationMethods publickey
Restart sshd: systemctl restart sshd. Test with a new terminal before closing your existing session — locking yourself out of a remote GPU server is painful to recover from.
Port and Network Configuration
Moving SSH off port 22 eliminates 99% of automated scans. It is not security — it is noise reduction that makes your logs readable:
| Configuration | Default | Hardened | Impact |
|---|---|---|---|
| SSH port | 22 | Custom (e.g., 2222) | Eliminates automated scans |
| ListenAddress | 0.0.0.0 | VPN interface only | No public SSH exposure |
| MaxAuthTries | 6 | 3 | Faster lockout on failure |
| LoginGraceTime | 120s | 30s | Reduces connection holding |
| MaxSessions | 10 | 3 | Limits concurrent sessions |
| ClientAliveInterval | 0 | 300 | Drops idle connections |
If you configured a VPN per the private infrastructure recommendations, bind SSH to the VPN interface only: ListenAddress 10.100.0.1. Zero public SSH exposure.
Fail2Ban for Automated Blocking
Even with key-only auth, failed connection attempts consume resources. Fail2Ban bans IPs after repeated failures:
# /etc/fail2ban/jail.local
[sshd]
enabled = true
port = 2222
filter = sshd
logpath = /var/log/auth.log
maxretry = 3
bantime = 3600
findtime = 600
banaction = iptables-multiport
For GPU servers running vLLM or Ollama, also create Fail2Ban jails for your inference API ports to catch brute-force attempts against API authentication.
Jump Host Architecture
For teams accessing multiple GPU servers, use a jump host (bastion) architecture. Team members SSH to a hardened bastion with no GPU resources, then proxy through to GPU servers on the private network. The GPU servers have no public SSH access at all:
# ~/.ssh/config on team member's machine
Host gpu-bastion
HostName bastion.your-domain.com
User deploy
Port 2222
IdentityFile ~/.ssh/gpu_server
Host gpu-inference-1
HostName 10.0.1.10
User deploy
ProxyJump gpu-bastion
IdentityFile ~/.ssh/gpu_server
Running ssh gpu-inference-1 transparently hops through the bastion. The GPU server’s IP is private and unreachable from the internet. This pattern scales well for teams managing multiple model deployments across several GPU servers.
Session Controls and Auditing
Log all SSH sessions for compliance and incident response. Enable verbose logging in sshd_config: LogLevel VERBOSE. This records key fingerprints used for each login, making it possible to identify which team member accessed the server. For stricter environments, deploy session recording that captures terminal output. Restrict what users can do after connecting — AI operators should not need root. Create a dedicated deploy user with sudo access only to restart inference services, not modify system configuration.
Review who has SSH access monthly. Remove keys for departed team members immediately. Maintain an authorised key inventory mapped to real identities. Teams operating chatbot services, document processing, or computer vision should apply these controls uniformly across all GPU servers in their fleet. See infrastructure guides and GDPR compliance documentation for complementary hardening.
Hardened GPU Servers
Dedicated GPU servers with full root access, static IPs, and network-level isolation for secure AI deployments. UK data centres.
Browse GPU Servers