Home / Blog / Tutorials / SSH Port Forwarding Workflow for GPU Development

Tutorials

SSH Port Forwarding Workflow for GPU Development

Forward ports from your dedicated GPU to your laptop over SSH - no VPN needed. The fastest way to access Jupyter, vLLM, and Grafana privately.

Tutorials April 23, 2026 1 min read admin

Before reaching for Tailscale or WireGuard, check whether simple SSH port forwarding solves your problem. For an individual developer accessing services on a dedicated GPU server, it usually does.

Local forwarding
Persistent tunnels
Multiple services
When to move beyond

Local Forwarding

To access vLLM on port 8000 of the GPU server locally:

ssh -L 8000:localhost:8000 user@gpu-server

While the SSH session is open, localhost:8000 on your laptop proxies to localhost:8000 on the server. Point your browser or client there.

Persistent

For always-on forwarding use autossh:

sudo apt install autossh
autossh -M 0 -N -L 8000:localhost:8000 user@gpu-server

Or set up via systemd user unit for auto-start on login:

[Unit]
Description=SSH tunnel to GPU
[Service]
ExecStart=/usr/bin/autossh -M 0 -N -L 8000:localhost:8000 user@gpu-server
Restart=always
[Install]
WantedBy=default.target

Multiple Services

ssh -L 8000:localhost:8000 \
    -L 8888:localhost:8888 \
    -L 3000:localhost:3000 \
    user@gpu-server

Forwards vLLM (8000), Jupyter (8888), and Grafana (3000) simultaneously.

Or configure in ~/.ssh/config:

Host gpu
    HostName gpu-server.gigagpu.com
    User yourname
    LocalForward 8000 localhost:8000
    LocalForward 8888 localhost:8888
    LocalForward 3000 localhost:3000

Then just ssh gpu activates all forwards.

When to Move Beyond

SSH forwarding stops scaling when:

Multiple team members need access – manage SSH keys for each
Non-SSH clients need to connect (mobile apps)
You want reverse forwarding (server initiates)

At that point move to Tailscale or WireGuard.

Dev-Ready GPU Hosting

UK dedicated GPU servers with SSH configured for easy port forwarding.

Browse GPU Servers

See Tailscale and remote VS Code.

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

Tutorials

admin

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

SSH Port Forwarding Workflow for GPU Development

Contents

Local Forwarding

Persistent

Multiple Services

When to Move Beyond

Dev-Ready GPU Hosting

Need a Dedicated GPU Server?

admin

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help?

SSH Port Forwarding Workflow for GPU Development

Contents

Local Forwarding

Persistent

Multiple Services

When to Move Beyond

Dev-Ready GPU Hosting

Need a Dedicated GPU Server?

admin

Related Articles

Ollama num_parallel and num_queue Tuning

vLLM High Latency: Reducing Time to First Token

RTX 5060 Ti 16GB LLM Context Budget

Flash Attention 2 Setup on a GPU Server

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help? Contact us

Have a question? Need help?