RTX 3050 - Order Now
Home / Blog / Tutorials / Ollama Behind Cloudflare Tunnel – Secure Remote Access
Tutorials

Ollama Behind Cloudflare Tunnel – Secure Remote Access

Cloudflare Tunnel exposes your local Ollama server on a public URL without opening ports. A clean, free, TLS-terminated setup.

You have Ollama running on a dedicated GPU server. You want to use it from your laptop, mobile app, or partner integrations without exposing it to the open internet. Cloudflare Tunnel is the easiest way.

Contents

Why Tunnel

  • No open inbound ports on the GPU server
  • Automatic TLS via Cloudflare’s edge
  • Free for most use cases
  • Cloudflare Access adds auth without you writing it

Setup

curl -L --output cloudflared.deb https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64.deb
sudo dpkg -i cloudflared.deb
cloudflared tunnel login
cloudflared tunnel create gigagpu-ollama

That issues a credentials file and creates a named tunnel.

Config

~/.cloudflared/config.yml:

tunnel: gigagpu-ollama
credentials-file: /home/ubuntu/.cloudflared/gigagpu-ollama.json

ingress:
  - hostname: ollama.yourdomain.com
    service: http://localhost:11434
    originRequest:
      noTLSVerify: true
      connectTimeout: 60s
      tcpKeepAlive: 30s
  - service: http_status:404

Then route DNS:

cloudflared tunnel route dns gigagpu-ollama ollama.yourdomain.com

Run as a service:

sudo cloudflared service install
sudo systemctl start cloudflared

Access

Cloudflare Access adds auth policies. Require Google, GitHub, or email OTP. In the Cloudflare dashboard, create an Access application for ollama.yourdomain.com and set an identity policy. Clients hit the tunnel through Cloudflare’s auth layer.

For API access, use service tokens so programmatic clients authenticate without user login.

Private Remote Access to Your GPU

UK dedicated GPU hosting with Cloudflare Tunnel preconfigured on request.

Browse GPU Servers

See Caddy reverse proxy and Tailscale private AI network.

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

admin

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

Have a question? Need help?