RTX 3050 - Order Now
Home / Blog / Tutorials / Connect Snowflake to AI Analytics on GPU
Tutorials

Connect Snowflake to AI Analytics on GPU

Connect Snowflake to your GPU-hosted AI for intelligent data analytics. This guide covers calling your self-hosted LLM from Snowflake external functions, enriching warehouse data with AI-generated insights, and building natural language query interfaces over your data.

What You’ll Connect

After this guide, your Snowflake warehouse will call your GPU-hosted AI directly from SQL queries — classifying records, generating summaries, extracting entities, and answering natural language questions about your data. Snowflake external functions connect to your vLLM endpoint on dedicated GPU hardware through an API gateway, letting analysts run AI operations on warehouse data without leaving SQL.

The integration uses Snowflake’s external function feature to call your OpenAI-compatible API via an API integration and proxy service. A Python middleware translates Snowflake’s batch function call format into individual or batched inference requests to your GPU endpoint, returning results that Snowflake inserts back into your query results.

Prerequisites

  • A GigaGPU server running a self-hosted LLM (setup guide)
  • A Snowflake account with ACCOUNTADMIN or CREATE INTEGRATION privileges
  • An API gateway (AWS API Gateway, Azure API Management, or a simple proxy)
  • Python 3.10+ with fastapi for the middleware proxy

Integration Steps

Set up a middleware proxy that accepts Snowflake’s external function request format and forwards inference requests to your GPU endpoint. Snowflake sends batch rows as a JSON array, and your proxy maps each row to an inference call, batches them efficiently, and returns results in Snowflake’s expected response format.

Create an API integration in Snowflake that points to your proxy endpoint. Then create external functions that analysts can call in SQL queries. Each function maps to a specific AI capability — AI_CLASSIFY(text) for classification, AI_SUMMARISE(text) for summarisation, AI_EXTRACT(text, entity_type) for entity extraction. Analysts use these functions like any built-in Snowflake function.

Build a natural language query interface by creating a function that accepts a question in English and returns a SQL query. The LLM generates SQL based on your schema metadata, and a wrapper function executes the generated query. This lets non-technical users query Snowflake by asking questions rather than writing SQL.

Code Example

Middleware proxy and Snowflake SQL for AI analytics from your self-hosted models:

# middleware_proxy.py — Translates Snowflake format to GPU API
from fastapi import FastAPI, Request
import requests

app = FastAPI()
GPU_URL = "http://gpu-server:8000/v1/chat/completions"
GPU_KEY = "your-api-key"

@app.post("/ai/classify")
async def classify(request: Request):
    body = await request.json()
    rows = body["data"]  # [[row_num, text], ...]

    results = []
    for row_num, text in rows:
        resp = requests.post(GPU_URL, json={
            "model": "meta-llama/Llama-3-8b-chat-hf",
            "messages": [{"role": "user",
                "content": f"Classify into one category "
                           f"(positive/negative/neutral): {text}"}],
            "max_tokens": 10, "temperature": 0.1
        }, headers={"Authorization": f"Bearer {GPU_KEY}"})
        label = resp.json()["choices"][0]["message"]["content"].strip()
        results.append([row_num, label])

    return {"data": results}

# --- Snowflake SQL setup ---
# CREATE OR REPLACE API INTEGRATION gpu_ai_integration
#   API_PROVIDER = aws_api_gateway
#   API_AWS_ROLE_ARN = 'arn:aws:iam::role/snowflake-gpu'
#   ENABLED = TRUE
#   API_ALLOWED_PREFIXES = ('https://your-proxy.example.com/');
#
# CREATE OR REPLACE EXTERNAL FUNCTION ai_classify(text VARCHAR)
#   RETURNS VARCHAR
#   API_INTEGRATION = gpu_ai_integration
#   AS 'https://your-proxy.example.com/ai/classify';
#
# CREATE OR REPLACE EXTERNAL FUNCTION ai_summarise(text VARCHAR)
#   RETURNS VARCHAR
#   API_INTEGRATION = gpu_ai_integration
#   AS 'https://your-proxy.example.com/ai/summarise';
#
# -- Usage in SQL queries:
# SELECT customer_id, feedback_text,
#        ai_classify(feedback_text) AS sentiment,
#        ai_summarise(feedback_text) AS summary
# FROM customer_feedback
# WHERE created_at > DATEADD(day, -7, CURRENT_DATE);

Testing Your Integration

Deploy the middleware proxy and test it directly with curl using Snowflake’s request format. Create the API integration and external function in Snowflake. Run a simple query: SELECT ai_classify('Great product, fast delivery') and verify it returns a classification. Then test on a table with 100 rows to verify batch processing works correctly.

Measure query execution time to establish baseline latency. External function calls add network round-trip and GPU inference time to query execution. For large result sets, test with LIMIT clauses first, then scale up gradually. Verify that the middleware handles concurrent requests from multiple Snowflake queries.

Production Tips

Cache AI results in Snowflake by writing enrichment outputs to a dedicated table rather than calling the external function in every query. Run a scheduled enrichment job that processes new rows nightly, storing AI classifications, summaries, and entities alongside the source data. Analysts query the pre-computed results without triggering live GPU inference.

For the natural language query interface, provide schema metadata and sample queries in the LLM prompt so it generates accurate SQL. Restrict the interface to SELECT queries only — never let AI-generated SQL modify data. Add guardrails that validate generated SQL before execution. Build an AI chatbot frontend for your Snowflake text-to-SQL interface. Explore more tutorials or get started with GigaGPU to power AI analytics on your warehouse data.

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

admin

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

Have a question? Need help?