What You’ll Build
In 30 minutes, you will have a production sentiment analysis API that classifies text as positive, negative, or neutral with confidence scores, detects fine-grained emotions (joy, anger, frustration, satisfaction), and performs aspect-level sentiment extraction from reviews. Running on a dedicated GPU server, your API analyses 5,000 texts per second — processing an entire quarter’s customer feedback in minutes, not hours.
Cloud sentiment APIs charge $0.25-$1.00 per 1,000 requests. At 500,000 customer reviews, support tickets, and social mentions monthly, that is $125-$500 in API fees for basic positive/negative classification alone. Self-hosted sentiment analysis on open-source models delivers deeper analysis — aspect-level sentiment, emotion detection, sarcasm handling — at unlimited volume with zero per-request cost.
Architecture Overview
The API offers two analysis paths. A specialised sentiment classifier (fine-tuned RoBERTa or DeBERTa) handles high-throughput basic sentiment at maximum speed. An LLM through vLLM handles nuanced analysis — aspect extraction, context-dependent sentiment, sarcasm detection, and multilingual text. Requests route automatically based on the analysis depth requested.
The API layer accepts single texts, batches of up to 1,000 texts, and streaming inputs from message queues. Output includes overall sentiment label, confidence distribution across classes, detected emotions, and for reviews, aspect-level sentiment breaking down how the customer feels about specific product features. Pair with an AI chatbot to let teams query sentiment trends conversationally.
GPU Requirements
| Analysis Depth | Recommended GPU | VRAM | Throughput |
|---|---|---|---|
| Basic sentiment (classifier) | RTX 5090 | 24 GB | ~5,000 texts/sec |
| Aspect + emotion (8B LLM) | RTX 5090 | 24 GB | ~200 texts/sec |
| Full analysis (70B LLM) | RTX 6000 Pro 96 GB | 80 GB | ~80 texts/sec |
The specialised classifier model uses under 1GB VRAM, so it co-hosts easily with an LLM on the same GPU. Route high-volume streams through the classifier for real-time monitoring and batch deeper analysis through the LLM for detailed reports. See our self-hosted LLM guide for model pairing strategies.
Step-by-Step Build
Deploy both a sentiment classifier and an LLM on your GPU server. Build the API with automatic routing based on analysis depth.
from fastapi import FastAPI
from transformers import pipeline
import requests
app = FastAPI()
# Fast classifier for basic sentiment
classifier = pipeline("sentiment-analysis",
model="cardiffnlp/twitter-roberta-base-sentiment-latest",
device=0)
VLLM_URL = "http://localhost:8000/v1/chat/completions"
@app.post("/v1/sentiment")
async def analyse(texts: list[str], depth: str = "basic"):
if depth == "basic":
results = classifier(texts, batch_size=64)
return [{"text": t, "sentiment": r["label"],
"confidence": r["score"]}
for t, r in zip(texts, results)]
# Deep analysis via LLM
analyses = []
for text in texts:
resp = requests.post(VLLM_URL, json={
"model": "meta-llama/Llama-3-8b-chat-hf",
"messages": [{"role": "user", "content":
f"""Analyse sentiment of this text.
Text: {text}
Return JSON: {{sentiment, confidence, emotions: [string],
aspects: [{{feature, sentiment, detail}}],
sarcasm_detected: bool}}"""}],
"max_tokens": 300, "temperature": 0.1
})
analyses.append(resp.json()["choices"][0]["message"]["content"])
return analyses
@app.post("/v1/sentiment/stream")
async def stream_analyse(text: str):
result = classifier([text])[0]
return {"sentiment": result["label"],
"confidence": result["score"]}
Add webhook support for real-time monitoring — trigger alerts when negative sentiment spikes above a threshold. The OpenAI-compatible endpoint handles the LLM path for deep analysis. See production setup for batching configuration across high-volume streams.
Monitoring and Dashboards
Build real-time sentiment dashboards that track customer satisfaction trends across products, channels, and time periods. Aggregate aspect-level sentiment to identify which product features drive positive and negative feedback. Set up alerting for sentiment drops that may indicate product issues, service outages, or PR crises.
Track classifier accuracy against a labelled validation set from your domain. Customer service language, product reviews, and social media posts each have different sentiment patterns — a model fine-tuned on your specific text types outperforms generic classifiers by 5-15% on accuracy.
Deploy Your Sentiment API
A self-hosted sentiment API powers real-time customer insight at unlimited volume without per-request fees or data leaving your infrastructure. Monitor brand perception, route support tickets, and quantify product feedback. Launch on GigaGPU dedicated GPU hosting and start analysing. Browse more API use cases and tutorials in our library.