RTX 3050 - Order Now
Home / Blog / Tutorials / Feedback Analyser with LLM and Embeddings
Tutorials

Feedback Analyser with LLM and Embeddings

Build a feedback analysis pipeline that clusters customer feedback using embeddings, identifies themes with an LLM, and surfaces actionable insights on a dedicated GPU server.

You will build a pipeline that takes thousands of customer feedback entries (surveys, support tickets, app reviews), converts them to embeddings for clustering, uses an LLM to label each cluster with a human-readable theme, and produces a prioritised report of customer concerns. The end result: instead of reading 5,000 feedback entries manually, your product team gets “Top 10 customer themes this month” with representative quotes and trend data. Here is the pipeline on dedicated GPU infrastructure.

Pipeline Architecture

StageToolPurpose
1. EmbeddingBGE-large-en-v1.5Convert feedback to vectors
2. ClusteringHDBSCANGroup similar feedback
3. Theme labellingLLaMA 3.1 8BName each cluster
4. Insight extractionLLaMA 3.1 8BPrioritised recommendations

Stage 1: Feedback Embedding

from sentence_transformers import SentenceTransformer
import numpy as np

model = SentenceTransformer("BAAI/bge-large-en-v1.5", device="cuda")

def embed_feedback(feedback_list: list) -> np.ndarray:
    embeddings = model.encode(
        feedback_list, batch_size=64,
        show_progress_bar=True, normalize_embeddings=True
    )
    return embeddings

# Embed 5,000 feedback entries (~30 seconds on GPU)
feedback_texts = load_feedback_from_db()
embeddings = embed_feedback(feedback_texts)

GPU-accelerated embedding processes thousands of entries in seconds. Store embeddings in ChromaDB or Qdrant for persistent vector storage and retrieval.

Stage 2: Semantic Clustering

import hdbscan
from sklearn.decomposition import UMAP

# Reduce dimensions for clustering
reducer = UMAP(n_components=15, metric="cosine")
reduced = reducer.fit_transform(embeddings)

# Cluster similar feedback
clusterer = hdbscan.HDBSCAN(min_cluster_size=10, min_samples=5)
labels = clusterer.fit_predict(reduced)

# Group feedback by cluster
clusters = {}
for idx, label in enumerate(labels):
    if label == -1:  # Noise
        continue
    if label not in clusters:
        clusters[label] = []
    clusters[label].append(feedback_texts[idx])

print(f"Found {len(clusters)} distinct themes from {len(feedback_texts)} entries")

Stage 3: Theme Labelling

from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="none")

def label_cluster(feedback_samples: list) -> dict:
    sample_text = "\n- ".join(feedback_samples[:15])
    response = client.chat.completions.create(
        model="meta-llama/Llama-3.1-8B-Instruct",
        messages=[{
            "role": "system",
            "content": """Analyse these customer feedback entries that share a common theme.
Return JSON: {"theme": "short descriptive name",
"description": "2-sentence description of the theme",
"sentiment": "positive|negative|mixed",
"severity": "critical|high|medium|low",
"representative_quote": "best example from the samples",
"recommendation": "suggested action"}"""
        }, {"role": "user", "content": f"Feedback entries:\n- {sample_text}"}],
        max_tokens=300, temperature=0.1
    )
    return parse_json(response.choices[0].message.content)

The vLLM server labels each cluster. The LLM understands context better than keyword extraction, producing themes like “Mobile checkout timeout on slow connections” rather than generic “checkout issues”.

Insight Report Generation

def generate_report(themes: list) -> dict:
    # Sort by severity and cluster size
    prioritised = sorted(themes, key=lambda t: (
        {"critical": 0, "high": 1, "medium": 2, "low": 3}[t["severity"]],
        -t["count"]
    ))
    report = {
        "total_feedback": len(feedback_texts),
        "themes_found": len(themes),
        "top_issues": prioritised[:10],
        "positive_themes": [t for t in themes if t["sentiment"] == "positive"],
        "trend_comparison": compare_with_previous_period(themes)
    }
    return report

Production Deployment

For production: schedule weekly analysis runs; track theme trends over time to measure whether fixes reduced complaint volume; integrate with product management tools to auto-create tickets from critical themes; and add RAG search so team members can ask natural language questions about feedback. Deploy on private infrastructure to keep customer feedback confidential. See model options for larger models, chatbot hosting for feedback Q&A, more tutorials, and analytics use cases.

Analytics GPU Servers

Dedicated GPU servers for feedback analysis and embedding pipelines. Process customer data on isolated UK infrastructure.

Browse GPU Servers

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

admin

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

Have a question? Need help?