The Challenge: 600 Sellers, 4,000 Categories, 23% Mis-Categorisation
A UK-based multi-vendor marketplace for craft supplies and handmade goods hosts 600 active sellers uploading approximately 15,000 new product listings every week. Sellers self-assign categories from a 4,000-node taxonomy spanning materials, techniques, occasions, and product types. The problem is consistency: one seller lists a set of polymer clay earrings under “Jewellery > Earrings > Drop,” another puts an identical product under “Craft Supplies > Polymer Clay > Finished Items,” and a third categorises theirs as “Gifts > For Her > Under £20.” A manual audit reveals that 23% of listings are mis-categorised or placed in suboptimal categories, making them invisible to buyers browsing the correct taxonomy path. The marketplace estimates this costs £130,000 per month in lost discoverability and reduced seller GMV.
A content moderation team of four spends 60% of their time re-categorising listings. Scaling this manually is unsustainable as the marketplace grows toward 1,000 sellers.
AI Solution: Multi-Modal Product Classification
Multi-modal classification combines product images and text descriptions to predict the correct taxonomy path. A vision model such as CLIP or SigLIP encodes the product image, while a text encoder processes the title and description. The fused representation feeds into a hierarchical classifier that predicts category at each taxonomy level — first “Jewellery,” then “Earrings,” then “Drop Earrings,” then material and technique tags.
Running the pipeline on a dedicated GPU server enables real-time classification as sellers upload listings. Products are auto-tagged within seconds, and sellers see a suggested category they can accept or override. The override data feeds back into model retraining, continuously improving accuracy.
GPU Requirements
Multi-modal classification uses both a vision encoder and a text encoder simultaneously. CLIP ViT-L/14 requires approximately 3 GB of VRAM, and the hierarchical classifier head adds another 1-2 GB. Real-time processing of concurrent seller uploads requires sustained inference throughput.
| GPU Model | VRAM | Classifications per Second | Weekly Batch (15K listings) |
|---|---|---|---|
| NVIDIA RTX 5090 | 24 GB | ~85 | ~3 minutes |
| NVIDIA RTX 6000 Pro | 48 GB | ~70 | ~3.5 minutes |
| NVIDIA RTX 6000 Pro | 48 GB | ~95 | ~2.6 minutes |
| NVIDIA RTX 6000 Pro 96 GB | 80 GB | ~130 | ~1.9 minutes |
The classification workload is lightweight — even the smallest GPU handles the volume with room for additional models. The private AI hosting option keeps seller product data and marketplace taxonomy within UK infrastructure.
Recommended Stack
- CLIP or SigLIP for multi-modal encoding of product images and text.
- Custom hierarchical classifier trained on the marketplace’s specific 4,000-node taxonomy.
- FastAPI microservice accepting image + text input and returning predicted category path with confidence scores.
- Active learning loop capturing seller overrides to retrain the classifier monthly.
- Optional: an LLM via vLLM for extracting structured attributes (material, colour, dimensions) from unstructured descriptions.
Marketplaces handling supplier catalogues in PDF format can add document AI to extract product data automatically. For generating missing product images, pair with an AI image generator.
Cost Analysis
The current manual re-categorisation process costs approximately £6,500 per month in staff time (four moderators spending 60% of their hours on categorisation). Third-party auto-tagging APIs charge £0.01–£0.03 per classification, totalling £600–£1,800 monthly at 60,000 monthly listings. Self-hosting on a dedicated GPU eliminates per-classification charges and provides unlimited processing capacity.
The real value is in discoverability. Reducing mis-categorisation from 23% to under 3% makes 20% more products visible to browsing buyers. At average order values of £18 and current conversion rates, the marketplace projects an additional £26,000 in monthly GMV from improved product placement, generating roughly £3,900 in additional commission revenue.
Getting Started
Export your existing product catalogue with current category assignments and seller-provided images and descriptions. Manually audit 5,000 listings to create a gold-standard training set with correct taxonomy labels. Train the hierarchical classifier, starting with the top two taxonomy levels and expanding to full depth as accuracy improves. Deploy in suggestion mode, measuring seller acceptance rate as the primary quality metric.
GigaGPU provides UK-based dedicated GPU servers ready for classification workloads. Add an AI chatbot for seller onboarding assistance, or expand into visual search to help buyers find products by image.
GigaGPU offers dedicated GPU servers in UK data centres with full GDPR compliance. Deploy classification models on private infrastructure today.
View Dedicated GPU Plans