Two Thousand Identity Documents Per Day
A UK-regulated payments company onboards approximately 2,000 new customers per day across its consumer and business accounts. Each application requires identity verification: extracting data from passports, driving licences, and proof-of-address documents, then cross-checking extracted fields against the application form. The current third-party KYC API costs £1.80 per verification, totalling £108,000 annually, and sends customer identity documents to external servers outside the company’s direct control. The compliance team wants verification infrastructure that keeps all identity documents within UK-hosted systems.
GPU-accelerated document AI processes identity documents in under 2 seconds per document: PaddleOCR extracts text fields, a vision model classifies the document type and detects manipulation artefacts, and an LLM cross-validates extracted data against the application. All processing happens on a dedicated GPU server within private UK infrastructure with no external API calls for document data.
AI Architecture for ID Verification
The pipeline handles four document categories. Passports: MRZ (machine-readable zone) parsing combined with OCR of the visual inspection zone to extract name, nationality, date of birth, document number, and expiry date. Driving licences: layout-aware OCR maps text to the specific fields on UK DVLA, EU, and international formats. Proof of address: utility bills, council tax statements, and bank statements are parsed for name, address, and document date — with validation that the document falls within the required three-month recency window. Selfie matching: a face embedding model compares the applicant’s selfie against the photo extracted from the identity document.
A cross-validation layer powered by the LLM server checks for consistency: does the name on the passport match the application? Is the date of birth consistent across documents? Does the address on the utility bill match the stated residential address? Inconsistencies are flagged with confidence scores and routed to the human review queue.
GPU Requirements for KYC Processing
| GPU Model | VRAM | Verifications/Hour | Best For |
|---|---|---|---|
| RTX 5090 | 24 GB | ~400 | Under 1,000 verifications/day |
| RTX 6000 Pro | 48 GB | ~900 | 1,000–5,000 verifications/day |
| RTX 6000 Pro 96 GB | 80 GB | ~1,600 | High-volume platforms, 5,000+ daily |
The payments company processing 2,000 verifications daily completes the entire day’s volume in approximately 2.5 hours on an RTX 6000 Pro, leaving capacity for batch re-verification and periodic re-screening of existing customers.
Recommended Software Stack
- OCR Engine: PaddleOCR v4 with document-specific preprocessing for passport MRZ and licence formats
- Document Classification: Fine-tuned EfficientNet for ID type identification (passport, licence, utility bill)
- Face Matching: ArcFace embedding model for selfie-to-document photo comparison
- Fraud Detection: Image forensics pipeline for splice detection, font inconsistency, and digital manipulation
- Cross-Validation: Llama 3 8B for multi-document field consistency checking
- Integration: REST API for onboarding platform, webhook callbacks for verification status
Regulatory Compliance and Cost Analysis
UK Money Laundering Regulations require adequate customer due diligence measures. The FCA expects firms to maintain systems that can verify customer identity reliably while maintaining appropriate records. Processing identity documents on GDPR-compliant dedicated infrastructure provides full control over data retention, access logging, and deletion schedules that third-party API providers cannot match.
| Approach | Annual Cost (2,000/day) | Data Sovereignty |
|---|---|---|
| Third-party KYC API | £108,000+ | Data leaves your control |
| Cloud GPU processing | £24,000–£48,000 | Shared cloud infrastructure |
| GigaGPU RTX 6000 Pro Dedicated | From £6,588/yr | Full UK sovereignty |
Getting Started
Collect 500 sample identity documents (with appropriate consent) spanning the document types your applicants submit most frequently. Benchmark OCR accuracy per document type against manually verified ground truth. Target 97%+ field extraction accuracy on passports and 94%+ on utility bills before connecting to the live onboarding flow. Run in shadow mode alongside the existing KYC provider for 30 days, comparing results. Organisations also running document AI for other use cases can share the GPU server. Browse additional use cases for complementary workflows.
KYC Document AI on Dedicated GPU Servers
Verify identity documents in seconds, UK-hosted, zero per-document fees, full data sovereignty.
Browse GPU Servers