The Challenge: Millions of Documents, Weeks to Review
A Magic Circle firm acting for a FTSE 100 client in a major commercial litigation has received a disclosure obligation covering 2.3 million documents — emails, board minutes, contracts, internal memos, spreadsheets, and instant messages spanning a six-year period. Traditional document review using teams of contract lawyers costs £25-£60 per hour per reviewer. At an average review rate of 50 documents per hour, the exercise would require approximately 46,000 reviewer-hours — nearly £2 million in review costs and four months of elapsed time that the court timetable does not allow.
Technology-assisted review (TAR) has been accepted by English courts since the landmark Pyrrho decision, but the firm’s clients — particularly those in regulated industries — increasingly insist that privileged and confidential material must not pass through US-headquartered cloud platforms. A banking client’s compliance team recently vetoed a proposed e-discovery platform specifically because document processing occurred on AWS US-East, raising questions about GDPR data transfers and legal privilege implications under cross-border subpoena risk.
AI Solution: LLM-Powered Relevance and Privilege Classification
An open-source LLM fine-tuned on legal document review decisions can classify documents across multiple dimensions simultaneously: relevance to pleaded issues, legal professional privilege, without-prejudice status, and regulatory sensitivity. Unlike keyword search or simple TAR classifiers, an LLM reads the document in context, understanding that a discussion of “the deal” in a 2019 email chain refers to the specific transaction at issue rather than an unrelated matter.
The pipeline begins with document AI ingesting the 2.3 million documents in native format. PaddleOCR handles scanned correspondence and handwritten notes. The LLM then processes each document, generating relevance scores, privilege flags, and issue codes mapped to the case’s list of issues. Senior associates review only the borderline documents rather than the full population — cutting human review to perhaps 5% of the original volume.
GPU Requirements: Processing Millions of Documents
Document review inference involves feeding the LLM each document’s text (typically 200-2,000 tokens per email, up to 10,000 tokens for contracts) and generating a structured classification output. The sheer volume — 2.3 million documents — demands sustained high throughput over several days.
| GPU Model | VRAM | Documents per Hour | 2.3M Documents |
|---|---|---|---|
| NVIDIA RTX 5090 | 24 GB | ~1,800 | ~53 days |
| NVIDIA RTX 6000 Pro | 48 GB | ~2,500 | ~38 days |
| NVIDIA RTX 6000 Pro 96 GB | 80 GB | ~4,200 | ~23 days |
| 2x NVIDIA RTX 6000 Pro | 160 GB | ~8,000 | ~12 days |
For the litigation timetable, a dual RTX 6000 Pro configuration on GigaGPU dedicated hosting completes the full review in under two weeks — comparable to managed e-discovery platforms but with all documents remaining on UK infrastructure the firm controls.
Recommended Stack
- LLaMA 3 8B or Mistral 7B-Instruct fine-tuned on labelled disclosure review decisions from prior matters.
- vLLM for high-throughput batch inference with continuous batching — critical for sustained multi-day processing runs.
- Apache Tika for document format extraction (emails, PDFs, Office documents, archives).
- PaddleOCR for scanned and image-embedded documents.
- Elasticsearch for keyword and metadata search alongside AI classification, enabling hybrid workflows.
- Streamlit or Retool review interface for associates to validate AI classifications and log review decisions.
An AI chatbot interface allows the litigation team to interrogate the document population in natural language: “Show me all board meeting minutes from 2020 where the acquisition was discussed and at least one external advisor was present.”
Cost vs. Alternatives
Managed e-discovery platforms (Relativity, Everlaw, Reveal) charge per-GB-per-month hosting fees plus processing charges that typically total £150,000-£400,000 for a 2.3 million document matter. Contract lawyer review at £35/hour for 46,000 hours adds another £1.6 million. A self-hosted LLM approach on dedicated GPU reduces human review to approximately 115,000 documents (the 5% borderline set), saving over £1.2 million in review costs alone.
The privilege protection argument may matter even more than cost. With self-hosted inference, privileged material never enters a third party’s infrastructure. The firm maintains an unbroken chain of custody that is straightforward to evidence in any subsequent privilege dispute.
Getting Started
Take a sample of 5,000 documents from a recently completed disclosure exercise where human review decisions are available. Benchmark the LLM classifier against those decisions. Measure precision and recall for relevance and privilege classifications separately. Most firms find that two rounds of fine-tuning on their own review data achieves recall exceeding 90% — sufficient for defensible TAR under English law.
GigaGPU offers private AI hosting with the sustained compute power large-scale document review demands. Every client document stays within UK data centres, and the firm retains complete control of the processing infrastructure.
GigaGPU’s UK-based dedicated GPU servers deliver the throughput legal discovery demands with zero privileged data leaving British soil.
See Dedicated GPU Plans