Table of Contents
For production AI deployments, log retention matters: incident response needs hot logs, analytics needs warm logs, compliance needs cold logs. Three-tier storage is the cost-efficient pattern.
Hot (~30 days): Loki / Elasticsearch for incident response query speed. Warm (~1 year): Parquet on S3 + ClickHouse external tables for analytics. Cold (~7 years): glacier-tier object storage for compliance-only access. Daily retention job moves between tiers. Total cost ~10% of all-hot.
Tiers
- Hot (0-30 days): Loki / Elasticsearch / OpenSearch. Sub-second query response. Used for incident response, recent debugging, dashboards.
- Warm (30-365 days): Parquet files on S3-compatible storage. Queryable via ClickHouse external tables, DuckDB, or Athena. Slower but much cheaper. Used for cost analytics, trend analysis, eval drift detection.
- Cold (1-7 years): S3 Glacier / Azure Archive / GCS Coldline. Hours-to-days retrieval. Used for compliance audit retrieval; never queried in normal ops.
Compliance
Retention windows by domain:
- SOC 2: typically 90 days hot, 1 year accessible
- PCI-DSS: 1 year audit logs minimum
- HIPAA: 6 years
- UK financial services (FCA): 5-7 years for client communications
- NHS DSPT: depends on data type; 8-30 years for clinical records
Set retention per-deployment based on regulated-data scope. GDPR right-to-erasure requires being able to delete subject data — verify your tiered storage supports targeted deletion at all tiers.
Ops
- Daily retention job: moves day-N+30 logs from hot to warm; day-N+365 from warm to cold
- Format conversion: JSON in hot, Parquet in warm (10-20× smaller, queryable)
- Encryption at rest: server-side encryption for all tiers
- Indexing: hot has full inverted index; warm has columnar Parquet; cold has only object metadata
- GDPR deletion: tooling to find + delete subject data across all tiers
Verdict
Three-tier log storage is the right pattern for production AI. Hot for ops, warm for analytics, cold for compliance. Total cost ~10% of all-hot; retention windows satisfy any regulatory framework you're likely to need. Set this up day-one of production logging.
Bottom line
Hot / warm / cold tiered storage. See structured logging.