June 05, 20244 min read

Migrating from Redshift to Apache Iceberg: What We Learned

A detailed account of migrating a 50TB Redshift warehouse to an open lakehouse on Apache Iceberg + S3, saving $400K/year and improving query performance 3x.

Apache IcebergLakehouseAWSData Architecture

We migrated a 50TB Redshift warehouse to an Apache Iceberg lakehouse on S3. Here's the honest story — what worked, what didn't, and what we'd do differently.

Why We Migrated

The business case was clear: Redshift was costing $85K/month. The technical case was compelling: we were hitting sort key limitations, vacuum times were growing, and cross-team data sharing required expensive Redshift data shares.

Apache Iceberg offered open table format (no vendor lock-in), hidden partitioning, schema evolution, time-travel, and storage decoupled from compute — all things Redshift couldn't match.

The Architecture

Our target architecture:

Storage: S3 (Parquet files managed by Iceberg)
Catalog: AWS Glue Data Catalog (Iceberg-native metadata)
Compute (batch): AWS EMR with Spark
Compute (interactive): Amazon Athena v3 (native Iceberg support)
Orchestration: Apache Airflow on EKS
Transformation: dbt with the dbt-spark adapter

Iceberg's table format manages file manifests, partition specs, and snapshots entirely in S3. Glue Catalog stores table metadata. Any engine that speaks Iceberg (Spark, Athena, Trino, Flink) can read/write the same tables.

Migration Strategy: Dual-Write

We used a dual-write strategy to migrate with zero downtime:

1. Shadow tables — created Iceberg equivalents for each Redshift table

2. Dual-write pipelines — all new data written to Redshift AND Iceberg in parallel

3. Backfill — migrated historical data in partitioned chunks (30-day windows)

4. Validation — compared row counts and aggregates between systems for 2 weeks

5. Cutover — redirected BI tools to Athena/Trino queries, shut down Redshift writes

6. Decommission — final Redshift shutdown after 30-day rollback window

The dual-write phase ran for 6 weeks total. This was longer than planned — schema inconsistencies between Redshift and Iceberg types required careful mapping (especially DECIMAL precision and TIMESTAMP timezone handling).

Iceberg Features That Changed How We Work

Hidden partitioning

WHERE event_date = ...

-- Partition spec defined once at table creation
CREATE TABLE events (
  event_id    BIGINT,
  user_id     BIGINT,
  event_type  STRING,
  event_ts    TIMESTAMP,
  properties  MAP
)
PARTITIONED BY (days(event_ts), bucket(16, user_id))

Schema evolution

Time travel

-- As-of timestamp
SELECT * FROM events
FOR SYSTEM_TIME AS OF TIMESTAMP '2024-06-01 00:00:00'
-- As-of snapshot ID
SELECT * FROM events
FOR SYSTEM_VERSION AS OF 5432198765

Performance Results

After tuning, query performance improved significantly:

|---|---|---|---|

| Single-day event scan | 8.2s | 1.4s | 5.8x |

| 90-day aggregation | 42s | 11s | 3.8x |

| Cross-table join (large) | 95s | 28s | 3.4x |

| Point lookup by user_id | 3.1s | 0.9s | 3.4x |

Key tuning decisions: Z-ordering by user_id + event_type, target_file_size_bytes = 256MB, Snappy compression, and regular OPTIMIZE + EXPIRE_SNAPSHOTS maintenance jobs.

Cost Results

Monthly compute + storage cost went from $85K (Redshift) to $18K (S3 + EMR + Athena). The $67K/month ($800K/year) saving came from:

S3 storage at ~$0.02/GB vs Redshift's ~$0.25/GB managed storage
Athena pay-per-query with aggressive data pruning
EMR Spot instances for Spark batch jobs (70% discount vs On-Demand)

What We'd Do Differently

Catalog choice matters more than we thought. Glue Catalog is convenient on AWS but has limitations: no multi-region replication, API rate limits at scale, and vendor lock-in. Next time we'd evaluate Project Nessie or Apache Polaris for catalog portability. Compaction strategy upfront. Small file accumulation is Iceberg's #1 operational problem. We retrofitted compaction jobs after launch. Plan your OPTIMIZE maintenance schedule before go-live. Test Trino vs Athena. We went all-in on Athena but later found Trino on EMR Serverless was cheaper for long-running analytical queries. Benchmark your actual query patterns before committing.

Overall: the migration was worth it. Lower cost, better performance, and the open table format gives us engine flexibility we didn't have with Redshift.

Share on Twitter / X

Feb 01, 20265 min read

How to Build a Lakehouse Using Delta Lake & Databricks

A comprehensive guide to building a modern data lakehouse architecture using Delta Lake and Databricks, combining the best of data lakes and data warehouses.

DatabricksDelta Lake