Client Work
All
Projects
Every project below delivered a measurable business outcome — cost savings, hours recovered, or revenue enabled. This is what "Future-Proof" data looks like in practice.
Featured Engagements
10M events/day Kafka → Spark Streaming Pipeline
Exactly-once · Late-event watermarks · Sub-5s end-to-end latency · AWS EMR
Production Kafka → Spark Structured Streaming pipeline at 10M+ events/day with exactly-once delivery to Delta Lake. Implemented watermark-based late-event handling (30-min tolerance), idempotent MERGE upserts, and a dead-letter queue with automatic replay. Reduced end-to-end data latency from 8 minutes to under 5 seconds.
Enterprise Lakehouse — Databricks Medallion Architecture
50+ siloed AWS Glue jobs → Bronze/Silver/Gold · Unity Catalog · 60% faster pipelines
Replaced a fragmented multi-warehouse topology (50+ isolated AWS Glue jobs, 8 engineering teams, no shared catalog) with a unified Delta Lake medallion architecture on Databricks. Unity Catalog for governance, automated schema contracts via dbt, Photon-powered Gold layer for BI and ML. Pipeline runtime dropped 60%, schema conflicts eliminated.
More Client Work
100TB Warehouse Migration — Redshift & Oracle → Snowflake + BigQuery
Dual-write validation · Zero downtime · p95 query time 42s → 11s
Led migration of 100+ TB from on-premise Oracle and legacy AWS Redshift to Snowflake and BigQuery using a dual-write validation strategy. Re-modelled physical layer (micro-partition clustering, column ordering, incremental ELT with dbt). Achieved 70% query performance improvement and 40% cost reduction with zero-downtime cutover.
ML Feature Store — 1,000+ Features, p99 < 8ms Online Serving
Dual-mode offline/online · Point-in-time correct · Zero training-serving skew
Centralised dual-mode feature platform on Databricks: Delta Lake offline store (point-in-time correct for training) and Redis online store (p99 < 8ms for inference) backed by identical feature definitions. Eliminated training-serving skew across 4 ML teams, cut feature engineering time from days to hours.
Real-Time Credit Decisioning — 48h Batch → < 2min Streaming
Kafka + PySpark micro-batch feature engineering · 100K+ apps/day · 95%+ accuracy
Replaced overnight batch credit scoring with a Kafka-driven real-time pipeline. PySpark micro-batch feature engineering computes 200+ credit risk signals in real time, integrated with a REST model serving layer. Reduced decisioning latency from 48 hours to under 2 minutes while maintaining 95%+ model accuracy at 100K+ applications/day.
Data Platform Cost Governance — −40% Spend in 90 Days
Isolation Forest anomaly detection · Terraform auto-remediation · 20+ Databricks workspaces
Automated cost governance framework using Isolation Forest ML to detect runaway Spark clusters, enforce S3 → Glacier storage tiering, and rightsize compute across 20+ Databricks workspaces. Terraform-driven auto-remediation halts anomalous jobs and routes Slack/PagerDuty alerts with per-team chargeback attribution.