Byte-Sized Design

Byte-Sized Design

The Trillion-Event Platform: How Spotify Built a Data System That Doesn't Break

Byte-Sized Design's avatar
Byte-Sized Design
Dec 27, 2025
∙ Paid

TL;DR

Spotify processes 1.4 trillion data points daily. Spotify grew from managing Europe’s largest Hadoop cluster to a 100+ engineer team running a full GCP-based platform. The key was when they stopped treating the data platform like infrastructure and started treating it like a product with real customers.


🎯 The Problem Space

Most companies hit the “we need a data platform” moment when their Slack is flooded with:

  • “Where’s that dataset again?”

  • “Why did this pipeline fail overnight?”

  • “Can someone explain why our numbers don’t match?”

Spotify hit all these triggers, but they also had a unique constraint: when your product is personalization, data isn’t a nice-to-have. It’s the entire business.

At scale, this meant:

  • 1 trillion+ events per day flowing through event delivery

  • 38,000+ scheduled pipelines running hourly and daily

  • 1,800+ event types representing user interactions

  • Teams across payments, ML, experimentation, and product all needing reliable, fast access

🏗️ Architecture That Actually Scales

The Three-Pillar Model

User's avatar

Continue reading this post for free, courtesy of Byte-Sized Design.

Or purchase a paid subscription.
© 2026 Byte-Sized Design · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture