Byte-Sized Design

Byte-Sized Design

Share this post

Byte-Sized Design
Byte-Sized Design
🍿 Inside Netflix’s Radical Shift to a Single Foundation Model
Copy link
Facebook
Email
Notes
More

🍿 Inside Netflix’s Radical Shift to a Single Foundation Model

Scaling personalized recommendations from siloed models to a single foundation engine

State of AI's avatar
State of AI
Apr 30, 2025
∙ Paid
14

Share this post

Byte-Sized Design
Byte-Sized Design
🍿 Inside Netflix’s Radical Shift to a Single Foundation Model
Copy link
Facebook
Email
Notes
More
5
Share

👋 Welcome to the 297 new Byte-Sized Design subscribers since our last edition, glad to have you here!

This week, we’ve got something special: a guest post from the State of AI newsletter. (Give them a follow and subscribe if you’re interested in frontier AI research) This edition dives into how Netflix made a bold shift to a foundational model for their recommendation system, and the massive impact it had on performance, personalization, and architecture.

You won’t want to miss it, let’s dive in!


🚨 TL;DR

Netflix was juggling a swarm of specialized models to recommend content: one for your homepage, another for notifications, another for the “Because You Watched” row. Each was trained separately, each optimized in isolation. This worked until the costs, complexity, and inconsistencies became impossible to manage.

They rebuilt from the ground up: a single foundation model trained on the full timeline of user interaction across the platform. Instead of learning short-term behavior, this system learns long-term intent. It can generate predictions in milliseconds, adapt to new titles without training data, and serve as a shared source of embeddings for downstream teams.

This edition breaks down how Netflix structured the system, where they hit limits, and what challenges you’d face applying foundation models to recommendation at scale.


🧠 The Old Architecture: A Model for Every Use Case

For years, Netflix operated what you’d expect from a mature recommender system: many models, each designed for a narrow goal.

  • Ranking notifications: Personalized based on recent watch history

  • Homepage rows: Top Picks, Continue Watching, Trending each ranked by separate pipelines

  • Search re-ranking: Suggestions fine-tuned by intent

While the models were performant in isolation, the cracks showed over time:

  • Inconsistent personalization: Two parts of the UI might recommend completely different genres

  • Repeated feature engineering: Same features rebuilt in multiple training pipelines

  • Costly innovation: Improvements in one model rarely transferred to others

  • Short-term bias: Most models only used recent activity due to latency limits

In short, the personalization system didn’t scale with the platform or its audience.


🧱 The Foundation Model Approach

Instead of patching the existing system, Netflix moved to a foundation model paradigm—training a single, large model on the entirety of each user’s interaction history.

Keep reading with a 7-day free trial

Subscribe to Byte-Sized Design to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
A guest post by
State of AI
Making frontier AI research more accessible
Subscribe to State
Š 2025 Byte-Sized Design
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More