Prime Video Gives Up On Distributed Services to be 90% Faster 🚗 💨
It's click bait. Here's what really happened
TLDR;
Prime Video was being really inefficient checking if their live streams were lagging or had other issues. They had another database to save live stream content then read from that database to check for errors.
Prime just dropped the database and did reading the live stream and detecting errors on the same server. Actually correcting the issue is on another service.
So yeah, they still use Microservices, just detecting video issues is on the same server now.
What’s the context?
Prime Video has live streams like live sports and other content. If Amazon truly values customer obsession, then they’ll guarantee these live streams have no issues.
Common issues include out of audio syncs, buffer, lags, and blocky video feed.
But when you’re at Amazon’s scale, there’s going to be a lot of viewer’s tuning in and you’ll need the infrastructure to support that.
Go Explain the Problem!
Here’s the current design. It’s a complicated picture but we’ll explain it and why it doesn’t work.
There’s a service that collects the live video feed. It’s literally watching the live stream with you as quality control.
Short snippets are stored in an S3 database
Another service reads from that database and detects the errors in the live stream quality.
Errors are sent to another service to be handled
If we focus on steps 1-3, the bottle neck is storing things in a database and reading it again to detect errors. It’s slow and inefficient.
Give me the Requirements!
Drop the bottleneck to analyze live streams. One of the few times dropping the database is a good thing.
Continue to detect errors for load 20x larger than what the current system handles.
What are we doing, boss?
.Prime Video went from using step functions → Not using step functions.
Instead of a database to hold the live stream footage to be analyzed, the video snippets were sent to the server directly. The server analyzes the errors in the video snippet and stores the errors in another database to be handled by another service.
So yes, there’s still Microservices! It’s just this one service got changed to use a single monolith to detect video errors.
I swear It’s a Monolith!
But if we look at the design, the only part that’s a monolith is just detecting the errors. Actually handling the errors is done by another service.
Instead of two databases, there’s only one here to keep track of all the video errors for the other service to handle.
Sources and Official Article!
(Link to sources and the full design article is available to paid subscribers! They help support this newsletter and maintain service costs.)
Keep reading with a 7-day free trial
Subscribe to Byte-Sized Design to keep reading this post and get 7 days of free access to the full post archives.