The Rewrite That Saved Millions of Memories
How Dropbox Fixed Photo Uploads Without Breaking Trust
On paper, Camera Uploads sounds simple.
Take a photo. Back it up. Move on with your life.
But behind that simplicity is a distributed pipeline that has to tiptoe around OS restrictions, user expectations, flaky networks, limited battery, and edge cases you didn’t even know existed, like one user who hadn’t opened Dropbox since 2018 but expected every new photo to upload anyway.
In 2019, Dropbox realized the old system which was written in C++ and shared across Android and iOS was holding them back. The bugs were growing, tooling was non-existent, and debugging meant spelunking through platform-specific hacks written years ago by engineers who had long since left.
So they rewrote it.
🧠 Want to Actually Get Better at Coding Interviews?
Most people think they’re preparing for interviews by randomly solving problems on LeetCode.
But progress without structure is just... noise.
That’s why I recommend The Daily Byte which is a structured platform that takes you from zero to interview-ready with:
✅ 365 curated problems (one for each day)
✅ In-browser IDE for real-time feedback
✅ Solutions that actually explain the why, not just the what
✅ A curriculum that builds up from the basics, step by step
You don’t need 10 tabs open, 5 different playlists, and a spreadsheet tracker.
You just need one solid habit and a platform designed to support it.
If you’re serious about landing that FAANG offer or leveling up your fundamentals, start here.
Let this be the last time you wonder what to practice next.
📦 TL;DR
Dropbox rewrote Camera Uploads from a shared C++ library into native Kotlin and Swift apps. The goal? Reliability, speed, and long-term maintainability. The Android team focused on parallel uploads, memory tuning, background resilience, and validating every byte. The result? Faster uploads, fewer crashes, and a system they can trust.
Here’s how they did it.
🧱 The Problem
The old C++ implementation did the job for a while. But modern Android had changed the rules of the game.
Strict background restrictions meant uploads were often throttled or blocked.
Network access came in short windows, and C++ didn’t know when to wait.
Retry logic was dumb: fail, retry, fail again, kill your battery.
Memory usage ballooned with larger libraries. Sometimes it crashed outright.
Worst of all? Fixing any of this meant touching code nobody wanted to touch.
🛠 The Rewrite
The new Android version was written in Kotlin, using WorkManager
and Room
. These tools speak Android fluently. No more duct-taping cross-platform logic into system-native workflows.
The upload pipeline now runs as two separate background workers:
Scanner: detects new photos, filters out dupes, builds the queue.
Uploader: breaks each file into 4MB chunks, hashes them, and commits them in batches.
Uploading starts without waiting for a full scan. New photos can begin uploading immediately, even while older ones are still being indexed.
🧵 Scanner and uploader work in parallel.
🏎 Faster. Smarter. More Memory-Efficient.
Old system = 1 file upload at a time.
New system = parallel uploads using Kotlin coroutines.
They created a custom unorderedConcurrentMap
operator to kick off multiple uploads at once, but only up to a safe limit. (Too many = crashes.)
Then they ran into an odd memory bug: byte buffers weren’t being released fast enough. GC was lagging behind.
They fixed it by:
Reusing byte buffers across uploads.
Switching from NIO-mapped files to direct byte buffers (no lingering caches).
Scaling concurrency dynamically based on available memory.
Result: first-time uploads of huge libraries are now up to 4x faster. And they don’t crash your phone.
🧠 Smarter Retries, Fewer Mistakes
The old system retried forever. Which sounds helpful until you realize it’s chewing up bandwidth, battery, and server logs.
The new system:
Categorizes errors as transient vs. permanent.
Uses exponential backoff.
Avoids retrying doomed uploads.
They also started validating upload state transitions. If something tries to move a photo from DONE → DONE, that’s flagged as a bug. (Which it was, WorkManager was duplicating workers before cancelling the old ones.)
📉 Duplicate uploads dropped. Throughput soared.
🧪 Validating the Invisible
They didn’t just test in QA. They ran the new upload logic alongside the old one in production, comparing results.
Hash mismatches? Logged and investigated.
Turns out, the new Kotlin implementation actually fixed a rare C++ hashing bug that had gone unnoticed for years.
They also validated every database state change against a strict state machine to catch concurrency bugs.
And just in case the rewrite went sideways, they built a full rollback path back to C++. They never needed it, but they had it.
🚀 Rollout Without Wreckage
A rewrite this big can backfire if you rush it. So they didn’t.
Started with opt-in beta users.
Monitored success rates, backlogs, and crash reports.
Tuned retry logic and concurrency limits as they learned.
Read support tickets daily. Not just dashboards.
When the public launch came, it was uneventful in the best way. No regressions. No spikes. Just a smoother, faster experience.
🧭 Lessons from the Field
Keep reading with a 7-day free trial
Subscribe to Byte-Sized Design to keep reading this post and get 7 days of free access to the full post archives.