Byte-Sized Design

Byte-Sized Design

🌀 From Connector Chaos to Clean Streams: PLAID’s Real-Time Architecture Overhaul

How a team cut Kafka connector costs and scaled cleanly with MongoDB Atlas Stream Processing

Byte-Sized Design's avatar
Byte-Sized Design
Jul 23, 2025
∙ Paid
9
5
Share

The setup

PLAID, a Tokyo-based company behind the real-time analytics platform KARTE, had used MongoDB since 2015. Their product ingests and processes millions of user interaction events across web and mobile in near real-time. The goal was to let customers personalize user experiences with up-to-the-second context.

Originally, they ran MongoDB self-hosted. Then in 2021, they moved to MongoDB Atlas. With Atlas came the ambition to tighten the feedback loop: real-time event ingestion, transformation, and analysis, without the complexity of batch jobs. They wanted everything piped into BigQuery to drive their downstream ML and reporting.

That’s where things started to get messy.


The problem: pipeline sprawl

To stream data from Atlas to BigQuery, they relied on Kafka Connect with custom connectors. The shape of the pipeline looked like this:

MongoDB Atlas → Kafka Source Connector → Kafka → BigQuery Sink Connector

In theory, this worked.

But as the product scaled, reality set in:

  • Connector count exploded. One per collection, per environment (prod, staging, etc.). Multiply that by services and teams.

  • Pricing was linear per connector. Each new pipeline meant higher Confluent Cloud costs.

  • Staging overhead doubled everything. Even feature testing meant provisioning duplicate connectors.

  • Monitoring was brittle. Kafka Connect didn’t integrate cleanly with Datadog. They had limited visibility into lags and throughput.

  • Backfills were clunky. There was no clean way to replay historical data into Kafka or BigQuery without hacks.

PLAID hit a tipping point: they were solving data engineering problems that had nothing to do with their core product.


The solution: stream closer to the source

Instead of solving these problems at the Kafka layer, PLAID looked upstream. In particular, they replaced the Kafka Source Connector with MongoDB Atlas Stream Processing (ASP).

The new architecture:

Keep reading with a 7-day free trial

Subscribe to Byte-Sized Design to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Byte-Sized Design
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture