When was the last time you thought about which physical partition your Postgres row lives on? Which Cassandra token range does your DynamoDB item hash into? Which shard that your MongoDB document routes to?
For almost every application, in almost every company, the answer is: never. The database world spent three decades making physical storage invisible to users. Tables are logical objects addressed by name. The system handles sharding, rebalancing, replication, and scale underneath — not because it's magic, but because application developers shouldn't have to care.
When LinkedIn open-sourced Kafka and Yahoo open-sourced Pulsar, exposing the partition was the honest move. Log-structured systems scale by parallelizing writes across machines, and the partition was the unit of that parallelism. Telling developers about it was telling them the truth about what the system could and couldn't do.
The problem is that the workloads have changed, but streaming APIs haven't kept up. A decade ago, streaming platforms handled predictable microservice workloads; teams could statically size topic partitions like a DBA planned tables. Today's streaming workloads are unpredictable. AI agents, lakehouse pipelines with unpredictable key distribution, and real-time activation systems mean the "right partition count" changes too fast for static configuration. This mirrors the pressure that pushed databases toward hidden sharding twenty years ago — static transparency stops working when workloads are too unpredictable.
It's time to revisit that assumption. The next decade of streaming needs a model designed around intent, not around 2010s deployment constraints. Apache Pulsar 5.0 is what that looks like in practice.
The database world already figured this out
Spanner splits tablets when load crosses a threshold. DynamoDB rebalances partitions silently when a key turns hot. Cassandra's token ring is a detail of the implementation — a thing engineers learn about when they're debugging, not a thing application developers think about when they're writing code. Postgres with pg_partman and logical replication lets operators express partitioning declaratively, and the system handles the mechanics.
The API surfaces above all of this are equally restrained. Modern ORMs — SQLAlchemy, ActiveRecord, TypeORM, Prisma — don't expose a single Operation<T> with a runtime operationType enum. They expose Query, Insert, Update, Transaction as separate, type-safe interfaces. Each interface offers only the methods that make sense for its purpose. When you call insert, you can't accidentally invoke a query-specific method that would silently no-op at runtime. The compiler tells you the moment you try.
That's not a stylistic preference. It's a recognition that the shape of the API teaches people how to use the system. A well-shaped API closes off the wrong moves before the programmer can make them. Databases made that choice a long time ago, and it has paid off in a generation of applications that spend almost zero time thinking about their storage layer.
Streaming today: the last bastion of manual partitioning
In 2026, building on Kafka or Pulsar still means the following conversations are routine in architecture reviews:
- "How many partitions should we use for this topic?" Get it wrong and you pay for it for years. Adding partitions to a live topic with keyed messages breaks ordering for in-flight traffic, because hash(key) % N is not a stable mapping under a changing divisor. Decreasing partitions is not supported at all — operators end up over-provisioned forever.
- "Which subscription type should our consumer use?" In Pulsar, that question has four possible answers — Exclusive, Failover, Shared, Key_Shared — and the wrong choice manifests as a silent runtime no-op rather than a compile error. In Kafka, the equivalent question splits across consumer groups, the newly GA'd Share Groups, and external coordinator services. The API surface doesn't tell you which is right.
- "How do we handle rebalances?" A consumer disconnects because of a brief network glitch, and the rebalance protocol redistributes its partitions across the remaining consumers. Work pauses. Operations teams tune around the behavior. This is a user-visible fault. In a database, the equivalent — a read replica blipping — is handled by the connection pool without anyone noticing.
- "Do we use a partitioned topic or a non-partitioned one?" Two topic types with different scaling properties, different metadata, different operational characteristics. The choice depends on how you plan to scale, which depends on estimates that will be wrong.
Every one of these is a question the database world made irrelevant. The answer in databases is: you don't think about it. The answer in streaming is: you think about it a lot, and you pay dearly when you get it wrong.
What changed, and why it has to catch up
The constraint that pushed streaming into exposing partitions was real. Log-structured systems scale horizontally by parallelizing writes across machines. Partitions were the unit of that parallelism — one partition per broker per leader, with clear ownership. Exposing them to the user was the honest move at the time: it told developers the truth about what the system could and couldn't do.
But the shape of real workloads has changed. Streaming is no longer the back-office system that moves events between microservices in a carefully-sized deployment. It's becoming the substrate for AI agents that consume event feeds at unpredictable rates. It's the connective tissue between lakehouse analytics and real-time activation. It's the log that backs state for stateful processors, serverless functions, and live model inference. In every one of these use cases, a permanent partition-count decision at topic creation is an architectural landmine.
The database world faced the same pressure twenty years ago and chose abstraction over transparency. Streaming now has to make the same choice — not because transparency is bad, but because it has stopped paying for itself. The complexity we push onto developers today is complexity that was reasonable in 2012 and is increasingly unreasonable in 2026.
What catching up looks like
Let's be concrete. Streaming's database moment has four parts, and they will land together in Pulsar 5.0.
Topics scale like tables. One topic type — a Scalable Topic — replaces partitioned and non-partitioned. You create a topic by name. The system splits and merges range segments underneath based on load. Key ordering is preserved across topology changes, because routing is range-based rather than modulo-based: when a segment splits, the keys that live in each child slice are a subset of the parent's hash range, and the parent-to-child order is respected by consumers catching up. Producers and consumers don't need to know any of this. They address topic://tenant/namespace/my-topic and the system handles the rest. No guessing partition counts. No re-creating topics at scale. No downtime for rescaling.
Consumer APIs look like ORM methods. Instead of a single Consumer<T> with a runtime subscriptionType enum, Pulsar 5.0 ships three purpose-named consumer interfaces: StreamConsumer for ordered consumption, QueueConsumer for shared-dispatch queue semantics, and CheckpointConsumer for stream-processing frameworks that manage their own state. Each interface exposes only the operations that are valid for its intent. Invalid combinations become compile errors. The API tells you, at write-time, what the system can do.
Rebalances become infrastructure events, not application events. A consumer disconnect no longer triggers an immediate redistribution. The new controller layer holds segments in reserve for a grace period; if the consumer reconnects with the same identity before the period expires, its assignments come back unchanged. Operations teams stop tuning around rebalance semantics because they stop being user-visible most of the time.
Migration is additive — and easy. This is the part that matters most for teams already running Pulsar. The underlying storage primitives — BookKeeper, managed ledgers, brokers — are unchanged. A new coordination layer is added on top. That means roughly 98% of the existing client SDK is reused. Moving an existing topic to a scalable topic is a short code change for most applications, not a replatform. And for teams not ready to migrate yet, existing topics continue to work exactly as they do today. There is no flag day.
Together, these changes reduce the surface area developers have to learn. The count is easy to make: two topic types plus four subscription types give you eight conceptual cells today. One topic type plus three purpose-named consumers gives you three conceptual cells in Pulsar 5.0. That's a 62% reduction in user-facing primitives — not because features were removed, but because the primitives were reshaped around what developers actually mean.
Why this matters now
Every major architectural shift in the data stack has eventually hidden its sharding. Databases did it. Search engines did it. Blob storage did it. Caches did it. Streaming is the last component of the modern data platform where sharding leaks into application code at scale.
The people building on streaming in 2026 are not the people who built on it in 2016. The AI agent developer who wants an event feed to drive a reasoning loop doesn't want to make a partition-count decision. The product team instrumenting user actions for real-time activation doesn't want to cap their throughput at topic-creation time. The lakehouse engineer wiring streams into Iceberg tables doesn't want a separate operational playbook for rebalances. Streaming infrastructure that demands these decisions pushes friction back upstream, into product timelines and architecture reviews and late-night incident response.
The Pulsar 5.0 changes are one community's attempt to remove that friction. The design principles — range-based routing, segment DAG, persistent watch sessions, type-safe consumers, grace-period leases — will be covered in detail in the posts that follow this one. The Pulsar Improvement Proposals that specify them (PIP-460, PIP-466, PIP-468) are public on the Apache mailing lists and the apache/pulsar repository. The work is in the open from the first line.
The bigger claim is this: streaming's database moment is not optional. It's the natural next step in the evolution of a piece of infrastructure that has become too important to demand so much of its users. Different communities will arrive at it in different ways. Pulsar 5.0 is what one arrival looks like — shipped on top of a decade of production-hardened infrastructure, with a migration path that doesn't require teams to rebuild, and with an API designed around how developers actually think.
The agent era assumes streaming is easy. Today it isn't. Making it easy means making the right abstraction-layer choices now. We will talk more about Scalable Topic and Pulsar 5.0 in the coming weeks. Subscribe on streamnative.io/blog to get the full series as it ships.





