Most well-known technologies, such as Apache Kafka or RabbitMQ, which are used to distribute data, events, or messages within distributed systems were created in the late 2000s, early 2010s. At that time, building distributed systems was a huge challenge. These technologies emerged to help companies operate at scale on premises.
Another trend started gaining traction with the rise of containerization technologies such as Docker and Kubernetes. Many organizations started to become aware of the benefits of running their workloads in the cloud, either due to cost savings or ease of operations.
Internal architectures of existing queueing and streaming technologies were not a good fit for this new technical paradigm, and it was difficult to dynamically scale those technologies to react to both spikes in traffic or to follow more gradual increases in volume over time.
Realizing this, individuals at Yahoo! started working on a “cloud-native messaging service” that eventually became Apache Pulsar.
The original creators of Apache Pulsar and founders of StreamNative designed an elastic architecture, and also added features that were missing from other technologies: multi-tenancy, third party storage, geo-replication, streaming and messaging consumption models…and more.
Pulsar was originally deployed inside Yahoo! as a consolidated messaging platform connecting critical Yahoo! applications such as Yahoo! Finance, Yahoo! Mail, and Flickr, to data.
Starting in 2021, Apache Pulsar is currently recognized in the Top 5 most active projects of the Apache Foundation, and has more active contributors than Apache Kafka.
Apache Pulsar is now a mature and versatile cloud-native data streaming platform used by the most demanding enterprises.
Apache Pulsar brings numerous features to the table, including:
Pulsar’s decoupled architecture enables you to reduce architectural complexity and put millions of topics into a single cluster. Pulsar’s distributed storage system design is segment-based. This hierarchical topic namespace enables Pulsar users to maintain millions of topics in a single cluster.
Pulsar reduces complexity and management overheard compared to monolithic system architectures, which often require organizations to segment and silo their data and workloads across multiple clusters. Pulsar also provides granular resource management with both hard and soft isolation, making it possible to prevent producers, consumers, and topics from overwhelming the cluster. This ensures that consistent and predictable economy-scale performance can be maintained, even as the number of clients and data volumes grow.
Apache Pulsar was designed with a multi-layer architecture in which each layer is scalable, distributed, and decoupled from the other layers. With Pulsar, you can add new topics as needed and seamlessly scale performance. Scaling in Pulsar is a simple, non-disruptive operation.
With partition-centric storage architectures, expanding capacity often requires partition rebalancing, which in turn requires recopying entire partitions to new nodes. Recopying data is expensive and prone to errors, and it consumes network bandwidth and I/O. With Pulsar, you can scale as needed without the expense and disruption of repartitioning. Pulsar’s decoupling of the processing, message serving and storage layers means you can independently scale resources.
Apache Pulsar replicates each message to multiple storage nodes and was designed to be able to handle both single and multiple node failures. With stateless brokers, Pulsar only needs to update metadata to transfer ownership from one broker to another when a topic is moved to a different broker and no data is copied during this transfer of ownership.
Pulsar supports full-mesh replication, in which individual topics can be replicated to any number of external data centers and replication is done in multiple directions, all with a simple configuration setting at the tenant level. In Pulsar, all replica repairs happen in the background and are handled by the storage layer to avoid impacting other workloads being processed by brokers.
Built by the original creators of Apache Pulsar, StreamNative brings speed and peace of mind to cloud-native application projects.
We offer free on-demand courses, videos, and a hands-on lab environment.
We offer 3-day hands-on courses, designed for developers and operators. Engage with Pulsar experts and your peers as part of this learning experience.
We maintain our own enterprise-level Pulsar service to ensure rapid fixes and updates. Our team members are also the top contributors to open source Pulsar.
SLA support is included in our deployment options.
Position, Company name
Position, Company name
Manager IT Business Platforms at Intelcom
Tech Lead at Tencent
Senior Staff Software Engineer at Iterable