Designing Load-Manager for Stateful Streaming Storage Systems (Apache Pulsar)
Apache Pulsar is a horizontally scalable messaging/streaming system, so the traffic in a logical cluster must be balanced across all the available Pulsar brokers as evenly as possible, in order to ensure full utilization of the broker layer.
In this talk, we will walk you through the capabilities and designs of the broker load balancer in Apache Pulsar. Users can use multiple settings(topic bundles, bundle assignment, bundle unload, and bundle split) to control the traffic distribution, which requires a bit of context to understand how the traffic is managed in Pulsar.
More recently, the Pulsar community developed a new version of Broker Load Balancer, based on Topics and Tableviews. https://github.com/apache/pulsar/issu.... In this project, we introduced Service Unit(Bundle) State and Load Data topics, which can broadcast bundle ownership and load data among brokers for load balancing. These topics greatly reduce the data dependency from the metadata store(ZK), but we need to resolve other distributed data issues, such as conflict resolution, consistency, recovery, and compaction. We will talk about the solutions to the data issues in this session.
At the end of the talk, you will have a better understanding of how Pulsar's broker-level auto-balancing works.
This Session recording was originally presented at Pulsar Summit North America 2023.
Newsletter
Our strategies and tactics delivered right to your inbox