Mar 2, 2021
8 min read

Pulsar Isolation Part I: Taking an In-Depth Look at How to Achieve Isolation in Pulsar

Penghui Li
Engineering Lead of Messaging Team, StreamNative
Yu Liu


One of the great things about using Apache Pulsar is that Pulsar’s multi-layer and segment-centric architecture and hierarchical resource management provide a solid foundation for isolation, which allows you to isolate resources in your desired manner, prevent resource competition, and attain stability.

This is the first blog in our four-part blog series on how to achieve resource isolation in Apache Pulsar. In this blog, we give you an overview of how to use the following approaches to achieve isolation in Pulsar:

Separate Pulsar clusters

In this approach, you need to create different Pulsar clusters for your isolation units.

How it works

As shown in figure 1, it demonstrates the deployment of separate Pulsar clusters to achieve isolation.

illustration Deployment of separate Pulsar clusters
Figure 1 - Deployment of separate Pulsar clusters

Here are some key points for understanding how it works:


  • Each Pulsar cluster exposes its service through a DNS entry point and makes sure a client can access the cluster through the DNS entry point. From the client side, the client can use one or multiple Pulsar URLs that the Pulsar cluster exposes as the service URL.
  • Each Pulsar cluster has one or multiple brokers and bookies.
  • Each Pulsar cluster has one metadata store.
  • Metadata store can be separated into Pulsar metadata store and BookKeeper metadata store. While the metadata store in this guide refers to these two concepts rather than distinguish them.
  • Separate Pulsar clusters use a shared configuration store.
  • Pulsar's hierarchical resource management provides a solid foundation for isolation. In this approach, if you want to achieve namespace isolation, you need to specify a cluster for a namespace. The cluster must be in the allowed cluster list of the tenant. Topics under the namespace are assigned to this cluster. For how to set a cluster for a namespace, see here. For how to manage Pulsar clusters, see here.

Migrate namespace

If you want to migrate namespaces between different clusters, you need to enable geo-replication for the namespaces and disable it after all data replicated to the target cluster. For how to set geo-replication for a namespace, see here.

Scale up or down node

If you want to scale up or scale down brokers or bookies, you need to scale up or scale down the brokers and bookies in the corresponding cluster.

Shared BookKeeper cluster

In this approach, you need to deploy one BookKeeper cluster shared across multiple broker clusters.

How it works

As shown in figure 2, it demonstrates the deployment of a shared BookKeeper cluster to achieve isolation.

figure Deployment of shared BookKeeper cluster
Figure 2 - Deployment of shared BookKeeper cluster

Here are some key points for understanding how it works:


  • Each Pulsar cluster exposes its service through a DNS entry point and makes sure a client can access the cluster through the DNS entry point. From the client side, the client can use one or multiple Pulsar URLs that the Pulsar cluster exposes as the service URL.
  • Each Pulsar cluster has one or multiple brokers.
  • Each Pulsar cluster has one metadata store.
  • Separate Pulsar clusters use a shared BookKeeper cluster.
  • Pulsar's hierarchical resource management provides a solid foundation for isolation. In this approach, if you want to achieve namespace isolation, you need to specify a cluster for a namespace. The cluster must be in the allowed cluster list of the tenant. Topics under the namespace are assigned to this cluster. For how to set a cluster for a namespace, see here. For how to manage Pulsar clusters, see here.
  • As shown in figure 3, the storage isolation is achieved by different bookie affinity groups.
  • All bookie isolation groups use a shared BookKeeper cluster and a metadata store.
  • Each bookie isolation group has one or several bookies.
  • You can specify a primary or secondary group (one or several) for a namespace. Topics under the namespace are created on the bookies in the primary group firstly and then created on the bookies in the secondary group. For how to set bookie affinity groups, see here.
illustration bookkeeper cluster
Figure 3 - Storage isolation is achieved by different bookie affinity groups


Migrate namespace

If you want to migrate the message service of the namespace to another broker cluster, you need to change the cluster for the namespace.

If you want to migrate the namespace to another bookie affinity group, you need to change the bookie affinity group. For how to set a bookie affinity group, see here. Besides, since the BookKeeper cluster is shared across all broker clusters, there is no need to copy data to another BookKeeper cluster.

Scale up or down node

Broker

When scaling up or scaling down brokers, you need to take the following key points into consideration:

  • When scaling up brokers, specify the broker isolation group for the newly added broker using the primary or secondary group.
  • When scaling down brokers, make sure the broker isolation group has enough brokers.

Bookie

When scaling up or scaling down bookies, you need to take the following key points into consideration:

  • When scaling up bookies, specify the bookie affinity group for the newly added bookies.
  • When scaling down bookies, make sure the bookie affinity group has enough bookies. For how to set bookie affinity groups, see here.

Single Pulsar cluster

In this approach, you do not need to deploy multiple broker clusters and multiple bookie clusters. Instead, you need to manage a single Pulsar cluster.

How it works

As shown in figure 4, it demonstrates the deployment of a single Pulsar cluster to achieve isolation.

illustration Deployment of single Pulsar cluster
Figure 4 - Deployment of single Pulsar cluster

Here are some key points for understanding how it works:


  • The Pulsar cluster exposes its service through a DNS entry point and makes sure a client can access the cluster through the DNS entry point. From the client side, the client can use the Pulsar URL that the Pulsar cluster exposes as the service URL.
  • Broker isolation is achieved by different broker isolation groups (Pulsar assigns the topic to the broker under the specific broker isolation). For how to set broker isolation groups, see here.
  • Storage isolation is achieved by different bookie affinity groups. For how to set bookie affinity groups, see here.

Migrate namespace

If you want to migrate the namespace to another broker isolation group, you need to change the namespace isolation policy. For how to set namespace isolation policy, see here.

If you want to migrate the namespace to another bookie affinity group (it does not move the old data to the new bookie affinity group), you need to change the bookie affinity group. For how to set a bookie affinity group, see here.

Scale up or down node

Broker

When scaling up or scaling down brokers, you need to take the following key points into consideration:

  • When scaling up brokers, specify the broker isolation group for the newly added broker using the primary or secondary group.
  • When scaling down brokers, make sure the broker isolation group has enough brokers.

Bookie

When scaling up or scaling down bookies, you need to take the following key points into consideration:

  • When scaling up bookies, specify the bookie affinity group for the newly added bookies.
  • When scaling down bookies, make sure the bookie affinity group has enough bookies. For how to set bookie affinity groups, see here.

Reference

In product environments, you can combine all Pulsar isolation approaches together or choose none of them to suit your needs. Normally, when choosing isolation approaches, you can take the following points as references:

For some critical businesses (such as billing, ads, and so on), you can have multiple small Pulsar clusters, which do not share storage or local ZooKeeper with the other clusters. This approach provides the highest level of isolation for the most critical workloads.

For the organization consists of multiple teams, you can deploy a single large Pulsar cluster and use various namespaces for different isolation groups. The isolation groups can be determined by capacity but more often by different workloads. For example, use cases with large amounts of fanout may have different hardware than those tailored for the lowest end-to-end-latency.


Penghui Li
Penghui Li is passionate about helping organizations to architect and implement messaging services. Prior to StreamNative, Penghui was a Software Engineer at Zhaopin.com, where he was the leading Pulsar advocate and helped the company adopt and implement the technology. He is an Apache Pulsar Committer and PMC member. Penghui lives in Beijing, China.
Yu Liu
Yu Liu is an Apache Pulsar PMC member and a content strategist from StreamNative.

Newsletter

Our strategies and tactics delivered right to your inbox

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Multi-Tenancy & Isolation