By clicking "Accept all cookies" you agree to have cookies stored on your device to improve site navigation, analyze site usage, and assist with our marketing efforts. See our privacy policy for more information.
This blog details how to create multiple, separate pulsar clusters for isolation of resources. Because this approach segregates resources and does not share storage or local ZooKeeper with other clusters, it provides the highest level of isolation. You should use this approach if you want to isolate critical workloads (such as billing and ads). You can create multiple, separate clusters dedicated to each workload.
To help you get started quickly, this blog walks you through every step for the following parts:
Deploy two separate Pulsar clusters
Verify data isolation of clusters
Synchronize and migrate data between clusters (optional)
Scale up and down nodes (optional)
Deploy environment
The examples in this blog are developed on a macOS (version 11.2.3, memory 8G).
Software requirement
Java 8
You will deploy two clusters and each of them supports the following services:
1 ZooKeeper
1 bookie
1 broker
The following are the details of the two Pulsar clusters you will deploy.
Prepare deployment
Download Pulsar and untar the tarball. In this example, Pulsar 2.7.0 is installed.
Create empty directories using the following structure and then change the names accordingly. You can create the directories anywhere in your local environment.
Copy the files to each directory you created in step 2.
Start configuration store. Configuration store operates at the instance level and provides configuration management and task coordination across clusters. In this example, cluster1 and cluster2 share one configuration store.
Input
cd configuration-store/zk1
bin/pulsar-daemon start configuration-store
Deploy Pulsar cluster1
Start a local ZooKeeper. For each Pulsar cluster, you need to deploy 1 local ZooKeeper to manage configurations and coordinate tasks.
Deploy BookKeeper. BookKeeper provides persistent storage for messages on Pulsar. Each Pulsar broker owns its bookie. BookKeeper clusters and Pulsar clusters share the local ZooKeeper. (1) Configure bookies. Change the value of the following configurations in the cluster1/bk1/conf/bookkeeper.conf file.
cd cluster2/broker1
bin/pulsar-daemon start broker
Verify data isolation of clusters
This section verifies whether the data in the two Pulsar clusters is isolated.
Create namespace1 and assign it to cluster1.
Tip : The format of a namespace name is <tenant-name>/<namespace-name>. For more information, see Namespaces.
Input
cd cluster1/broker1
bin/pulsar-admin namespaces create -c cluster1 public/namespace1
Check the result.
Input
bin/pulsar-admin namespaces list public
Output
"public/default"
"public/namespace1"
Set the retention policy for namespace1.
Note | If the retention policy is not set and the topic is not subscribed, the data stored on the topic is deleted automatically after a while.
Input
bin/pulsar-admin namespaces set-retention -s 100M -t 3d public/namespace1
Create topic1 in namespace1 and write 1000 messages to this topic.
Tip | The pulsar-client is a command line tool to send and consume data. For more information, see Pulsar command line tools.
Input
bin/pulsar-client produce -m 'hello c1 to c2' -n 1000 public/namespace1/topic1
09:56:34.504 [main] INFO org.apache.pulsar.client.cli.PulsarClientTool - 1000 messages successfully produced
The attempt failed. The error message shows that the data stored on public/namespace1 is assigned only to cluster1. This proves that the data is isolated.
<script>
Namespace missing local cluster name in clusters list: local_cluster=cluster2 ns=public/namespace1 clusters=[cluster1]
Reason: Namespace missing local cluster name in clusters list: local_cluster=cluster2 ns=public/namespace1 clusters=[cluster1]
<script>
Write data to public/namespace1/topic1 in cluster2.
Input
<script>
cd cluster2/broker1
bin/pulsar-client produce -m 'hello c1 to c2' -n 1000 public/namespace1/topic1
<script>
Output
The error message shows that 0 message is written. The attempt failed because namespace1 is assigned only to cluster1. This proves that the data is isolated.
<script>
12:09:50.005 [main] INFO org.apache.pulsar.client.cli.PulsarClientTool - 0 messages successfully produced
<script>
Synchronize and migrate data between clusters
After verifying that the data is isolated, you can synchronize (using geo-replication) and migrate data between clusters.
Assign namespace1 to cluster2, that is, adding cluster2 to the cluster list of namespace1. This enables geo-replication to synchronize the data between cluster1 and cluster2.
The output shows that there are 1000 messages on cluster2/topic1. This proves that the data stored on cluster1/topic1 is replicated to cluster2 successfully.
At this point, you replicated data from cluster1/topic1 to cluster2 and then removed the data from cluster1/topic1.
Scale up and down nodes
If you need to handle increasing or decreasing workloads, you can scale up or down nodes. This section demonstrates how to scale up and scale down nodes (brokers and bookies).
Broker
Scale up brokers
In this procedure, you’ll create 2 partitioned topics on cluster1/broker1 and add 2 brokers. Then, you’ll offload the data stored on partitioned topics and check the data distribution among 3 brokers.
Check the information about brokers in cluster1.
Input
cd/cluster1/broker1
bin/pulsar-admin brokers list cluster1
Output
The output shows that broker1 is the only broker in cluster1.
"192.168.0.105:8080"
Create 2 partitioned topics on cluster1/broker1. Create 6 partitions for partitioned-topic1 and 7 partitions for partitioned-topic2.
Create two empty repositories (broker2 and broker3) under cluster1 repository. Copy the untarred files in the Pulsar repository to these two repositories.
The output shows that the data stored on partitioned-topic1 is distributed evenly between broker1 and broker2, which means that the data from broker3 is redistributed.
In this procedure, you’ll add 2 bookies to cluster1/bookkeeper1. Then, you’ll write data to topic1 and check whether the replicas are saved.
Check the information about bookies in cluster1.
Input
<script>
cd cluster1/bk1
bin/bookkeeper shell listbookies -rw -h
<script>
Output
The output shows that broker1 is the only broker in cluster1.
<script>
12:31:34.933 [main] INFO org.apache.bookkeeper.tools.cli.commands.bookies.ListBookiesCommand - ReadWrite Bookies :
12:31:34.946 [main] INFO org.apache.bookkeeper.tools.cli.commands.bookies.ListBookiesCommand - BookieID:192.168.0.105:3181, IP:192.168.0.105, Port:3181, Hostname:192.168.0.105
<script>
Allow 3 bookies to serve.
Change the values of the following configurations in the cluster1/broker1/conf/broker.conf file.
<script>
managedLedgerDefaultEnsembleSize=3 // specify the number of bookies to use when creating a ledger
managedLedgerDefaultWriteQuorum=3 // specify the number of copies to store for each message
managedLedgerDefaultAckQuorum=2 // specify the number of guaranteed copies (acks to wait before writing is completed)
<script>
Restart broker1 to enable the configurations.
Input
<script>
cd cluster1/broker1
bin/pulsar-daemon stop broker
bin/pulsar-daemon start broker
<script>
Set the retention policy for the messages in public/default.
Note If the retention policy is not set and the topic is not subscribed, the data of the topic is deleted automatically after a while.
Input
<script>
cd cluster1/broker1
bin/pulsar-admin namespaces set-retention -s 100M -t 3d public/default<script>
Create topic1 in public/default and write 100 messages to this topic.
Input
<script>
bin/pulsar-client produce -m 'hello' -n 100 topic1
<script>
Output
The data is not written successfully because of the insufficient number of bookies.
<script>
···
12:40:38.886 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ClientCnx - [id: 0x56f92aff, L:/192.168.0.105:53069 - R:/192.168.0.105:6650] Received error from server: org.apache.bookkeeper.mledger.ManagedLedgerException: Not enough non-faulty bookies available
...
12:40:38.886 [main] ERROR org.apache.pulsar.client.cli.PulsarClientTool - Error while producing messages
…
12:40:38.890 [main] INFO org.apache.pulsar.client.cli.PulsarClientTool - 0 messages successfully produced
<script>
Add bookie2 and bookie3.
(1) Prepare for deployment.
Create two empty repositories (bk2 and bk3) under cluster1 repository. Copy the untarred files in Pulsar repository to these two repositories.
Tip The following steps continue from the previous section “Scale up bookies”.
In this procedure, you’ll remove 2 bookies. Then, you’ll write data to topic2 and check where the data is saved.
Allow 1 bookie to serve. Change the values of the following configurations in the cluster1/broker1/conf/broker.conf file.
<script>
managedLedgerDefaultEnsembleSize=1 // specify the number of bookies to use when creating a ledger
managedLedgerDefaultWriteQuorum=1 // specify the number of copies to store for each message
managedLedgerDefaultAckQuorum=1 // specify the number of guaranteed copies (acks to wait before writing is completed)
<script>
Restart broker1 to enable the configurations.
Input
<script>
cd cluster1/broker1
bin/pulsar-daemon stop broker
bin/pulsar-daemon start broker
<script>
Check the information about bookies in cluster1.
Input
<script>
cd cluster1/bk1
bin/bookkeeper shell listbookies -rw -h
<script>
Output
All three bookies are running in cluster1, including bookie1 (3181), bookie2 (3183), and bookie3 (3184).
<script>
...
15:47:41.370 [main] INFO org.apache.bookkeeper.tools.cli.commands.bookies.ListBookiesCommand - ReadWrite Bookies :
15:47:41.382 [main] INFO org.apache.bookkeeper.tools.cli.commands.bookies.ListBookiesCommand - BookieID:192.168.0.105:3183, IP:192.168.0.105, Port:3183, Hostname:192.168.0.105
15:47:41.383 [main] INFO org.apache.bookkeeper.tools.cli.commands.bookies.ListBookiesCommand - BookieID:192.168.0.105:3184, IP:192.168.0.105, Port:3184, Hostname:192.168.0.105
15:47:41.384 [main] INFO org.apache.bookkeeper.tools.cli.commands.bookies.ListBookiesCommand - BookieID:192.168.0.105:3181, IP:192.168.0.105, Port:3181, Hostname:192.168.0.105
…
<script>
Ran Gao is a software engineer at StreamNative. Before that, he was responsible for the development of search service at Zhaopin.com. Prior to that, he worked on the development of the logistics system at JD Logistics. Being interested in open source and messaging systems, Ran is an Apache Pulsar committer.
Yu Liu
Yu Liu is an Apache Pulsar PMC member and a content strategist from StreamNative.