We are excited to announce the general availability of the Google Cloud Pub/Sub connector for Apache Pulsar. The connector enables seamless integration between Google Cloud Pub/Sub and Apache Pulsar, improving the diversity of the Apache Pulsar ecosystem.
What is the Google Cloud Pub/Sub connector?
The Google Cloud Pub/Sub connector is a Pulsar IO connector enabling data replication between Google Cloud Pub/Sub and Apache Pulsar. The connector provides two ways to import and export data between the systems: Source and Sink.
Why did StreamNative develop the Google Cloud Pub/Sub connector?
Apache Pulsar and Google Cloud Pub/Sub are two of the most popular and widely used messaging platforms in modern cloud environments. Apache Pulsar’s unified platform enables queueing data, analytics, and streaming in one underlying system. Google Cloud Pub/Sub is known for efficient performance, a powerful ecosystem in streaming analytics, and the capability of in-order delivery at scale.
Historically, however, users did not have a simple and reliable way of performing fully-featured messaging and streaming in one cloud pub/sub system, so they compensated for this by investing significant development efforts to bridge the gaps.
The new StreamNative connector provides Google Cloud Pub/Sub users a way to connect the flow of messages to Pulsar and use the features unavailable elsewhere, while also avoiding problems with connectivity that can appear when there are intrinsic differences1 between systems or privacy requirements. The connector solves this problem by fully integrating with the rest of Pulsar’s system (including, serverless functions, per-message processing, and event-stream processing). It presents a low-code solution with out-of-box capabilities like multi-tenant connectivity, geo-replication, protocols for direct connection to end-user mobile clients or IoT clients, and more. These features are essential for two-way event traffic.
What are the benefits of using the Google Cloud Pub/Sub connector?
The integration between Google Cloud Pub/Sub and Apache Pulsar results in 3 key benefits.
Easy. You can quickly move data between Apache Pulsar and Google Cloud Pub/Sub without writing any code.
Efficient. You can reduce the time on the data layer and have more time to find the maximum business value from real-time data in an effective way.
Scalable. You can run this connector on any node (standalone or distributed), allowing you to build reactive data pipelines to meet your business and operational needs in real-time.
How do I start using the Google Cloud Pub/Sub connector?
You can be up and running with the connector in 3 easy steps:
Configure the services and download the connector
Configure the source connector
Configure the sink connector
Before you start
First, you must run an Apache Pulsar cluster and a Google Cloud Pub/Sub service.
Prepare the Pulsar service. You can quickly run a Pulsar cluster anywhere by running $PULSAR_HOME/bin/pulsar standalone. See Getting Started with Pulsar for details. Alternatively, get started with StreamNative Cloud, which provides an easy-to-use and fully-managed Pulsar service in the public cloud.
Set up the Google Cloud Pub/Sub connector. Download the connector from the Releases page, and then move it to $PULSAR_HOME/connectors.
Apache Pulsar provides a Pulsar IO feature to run the connector. Follow the steps below to quickly get the connector up and running.
Configure the source connector
Create a configuration file named google-pubsub-source-config.json to send the pulsar-io-google-pubsub/test-google-pubsub-source topic messages from Google Cloud Pub/Sub to the public/default/test-google-pubsub-source topic of Apache Pulsar:
Create a configuration file named google-pubsub-sink-config.json to send the public/default/test-google-pubsub-sink topic messages from Apache Pulsar to the pulsar-io-google-pubsub/test-google-pubsub-sink topic of Google Cloud Pub/Sub:
When you send a message to the public/default/test-google-pubsub-sink topic of Apache Pulsar, this message is persisted to the pulsar-io-google-pubsub/test-google-pubsub-sink topic of Google Cloud Pub/Sub.
The Google Cloud Pub/Sub connector is a major step in the journey of integrating other messaging systems into the Pulsar ecosystem. To get involved with the Google Cloud Pub/Sub connector for Apache Pulsar, check out the following featured resources:
Try out the Google Cloud Pub/Sub connector. To get started, download the connector and refer to the ReadMe that walks you through the whole process.
Make a contribution. The Google Cloud Pub/Sub connector is a community-driven service, which hosts its source code on the StreamNative GitHub repository. We would love you to explore this new connector and contribute to its evolution. If you have any feature requests or bug reports, do not hesitate to share your feedback and ideas and submit a pull request.
1Intrinsic differences exist between platforms that have no notion of schema and the ones that have sophisticated schema capabilities because there is no simple way to translate between them. These platform differences range from traditional messaging like Amazon SQS to multi-level hierarchical Avro schema written to a data lake. Distinctions also exist between platforms relying on different data representations, such as Pandas DataFrames and simple messages.
Zixuan Liu is a software engineer at StreamNative. He mainly contributes to Apache Pulsar and Pulsarctl, focusing on the security field.