14 min

Make The Most Out Of Your Pulsar Catalog

Apache Pulsar is a distributed, open source pub-sub messaging and streaming platform for real-time workloads. Apache Flink's Apache Pulsar DataStream API connector is maintained by the Apache Flink community. However, Apache Pulsar Table API connector as well as Apache Pulsar Catalog is still released as part of the Stream Native's fork of Apache Flink.

Currently, Apache Pulsar Catalog implements two separate mechanisms, one is used to infer the metadata information for already available Apache Pulsar tenants, namespaces and the topics automatically, named native tables. Whereas explicit table is used to persist the metadata information of the tables registered via Flink SQL/Table API in an Apache Pulsar topic.

For the purpose of this session, we only want a version of the Apache Pulsar Catalog that serves the purpose of the native tables integration for persisting the metadata information of the the tables registered via Flink SQL. Since there is not yet such separation exists within Apache Pulsar Catalog, we will go ahead and use a custom implementation of the Apache Pulsar Catalog, which is basically a read-only version for the native tables. The current implementation is available at the specific fork of the Apache Flink [1] based on the 1.15.2 release.

This Session recording was originally presented at Pulsar  Summit North America 2023.

Ali Zeybek
Team Lead Solutions Architect, Ververica


Our strategies and tactics delivered right to your inbox

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.