Data Streaming Summit Virtual 2025 Is Now a Two‑Day Event – May 28‑29

By clicking "Accept all cookies" you agree to have cookies stored on your device to improve site navigation, analyze site usage, and assist with our marketing efforts. See our privacy policy for more information.

Video

14 min

Make The Most Out Of Your Pulsar Catalog

Apache Pulsar is a distributed, open source pub-sub messaging and streaming platform for real-time workloads. Apache Flink's Apache Pulsar DataStream API connector is maintained by the Apache Flink community. However, Apache Pulsar Table API connector as well as Apache Pulsar Catalog is still released as part of the Stream Native's fork of Apache Flink.

Currently, Apache Pulsar Catalog implements two separate mechanisms, one is used to infer the metadata information for already available Apache Pulsar tenants, namespaces and the topics automatically, named native tables. Whereas explicit table is used to persist the metadata information of the tables registered via Flink SQL/Table API in an Apache Pulsar topic.

For the purpose of this session, we only want a version of the Apache Pulsar Catalog that serves the purpose of the native tables integration for persisting the metadata information of the the tables registered via Flink SQL. Since there is not yet such separation exists within Apache Pulsar Catalog, we will go ahead and use a custom implementation of the Apache Pulsar Catalog, which is basically a read-only version for the native tables. The current implementation is available at the specific fork of the Apache Flink [1] based on the 1.15.2 release.

This Session recording was originally presented at Pulsar Summit North America 2023.

Ali Zeybek

Team Lead Solutions Architect, Ververica