We are very glad to see the Apache Pulsar community has successfully released 2.5.1 version. This is the result of a huge effort from the community, with over 130 commits and a long list of new features, general improvements and bug fixes.
The following highlights a tiny subset of new features.
Refresh authentication credentials
In Pulsar 2.5.1, two more methods are introduced in the single AuthenticationState interface credentials holder. This helps enhance the Pulsar authentication framework to support credentials that expire over time and need to be refreshed by forcing clients to re-authenticate.
Existing authentication plugins are unaffected. If a new plugin wants to support expiration, it just overrides the isExpired() method. The Pulsar broker ensures to periodically check the expiration status for the AuthenticationState of every ServerCnx object. You can also use the authenticationRefreshCheckSeconds setting to control the frequency of the expiration check.
Upgrade Avro to 1.9.1
The library used to handle logical datetime values has been changed from Joda-Time to JSR-310. For keeping forward compatibility, Pulsar java client uses Joda-Time conversion for logical datetime. To use JSR-310 conversion, you can enable it in the schema definition.
By default, Avro 1.9.1 enables the JSR310 datetimes, which might introduce some regression problems if users use source codes generated by Avro compiler 1.8.x and the source codes contain datetimes fields. It is recommended to use Avro 1.9.x compiler to recompile. And, Avro may remove the Joda time support in the future. This may also be deleted in Pulsar in the future.
Support unloading all partitions of a partitioned topic
Before Pulsar 2.5.1, Pulsar supports unloading a non-partitioned topic or a partition of a partitioned topic. If there is a partitioned topic with too many partitions, users need to get all partitions and unload them one by one. In Pulsar 2.5.1, we support unloading all partitions of a partitioned topic.
Supports evenly distributing topics count when splitting bundle
In Pulsar 2.5.1, we introduce an option(-balance-topic-count) for bundle split. When setting this option to true, the given bundle is split into two parts and each part has the same amount of topics. In addition, we bring in a new Load Manager implementation named org.apache.pulsar.broker.loadbalance.impl.BalanceTopicCountModularLoadManager. The new Load Manager implementation splits the bundle with balance topics count.
Before Pulsar 2.5.1, Pulsar SQL cannot read the keyValue schema data. In Pulsar 2.5.1, we add the prefix key. for the key field name, add the prefix value. for the value field name. Therefore, Pulsar SQL can read the keyValue schema data.
Update Netty version to 4.1.45.Final
Netty 4.1.43 has a bug, which prevents it from using Linux native Epoll transport. This makes Pulsar brokers fail over to NioEventLoopGroup even when running on Linux. The bug is fixed in Netty 4.1.45.Final .
In Pulsar 2.5.1, to improve Key_Shared subscription message dispatching performance, we make the following operations for saving CPU usage which can improve non-batched message dispatch performance:
Reduce making hash for the message key.
Reduce the number of finding consumers for message keys..
Add Joda time logical type conversion
In Pulsar 2.5.1, Avro is upgraded to 1.9.x and the default time conversion is changed to JSR-310. For forwarding compatibility, we add the Joda time conversion in Pulsar 2.5.1 and enable it by default
Support deleting inactive topic when subscriptions caught up
Before Pulsar 2.5.1, Pulsar supported deleting inactive topics that have no active producers or subscriptions. In Pulsar 2.5.1, we expose inactive topic delete mode in broker.conf to delete inactive topics that have no active producers or consumers but all subscriptions of the topic are caught up. You can enable this feature in the broker.conf:
Introduce maxMessagePublishBufferSizeInMB configuration to avoid broker OOM
Before Pulsar 2.5.1, if a broker has a smaller direct memory (e.g. 2G) and runs pulsar-perf to write messages, the broker becomes unstable. Because the broker reads messages from the channel automatically and the ByteBuf cannot be released until the entry is written to Bookie successfully or the timeout expires.
In Pulsar 2.5.1, we introduce the maxMessagePublishBufferSizeInMB configuration to avoid broker OOM (Out of Memory). If the processing message size exceeds this value, the broker stops reading data from the connection. When the available size is greater than half of the maxMessagePublishBufferSizeInMB, the broker starts automatically reading data from the connection. You can set up the publish buffer size in broker.conf:
# Max memory size for broker handling messages sending from producers.
# If the processing message size exceed this value, broker will stop read data
# from the connection. The processing messages means messages are sends to broker
# but broker have not send response to client, usually waiting to write to bookies.
# It's shared across all the topics running in the same broker.
# Use -1 to disable the memory limitation. Default is 1/2 of direct memory.
Support BouncyCastle FIPS provider
In Pulsar 2.5.1, Pulsar supports BC-FIPS (BouncyCastle FIPS) provider. Before Pulsar 2.5.1, Pulsar only supported BouncyCastle (BC) provider, and BC JARs are tied strongly into both the broker and the client code. Users fail to change from the BC provider to the BC-FIPS provider. This feature splits the BC dependency out into a separate module. Therefore, users can freely switch between the BC provider and the BC-FIPS provider.
Allow tenant Admin to manage subscription permission
In previous releases, we have added support to grant subscriber-permission to manage subscription based APIs. However, grant-subscription-permission API requires super-user access and it creates too much dependency on system-admin when many tenants want to grant subscription permission. In Pulsar 2.5.1, through the Restful API or the Pulsar Admin, we allow each tenant Admin to manage subscription permission in order to reduce administrative efforts for super users.
Allow to enable/disable delayed delivery for messages on namespace
In Pulsar 2.5.1, we add the set-delayed-delivery and set-delayed-delivery-time policies for the namespace. Therefore, Pulsar 2.5.1 allows to enable or disable delayed delayed delivery for messages on namespace.
Support offloader at namespace level
In previous releases, the offload operation only had the cluster-level configuration. Users cannot set the offload configuration at the namespace level. In Pulsar 2.5.1, we support using the Pulsar Admin to set the offloader at the namespace level.
Disallow sub auto creation by Admin when disabling topic auto creation
In previous releases, when Auto topic creation is disabled in KoP, non-partitioned topics are created with Flink Pulsar Source. To fix this bug, in Pulsar 2.5.1, we change the admin code to disable sub auto creation by the Admin when Auto topic creation is disabled.
Support Python 3.8 for Pulsar client
In pulsar 2.5.1, we add 3.8 cp38-cp38 to support Python 3.8 for the Pulsar client. Therefore, users can install the Pulsar client on Python 3.8 .
Provide another libpulsarwithdeps.a in Debian/RPM cpp client library
Pulsar 2.5.1 mainly provides 2 additional pulsar c++ client libraries in Debian/RPM:
pulsarSharedNossl (libpulsarnossl.so): it is similar to pulsarShared(libpulsar.so), and has no SSL statically linked.
pulsarStaticWithDeps(libpulsarwithdeps.a): it is similar to pulsarStatic(libpulsar.a), and is archived in the dependencies libraries of libboost_regex, libboost_system, libcurl, libprotobuf, libzstd and libz statically.
Guangning E is an Apache Pulsar committer and the main contributor to Apache Pulsar IO and Apache Pulsar Manager. He works as a senior software engineer at StreamNative, where he specializes in cloud platform, cloud computing, and big data related fields.
This post was originally published by Guangning E on Apache Pulsar blog.