Introducing StreamNative Platform
June 16, 2021
We are excited to announce StreamNative Platform 1.0, a cloud-native, unified messaging and streaming platform powered by Apache Pulsar. StreamNative Platform provides a complete, declarative API-driven experience for deploying and self-managing Apache Pulsar in your private environments. With StreamNative Platform, we’ve packaged our enterprise expertise with StreamNative Cloud to help you build your own private cloud Pulsar service.
Whether you are an agile development team that needs to get up and running with Apache Pulsar quickly, or a central infrastructure team that is responsible for enabling your engineering team to build messaging and streaming applications, StreamNative Platform may be the right fit for you. StreamNative platform enables you to:
- Reduce operational costs by using our API-driven automation to deploy and manage in the private environment of your choice.
- Reduce risk and costly resource investments, by leveraging our Pulsar expertise to run a secure, reliable, and production-ready messaging and streaming platform.
- Run messaging and streaming workloads consistently and scale to meet business demands with efficient use of resources by deploying Apache Pulsar to any private cloud.
With StreamNative Platform, you can achieve the simplicity, flexibility, and efficiency of the cloud without the burden of complex infrastructure operations. We provide all the components of a complete platform—ready out of the box with enterprise-grade configurations.
StreamNative Platform features
The StreamNative Platform is built on Apache Pulsar to allow developers to transition from traditional silos and monolithic applications, to modern microservices and messaging and streaming applications to increase agility and accelerate time to market.
The overall architecture of StreamNative Platform is illustrated in the following figure.
In this blog, we highlight the enterprise features for StreamNative Platform, including:
- Transaction (Pulsar 2.8.0)
- Function Mesh
- Enterprise-grade security (Vault & Audit Log)
- Declarative API
- Integrated with Cloud-Native ecosystem
Below we provide a deep dive on each.
Enable unrestricted developer productivity with Pulsar transactions
Built on Apache Pulsar 2.8, StreamNative Platform brings strong transactional guarantees to Pulsar. Transactional guarantees make it easier than ever to write real-time, mission-critical messaging and streaming applications. From tracking ad views to processing financial transactions, you can do it all in real-time and reliably with Pulsar Transaction. You no longer have to develop with lost or duplicated data in mind.
Pulsar PMC member and StreamNative Engineering Lead, Penghui Li, reviews this functionality in detail in the recent blog, Exactly-once Semantics with Transactions in Pulsar. Read this blog to learn more about the exactly-once semantics support in Pulsar.
Empower Kafka-API users to build upon a new streaming platform reimagined for Kubernetes
Developed by OVHCloud and StreamNative, Kafka-on-Pulsar (KoP) has become one of the most popular protocol handlers in the Apache Pulsar community. Companies including Tencent, Bigo, and Dada Nexus have deployed KoP at internet-scale to migrate their existing Kafka applications to Pulsar.
StreamNative Platform includes the GA release of Kafka-on-Pulsar to enable Kafka-API users to build event streaming applications on a streaming platform architected for Kubernetes. The GA release of Kafka-on-Pulsar includes the following features:
- Native support for Kafka protocols from 1.0 to 2.6.
- Native support for Kafka admin API. All existing Kafka tools can be seamlessly used without any code changes.
- Continuous offset to support a broader set of Kafka integrations, like the Kafka Spark connector.
- Enterprise-grade security features such as OAuth2 integration.
- Native Pulsar performance to Kafka-on-Pulsar (you can get the same performance using Kafka clients as you get from using Pulsar clients).
- Preview feature of Kafka Transaction Support.
Simplify building serverless streaming applications with Function Mesh
Pulsar Functions and Pulsar IO have been proven to be two powerful building blocks for developing messaging and event streaming applications. However, running and orchestrating multiple functions and connectors at a large scale is not an easy task. The complexity is increased when the number of functions and connectors increases.
StreamNative Platform leverages Function Mesh to simplify building serverless event streaming applications. The key benefits include:
- Eases the management of Pulsar Functions and connectors when running multiple instances of Functions and connectors together.
- Utilizes the full power of Kubernetes Scheduler, including deployment, scaling and management, to manage and scale Pulsar Functions and connectors.
- Allows Pulsar Functions and connectors to run natively in the cloud environment, leading to greater possibilities when more resources become available in the cloud.
- Enables Pulsar Functions to work with different messaging systems and to integrate with existing tools in the cloud environment.
Monitor and audit Pulsar clusters with Structured Audit Logs
Once Pulsar is up and running within a large team, it’s critical to keep an eye on who is touching data and what they’re doing with it. Structured Audit Logs, which is GA on StreamNative Platform, provides an easy way to track user/application access so you can identify potential anomalies and bad actors.
Structured Audit Logs enable you to capture audit logs in a set of dedicated Pulsar topics, either on a local or a remote cluster, including:
- Capture low-volume, management-related activities, such as creating or deleting tenants, namespaces or topics (enabled by default).
- Capture high-volume activities, such as produce, consume, and acknowledge events (can be enabled as needed).
With audit events safely stored in Pulsar topics, you can use Pulsar integrated tools, like Pulsar Functions, Pulsar SQL, and Flink SQL, to process and analyze them. Additionally, you can offload audit events to external data lakes or data warehouses (like Snowflake or Databricks) for analysis using Pulsar IO connectors.
Self-managing Pulsar with a fully-managed experience using declarative APIs
StreamNative Platform provides high-level declarative APIs by extending the Kubernetes API through Custom Resource Definitions to support the management of Pulsar services. As a user, you can interact with the Custom Resource Definition by defining a Custom Resource that specifies the desired state. Then the StreamNative Platform will take care of the rest.
Manage Pulsar components
StreamNative Platform provides a set of Custom Resource Definitions to deploy and manage Pulsar components: ZooKeeper, BookKeeper, Pulsar Broker, Pulsar Proxy, and StreamNative Console.
The declarative API enables you to leave the infrastructure handling to software automation, freeing you to focus on your core business applications.
- Scale Pulsar with a single change to the declarative spec. StreamNative Platform then will spin up the required compute, networking, and storage, and start the new components (bookies, brokers, or proxies).
- Deploy a fully secure Pulsar cluster with a single declarative spec. StreamNative Platform automates Pulsar configuration for strong authentication, authorization, and network encryption, as well as creates the set of TLS certificates required by Pulsar components to operate.
- Upgrade to the latest StreamNative Platform release by specifying the new version in the declarative spec. StreamNative Platform then orchestrates a rolling upgrade, deploying the new version without disruption to ongoing workloads.
- Deploy highly-available infrastructure in any environment. StreamNative Platform understands the infrastructure topology of nodes, racks, and zones, while automating the detection of and configuring the Pulsar service to ensure resilience to infrastructure failures.
Operating Apache Pulsar with a cloud-native ecosystem
With StreamNative, you can utilize Kubernetes-native interfaces, integrations, and scheduling controls to operate consistently and cost-effectively alongside other applications and data systems.
Initially, we used Helm to provide a simple configuration abstraction on top of Kubernetes to allow you to define a declarative spec as a Helm values yaml file (around 60% to 70% of Pulsar users are using Pulsar or StreamNative Helm charts to deploy Pulsar on Kubernetes).
However, after working to provide a fully managed Pulsar service (StreamNative Cloud) and thinking about how to provide automation and packaged best practices, we found that Helm templates were not the right architecture choice. Helm did not provide important features for running a stateful storage service such as the ability to control deployment sequences and add additional operations between deployment steps.
As a result, we moved to an industry standard and aligned on providing a Kubernetes-native interface with Custom Resource Definitions and Controllers. A Kubernetes-native experience provides a reliable API-driven approach with custom resources and leverages the ecosystem tooling and features inherent to Kubernetes. With this approach, you do not need specialized knowledge of how the applications are deployed, such as how to configure storage and network for stateful services.
Each Pulsar resource configuration spec is defined as a Kubernetes-native Custom Resource Definition, and each Pulsar resource provides an extensible configuration interface:
- Configure the service configuration, JVM configuration, and Log4j2 configuration for each Pulsar component.
- Manage the lifecycle of sensitive credentials and configurations separately, and only reference them in the Pulsar resource configuration spec.
- Leverage the industry standards like Kubernetes Secrets to manage the lifecycle of credentials.
- Specify workload scheduling rules through Kubernetes Node and Pod affinity. Fully integrated with Prometheus and Grafana. Prometheus on Kubernetes automatically discovers and scraps metrics from Pulsar components.
The StreamNative team has experience running some of the largest Pulsar deployments in the world and operating StreamNative Cloud. StreamNative Platform brings a cloud-native experience for running Apache Pulsar workloads in on-premises environments. It provides an enterprise-ready deployment of Apache Pulsar that enhances Pulsar’s elasticity, ease of operations, and resiliency.
StreamNative Platform is a strong fit for the following use cases:
- If you have data on-premises that needs to be streaming.
- If you have regulatory requirements that mandate controls of data, systems, and applications to stay within your own isolated environments.
- If you want to provide the same StreamNative Cloud experience across all of your use cases.