January 6, 2026
10 min read

StreamNative’s 2025 Year in Review

Sijie Guo
Co-Founder and CEO, StreamNative

Welcome to the new year! As we kick off 2026, we’re thrilled to take a moment to reflect on 2025—a year of remarkable growth, innovation, and community momentum at StreamNative. From major product milestones like Ursa Engine reaching general availability to breakthroughs in real-time AI integration, 2025 was a pivotal year that solidified StreamNative’s role at the forefront of lakehouse-native data streaming. In this review, we highlight the key product releases, community achievements, business developments, and events that defined our year – and share a glimpse of what’s ahead in 2026.

Ursa Engine Goes GA and Everywhere – Lakehouse-Native Streaming at Scale

2025 was a breakthrough year for Ursa Engine, StreamNative’s next-generation, lakehouse-native streaming engine for Apache Pulsar and Kafka. Ursa Engine reached General Availability (GA) on AWS in Q1, delivering on its promise to slash streaming costs by up to 95% compared to traditional Kafka. Built on a leaderless, stateless architecture that writes data directly to cloud object storage in open table formats, Ursa dramatically reduces infrastructure overhead while remaining fully Kafka-compatible. Its innovative design was validated on the world stage when our Ursa paper won the Best Industry Paper award at VLDB 2025, underscoring Ursa as the first “lakehouse-native” streaming engine for Kafka.

We also put a name to the architectural shift we’ve been building toward: lakehouse-native data streaming. By “lakehouse-native,” we mean a streaming system where open lakehouse tables (Iceberg/Delta) on object storage are the primary storage layer, not an after-the-fact destination fed by connector pipelines. Instead of “stream first, copy later,” Ursa makes it possible to write once into open table formats and make the same data immediately usable for streaming consumers and analytics/AI engines through catalog integrations — reducing duplication, simplifying governance, and collapsing two infrastructures into one.

Ursa expanded to every cloud and cluster. Following AWS GA, we introduced Ursa on Microsoft Azure and Google Cloud in Public Preview. By late 2025, organizations could deploy Ursa in their own accounts on all three major clouds, or consume it as a fully-managed service. Crucially, Ursa’s lakehouse storage tier became available for every StreamNative Cloud cluster (Serverless, Dedicated, BYOC) via a new tiered storage extension. This means even classic Pulsar clusters can now offload data to Iceberg/Delta lakehouse tables, immediately making each topic a live stream and an analytics-ready table. Users get the familiar Pulsar/Kafka experience while data automatically lands in their cloud storage (e.g. S3 or ADLS) as compacted Parquet files. This “Ursa Everywhere” approach allows seamless upgrades to the full Ursa engine in future, with data already in the right format and place – a pragmatic path to reduce total cost of ownership without disruptive migrations.

Deep integration with data lakehouse catalogs was another highlight. Ursa now natively integrates with popular governance and catalog systems to unify streaming and batch data under consistent governance. For example, Databricks Unity Catalog integration allows streaming topics to register as Unity Catalog–governed Delta or Iceberg tables, so real-time data inherits the same access controls and lineage as the rest of the lakehouse. Amazon S3 Tables integration enables Ursa to write streams directly into Iceberg tables backed by AWS S3, using Iceberg’s REST catalog for centralized metadata. And Snowflake Open Catalog integration makes Ursa’s Iceberg tables discoverable and queryable from Snowflake, bridging real-time data into Snowflake’s analytical ecosystem. Together, these “streaming augmented lakehouse” capabilities brought truly unified governance: streaming topics and batch tables can be one and the same, controlled by the same catalog policies.

Finally, StreamNative’s Serverless offering reached General Availability on AWS, Google Cloud, and Azure in 2025. This Serverless mode delivers instant, elastic streams without cluster management, enabling teams to spin up Pulsar/Ursa clusters on-demand across all major clouds. With seamless auto-scaling and multi-tenancy, the GA release of StreamNative Serverless opened real-time streaming to a wider audience by removing operational overhead. Developers can now build real-time applications faster with instant start, automatic scaling, and support for both Pulsar and Kafka APIs on a unified serverless platform.

Adaptive Universal Linking – Seamless Kafka Migrations

To ease the journey to Ursa and modern streaming, we introduced Universal Linking (“UniLink”) – a powerful tool for seamless cross-cluster data migration. In March, UniLink entered Public Preview as a “full-fidelity Kafka-to-Ursa replication tool”. This allowed organizations to live-migrate from legacy Kafka (or classic Pulsar clusters) to Ursa Engine with zero downtime. UniLink continuously replicates topics, schemas, and consumer state from the source to Ursa, so teams can cut over applications at their own pace without data loss or dual-writes. By leveraging smart, zone-aware reads and writing directly to Ursa’s object storage, UniLink avoids broker bottlenecks and costly cross-AZ traffic during migration. This made migrating to Ursa’s leaderless architecture faster and cheaper, “replicating more while spending less, without compromise.”

Mid-year, UniLink evolved with Adaptive Linking to support more flexible migration strategies. Two linking modes – stateful vs. stateless – were introduced to let teams choose how to handle consumer offsets during migration. In stateful mode, UniLink preserves the exact offsets and ordering between source and destination clusters, so consumers see a continuous stream as if nothing changed. This allows a clean cutover (with full auditability) but requires coordinating a final producer switch in a maintenance window. In stateless mode, UniLink does not preserve offsets on the target, which greatly relaxes rollout: consumers can start reading from the new cluster independently of when producers move. This mode shines for migrations that may stretch over weeks or involve many independent teams, as it tolerates offset discontinuities that downstream systems can handle. Together, these modes turn “all-or-nothing” migrations into an engineering choice – tightly coordinated when needed, or gradual and decoupled when possible.

UniLink also added support for topic rename mapping, making re-platforming even smoother. This lets users migrate a topic from one name/namespace to a different name on the new cluster – for example, mirror payments.orders into finance_orders – without breaking schema compatibility or consumer group behavior. Organizations used this to reorganize and clean up topic taxonomy during migration (e.g. consolidating topics or aligning naming conventions) while UniLink kept the data and schema continuity intact. By the end of 2025, Adaptive UniLink Linking enabled truly seamless cross-cluster migrations, whether upgrading from open-source Kafka, moving from self-managed Kafka to StreamNative Cloud, or consolidating multiple clusters. Companies could “link” their data streams over with confidence, knowing they can preserve critical ordering when required or opt for flexibility when speed is paramount.

Expanding Connectivity: Snowflake Snowpipe and Google Spanner Integration

We also significantly expanded its integrations and connectors in 2025, making it easier to connect diverse systems into the streaming platform. One major enhancement was Snowflake Snowpipe Streaming support in our Snowflake Sink Connector. The Snowflake Streaming Sink (introduced in private preview in late 2024) was upgraded with Snowpipe Streaming, enabling near-real-time loading of data into Snowflake tables. Instead of staging files on cloud storage and waiting for batch loads, the connector now uses Snowflake’s Snowpipe Streaming API to push messages directly into Snowflake as soon as they arrive. This delivers lower latency – data is queryable in Snowflake within seconds, not minutes. It also reduces cost and complexity by eliminating intermediate storage and batch jobs. In short, streaming pipelines from Pulsar/Ursa into Snowflake became faster, cheaper, and simpler, unlocking use cases like real-time analytics dashboards on Snowflake and up-to-date ML feature tables without complex ETL.

On the source side, StreamNative onboarded a suite of Debezium-powered CDC connectors in 2025, bringing a rich array of enterprise database integrations into the fold. We added fully-managed source connectors (built on Debezium Kafka Connect) for popular databases: MySQL, PostgreSQL, Microsoft SQL Server, MongoDB, and a universal JDBC connector for other relational DBs. These connectors capture change data capture (CDC) events from databases and stream them into Pulsar topics in real time – all as a native part of StreamNative Cloud (no self-managed Connect cluster needed). For example, the Debezium MySQL Source connector is available built-in on StreamNative Cloud; with a few clicks or CLI commands, users can start streaming MySQL binlog events into Pulsar. Similar connectors for Postgres, SQL Server, and MongoDB allow streaming inserts/updates/deletes with low latency. This year’s additions meant customers could use StreamNative Cloud as a universal data pipeline, seamlessly integrating operational databases into their event streams. With these new CDC sources, microservices can react to DB changes (e.g. an order status update) in real time, and data lakes can ingest fresh transactional data continuously rather than via nightly dumps.

Another noteworthy integration was the Debezium Cloud Spanner Source connector introduced in Q4. Google Cloud Spanner – a globally-distributed SQL database – can emit change streams, and StreamNative’s managed connector now taps into those to produce Pulsar events. This connector listens to Spanner’s change streams and publishes every row-level insert/update/delete event into a Pulsar topic in near real-time. It is fully managed and handles all the heavy lifting (scaling, partitioning, offset management), so users simply provide their Spanner instance details and let the platform stream the changes. Google Spanner integration unlocks powerful patterns: for example, applications can subscribe to Spanner change topics to trigger downstream processes the moment critical data changes (fraud detection, cache updates), and analytics pipelines can keep BigQuery or lakehouse tables in sync without batch jobs. All Debezium-based connectors include rich observability (throughput, lag, error rates in our console) and are designed for reliability at scale. With Snowpipe Streaming + a growing connector roster, 2025 solidified StreamNative’s vision of Universal Connectivity: whatever data source or sink you use – cloud data warehouse, relational database, NoSQL store – we likely have a native integration to plug it into your streaming pipeline.

Orca: Event‑Driven AI Agents Come to Life

Perhaps the most futuristic development of 2025 was the advent of Orca, Our new Event-Driven Agent Engine for AI. Unveiled at the Data Streaming Summit in San Francisco, Orca entered Private Preview as the industry’s first event-driven runtime for production AI agents. The idea behind Orca is simple but powerful: if your enterprise data already streams through Pulsar, why not host your AI “agents” directly in the stream? Traditional LLM-powered agents often run as stateless APIs or notebook experiments, but Orca transforms AI agents from passive, request/response bots into persistent, real-time actors. An Orca agent can subscribe to one or more topics, maintain state (memory) between events, take actions (call APIs or trigger workflows), and emit new events – all with the resilience and scalability of Pulsar behind it.

In practice, Orca provides a production-grade sandbox for autonomous AI. Agents run inside a durable event loop: they consume messages from streams (e.g. a customer event topic), use an LLM or other AI logic to decide on an output, and produce results or commands to other topics. Unlike ephemeral Lambda functions, Orca agents can keep long-lived state (via in-memory or streaming storage), allowing them to “remember” past interactions or maintain a chain of thought over time. The Orca engine handles concurrency, fault tolerance, and observability – multiple agents can coordinate, no single agent stalls the system, and every decision or action is logged and traceable. In essence, Orca enables an “agent mesh” architecture where multiple AI agents collaborate via the Pulsar event bus, sharing context and tasks in real time. Notably, Orca is polyglot: it leverages Pulsar’s multi-protocol support, meaning it can work with OpenAI functions/agents, Google’s Agent Framework (ADK), LangChain/LangGraph, or custom Python agents without heavy rewrites.

The use cases for Orca are ground-breaking. Imagine a cybersecurity agent that subscribes to network intrusion events and autonomously orchestrates containment actions, or a customer support AI that listens to user activity streams and proactively engages with personalized responses. With Orca, such agents run natively in the streaming platform, eliminating latency and integration barriers. They don’t poll for data – they react the instant events occur. StreamNative built Orca with enterprise needs in mind: integration with corporate single sign-on and secrets management, role-based controls on what tools an agent can use, and full audit logs of agent decisions. By year’s end, Orca remained in Private Preview (initially available for BYOC deployments), but it had already sparked imagination among early users. Orca’s debut signals that autonomous, event-driven AI is no longer science fiction; it’s the next chapter of streaming, where data streams feed AI agents that continuously perceive and act.

Security and Governance: RBAC GA and Schema Governance Previews

StreamNative Cloud matured its enterprise security and governance features in 2025, making it easier for organizations to confidently run multi-tenant, production workloads. A major milestone was Role-Based Access Control (RBAC) reaching General Availability in Q3. RBAC in StreamNative Cloud is now GA, bringing a consistent, fine-grained security model across all Pulsar and Kafka interfaces. This means platform admins can centrally define who is allowed to do what – e.g. who can create or delete topics, publish or subscribe on a given namespace, or evolve a schema – all through a unified roles and permissions system. Roles can mirror real-world teams and least-privilege principles (for example, a Data Producer role that grants publish rights on specific topics but no consume rights). These permissions apply uniformly whether clients connect via Pulsar protocols or the Kafka API, ensuring no backdoor by using a different interface. With RBAC GA, enterprises no longer need ad-hoc ACL scripts or manual enforcement – they get a single source of truth for access control, manageable in the Cloud Console or via API/Terraform for automation. As noted in the announcement, “consolidating onto one platform doesn’t mean compromising on governance” – RBAC provides the guardrails to confidently host many applications and teams on the same streaming cluster.

StreamNative also introduced new schema governance capabilities. Since Pulsar’s schema registry is built-in, RBAC now covers who can register or update schemas for each topic, adding protection against unauthorized or incompatible schema changes. Moreover, in January we launched Kafka Schema Registry RBAC in Private Preview. This feature extends fine-grained access control to the Kafka-compatibility Schema Registry API, allowing enterprises to enforce who can read or write schema definitions on a per-subject basis. By locking down schema evolution, companies can ensure only approved data models make it to production – a big win for compliance and data quality. These schema governance tools, combined with RBAC, move StreamNative Cloud toward a “secure by default” posture: no more open access by default; everything is governed by roles that map to business needs. It shifts access management from scattered configs to a single auditable model. And because RBAC applies to Pulsar and Kafka endpoints, security teams have one framework to understand, rather than separate ACL systems.

Other enhancements focused on administrative ease and platform hardening. We rolled out a new Organization Profile page in the Console for centralized org management. Administrators can now easily update key info like billing contacts and technical contacts, ensuring they don’t miss critical notifications. The profile page also provides a clear overview of the organization’s clusters and resources in one place, simplifying management for large teams. Under the hood, we delivered a “slim” StreamNative Cloud container image that uses a Bill of Materials for dependency management. This trimmed the core image size to ~1 GB, improving startup times and reducing the attack surface for security. A smaller image means faster autoscaling and easier upgrades, as well as fewer components to monitor for vulnerabilities. This change, though not visible to end users, exemplifies our commitment to enterprise-grade reliability and security. In sum, by end of 2025 StreamNative Cloud offered a much tighter security and governance story: GA-grade RBAC for all resources, schema controls to prevent data chaos, and polished admin experiences – all contributing to a trustworthy, governable streaming platform for the enterprise.

Business Growth and Global Expansion

StreamNative’s business saw robust growth in 2025, underpinned by new customer wins, cloud footprint expansion, and industry recognition. In 2025, we saw more “AI-native” products depend on continuous, high-volume event streams — because when your product reacts in real time, your data pipeline can’t be batch. 

Unify, an AI-native go-to-market platform, built a real-time backbone on StreamNative Cloud + Apache Pulsar that ingests tens of millions of events per day, replacing batch jobs and legacy queuing so their platform can react to buyer signals in seconds and trigger downstream workflows immediately. Safari AI scaled real-time computer vision analytics on top of customers’ existing camera infrastructure — tracking operational metrics like occupancy and queue wait times — and as they grew to 10,000+ pipelines and 50,000+ cameras, StreamNative helped them achieve a 50% infrastructure cost reduction while maintaining sub‑10‑second end-to-end delivery for real-time metrics. And in security and fraud prevention, Q6 Cyber replaced Google Cloud Pub/Sub with StreamNative’s Pulsar platform to process 85B+ cyberthreat records, using StreamNative as the transport layer at the center of their architecture while retaining the control they needed via BYOC. 

These fast-growing organizations chose StreamNative for its unique ability to handle both high-throughput streaming and mission-critical messaging on one platform – a perfect fit for AI use cases that ingest massive data streams and respond in milliseconds. We also continued to serve large enterprises modernizing their infrastructures: more Fortune 500 companies moved from self-managed Kafka or legacy messaging systems to StreamNative Cloud to cut costs and accelerate development. This broad adoption across startups and enterprises drove our cloud usage to new heights – In 2025, StreamNative’s Cloud business nearly tripled in revenue, while enterprise cloud revenue grew over 200% year over year—outpacing overall growth as large customers scaled mission-critical workloads.

On the global front, we made StreamNative Cloud more accessible than ever. In August, we launched StreamNative Cloud on Alibaba Cloud Marketplace, entering the Chinese and Asia-Pacific cloud ecosystem. Now Alibaba Cloud users can subscribe to StreamNative’s fully-managed Pulsar/Ursa service directly through their local cloud account. This public preview on Alibaba Cloud opened the door to organizations in regulated or region-specific markets who prefer Alibaba’s infrastructure. The offering brought the StreamNative’s Data Streaming Platform (messaging + lakehouse streaming) to Alibaba’s customer base, with seamless integration to Alibaba services like OSS (object storage) for lakehouse tiered storage. In addition, we extended our marketplace availability – by end of year, StreamNative Cloud listings existed on all three major cloud marketplaces, simplifying procurement for cloud-first enterprises.

Industry analysts took note of StreamNative’s rise. Forrester Research included StreamNative in The Forrester Wave™: Streaming Data Platforms, Q4 2025, marking our first appearance in this influential evaluation of streaming vendors. We were recognized as a “Contender” in the Wave – an impressive showing for our debut year – with Forrester highlighting that “StreamNative excels at messaging and resource optimization” and supports real-time analytics and event-driven use cases with strong scalability. The report noted our cost-efficient, Kafka-compatible architecture as a key strength appreciated by customers. This independent validation echoed an earlier recognition from GigaOm, which named StreamNative a Leader in its 2024 Radar for Streaming Data Platforms. Such accolades boosted our credibility in the market and have driven an uptick in inbound interest from enterprises looking to modernize their data infrastructure.

Community Events and Thought Leadership

Throughout 2025, StreamNative invested heavily in community education and thought leadership, convening the Data Streaming Summit series as a forum for practitioners. In the spring, we hosted Data Streaming Summit Virtual 2025 (May 29), a free two-day online conference that attracted thousands of attendees from around the globe. The virtual summit featured 36+ sessions over multiple tracks, showcasing the latest trends and best practices in real-time data. A central theme was the emergence of “Agentic AI” – the idea of AI agents driven by streaming data – which was fitting given our Orca announcement. Talks from industry leaders explored how real-time streaming, unified lakehouse architectures, and open source technologies are converging to enable this next wave of intelligent systems. Other sessions dove into Pulsar 4.1’s improvements, user case studies of Pulsar replacing Kafka, and deep-dives into Ursa’s design. By removing geographical barriers, the virtual summit democratized knowledge, allowing anyone to learn from streaming experts. The engagement was tremendous – live Q&As, community Slack discussions, and thousands of views on session recordings.

Building on that momentum, Data Streaming Summit San Francisco 2025 took place in-person on September 29–30 at the Grand Hyatt SFO. This marked the return of an in-person community conference (after prior Pulsar Summits), and it did not disappoint. Over 300 practitioners gathered to network and learn. The summit offered 30+ sessions across four dedicated tracks: Deep Dive (covering architecture and internals), Use Cases (real-world deployments), AI + Stream Processing, and Streaming Lakehouse. The agenda was packed with exciting content – from how Netflix runs Kafka at massive scale, to insider talks from LinkedIn, Uber, and OpenAI on their streaming infrastructures. Notably, the event was intentionally vendor-neutral and multi-technology. While StreamNative played host, speakers and sponsors came from across the ecosystem: Amazon Web Services, Redpanda, Confluent, RisingWave, and more. This fostered honest discussions on comparing approaches and the future direction of streaming. A highlight was a keynote panel on real-time AI in production, featuring contributors from both Pulsar and Kafka communities discussing how streaming systems must evolve to support AI workloads. The energy at the summit was electric – it underscored that the real-time data community is vibrant and united by common challenges regardless of the tool. By convening these events (virtual and in-person), we continue to support the broad data streaming community and ecosystem, facilitating knowledge-sharing that benefits the entire industry.

Looking Ahead to 2026

As we celebrate the successes of 2025, we’re already gearing up for what’s next. Data streaming will continue to evolve from a siloed pipeline to an integrated “data backbone” for all enterprise analytics and AI. In 2026, StreamNative will double down on enabling the streaming lakehouse paradigm – expect even tighter integrations with lakehouse ecosystems, more connectors for real-time analytics, and features that make streaming data immediately usable for AI/ML. Our recently announced Agent Engine (Orca) will progress toward general availability, bringing event-driven agents into mainstream use. We plan to expand Orca’s capabilities, adding richer developer tooling, library integrations, and guardrails so that any organization can safely deploy AI agents that live in the stream. On the governance front, 2026 will see us delivering full schema governance and auditing features – from Schema Registry ACLs graduating to GA, to advanced schema validation and lineage tracking for streaming data.

In short, StreamNative’s vision for 2026 is an open platform where data streams, batch data, and AI agents all come together in a governed, seamless fashion. We anticipate more enterprises will converge their messaging queues, streaming logs, and data lakes into one cohesive system – and we aim to be the backbone for that transformation. The team is already hard at work on Pulsar 5.0 features, further performance optimizations, and one-click cloud experiences that push the envelope of simplicity and scale. Thank you to our customers, community, and partners for an incredible 2025 – and get ready for an even more exciting 2026, where real-time data powers intelligence like never before!

This is some text inside of a div block.
Button Text
Sijie Guo
Sijie’s journey with Apache Pulsar began at Yahoo! where he was part of the team working to develop a global messaging platform for the company. He then went to Twitter, where he led the messaging infrastructure group and co-created DistributedLog and Twitter EventBus. In 2017, he co-founded Streamlio, which was acquired by Splunk, and in 2019 he founded StreamNative. He is one of the original creators of Apache Pulsar and Apache BookKeeper, and remains VP of Apache BookKeeper and PMC Member of Apache Pulsar. Sijie lives in the San Francisco Bay Area of California.

Newsletter

Our strategies and tactics delivered right to your inbox

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Ursa
UniLink
Orca
RBAC
StreamNative Cloud
Data Streaming Summit