Open-Sourcing the Leaderless Log Protocol Behind Ursa

Since we launched Ursa and Ursa For Kafka, the #1 question from the community has been: "What are you going to open-source, and when?"

We are publishing this article because we want to be transparent about what's coming and in what order.

As for the question — here's our answer — and it starts differently than you'd expect.

We're taking an unconventional series of steps to open-sourcing our solution. One we believe is better aligned in the 2026 age of coding agents.

We're open-sourcing the formally verified protocol specification behind Ursa's storage engine first — the Leaderless Log Protocol. In the end of this article we detail our next series of steps, but let us explain our rationale first.

Why Specs First?

Because specs outlive code. This is a core conviction at StreamNative, and it shapes everything we're doing with Ursa's open-source strategy.

We learned this lesson from Kafka itself. Kafka's wire protocol was never formally specified from the start. The protocol's authoritative definition was the Java implementation. When teams outside the Kafka community tried to build compatible clients — in Go, Rust, Python, C++ — they had to reverse-engineer behavior from the Java source code. Every client had to independently discover the same edge cases, the same undocumented invariants, the same subtle ordering requirements. This led to years of interoperability issues, incompatible client behaviors, and bugs that only surfaced when a non-Java client encountered a corner case the spec never documented. Jack Vanlightly's formal verification work on Kafka replication came much later — after the protocol had already been implemented, deployed, and debugged through production incidents (all across different implementations).

We don't want to repeat that mistake. With Ursa, we're doing it in the opposite order: spec first, then code.

And AI makes this approach not just principled but practical. We used to say "talk is cheap, show me the code." In the age of AI, code is cheap. Every engineering team has discovered this — AI coding agents generate implementation code at a pace that would have seemed absurd five years ago.

So what's expensive now? The design. The precise description of what your system should do under every possible interleaving, crash timing, and concurrent access pattern. For distributed systems, the hardest bugs are never coding bugs — they're design bugs. Race conditions, crash-recovery edge cases, subtle ordering violations that pass every test and surface at 3 AM six months into production.

In the age of coding agents, a formally verified spec IS the source code. Any team can take our spec, hand it to a coding agent, and, with enough tokens and testing, produce a correct implementation in their language of choice. The spec will outlast any particular Java, Rust, or Go implementation — and with coding agents, translating between languages becomes trivial. What doesn't become trivial is getting the protocol right. That's what we're open-sourcing.

What is the Leaderless Log Protocol?

In vanilla Kafka, every partition has a leader. The leader serializes writes, manages replication, and tracks which replicas are in sync. When a leader dies, everything stops until a new one is elected.

The Leaderless Log Protocol removes the leader entirely. Multiple brokers write to the same partition concurrently. Offset ordering is delegated to a single atomic operation on an external coordination store. No leader elections. No ISR rebalancing. No partition reassignment cascades. Instant failover when any node dies.

This is the core protocol powering Ursa, the storage engine behind Ursa For Kafka. It's what makes Ursa's zone-aware routing, instant failover, and concurrent writes possible. Our Ursa paper describes the detailed design of the system — but today we're open-sourcing the protocol that makes it all work.

How We Got Here: From Years of Implementation to an AI-Native Workflow

We didn't start with this workflow. We started the way every distributed systems team does — design docs, code, tests, production incidents, patches, repeat. The Leaderless Log Protocol was designed and implemented based on years of experience building Apache Pulsar, BookKeeper, and other distributed storage systems. The protocol's architectural decisions — coordination-delegated architecture, write-before-delete compaction ordering, ceiling-based sparse index lookup — came from hard-won lessons about what breaks in production.

But over time, as we watched AI tooling evolve, we realized something: the knowledge we'd accumulated over years of building distributed systems could be captured in a more durable, more verifiable form than code. We could take the protocol we'd already built and battle-tested, formalize it into a specification, and use AI-assisted formal verification to confirm properties that we'd previously only been confident about through operational experience.

That's when we started using Claude Code — not to generate Java code faster, but to build out a harness for developing reliable distributed systems. Here's what that workflow looks like:

Formalizing the spec. We took the protocol that was already running in production and described it precisely in structured Markdown — every state variable, every action, every guard condition, every correctness property. Claude Code helped us refine the spec iteratively, catching ambiguities and edge cases in the natural language description. Years of implicit knowledge became an explicit, reviewable artifact.

Translating to formal languages. We used Claude Code to translate the Markdown spec into both TLA+ (the industry standard for distributed systems verification) and Fizzbee (a newer, more accessible model checker). These are two independent implementations — they don't share code and were written separately. Having both catches translation bugs. Previously, this would have required months of specialized expertise. With Claude Code, it took hours.

Exhaustive verification. The model checkers explored ~200,000 states — every possible interleaving of concurrent writers, compaction, fencing, and readers. Every crash timing. Every race condition. Not sampling — exhaustive enumeration. Properties we'd been confident about from operational experience were now machine-checked proofs.

Catching a real design bug. Even with years of production experience, verification found a genuine conceptual error that no unit test would have caught. The original model assumed that compacted data could become temporarily unreadable — but in reality, compaction reorganizes data without ever deleting it. This wasn't a coding bug. It was a design bug, a misunderstanding of the system's own invariants. The model checker found it. We fixed the spec. We re-verified. Then we updated the implementation. Years of running in production hadn't surfaced this — exhaustive model checking did.

From spec to implementation in one shot. Once the protocol was verified, we handed the spec to a coding agent with a single prompt: "Implement this." It produced a working reference implementation in Rust — concurrent producers, background compaction with the protocol's 3-step index update, consumer cursors, administrative fencing — all from the spec alone, without back-and-forth.

The lesson: AI didn't replace our distributed systems expertise. It gave us a way to formalize that expertise into a verified, reusable harness — and then use that harness to build reliable systems faster than ever before.

Harness Engineering: Specs as Guardrails for Coding Agents

Here's the problem with using coding agents to build distributed systems: the hardest parts — crash recovery, concurrent interleavings, ordering guarantees — are exactly the parts that AI is worst at getting right. A coding agent can generate a Kafka producer in minutes. But will it correctly handle the three-step compaction ordering that prevents data loss during a crash between step 1 and step 2? Almost certainly not — because that correctness property lives in the protocol design, not in any single function's logic.

This is where the idea of harness engineering comes in. A formally verified protocol spec acts as a harness — a set of constraints that enforce correctness, safety, and liveness for any implementation built within it. The harness defines what the system must do (actions, guards, state transitions) and what must never happen (safety invariants). A coding agent working within this harness can generate code fast — that's what AI is good at — but the correctness comes from the harness, not from the agent's training data.

Think of it this way: you wouldn't let a coding agent design your distributed consensus protocol from scratch. But you would let it implement a protocol that's already been formally verified across 200,000 states. The spec is the harness. The coding agent is the executor. The model checker is the proof.

This changes what it means to open-source a distributed system.

Traditionally, open-sourcing means publishing implementation code — tied to a specific language, dependencies, and trade-offs. But when you open-source a formally verified spec — a harness — you're giving the community something more fundamental:

Anyone can implement it. Hand the harness to a coding agent, get a correct implementation in Go, Rust, Java, Python — whatever fits your stack.
Correctness is enforced, not assumed. The model checker has already verified every safety and liveness property across the full state space. The harness carries those guarantees into any implementation.
The community can extend it. Modify the spec for your domain, re-verify your variant, then implement. The harness is the starting point, not the code.

We're calling this specs-first open source. Open-source the verified harness. Let the community — and their coding agents — build reliable systems on a proven foundation.

Proof by Example: S3-Queue

To demonstrate that this workflow actually works end-to-end, we built S3-Queue — a distributed message queue built entirely on S3-compatible object storage, with zero external dependencies beyond S3.

Here's the pipeline we followed:

Protocol Spec → System Spec → System Implementation. Each step was done with Claude Code.

First, we took the Leaderless Log Protocol spec — the formally verified, abstract protocol — and asked Claude Code to produce a system spec: a concrete design document that adapts the protocol for a queue on S3, using S3's conditional writes as the coordination primitives. One conversation, one self-contained Markdown document.

Then we handed that system spec to Claude Code with a single prompt: "Implement S3-Queue according to this spec." It produced a working Rust implementation — concurrent producers, background compaction with the protocol's 3-step index update, consumer cursors, administrative fencing — all generated from the spec alone, without back-and-forth.

This is harness engineering in action. The protocol spec enforces correctness. The system spec translates that into a concrete design. The coding agent translates the design into working code. At every stage, the agent is working within the harness — not improvising concurrency logic from training data.

You can try it yourself. Point your favorite coding agent at the S3-Queue system spec and ask it to implement in whatever language you prefer. The spec is self-contained — it has everything a coding agent needs to produce a correct implementation in one shot. That's the whole point: the harness is reusable across any number of implementations.

What We're Open-Sourcing Today

The complete specification suite for the Leaderless Log Protocol:

Canonical protocol specification — human-readable Markdown defining every state variable, action, guard, effect, and correctness property
TLA+ formal model — independently verified against all safety and liveness properties
Fizzbee formal model — second independent verification
Reference implementation — S3-Queue, a distributed message queue built entirely from the spec on S3-compatible object storage

We've also included the Coordination-Delegated Task Claiming Protocol — a second formally verified protocol in the same family for distributed task execution.

https://github.com/lakestream-io/leaderless-log-protocol

What's Next: The Open-Source Roadmap

We want to be transparent about what's coming and when. The streaming ecosystem doesn't need yet another Kafka fork — it needs a solid foundation that the community can build on and reason about. That's why we're open-sourcing in this order: core protocol specs (today), then the core Ursa storage library, and UFK in the next 9 months.

Why this way? Because we believe the industry needs a stream storage standard more than it needs another Kafka fork. If we open-source the code first, people will focus on the code. If we open-source the specs first, people will focus on the design. That's where the real value is, and that's where the real conversation should be. We want the community to engage with the protocol, challenge our design decisions, and potentially adopt the spec for their own systems, before the discussion gets anchored to one particular implementation.

We have two deeper technical blog posts coming — one on the protocol itself and one on the AI-augmented development workflow — for those who want to go deeper into the formal verification and the protocol-first development approach. Subscribe to our blog at streamnative.io/blog so you don't miss them.

We invite the distributed systems community to review, critique, extend, and build on this work. Take the spec. Hand it to your AI coding agent. Build something reliable.