April 30, 2025
3 mins

Diskless, Stateless, Leaderless – A Comic Guide to Modern Data Streaming

Sijie Guo
Co-Founder and CEO, StreamNative

Goal of this comic‑blog: Explain three buzz‑worthy architectures in plain English so any dev, PM, or VP can walk away nodding, “Got it!”

Meet the Three Amigos 🧑‍🚀🧑‍🔧🧑‍🎨

Imagine a squad of message‑brokers who keep your data flowing 24/7. Each broker can adopt one (or more) of these personality quirks:

  1. Diskless – “No backpack full of disks for me. I stash my stuff in the cloud!”
  2. Stateless – “Goldfish memory. I deliver messages then forget them.”
  3. Leaderless – “Nobody here is the boss. We pass the ball around like pickup basketball.”

Why should you care? Because the mix you choose decides how cheap, fast, and fault‑tolerant your pipeline can be.

1. Diskless – No Hard Drives, No Heavy Lifting ☁️

What it means

Brokers write straight to cloud/object storage. Their own disks disappear completely.

Why it’s cool

  • Elastic scale – Spin up new brokers in seconds; no terabytes to copy.
  • Cloud pricing – Pay pennies per GB instead of gold‑plated SSDs.
  • Unlimited bandwidth – Saturate links with no caps or throttling, so throughput grows with your needs.  

Gotchas

  • Adds a few hundred extra milliseconds – Each write waits for the cloud to say “stored!”.
  • Backpressure on the producers – Messages must be kept on the client until they are persisted in the cloud.
  • Your cloud bucket is now the single source of truth – keep it durable and monitored.

Who's doing it?

  • Kafka – Mostly disk‑full today; Diskless Topics (KIP‑1150) are still experimental.
  • Redpanda – Mostly disk‑full; shadow‑indexing can tier cold data to S3, but brokers still rely on local disks.
  • Pulsar – Brokers are diskless, but BookKeeper storage nodes keep spinning disks.
  • Ursa – 100 % diskless; everything lands straight in S3 + Iceberg.

👉 Use diskless if you love cloud economics or need limitless retention. Stick with disks if every millisecond counts or you run on‑prem without object storage.

2. Stateless – Memory of a Goldfish 🐠

What it means

Brokers keep no durable state. They push every message to an external store (and maybe cache a few in RAM). If they crash, a twin broker just resumes the job.

Why it’s cool

  • Replaceable pods – Perfect fit for Kubernetes auto‑scaling.
  • Ops bliss – Rolling upgrades? Kill and redeploy without data shuffles.
  • Elastic scalability - instantly add or remove brokers to your cluster without having to copy data

Gotchas

  • More moving parts – You must run external storage (BookKeeper, S3, etc.) and a metadata service.
  • Extra hop – Reads may fetch from storage instead of a local disk, adding a tiny overhead.

Who's doing it?

  • Kafka – Classic mode is stateful (data lives on broker disks).
  • Redpanda – Also stateful; each node stores logs locally just like Kafka.
  • Pulsar – Brokers are proudly stateless; BookKeeper holds the data.
  • Ursa – Stateless ++; any broker can serve any partition (thanks to Leaderless, next section).

👉 Use stateless when you crave effortless scaling or multi‑tenant isolation. Choose stateful if you prefer one simple box that “just works.”

3. Leaderless – No Single Boss Here 🤝

What it means

There’s no “captain” broker for a partition. Any broker can accept writes; ordering is managed by a shared “scoreboard” (a fast metadata/index service).

Why it’s cool

  • Failover magic – A broker dies? Clients just talk to another—no election delay.
  • Spread the load – Hot partitions aren’t locked to one über‑busy leader.

Gotchas

  • New brain to babysit – That metadata service (Oxia, etcd, Spanner) is now mission‑critical. Keep it HA and low‑latency.
  • Two‑hop writes – Broker ➜ metadata ➜ storage adds a smidge of latency.

Who's doing it?

  • Kafka – Leader‑based (for now).
  • Redpanda – Same leader‑follower pattern as Kafka.
  • Pulsar – One broker owns a topic at any moment ⇒ still leader‑based.
  • Ursa – Fully leaderless; brokers & storage coordinate via Oxia.

👉 Use leaderless if you need five‑nines uptime, global clusters, or you’re tired of hot‑leader bottlenecks. Stick with leaders when ultra‑low latency or simple ops trump everything.

Putting It All Together 🏆

Takeaways

  • Diskless slashes storage costs and boosts elasticity.
  • Stateless makes brokers cattle, not pets.
  • Leaderless removes single points of failure (but needs a rock‑solid metadata brain).

Mix & match based on your pain point—storage cost, scaling headaches, or availability.

Ready to play with the Three Amigos? 🎲

  1. Spin up Pulsar to feel the stateless joy of instant topic reassignment.
  2. Give Ursa a whirl to taste the full diskless + stateless + leaderless combo.
  3. Join our free two‑day Data Streaming Summit Virtual 2025 (May 28‑29) for live demos, real‑world war stories, and Q&A with the folks building these systems.

See you on the stream! 🚀

This is some text inside of a div block.
Button Text
Sijie Guo
Sijie’s journey with Apache Pulsar began at Yahoo! where he was part of the team working to develop a global messaging platform for the company. He then went to Twitter, where he led the messaging infrastructure group and co-created DistributedLog and Twitter EventBus. In 2017, he co-founded Streamlio, which was acquired by Splunk, and in 2019 he founded StreamNative. He is one of the original creators of Apache Pulsar and Apache BookKeeper, and remains VP of Apache BookKeeper and PMC Member of Apache Pulsar. Sijie lives in the San Francisco Bay Area of California.

Newsletter

Our strategies and tactics delivered right to your inbox

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Kafka
Pulsar
Ursa