Native Apache Kafka Service Is Coming Soon to StreamNative Cloud. Join the waitlist and get $1,000 in credits.

Join Waitlist >
StreamNative Logo
VideoSep 30, 202535 mins

Kafka Under Pressure: Netflix's Blueprint for Unshakeable Kafka Resilience

Unlock Instant Access

Complete the form to start watching.

Session Overview

Learn how Netflix ensures Kafka resilience under extreme traffic. Discover strategies for broker stability, adaptive clients, and high-throughput operations at scale.

How does a Kafka cluster handle 10× or even 100× traffic spikes while maintaining high throughput and availability? At Netflix, live streaming events place unprecedented demands on our core Kafka infrastructure, requiring innovative solutions to keep services resilient under extreme load.

In this talk, we share Netflix’s blueprint for Kafka resilience, covering strategies that go beyond out-of-the-box configurations to maximize uptime, minimize data loss, and maintain service performance during peak loads.

Key topics include:

  • Broker Stability Under Overload: Techniques to ensure Kafka brokers remain stable even during extreme traffic surges.
  • Adaptive Clients: Transforming producers and consumers into active participants that dynamically adjust behavior in real time to protect cluster health.
  • Operational Insights: Lessons learned from scaling Kafka at Netflix, including monitoring, failure mitigation, and proactive management strategies.
  • High-Throughput Design Patterns: Architectures and operational patterns to sustain performance during unpredictable traffic spikes.

Whether you’re a Kafka engineer, platform architect, or operations lead, this talk provides actionable strategies and insights for building resilient, scalable, and high-performing Kafka infrastructures capable of surviving even the most demanding workloads.

About Speaker

Jorge Rodriguez

Jorge Rodriguez Jorge Rodriguez is a Senior Software Engineer on the Data Movement Engines team at Netflix. During the past 4 years, he's been contributing to the Apache Kafka and Flink platforms to enable realtime data processing at Netflix.

Vinay Rayini

Vinay Rayini Vinay is a Software Engineer on the Data Movement Engines team at Netflix, where he has spent the last two years developing and scaling the Kafka as a Service platform. This platform is crucial for collecting and transporting over 23 trillion events and 50 petabytes of data daily. Previously, he worked at Microsoft and Google on distributed systems and real-time data processing initiatives.