Insights from streaming 300B telemetry trace spans per day with Flink
Amrit Sarkar
Vineet Khadloya

How do you make sense of 300 billion distributed tracing spans per day? At Salesforce, the Monitoring Cloud Telemetry Tracer team tackles this challenge head-on — using Apache Flink to process massive real-time telemetry streams and construct accurate, up-to-date service dependency maps across hundreds of microservices.

In this session, we’ll share key architectural decisions, scaling lessons, and operational insights from running telemetry pipelines at extreme scale. You’ll learn how we:

  • Process and correlate hundreds of billions of spans in real time
  • Design robust stateful streaming pipelines for telemetry data
  • Handle out-of-order events and massive fan-out scenarios
  • Manage partitioning and state management at scale for high-throughput workloads
  • Maintain performance and reliability in mission-critical observability systems

This talk is a must-watch for engineers and architects building large-scale telemetry, tracing, or observability pipelines — or anyone interested in pushing Flink to its real-time processing limits.

Amrit Sarkar
Lead Member of Technical Staff, Salesforce

Amrit Sarkar, an engineer with eight years of experience, specializes in the Search and Big Data domains.

Vineet Khadloya
SMTS, Salesforce

Engineer on the Tracer and Moncloud API Platforms Team

Newsletter

Our strategies and tactics delivered right to your inbox

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.