Flink Streaming Ingestion to Cloud-lake at Scale
Zhenqiu Huang
Shiyan Xu

At Uber, real-time data powers everything—from pricing and matching to logistics and safety. In this session, Uber engineers share how Apache Flink and Apache Hudi are used to build streaming ingestion pipelines for Uber’s Cloud Lake, enabling real-time machine learning and analytics in a hybrid cloud environment.

You’ll get a detailed look into:

  • Architecture: How Uber manages thousands of Flink ingestion pipelines with built-in deployment safety, failover mechanisms, and disaster recovery.
  • Runtime: How Uber ensures cross-cloud data security, column-level access control, and data privacy at scale.
  • Operational Efficiency: How Flink Autoscaler as a Service dynamically adjusts workload parallelism to reduce cost, and how partial sort optimization drives further efficiency.
  • Apache Hudi: Deep dive into Hudi’s streaming integration for Flink and its non-blocking concurrency control (NBCC) design.

If you’re running large-scale streaming systems or exploring real-time ingestion for ML and analytics, this session delivers practical insights and architectural patterns proven in Uber’s production environment.

Zhenqiu Huang
Software Engineer, Uber

Zhenqiu Huang has been in the Apache Flink Community for a long time. He built a Streaming Platform at Uber Technology and Apple Inc. He recently worked with the Apache Hudi community on building Streaming Ingestion to Cloud Lake at Uber.

Shiyan Xu
Founding team member, Onehouse

Shiyan Xu works as a data architect for open source projects at Onehouse. While serving as a PMC member of Apache Hudi, he currently leads the development of Hudi-rs, the native Rust implementation of Hudi, and the writing of the book "Apache Hudi: The Definitive Guide" by O'Reilly. He also provides consultations to community users and helps run Hudi pipelines at a production scale.

Newsletter

Our strategies and tactics delivered right to your inbox

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.