StreamLink: Real-Time Data Ingestion at OpenAI Scale

Resources
Download Slide Deck ↓Real-time data ingestion at scale is a cornerstone of modern AI and analytics. In this talk, explore how OpenAI built StreamLink, a high-performance streaming ingestion platform powered by Apache Flink, designed to meet the data needs of both humans and AI systems.
Join this deep dive to learn:
- How StreamLink enables large-scale, real-time ingestion in OpenAI’s lakehouse
- The architecture behind its Kubernetes-native deployment using the Flink K8s Operator
- How adaptive autoscaling and self-service onboarding keep operations fast and lean
- Best practices and design patterns for building scalable streaming systems at enterprise scale
Whether you’re operating a data platform or scaling streaming infrastructure, this session offers practical insights for powering the next generation of real-time, AI-driven analytics.
Recommended resources
Watch more events.
Newsletter
Our strategies and tactics delivered right to your inbox

.png)

.png)

