Apache Pulsar 5.0 Early Access is now open. Be among the first to try the future of streaming.

Request Early Access >

StreamNative Introduces Lakestream Architecture and Launches Native Kafka Service

Read Announcement > Sign Up Now >
StreamNative Logo
VideoOct 29, 202415 mins

Truly Scalable Operational Data Layers for Data Pipelines

Unlock Instant Access

Complete the form to start watching.

Session Overview

As streaming systems scale to match the ever-increasing volumes of data in applications, how should data engineers think about the scale properties of the sources and destinations of streaming data? In this session, we’ll discuss scaling from the perspective of an operational data layer (both a destination and a source), or – more tangibly – the global source of truth for data aggregated from all internal sources. Engineers use this layer to park data for additional processing or operational business intelligence. Almost every large business has one or is building one, and they may not even know it. The purpose of this talk is to precisely define this layer and discuss how to think about its scalability as it serves workloads from so many different places. We’ll deconstruct the idea of “scalability” into fine-grained parts to be considered in nearly all cases. By the end of this talk, you’ll know what it means to have a truly scalable operational data layer.

About Speaker

Matthew Penaroza

Matthew Penaroza Senior Solution Engineer, PingCap