24 min

Building Reliable Lakehouses with Apache Pulsar and Delta Lake

Pulsar Summit San Francisco 2022


Data Lakehouses combines the best of both worlds for databases and data lakes. Databases provide relative simplicity and ACID transactional protection for your data while data lakes provide flexibility, scalability, and support for non-structured data on cheap object stores. In this session, we describe Delta Lake, which brings reliability by providing a transactional layer on top of data lakes. We will talk about key features of Delta Lake that enable the Lakehouse Architecture. Finally, we will talk about the work we are doing to build the ecosystem around Delta Lake including supporting multiple languages (Python, Rust, Java, etc) as well as data processing systems (Apache Pulsar, Apache Flink, Apache Hive, PrestoDB, TrinoDB, Apache Spark™, etc).

Nick Karpov
Staff Developer Advocate, Databricks


Our strategies and tactics delivered right to your inbox

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.