Introducing: Delta Lake The Definitive Guide

November 25, 2024

by R. Tyler Croy

deltalake

python

rust

As one of the co-creators of delta-rs I am thrilled to Introduce Delta Lake The Definitive Guide. The book is truly a definitive guide for new and existing Delta Lake users, including the chapter I was fortunate enough to contribute on building native applications with Python or Rust.

Delta Lake The Definitive Guide book cover

From the official description:

Authors Denny Lee, Tristen Wentling, Scott Haines, and Prashanth Babu (with contributions from Delta Lake maintainer R. Tyler Croy) share expert insights on all things Delta Lake--including how to run batch and streaming jobs concurrently and accelerate the usability of your data. You'll also uncover how ACID transactions bring reliability to data lakehouses at scale.

The chapter on building native applications with Python or Rust walks the reader through the basics of using Python with Pandas or Rust with Apache DataFusion to perform efficient and useful data engineering work. The chapter also shares a couple of examples for implementing efficient data processing tasks with AWS Lambda, a pattern we use heavily at Buoyant Data to deliver highly efficient and cost-effective data platform infrastructure.

Other sections of the book also will help readers:

  • Understand key data reliability challenges and how Delta Lake solves them
  • Explain the critical role of Delta transaction logs as a single source of truth
  • Learn the Delta Lake ecosystem with technologies like Apache Flink, Kafka, and Trino
  • Architect data lakehouses with the medallion architecture
  • Optimize Delta Lake performance with features like deletion vectors and liquid clustering

Adjacent to the book is a public Git repository which contains numerous examples, all of which are open source and ready to help developers get started with kafka-delta-ingest, delta-rs, or Delta with Apache Spark.

Pick up your physical or digital copy today!


If your team is interested in adopting Delta Lake with Python, Rust, or Apache Spark drop me an email and we'll chat!!