From B-Trees to V-Order. Told differently than usual.

2025

TL; DR

We’ll explore query optimization techniques from B-Trees, through Z-Order and Liquid Clustering, to V-Order, diving into their mathematical foundations, challenging our intuition, and uncovering the mechanisms driving their performance.

Session Details

In this session, we will walk through the basic query optimization techniques, starting with classic indexes (B-Tree) for relational databases, moving through Partitioning, Z-Order, and Liquid Clustering for DataLake/Lakehouse, and ending with the V-Order mechanism introduced by Microsoft, which speeds up performance in Direct Lake mode.

Throughout this hour, we will often delve into the mathematical foundations behind the mechanisms we use daily, while also verifying how far off our intuition can be, and challenging what we sometimes take for granted.

We will answer, among other things:

> What’s the difference between partial order and linear order, and how does it relate to sorting rows in tables?
> Where did Morton and Hilbert curves come from before they were used to optimize the "Data Skipping" mechanism?
> What does a Parquet file consist of, how does Predicate Pushdown work, and why don’t Z-Order and V-Order exclude each other?

And also…

> How many new guests can fit into a fully occupied hotel with an infinite number of rooms? ;)

3 things you'll get out of this session

- An overview of optimization techniques and an understanding of how they have addressed evolving challenges over time. - Expanding knowledge with fundamentals often overlooked in sessions covering similar topics. - Inspiration to occasionally revisit the mathematical foundations behind the mechanisms we use daily.

From B-Trees to V-Order. Told differently than usual.

TL; DR

Session Details

3 things you'll get out of this session

Speakers

Tomasz Kostyrka

pl.seequality.net