Approximate functions: How do they work?

Proposed session for SQLBits 2026

TL; DR

In this session, we will look at the internals of how the "approximate processing" functions in SQL Server 2019 and 2022 are implemented, and what their benefits and drawbacks are.

Session Details

Sometimes, a close approximation is good enough. And sometimes, a close approximation is a lot faster. Microsoft has introduced “Approximate Query Processing” (the APPROX_COUNT_DISTINCT and APPROX_PERCENTILE functions) to give you exactly that benefit when you don't need exact answers.

But do you have a good response when you propose to use this function and your manager asks you to explain how they work first? Or is your only option to claim "black magic by smart Microsoft engineers"?

The algorithms used are not a secret. HyperLogLog and KLL Sketch. And now you most likely know exactly as much as you already knew before. And when you google for those terms ... you end up with a headache.

Time to join me for a session where I explain the black magic in the simplest possible terms, so that you can then explain it to your manager!

3 things you'll get out of this session

Explain the algorithm used for APROX_COUNT_DISTINCT Explain the algorithm used for APPROX_PERCENTILE_DISC and APPROX_PERCENTILE_CONT Explain the error margins of these approximate functions.