An Empirical Look at Key Lookups

I’m going to presume you’re familiar with what key lookups are, and why they exist.

SQL Server’s cost optimizer hates them, because it thinks they’re random reads from a spinny disk – unlikely for a modern enterprise server. What I’m curious about is just how bad these assumptions are in practice.

The Setup

Here’s the million-row test table I’m going to use, along with fake data. I’m going to be testing on my laptop with a standard SSD, and an intel processor with a variable clock speed (it seems to average a bit below 4GHz for these tests).

CREATE TABLE dbo.InternetRecipes (
ID INT IDENTITY PRIMARY KEY,
Calories INT,
SizeOfIntro INT,
RecipeName NVARCHAR(300)
)

INSERT dbo.InternetRecipes
SELECT TOP (1000000)
(ROW_NUMBER() OVER(ORDER BY 1/0))%1000,
(ROW_NUMBER() OVER(ORDER BY 1/0))%10000+500,
'Paleo Fat Free Cotton Candy'
FROM master..spt_values a,
master..spt_values b


--With includes and not, for testing
CREATE INDEX IX_InternetRecipes_Calories
ON dbo.InternetRecipes(Calories)

CREATE INDEX IX_InternetRecipes_Calories_incl
ON dbo.InternetRecipes(Calories)
INCLUDE (RecipeName)

Time/Cost Comparison

Using an index hint, we can see a direct comparison between a plan with key lookups, and without. The “without” plan makes use of the index that includes RecipeName. Normally this is described as a covering index, but today I’m designating it the “inclusive” index.

--7.7 query bucks
SELECT RecipeName
FROM dbo.InternetRecipes WITH(INDEX(IX_InternetRecipes_Calories_incl))
WHERE Calories >= 0
AND RecipeName = 'Vegan Steak'

--200 query bucks
SELECT RecipeName
FROM dbo.InternetRecipes WITH(INDEX(IX_InternetRecipes_Calories))
WHERE Calories >= 0
AND RecipeName = 'Vegan Steak'
OPTION(MAXDOP 1)

Some interesting details emerge here. First: key lookup cost does not scale linearly.

Working the math backwards shows the cost comes from a million CPU costs, but only 11,495 IO costs. Incidentally, 11,495 is the number of data pages for the table. Key Lookup IO costs are capped at the number of pages in the table, although you’re unlikely to encounter this naturally. Of course, the next thing for me to test was everything in-between, and the experiments showed that the key lookup cost does not assume each will cause an IO.

The exact formula still eludes me though. I think it’s a delightfully nerdy problem that nobody will care about, so I’ll be working on it as soon as possible.

The one million key lookup query is costed about 26X more than the simple seek, and takes about 18X longer to run. The optimizer does a decent job here! Or does it? I was running the test with a warm cache, so what if we ignore IO costs?

--MS will laugh at you if this breaks your server
DBCC SETIOWEIGHT(0)

The optimizer cost ratio balloons to 149X (while actual runtime remains 18X of course). Even though the actual executions involved no IO, the query optimizer including IO costs does a better job.

Let’s fix that IO costing thing…

DBCC SETIOWEIGHT(1)

And test with a cold cache…

CHECKPOINT --if we just created the table/indexes
DBCC DROPCLEANBUFFERS  --repeat as needed

The execution time ratio is about 15X with IO – not much of a change. If I disable prefetch with 9115 (thanks Paul), the time ratio moves to 17X, and query buck costs stay unchanged. So, the optimizer does a decent job costing between the inclusive index and the key lookup plan, but only when it assumes IO costs.

Tipping Point Testing

Because SQL Server loathes key lookups so much, it will sometimes choose to scan the entire table just to avoid them. In my experience, it does this way too early. Can I prove it?

Let’s remove the inclusive index, and select results into a trash variable for testing.

DROP INDEX IX_InternetRecipes_Calories_incl
ON dbo.InternetRecipes

SELECT RecipeName
FROM dbo.InternetRecipes
WHERE Calories >= 996
OPTION(MAXDOP 1);

I find that the tipping point occurs between Calories values of 996 and 997, which is between 3000 and 4000 rows/lookups.

The full scan plan takes on average 135ms, while the key lookup plan for 4000 rows is only 9ms.

It’s not until Calories value 925, or 75k rows, that the true tipping point occurs, and the key lookup plan takes as long as the scan. Even testing with a cold cache, the tipping point is at 41k rows. So for my laptop with a warm cache, the optimizer switches to a scan at just 4% of the optimum number of rows, or 25X too soon (roughly).

Throughput Testing

What else can we learn about key lookup impact? Let’s cram it in a loop and set spin cycle to ultra high.

First, bring back the inclusive index.

CREATE INDEX IX_InternetRecipes_Calories_incl
ON dbo.InternetRecipes(Calories)
INCLUDE (RecipeName)

Now, the general loop structure:

DECLARE @i INT = 0
DECLARE @trash NVARCHAR(300)
DECLARE @dt DATETIME2 = SYSDATETIME()

WHILE @i < 10000
BEGIN

	SELECT TOP(1000) @trash = RecipeName
	FROM dbo.InternetRecipes WITH(INDEX(IX_InternetRecipes_Calories))
	WHERE Calories = 666 --Devil's food cake
	OPTION(MAXDOP 1);

	SET @i += 1

END

SELECT DATEDIFF(MILLISECOND,@dt,SYSDATETIME())

Using TOP to limit the number of rows/lookups, I measure how long it takes to run a loop of 10k queries. Here are my averaged numbers:

RowsKey Lookup Time (ms)Inclusive Seek Time (ms)
100018,2231,747
1001,857331
10391200
1250181
1k, random21,2841,824

These numbers are specific to my laptop and particular test. Still, what I see is that the impact of key lookups becomes larger the more there are. This makes sense, as there is some overhead per query.

If I assume all extra time in the 1000 row test comes from the lookups, I get a value of about 1.6 microseconds per lookup. To put that in perspective, to occupy an extra core of CPU would take 600,000 lookups/second.

I was curious if using a repeated value took unfair advantage of the CPU cache, so I also tested with randomized “Calories” values, showing minor slowdown. An extra core under these conditions (probably more representative of a production server) would need 514k lookups/s. And a big caveat: all these tests use a single table design, and are run on a personal laptop.

Summary

Lookups hurt performance, but probably not as much as you think, and definitely not as much as the optimizer thinks when considering the seek/scan tipping point. If you see a scary cost percentage number by a key lookup, remember that it’s probably junk.

Avoiding Sorts with the $PARTITION Function

Backstory

I imagine everybody has them – those annoying, bloated, misshapen tables that don’t really belong in the database. I mean, this is an expensive enterprise RDMS supporting a high volume OLTP environment, and a depressingly large portion of the storage (also expensive btw) is dedicated to decade’s worth of nvarchar(max) email attachments.

I know how hard it can be to undo someone else’s decision to treat SQL Server like a landfill, as this usually involves getting others to do something, i.e. politics. But sometimes we DBAs can make a helpful change to these archive tables on our own. And that’s where I found myself, with billions of rows in a neglected heap that couldn’t be moved, but could be turned into a clustered columnstore table for 5X compression.

Setup Script

Here’s the setup script in case you want to follow along for the repro. No billions of rows here, just 10 million.

CREATE PARTITION FUNCTION pf_years( DATETIME )
    AS RANGE RIGHT FOR VALUES ('2010','2011','2012','2013','2014','2015',
	'2016','2017','2018','2019','2020');

CREATE PARTITION SCHEME ps_years
AS PARTITION pf_years
ALL TO ( [PRIMARY] );

--Here's a table we're loading
CREATE TABLE BunchaDates
(somedate datetime,
junk char(10)
) ON ps_years(somedate)

CREATE CLUSTERED COLUMNSTORE INDEX CCX_BunchaDates ON dbo.BunchaDates ON ps_years(somedate)

--Original table, full of junk
CREATE TABLE BunchaDatesSource (
somedate datetime,
junk char(10)
)

INSERT BunchaDatesSource
SELECT TOP (10000000)
DATEADD(MINUTE,24*60*365*6*RAND(CHECKSUM(NEWID())),'1/1/2014')
,'demodemo'
FROM master..spt_values a
,master..spt_values b
,master..spt_values c

The Problem

My plan was to load the data from the heap into the columnstore table, later doing a drop and rename, but I wanted to transfer a limited amount at a time. My environment has an Availability Group set up, so my constraint was minimizing log generation. (If you’re looking for a technique to load a columnstore table as quickly as possible, check out Joe Obbish.) But, when I started loading a limited time range of data, I quickly ran into a problem.

INSERT BunchaDates
SELECT somedate,junk
FROM BunchaDatesSource bs
WHERE YEAR(bs.somedate) = '2014'

A Sort operator shows up! To explain my distaste, the columnstore table doesn’t need it – it is inherently unordered (at least in the way we think about it). Moreover, I am selecting rows guaranteed to reside within a single partition. Yet SQL Server insists on ordering rows, because it very helpfully wants us to avoid the performance penalty of switching between partitions insert-by-insert. Instead I get a slow query, ginormous memory grant, and tempdb spills. Yay!

Ok. You might notice that I used YEAR() in my predicate. What if I use a proper date range instead?

Crud. Ok. Was it the boundary dates? Maybe a narrow date range entirely within the year will work?

Fine. What about an exact time? Pretty please, oh kind and generous Query Optimizer?

The Solution

Looping through all possible times of a datetime time field in order to avoid a sort is very bad way to do things. But thankfully, there’s a better way: $PARTITION.

The $PARTITION function uses your partition function to return which partition a value belongs to, and importantly, it can be used to provide a guarantee to SQL Server that it doesn’t have to insert into multiple partitions, and will therefore prevent a superfluous sort.

INSERT BunchaDates
SELECT somedate,junk
FROM BunchaDatesSource bs
WHERE $PARTITION.pf_years(somedate) = $PARTITION.pf_years('2014')

No Sort! But there’s still room for improvement:

You see, $PARTITION doesn’t qualify for predicate pushdown, and on a query involving billions of rows, that adds up.

But guess what, we can still add the date range, and that does get predicate pushdown.

INSERT BunchaDates
SELECT somedate,junk
FROM BunchaDatesSource bs
WHERE $PARTITION.pf_years(somedate) = $PARTITION.pf_years('2014')
AND bs.somedate >= '2014/1/1' AND bs.somedate < '2015/1/1'

Conclusion

Sometimes, the optimizer misses something, and you have to babysit it. In my case, using $PARTITION allowed me to finish the table load in my time window, eventually freeing up significant space to be used for other questionable data.

The takeaway? $PARTITION lets you guarantee for the optimizer that only a single partition is involved, but you might want to retain your other predicates for predicate pushdown or perhaps index qualification.

Now excuse me while I compose yet another carefully worded email about that email attachment table…