The Half Apply Query

I came across a bizarre query plan at work, and ended up getting mad at the optimizer. Of course, when I shared the query with Joe Obbish, he told me to “stop thinking like a human.” Thanks, Joe.

See this estimated plan? Buncha seeks, small arrows? Looks like a safe OLTP plan, yeah?

Well, it’s actually a CPU-eating monster that rampaged through my server, causing timeouts and panicked phone calls. Here’s my setup to repro the issue (minus phone calls):

CREATE TABLE #A (
GroupID int,
PersonID int,
DetailID int,
CONSTRAINT PK_#A_GroupID_PersonID PRIMARY KEY CLUSTERED (GroupID,PersonID)
)

CREATE TABLE #B (
GroupID INT,
PersonID INT,
Junk CHAR(100)
)

CREATE INDEX IX_#B_GroupID_PersonID ON #B(GroupID,PersonID)

INSERT #A
SELECT TOP 100000
(ROW_NUMBER() OVER(ORDER BY (SELECT 1/0)))%3,
ROW_NUMBER() OVER(ORDER BY (SELECT 1/0)),
ROW_NUMBER() OVER(ORDER BY (SELECT 1/0))
FROM master.dbo.spt_values x
CROSS JOIN master.dbo.spt_values y

INSERT #B
SELECT TOP 100000
(ROW_NUMBER() OVER(ORDER BY (SELECT 1/0)))%3,
ROW_NUMBER() OVER(ORDER BY (SELECT 1/0)),
'data'
FROM master.dbo.spt_values x
CROSS JOIN master.dbo.spt_values y

And here are what the tables look like – three distinct GroupIDs (0,1, and 2), with everything else unique.

What happened in prod was a plan getting compiled for values outside of the histogram. Though I could simulate that with some more inserts…I’m lazy. OPTIMIZE FOR works just fine, and I include a hint to force the legacy CE, since that’s what I need on my machine. Also, my code plugin is still borked, so I’m pasting as little ugly code as possible.

DECLARE @id1 INT = 2
SELECT *
FROM #A a
LEFT JOIN #B b
ON b.GroupID = a.GroupID AND b.PersonID = a.PersonID
WHERE a.GroupID = @id1 AND a.DetailID < 1000
--Add OPTION section to get bad plan
--OPTION(OPTIMIZE FOR (@ID1 = 3)
--,USE HINT('FORCE_LEGACY_CARDINALITY_ESTIMATION'))

Here’s the in-histogram plan:

And here’s the out-of-histogram plan of fail and awfulness:

Compare it to the estimated plan up top, and you’ll see something has gone terribly wrong. The number of rows read on that bottom seek is really high, and if we look at the operator, we see why.

What the heck? It’s only seeking on GroupID, even though the other key is part of the index! If we look at the leftmost join, we finally see the PersonID comparison. Oops.

So what went wrong? In the good plan, both keys are pushed down into the seek under a nested loop join in an APPLY pattern. (Read more about JOIN vs APPLY Nested Loops from Paul White here.) However, in the bad plan, only one the keys gets pushed down, while the other stays at the join. Weird. Using the new CE gave me the better plan, but some of my helpful friends testing with the new CE still got the bad plan.

To a human, it should be obvious that pushing both keys down is better. But let’s put our think-like-the-optimizer hat on.

When compiling for an out-of-range value, Dr. Optimizer hypothesizes there won’t be any rows returned from #A. Then when seeking only for the matching GroupID in #B, there won’t be any rows (rounded up to 1 row), which is exactly the same as would happen when seeking on both GroupID and PersonID.

Thus we end up with a situation where both the optimal and awful plans have the same query buck cost, because every operation is costed at one row. (Feel free to test this out by adding a FORCESEEK hint to #B that specifies both key columns). It also explains why only some of my friends were able to repro this. Faced with plans of equal cost, SQL Server then used some factor X to choose one or the other on different machines.

So yeah, the solution to this is to stop thinking like a human, and realize that the optimizer is a terrible Scrooge who cares only about cost. With equivalently costed plans, of course it might pick the bad (according to humans) one. Special thanks to my friends Joe and Paul for helping me reason through this.

P.S. For anyone interested, it appears that rule SelToIdxStrategy creates the incomplete seek, and AppIdxToApp the two-column seek.

Cost of a Query

There’s something about being a DBA that gives us special insight into the dysfunctions of both code and organizations. When we’re the ones keeping databases running, we get a first row seat to all the dumb that floats up.

Yet sometimes it feels hard getting developers to care. Five thousand deadlocks an hour? Don’t worry, we have retry logic. Entity Framework uses the wrong datatype in a query, causing it to scan and consume half a server’s CPU? Oh no worries, everything is still working. Remember…servers are cattle, not pets.

Bitterness aside, I found something that helps communicate impact: cost. No, not query bucks – dollars. One of the wonderful benefits of Azure is that organizations can see the cost of garbage code. Well, those of us working with on-premise servers can use the same approach. Send an email informing developers (and their managers) that one of their bad queries would cost $80k a year to run in Azure, and you’ll get traction. It’s also fun to show your boss how much your indexing was worth, right around review time.

Here’s the core of my approach:

  • Find an approximately equivalent resource in Azure (e.g. if you have an AG, look at a Business Critical Managed Instance).
  • Use your company’s pricing info (or the handy online estimator) to look at cost, and then calculate dollars per core-hour.
  • Add in scaling assumptions. For example, any server consuming more than 75% of CPU in your organization may be considered too small and increased in size. This would make the available compute pool 75% instead of 100% for our calculation.
  • Grab CPU usage metrics from your favorite source (yay Query Store!) for the offending query. Gather them during peak hours if possible, because that’s what you size your physical server around.
  • Scale the cost to per-year (yes, I admit it’s to get a larger number, but it’s a timeframe the business usually budgets around too).

Step 2: Math. Step 3: Profit! I figured you might not want to math it out yourself, so here’s a Google Sheet with my formulas.

I’ve had a lot of success communicating this way, and I’m especially happy when I have actual Azure resources to work with. There are still some imperfect assumptions in my approach (scale and comparable Azure resources I think are the weakest), so I’d love feedback and suggestions. Now go use math to beat up developers!