I had an interesting conversation with a developer the other day – it went something like this:
Developer: “Why does it take three minutes to delete a single row from this table?”
Me: “Hold on while I look at the plan. Oh, it’s because SQL Server is checking the 460 foreign keys that reference that table, 447 of which don’t have a supporting index.”
I know the pains of foreign keys without indexes. Oh, do I know. So when I saw a cascading single-row delete plan with a scan on the child table, I thought, “Aha! I know how to fix this, an index!”
But the index was already there, a perfect one for supporting that foreign key with the abominable ON DELETE CASCADE . Something was obviously wrong, and I immediately suspected our recent switch to 2016 compatibility level – meaning a new cardinality estimator.
Sure enough, in 2012 mode the query optimizer produced a seek.
I was able to come up with a repro of the issue (which is what the above plans are from).
--set up two tables, one with a foreign key with cascading delete DROP TABLE IF EXISTS dbo.Child DROP TABLE IF EXISTS dbo.Parent GO CREATE TABLE dbo.Parent ( ID INT IDENTITY PRIMARY KEY CLUSTERED, junk CHAR(1) ) INSERT dbo.Parent SELECT TOP 1000000 'a' FROM master..spt_values a CROSS JOIN master..spt_values b CREATE TABLE dbo.Child ( ID INT IDENTITY PRIMARY KEY CLUSTERED, morejunk CHAR(5), ParentID INT ) ALTER TABLE dbo.Child ADD CONSTRAINT [💩] FOREIGN KEY (ParentID) REFERENCES dbo.Parent (ID) ON DELETE CASCADE ON UPDATE CASCADE CREATE INDEX IX_agggghhhhhh ON dbo.Child(ParentID) INSERT dbo.Child SELECT TOP 1100000 'bah!', 42 --repro also works with random numbers for FK column FROM master..spt_values a CROSS JOIN master..spt_values b GO ----------------------------------------------------------------- --Just need estimated plan here. Does it scan on Child? DELETE dbo.Parent WHERE ID = -1 --any value here too low or too high will cause a scan, often including values that do exist in Parent!
The short of it is, with any value too far outside of the histogram of the parent table, the new cardinality estimator assumes that EVERY SINGLE ROW of the child table will get deleted.
This can be seen even when a seek is forced through a USE PLAN hint.
Interestingly enough, running the new cardinality estimator in 2014 compatibility level with Query Optimizer Fixes off will also produce a seek plan. I found a few other ways to get the good plan, and stuck them in a bonus script.
DELETE dbo.Parent WHERE ID = -1 OPTION( USE HINT('FORCE_LEGACY_CARDINALITY_ESTIMATION') ) DELETE dbo.Parent WHERE ID = -1 OPTION( USE HINT('ASSUME_JOIN_PREDICATE_DEPENDS_ON_FILTERS') ) --legacy CE with base containment (TF2301) still seeks though --Repro is dependent on statistics SELECT * FROM sys.dm_db_stats_properties_internal(OBJECT_ID('dbo.Parent'),1) SELECT * FROM sys.dm_db_stats_histogram(OBJECT_ID('dbo.Parent'),1) SELECT * FROM sys.dm_db_stats_histogram(OBJECT_ID('dbo.Child'),2) UPDATE STATISTICS dbo.Parent WITH SAMPLE 5 PERCENT --allows problem UPDATE STATISTICS dbo.Parent WITH FULLSCAN --changes histogram and prevents problem UPDATE STATISTICS dbo.Parent WITH SAMPLE 99 PERCENT --still allows problem
There’s something I don’t understand happening with the cardinality estimator here (as is usual). It seems when the value being deleted from the Parent table is within its histogram, the optimizer uses stats from the Child. Otherwise, it assumes the entire Child table matches. Ouch.
Remember how my demo only has two tables? The real scenario I encountered was a cascading delete hitting forty tables, and, you guessed it, scanning each one. Block party!
Forcing plans with the help of Query Store only goes so far, especially when it’s playing whack-a-mole with ORM-generated SQL. The best solution for now might be to retreat to 2012 compatibility and complain on the internet. Bah!