Maybe you’ve seen the Microsoft recommendations on tempdb, that multiple datafiles help reduce contention. They don’t simply say 1 file per CPU though, but to stop at 8 and slowly increase from there. Of course, my curiosity was piqued. Why increase slowly and carefully? Can we run into problems with too many datafiles? How many is too many?
Since there’s no official Microsoft guidance on breaking tempdb (geez, guys…), I looked to the wisdom of an ancient meme, and modified tempdb to have over 9000 datafiles.
That should be enough.
--Set space available in GB DECLARE @TotalSizeGB DECIMAL(9,2) = 500 DECLARE @ExtraFileCount INT = 9000 DECLARE @SizeMB INT --Convert to MB SET @SizeMB = @TotalSizeGB * 1024 --Find space allowed per file DECLARE @ExistingFileCount INT = (SELECT COUNT(*) FROM tempdb.sys.database_files WHERE [type] = 0) DECLARE @SizePerFile INT = FLOOR(@SizeMB/(@ExtraFileCount+@ExistingFileCount)) --Resize existing files DECLARE @resizescript NVARCHAR(MAX) SELECT @resizescript = ISNULL(@resizescript,'') + 'DBCC SHRINKFILE('+[name]+','+CONVERT(VARCHAR(20),@SizePerFile)+')'+CHAR(13) FROM tempdb.sys.database_files WHERE [type] = 0 --sometimes it's hard to shrink tempdb files unless these are run first PRINT 'USE tempdb GO CHECKPOINT GO DBCC DROPCLEANBUFFERS DBCC FREEPROCCACHE DBCC FREESYSTEMCACHE(''All'') GO' PRINT @resizescript --tempdb file location will have to be manually edited DECLARE @i INT = 1 DECLARE @iText VARCHAR(20) WHILE @i <= @ExtraFileCount BEGIN SET @iText = CONVERT(VARCHAR(20),@i) PRINT 'ALTER DATABASE tempdb ADD FILE (NAME = ''HorribleIdea'+@iText+''', FILENAME = ''L:\tempdb\HorribleIdea'+@iText+'.ndf'', SIZE = '+CONVERT(VARCHAR(20),@SizePerFile)+'MB, FILEGROWTH = 0) GO' SET @i += 1 END
40 minutes later, I’m ready to see if anything breaks.
The obvious starting point is creating a temp table, and sure enough, it’s slower than before, about 30X slower.
DROP TABLE IF EXISTS #DBZepisode CREATE TABLE #DBZepisode ( ID INT IDENTITY PRIMARY KEY, Content CHAR(8000) ) INSERT #DBZepisode SELECT TOP 100000 'Pointless dialogue' FROM master..spt_values a CROSS JOIN master..spt_values b OPTION(MAXDOP 1)
Also interesting is that scans are really slow. Realllllllly slow.
--Picking a filter that won't return any rows SELECT * FROM #DBZepisode WHERE Content = 'Brief combat scene' OPTION(MAXDOP 1)
Even though there’s obviously a problem here, nothing shows up in the wait stats. (Below is a simple differential script I use).
------------Run at start-------------------------- DROP TABLE IF EXISTS #waits SELECT * INTO #waits FROM sys.dm_os_wait_stats WHERE wait_type NOT IN ('LOGMGR_QUEUE','CHECKPOINT_QUEUE','LAZYWRITER_SLEEP','FT_IFTS_SCHEDULER_IDLE_WAIT','XE_DISPATCHER_WAIT', 'REQUEST_FOR_DEADLOCK_SEARCH','XE_TIMER_EVENT','HADR_FILESTREAM_IOMGR_IOCOMPLETION','DIRTY_PAGE_POLL','SQLTRACE_INCREMENTAL_FLUSH_SLEEP', 'QDS_PERSIST_TASK_MAIN_LOOP_SLEEP','BROKER_TO_FLUSH','SLEEP_TASK','SP_SERVER_DIAGNOSTICS_SLEEP','BROKER_TASK_STOP','HADR_WORK_QUEUE','HADR_LOGCAPTURE_WAIT', 'BROKER_TRANSMITTER','HADR_CLUSAPI_CALL','QDS_ASYNC_QUEUE','REDO_THREAD_PENDING_WORK','HADR_TIMER_TASK') -------------Run at end---------------------------- SELECT ws.wait_type, ws.wait_time_ms-w.wait_time_ms AS wait_ms, ws.waiting_tasks_count-w.waiting_tasks_count AS wait_count FROM sys.dm_os_wait_stats ws JOIN #waits w ON ws.wait_type = w.wait_type WHERE ws.waiting_tasks_count > w.waiting_tasks_count ORDER BY wait_ms DESC
What’s SQL Server burning CPU on? To figure this out I turn to PerfView (ht Josh).
GetNextScanIAM? The engine appears to be spending most of its time traversing the IAM chain, which means it’s using an Allocation Order Scan. In short, SQL Server has multiple access methods. An Allocation Order Scan is only available under certain conditions, which my query on a large local temp table apparently qualifies for. The pages are located from the mapping pages (IAMs), instead of going page-to-page along the leaf level. In normal circumstances, I find an allocation order scan to be faster, but here, it chokes, because there are 5k+ IAMs.
I want to test my hypothesis though. I know that allocation order scans don’t occur when there’s an ORDER BY, so what happens when I add that?
SELECT * FROM #DBZepisode WHERE Content = 'Brief combat scene' ORDER BY ID OPTION(MAXDOP 1)
Wow. From 22387ms of CPU down to 60.
The fun doesn’t stop here of course (by “fun” I mean the horrible performance problems you might expect from having thousands of datafiles).
It turns out MAX-length variables start having issues after a certain size, since they can be stored in tempdb. Again, the wait stats are silent, but PerfView makes me suspect these variables are written to tempdb as part of a LOB.
DECLARE @ReallyLongYell VARCHAR(MAX) = 'RAHHHH' SELECT TOP 10000 @ReallyLongYell += REPLICATE('H',60) FROM master..spt_values a CROSS JOIN master..spt_values b
The version store (used for optimistic isolation levels) lives in tempdb, so I thought I’d test it out. I crafted a decent workload, confirmed it added significantly to the version store, and then watched it have no problems whatsoever. In fact, it appears to be faster! Oh well…I did learn that the version store usually (for my test) doesn’t cause physical writes to tempdb, but just hangs out in memory until cleaned up.
Spills get written to tempdb, but I didn’t see any performance issues there either, well, other than the fact that there’s a spill happening.
Oh hey, remember how SQL Server rebuilds tempdb every time it restarts? And that I mentioned creating over 9000 files took 40 minutes? You can guess what happens next. Though the SQL Server service restarted within seconds, logins failed for a half hour while tempdb was rebuilding. Fun!
Even a developer would have recognized (probably) that adding thousands of datafiles is a Bad Idea. But this is a Hold My Beer demo, and ended up teaching me a few things about SQL Server. I hope it helped you too!