Early in Covid, even before the mass awareness set in, I remember reading some blog posts from the rationalist community. They fundamentally said two things: the disease and math are real, and this is your permission to take this seriously if you rely on social affirmation.
I would like to say there is a general awareness of how much AI has altered the day-to-day work of software engineering, but given how many people I personally still introduce to Claude Code, I’m not sure how true that is. Let this post be my contribution to saying AI is real, and you have permission to take it seriously.
I am currently, for production systems, actually using AI query tuning. It’s here. My current setup is not fully autonomous but provides maybe a 5-10X speedup.
There are still genuine questions, but not whether AI is useful. Anyone still using the phrase stochastic parrots is describing themself more than the AI they critique. I would probably put AI-tuning at the level of a mid-level DBA, although AI capabilities don’t map cleanly to humans. In fact, that’s important enough to emphasize again: AI abilities are weird and do not compare evenly to human abilities.
Assuming you would like to know more:
For me, the central idea of building an AI query tuner comes from Karpathy: that which can be verified can be automated.
The linchpin of my setup is a copy of prod on a decent sized server (i.e. a performance testing environment). If you don’t have a place for it to test, an AI can come up with ideas but can’t shoot them down, and will waste your time instead of its own.
It is of course possible to test directly in production, as I have no doubt some of you will do after reading this. I would just like to say, most emphatically, that that level of AI is not here yet. If you wish to –dangerously-skip-permissions while using a Windows login with godmode access on the money-making foundation of your corporate paycheck…
…
…oof, let’s just say “don’t” and allow your human imagination to generate the epitaph.
Secondary ideas
Getting this to production does require more than a perf server, and I’m happy to share lessons learned with readers, human or otherwise.
Write a query plan minimizer
AI can work directly on raw xml query plans, but doesn’t do as well, because xml query plans have a serious case of chonk. Sure, you might be able to fit War and Peace (~3MB), err, I mean your 200-node query plan xml, directly into the context window nowadays, but as we DBAs know, a limit is not a goal. AI performs worse when an excessive amount of context is used, including when most of it is angle brackets. </chonk>
Thankfully, it’s pretty easy to have AI write a minimizer which only keeps the limited, useful information from a query plan while shedding the rest. My own runs at maybe a 70% token reduction. Cheaper, faster, better. I do recommend moving some logic to it too, like calculating per node timings.
Use scripts
Generalizing on the above, if something is deterministic and common, you should have a script for it that AI can use, instead of leaving to chance and costing tokens. Good news: AI can write that script for you.
Beware negative attractors
Certain ideas are so strong to the AI that it has trouble avoiding them. You might repeat “ignore X” a hundred times but that just makes the AI focus on X even more. The two I encountered were Cost Percentage (which trips low-skill humans too) and any time actual vs estimated rows differed more than a smidge.
For Cost Percentage, I just scrubbed that info altogether from the minimal plan. Nothing like hiding an object that gets misused. My toolkit for parenting my 11mo child has a surprising amount of overlap with managing AI.
Estimated vs Actual rows wasn’t something I wanted to remove, so I let AI struggle with it. Sure, I’ve added lots to the instructions that it often finds a way to ignore, but when it finds out by testing that its panic is misplaced, those instructions are enough to let it get back on track.
Save bad plans
Something I’ve been doing for the past year and am so glad I did: save actual, real-world query plans from incidents and bad plans. I also automatically collect live query plans for any query taking more than X cores of CPU with sys.dm_exec_query_statistics_xml.
I manually analyzed 20 of them and kept my analyses, which let me benchmark AI performance and find feature gaps.
Context stuffing
Saving bad plans and analyses had another benefit – I was able to take a bunch of small plans+analyses with real-world cases I especially cared about, and load them into the query analyzer agent’s context. One scenario I made sure to include was where the query doesn’t actually need to be tuned, but where the application usage needs to be questioned. Another was “I don’t know, there’s not enough info.”
Persona tuning is mostly obsolete
Older AI used to respond really strongly to “You are an expert X” but doesn’t nearly as much now. The signal from that is even weaker than a year ago. Alas, when I tested “You are ___, a SQL Server expert and skilled at query tuning” for Brent vs Erik, they tied with each other and the control group (I didn’t try Paul White because I didn’t want to translate from hex). Maybe they’ll let me train a model on all their blog posts for a rematch.
UI matters
Remember how much resistance there was to extended events when they were introduced, even though they were across many dimensions just plain better? The UI was craptastic, a slow and painful slog, and your reward was getting to read XML. It was annoying to use. That made adoption slow.
One of the biggest barriers to sharing my Query Tuning AI setup internally, even with other DBAs, is the same issue – using it is annoying, especially setup. I do not have a fix yet.
Output
Some things are so fundamental to the investigative process that I actually include them in a standardized output. Mine is currently structured into four basic sections:
– Plan basics (type of plan, rows output, duration vs CPU)
– Mechanical explanation (where is time going, and why, at an operator level)
– Root issue (why did the plan end up in this shape)
– Recommendations (what should be investigated and tried next)
The structured output moderately improved accuracy, but also helped devs understand analysis when I gave it to them.
This breaks the skill-loop
One of my complaints about ORMs is that it produces low-complexity SQL just fine, but when it breaks (and it most certainly does), the devs who are reliant on it have no practice writing or fixing SQL. This will happen with AI. This WILL happen with AI.
It’s good enough to be a huge speedup for low and mid complexity, but guarantees skill atrophy and fails often enough that a human is still needed. The short-term benefits are so large that every manager will accept the tradeoff even before you do. Good luck.
Other thoughts
Two observations are common almost to triteness, but I can’t help making them too. First, there’s still a lot of work to do. Getting faster at certain DBA work just means I hit the other bottlenecks faster, and spend time addressing them. It also means I get to review a lot more queries. The cost of tuning is a lot lower.
Second, things are weird, and going to get weirder. This is a shakeup larger than the invention of the internet, even if AI weren’t going to continue improving. My level of uncertainty is off the charts. My day-to-day work looks nothing like it did 6 months ago, and I expect it to be quite different in another 6 months. I hope I still have a job in 5 years – I expect to in some form – but have to admit I don’t know.
Anyways, there was a common saying in EVE Online back when I played. Adapt or die.




































