The following is a guest post by Microsoft Certified Master Robert L Davis. Find more from him at SQLSoldier.com, and follow him on twitter @SQLSoldier.
I was talking recently with my friends and former colleagues at Idera about the things that slow down SQL Server. Specifically, the discussion was about things that people could change without a lot of trouble to make their SQL Server run faster. I like to use the phrase “quick performance wins” for these easy-to-fix issues that can generate considerable performance boost.
This conversation led to a white paper on 5 things that are making your SQL Servers slow. (You can pick that white paper up here!)
Earlier, I blogged about index hints that slow down queries (Why is Your SQL Server Slow? Here’s a Hint), a real production issue that I had encountered recently. I was consulting on a report of the infamous “general slowness” of a production server that was affecting critical processes to the point that they were in danger of missing their service level agreements (SLAs) for completing the processing of data. Another issue that I noted while investigating slowness was that they had left the parallelism settings at the default of zero for max degree of parallelism (maxDOP) and 5 for cost threshold for parallelism. This was allowing a few very large queries to consume too many resources on the server and overwhelm everything running on it.
There were an extremely high number of processes running on the server at the time of the investigation. A majority of the processes running were INSERT – VALUES ad hoc queries using Linq2SQL to send the queries as dynamic SQL. I’m not going to rant about Linq2SQL or dynamic queries today though. I’ll save that for another day. These processes weren’t a big problem today, but they were experiencing the downsides of the problem that was occurring.
In addition to the simple insert queries, there were several very large queries and processes running. Many of the queries in procedures had the maxDOP query hint added to them by the development team, but no attempts to throttle parallelism was made at the server level. Some of the big queries I saw were using a maxDOP hint of 4 and some were using a maxDOP hint of 8. All-in-all, this was not too horrible. Many of these queries were complex, though, and had 4 to 6 tasks running at the same time in the same query. For example, one session ID (SPID) was consuming 33 threads, 4 tasks each using 8 threads, plus 1 parent thread to monitor the parallel task.
The server had 20 logical CPUs in a single NUMA node. So that means that a query using 33 threads has threads overlapping each other and sharing the same CPU thread as other threads of the same query. When there are many of these queries running at the same time, it is very easy for the very large queries to affect performance of every query running on the server. Simple queries end up waiting for a CPU thread, because the small number of large queries was overwhelming the worker threads.
Add in to the mix here that they were also manually updating the statistics on several indexes (9 total update index jobs) every 15 minutes. This practice is another rant for another day, but they were updating statistics on the table that all of these processes were hitting, which repeatedly further backed up the processes.
My recommendations here were simple: set the parallelism settings to appropriate values. Microsoft has a KB article to help you determine a good starting point for the maxDOP setting here: Recommendations and guidelines for the “max degree of parallelism” configuration option in SQL Server. In short, for NUMA machines, never exceed the number of CPUs per NUMA node or 8, whichever is less. For non-NUMA machines, never exceed a maxDOP of 8. These are starting points and may need to be tuned to lower values if the workload dictates it.
Additionally, although they did not seem to have an issue with small queries being parallelized and consuming a lot of worker threads, I recommended increasing the Cost Threshold for Parallelism to 50. I make this recommendation for any system as a good starting point to ensure that small queries don’t get parallelized needlessly, and it saves the extra worker threads for the big queries that need it more.
When you have a server that is running with the default parallelism settings, it is very easy for large processes to over-parallelize and overwhelm the server, causing everything to seem to run slower. When you find a distressed server with lots of over-parallelized queries, setting the parallelism settings can increase performance of the other processes on the server and alleviate a lot of problems. It’s an easy fix that can generate a big boost to performance, making this a definite quick performance win.