Troubleshooting Under-Indexing for Performance

by Feb 9, 2021

Introduction

All too often, people treat a good index as a silver bullet in terms of SQL Server performance. It is well-known that we will encounter performance issues when we inadequately index our environment. However, we can also expect to encounter problems if we have more indexes than we need or if these indexes do not perform efficiently. This blog post will primarily address under-indexing, while the next will address the issue of over-indexing.

I have seen many different index cases during my more than ten years as an SQL Server Performance Tuning and Optimization expert. The bottom line is that a single quality index can often make a slow-performing query run as though on adrenaline. Today we are going to discuss the issue of under-indexing for SQL Server Performance.

What is Under-Indexing?

As we have learned in the first part of this series, a table can have a single clustered index. It is a widespread practice to use a primary key for a relational database. By default, SQL Server will automatically create a clustered index upon the creation of a primary key. Of course, it is possible to create a primary key without a clustered index, but this is not common practice.

I often see tables with a clustered index in the real world because they have a primary key but no other index. A table can benefit from non-clustered indexes and columnstore indexes – yet people often fail to create these due to a lack of awareness.

I commonly see a single column index created on virtually every column of the table. While this increases the count of indexes in general, there are often too few indexes that are most helpful to most queries. In truth, I view this as a case of both over-indexing and under-indexing. The table is over-indexed because it has many poorly-performing indexes. It is under-indexed because too few helpful indexes are available to aid query performance.

How do I Detect Under-Indexing?

This question is a widespread one that I often receive from clients. It is challenging to figure out the case of under-indexing. In the next blog, we look at a five-step process to figure out over-indexing and then fix this. The same method will not work directly for under-indexing because we need the new (not yet created) indexes to compare with the existing workload.

So this brings us to a fascinating point about moving ahead with performance tuning if we can’t determine the need for more indexes. However, there is an experiment I often suggest to my clients, and the results provide a fantastic starting point. Because under-indexing and over-indexing are so closely related, I am providing the five-step over-indexing troubleshooting method below, as well as in the dedicated over-indexing blog post.

There is a five-step process in which we check the workload on two different tables. I always suggest this method to my clients, and it generally yields excellent results.

Let me describe the process in its various steps here:

Step 1: Identify the top three (3) or top five (5) tables from your database that you consider most important for your business.

Step 2: Now capture your generic workload (from the production server) either by extended event (preferred) or profiler (not preferred). The purpose here is to figure out what insert, update, delete, or select operations are running on your table.

Step 3: Now, in your development environment, create two identical tables for the table you selected for this experiment. The first table will contain all the indexes which you have on your production server. The second table will only contain clustered indexes if your original table contains clustered indexes. Otherwise, the second table will contain no indexes.

Step 4: First, you will run your workload on the original table schemas (with all the indexes) and measure the experiment’s completion time. Call this “Time A.”

Step 5: Now run the same workload on the second set of tables with only the clustered index or just a heap. Here, measure the full time of the experiment. Call this “Time B.”

Now you will have to compare both times and decide which is most optimal for your SQL Server instance. If you find that Time A is much better than Time B, you are in better shape overall, and you may not be a victim of over-indexing. However, if you notice that Time B is far better than Time A, you may now need to delve deeper into your indexes and hunt down the culprits.

How do I Fix Under-Indexing?

Before going to the next section of how to fix under indexing, you should have already addressed the over-indexing scenario for your database. To address this problem, follow the over-indexing troubleshooting steps (above or in my next article) and fix this problem if you have it. If you do not first resolve any over-indexing, your search for under-indexing may yield false-positive results.

SQL Server engine tracks all the indexes utilization since the last restart of the SQL Server. That means if you have not restarted your SQL Server for a while, you get better suggestions for your database.

There is a five-step process where we check the workload on two different tables. I always suggest this method to my clients, and so far, it has worked out for most of my clients.

Let me describe the process in various steps here:

Step 0: Check for over-indexing as set out above, and in the next article. If necessary, follow the process to fix over-indexing set out in my next blog post.

Step 1: Identify the top 3-5 tables from your database, which you consider most important for your business.

Step 2: Now capture your generic workload (from production server) either by extended event (preferred) or profiler (not preferred). Essentially, you will try to figure out what all insert, update, delete or select is happening on your table.

Step 3: Nowin your development environment create an identical table for each of the table you selected for this experiment in step 1. The table will contain all the indexes which you have on your production server.

Step 4: First, you will run your workload on the original table schemas (with all the indexes) and measure the experiment’s completion time. Call this time as Time A.

Step 5: Now is the time when you go and download the missing index script. Once you download the missing index script, run that against your database. The script may give you no result if you are running it for the master database, so you must change the context of your database to your user database.

Step 6: Next step is list top 1 (yes, only one) index for the tables you have identified in step 1 of this process. There will be an index creation script for it is the result of the missing index script. Create the top 1 index identified for your table in the database.

Step 7: Now run the same workload on which you ran in step 4 on the tables with new indexes. Over here measure the complete time of the experiment. Call this time as Time B.

Now is when you will have to compare both the times and decide which is most optimal for your SQL Server instance. If you find your Time A to be much better than Time B, you are in better shape overall and may not be a victim of under-indexing. However, if you notice that your Time B is far better than Time A, you have started on the right foot and created a good index for your query.

Repeat from Step 1-7: Once you complete step 7, after one week, you can repeat from Step 1-7 to see if you can find a more efficient index for your SQL Server database. It is good to repeat this process once a month to check if there are new index suggestions based on your changing workload.

Summary

In most cases, indexes are very helpful for query performance. However, there are scenarios where they can negatively impact on performance. We should always be vigilant with our indexes to avoid becoming victims of under-or-over-indexing within our SQL Server database. In the next episode of this series, we will discuss a real-world scenario of detecting and troubleshooting over-indexing.