Troubleshooting Over-Indexing for Performance

by Feb 16, 2021

Introduction

I have now warned several times against the mindset that suggests indexing is a silver bullet for SQL Server performance. In the previous blog post, we discussed under-indexing. It is well-known that we may have performance issues if we have not indexed correctly. However, we also often face performance issues when we have more indexes than we need or indexes doing more work than we need.

I have seen a wide range of troubleshooting and maintenance cases involving indexes during my more than ten years as an SQL Server Performance Tuning and Optimization expert. The bottom line is that a high-quality index can take an underperforming performing query on an inefficient database, and make the whole environment function like a well-oiled machine. Today we are going to discuss the issue of over-indexing for SQL Server Performance.

What is Over-Indexing?

As we have learned in the first part of this series, a table can have a single clustered index. It is a popular practice to have a primary key for a relational database. By default, SQL Server will automatically create a clustered index upon the creation of a primary key. Of course, it is possible to create a primary key without a clustered index, but this is not common practice.

I often see tables with a clustered index in the real world because they have a primary key but no other index. A table can benefit from non-clustered indexes and columnstore indexes – yet people often fail to create these due to a lack of awareness.

At many places, I commonly see a single column index created on pretty much every single column of the table. While this increases the count of indexes in general, many indexes are irrelevant to the majority of queries. This example is a classic case of over-indexing, which often includes an element of under (or at least inadequate and misdirected) indexing. The table is over-indexed because it has many under-performing indexes. Yet, the indexing functionality ultimately proves insufficient, because so few helpful indexes are available to help query performance.

How to Detect Over-Indexing

This one is a common question I often receive from my clients.

I strongly suggest that you consider this next section a starting point or guideline. If you believe this method will solve all of your problems, well, that may not happen.

There is a five-step process in which we check the workload on two different tables. I always suggest Let me describe the process in its various steps here:

Step 1: Identify the top three (3) or top five (5) tables from your database that you consider most important for your business.

Step 2: Now capture your generic workload (from the production server) either by extended event (preferred) or profiler (not preferred). The purpose here is to figure out what insert, update, delete, or select operations are running on your table.

Step 3: Now, in your development environment, create two identical tables for the table you selected for this experiment. The first table will contain all the indexes which you have on your production server. The second table will only contain clustered indexes if your original table contains clustered indexes. Otherwise, the second table will contain no indexes.

Step 4: First, you will run your workload on the original table schemas (with all the indexes) and measure the experiment’s completion time. Call this “Time A.”

Step 5: Now run the same workload on the second set of tables with only the clustered index or just a heap. Here, measure the full time of the experiment. Call this “Time B.”

Now you will have to compare both times and decide which is most optimal for your SQL Server instance. If you find that Time A is much better than Time B, you are in better shape overall, and you may not be a victim of over-indexing. If you notice that Time B is far better than Time A, you may now need to delve deeper into your indexes and fix the over-indexing problem.

How do I Fix Over-Indexing?

This question represents the most crucial conversation concerning over-indexing. Although it may seem like a very complex process, the reality is much less complicated. There are two different methods to handle over-indexing.

Method 1: Remove all unused indexes from the system. Now, this is a relatively vital process. The SQL Server engine tracks index utilization since the last restart of the SQL Server. That means if you have not restarted your SQL Server for a while, you get better suggestions for your database.

After your SQL Server has been running for around four (4) weeks, download the script from here and run it against your database to identify unused indexes. In the results, you can check the column UserSeek and UserScans in the table. If you discover that they have zero values, you may deem that index not used for four (4) weeks and set for deletion during the next maintenance cycle.

This method is the easiest way to remove indexes that are not in regular use. Most of my clients usually disable the unused non-clustered indexes and eventually drop them. Remember: you should not remove the clustered index from your table using the script discussed in this blog post.

Method 2: So you’ve already cleaned up all indexes with Method 1. If Time A is still better than Time B from our workload test, you should consider using this alternate method.

In this method, you need to start looking for indexes with similar definitions and key columns and start to consolidate them. When you consolidate the indexes, you should make sure that you pay special attention to the index’s first column. This first column primarily drives index selection for the query.

Along with removing the indexes, sometimes we have to create new indexes, which I will discuss in the next blog post. You can also use the script for identifying missing indexes and use that as a starting point for index consolidation, which I will also discuss in future blog posts.

Summary

It is very tempting to create new indexes frequently when they instantly perform. However, be very careful not to make too many indexes, since more indexes may negatively impact the SQL Server database’s performance. When dealing with indexes, we should be cautious with what we create and what we delete.

To sum up my advice on over-indexing and under-indexing, I offer you one line of advice: Always Be Vigilant!

In the next episode of this series, we will discuss a real-world scenario of Filtered Indexes.