Rx for Demystifying Index Tuning Decisions

Nov 30

Review

In Parts 1 through 5 of this blog series the author discussed various aspects of SQL Server indices such as:

Types of indices and their variations, e.g., clustered, nonclustered, filtered, and covering
Physical versus logical index access methods, e.g., seeks, scans, key/RID lookups, singleton lookups, and range scans
Why index tuning should begin with queries
Determining an Appropriate Strategy for Index Tuning
Determining Queries that Need Indices the Most
Capturing query performance information using Extended Events
Using query plans to determine relationships between queries and indices
Query Optimizer
Statistics and how they affect index usage and query performance
Consequences of too many indices
Overlapping Indices

Part 6 contains an overview of dynamic management functions as well as a detailed discussion of the sys.dm_db_index_usage_stats DMV (and how to use it).

Dynamic Management Views and Functions

Dynamic Management Views and Functions are “Designed to give you a window into what’s going on inside SQL Server.” There are two types: DMV – Pure view, i.e., no parameters required and DMF – Table-valued function, i.e., parameters required. The parameters usually specify database, table, index, partition, etc. They provide a significant amount of information regarding the system, databases, performance, and internal workings of SQL Server. Most DMVs and DMFs are simple to use, but one must reconcile the numeric IDs with static views that contain textual names.

Most values are accumulated from last SQL Server instance restart. To determine interval-specific values one must calculate differences between individual sample records. However, one must be certain to difference records with same database ID, object ID, index ID, and possibly, partition ID. This data is perfect for periodic or intermittent sampling because no data is lost during the sampling process. Note: sys.dm_db_index_operational_stats may be the exception if a table is used intermittently because when tables haven’t been used for a period of time they are removed from the cache and their metrics are cleared. Capture rates can range from every 30 seconds to a few times per day; the need for granular analysis normally dictates the collection frequency.

Although several methods for collecting data exist, two general usage scenarios are instructive.

Method 1

Capture a snapshot before monitored activities begin and store the results in a SQL table or a spreadsheet
Execute the workload (whether natural production or test)
Capture a snapshot again and compare value differences
Load into spreadsheets for further analysis, if necessary

Method 2

Capture a snapshot every <n> minutes and store results in a flat file
After the workload and capture processes are complete, load data into SQL tables
Use SQL Server to compare incremental value differences and store results
Extract interval data to spreadsheets for further analysis

Several general and static views are required for converting numeric DMV and DMF identifiers into understandable text

sys.databases

Lists all databases and their IDs so proper associations can be made

sys.partitions

Only way to decode HOBT (Heap or Binary Tree) IDs returned by lock-specific information, e.g., blocked process records and sys.dm_os_waiting_tasks

sys.configurations

Provides information regarding OS and SQL Server configurations

Several database-specific views are needed to convert various database-specific IDs into understandable text. Each of these must be interrogated for each database separately.

sys.objects

Lists all database objects such as tables, views, stored procedures, etc.

sys.indexes

Lists all indices and their associated table IDs

Does not provide row counts as sysindexes does

sys.filegroups

Lists all file groups and their IDs

sys.database_files

Lists all physical database files and their IDs

sys.schemas

Lists all database schemas

Index Related Dynamic Management Views and Functions

Two DMV/DMFs are used to obtain index-related performance information and they are BOTH needed to obtain an accurate and comprehensive perspective regarding index usage. The differences between physical and logical index access metrics were discussed earlier in this series. sys.dm_db_index_usage_stats returns information regarding the query code usage of tables and indices, e.g., inserts, updates, deletes, random accesses, and sequential accesses. These metrics most closely match Query Plan operators, and the information is retained much longer than operational (physical) stats, so it is quite useful for long-term evaluation of index usage, i.e., whether indices are used a great deal or not at all. sys.dm_db_index_operational_stats “returns current low-level I/O, locking, latching, and access method activity for each partition of a table or index in the database.” These metrics track actual index and table operations, e.g., singleton lookups and range scans. Information for certain structures may be deallocated when table no longer in cache, so this is not a reliable source of long-term index usage.

sys.dm_db_index_usage_stats DMV

The sys.dm_db_index_usage_stats DMV requires no input parameters and returns one row per index, the number of seeks, scans, lookups, and updates for user and system queries. As discussed earlier in this series, seeks are random or pseudo-random (single record or range of records as long as a key is used). Scans are fully sequential. The time of the last seek, scan, lookup, and update for user queries is also provided by this function. The metrics are divided into two categories, system and user. The System category is comprised of work generated by maintenance activities, e.g., statistics updates. The User category is comprised of insert, update, delete, and select operations. The metrics report user statements, not record counts. Therefore, one insert statement can result in millions of rows being inserted, so the metric will show one, not millions. This DMV helps determine index and table usage patterns and is particularly useful for identifying indices that are seldom, if ever, used by queries, especially if SQL Server instance has been running for a very long time, e.g., months. As cited previously, these metrics are much more trustworthy for long-term evaluations than operational stats because of possible cache removal issues. Table 8 illustrates the type of information that is available from this function. Each of the highlighted indices was usually accessed sequentially, i.e., a full scan. Since one of the tables contained 71 million rows and the other contained almost 34 million rows, these statistics clearly demonstrate tuning opportunities. Reminder: random accesses can be either single record lookups or a filtered range scan. Table 10 shows data from an actual customer’s SQL Server instance that had been up for 271 days, i.e., approximately nine months. This view highlights the indices that are fully scanned and those that are hardly accessed. The highlighted row in Table 9 shows an index that is updated, but never used. It is easy to become distracted by all the update activity and lose sight of the fact that the index had not been used for inquiry purposes in 271 days.

Table 8: Index Usage Stats Summary

Table 9: Index Usage To-Date Summary (271 Days)

The following code uses sys.dm_db_index_usage_stats to list rarely-used indices for a specific database with resolved names and its output appears in Table 10.

declare @dbid int = db_id()

select objectname=object_name(inxusage.object_id), inxusage.object_id, indexname=sysinx.name, sysinx.index_id, user_seeks, user_scans, user_lookups, user_updates

from sys.dm_db_index_usage_stats inxusage,

sys.indexes sysinx

where database_id = @dbid and objectproperty(inxusage.object_id,‘IsUserTable’) = 1 and sysinx.object_id = inxusage.object_id and

sysinx.index_id = inxusage.index_id

order by (user_seeks + user_scans + user_lookups + user_updates) asc

Table 10: Example – Unused Tbl1 Indices

Table 11 illustrates the kind of overall summary report that this view when combined with static index information and the missing index DMVs that will be discussed later can generate. This table shows clearly which SQL Server tables have several indices, how many are duplicated or unused, and how many others were suggested by SQL Server. The row counts are often invaluable for this kind of overall analysis, which can direct an index tuning study.

Table 11: Index/Table Overall Summary

The next article in this blog series will cover the sys.dm_db_index_operational_stats DMF in detail. Until then…good luck with your tuning efforts and please contact us with any questions.

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!