Yesterday we looked at how you could determine the cost of a single row stored in SQL Azure. Part of my focus for doing so was to help you understand that a bad database design can lead to higher costs. Creating unnecessarily wide tables will result in more storage which then results in a higher cost. You will be able to provide a rough idea of just how much that 349 column customer table is costing your company.
Yesterday we focused on the size of a row on a page. For indexes there is a size limit: 900 bytes is all you get (this is for both SQL Azure and for on-premise versions of SQL Server). At first glance you may be thinking to yourself “self, why would I care about something that would be so small? And that is a fair question, so let me explain something to you about database design that I have seen repeatedly in my career as a production DBA:
People tend to create WAY more indexes than they need.
Creating indexes to solve performance problems is like crack: the first one is free but the last one is the one that causes your heart to come to a stop.
How many times have you come across tables that are over-indexed? How about tables with an index for every column (you know, just in case we need them). I’ve seen it throughout my days as a production DBA and today I am here to help show you how to find those costs in SQL Azure.
The code we used yesterday gave us the starting point. We will take that piece of code and instead of pointing it at the sys.objects view we will instead point at the sys.indexes. We can also filter for just the non-clustered indexes, since every table in SQL Azure has a clustered index defined we already know the costs for them (we did that yesterday). This code will show me the size of my non-clustered indexes:
SELECT OBJECT_NAME(sys.indexes.object_id)AS [TableName], sys.indexes.name AS [IndexName], SUM(reserved_page_count) * 8.0 / 1024 as [SizeinMB] FROM sys.dm_db_partition_stats INNER JOIN sys.indexes ON sys.dm_db_partition_stats.object_id = sys.indexes.object_id AND sys.dm_db_partition_stats.index_id = sys.indexes.index_id WHERE type_desc = 'NONCLUSTERED' GROUP BY sys.indexes.object_id, sys.indexes.name ORDER BY sys.indexes.name
I can then take this information and break down the cost for my indexes, similar to what we did yesterday. I will use the same calculated monthly cost so if you need to know how I got that number you should go and read what I did yesterday. If I remove the filter for non-clustered indexes then we can see a summary of all objects:
SELECT OBJECT_NAME(sys.indexes.object_id)AS [TableName], sys.indexes.name AS [IndexName], SUM(reserved_page_count) * 8.0 / 1024 as [SizeinMB], row_count [Row Count], (CASE row_count WHEN 0 THEN 0 ELSE (sum(reserved_page_count) * 8192)/ row_count END) [Bytes Per Row], (CASE row_count WHEN 0 THEN 0 ELSE ((sum(reserved_page_count) * 8192)/ row_count) * (0.00000257749) END) [Monthly Cost Per Row] FROM sys.dm_db_partition_stats INNER JOIN sys.indexes ON sys.dm_db_partition_stats.object_id = sys.indexes.object_id AND sys.dm_db_partition_stats.index_id = sys.indexes.index_id GROUP BY sys.indexes.object_id, sys.indexes.name, row_count ORDER BY [Bytes Per Row] DESC
You could rewrite the code and sum up the amount of data based upon the tables in order to get an idea of the true cost for a table in SQL Azure. Or you could decided to break out the costs per day. You have a lot of options here, SQL Azure makes the information available for you to analyze.
The reason I wanted to show you all this? Mostly it has to do with helping to improve database design, same as yesterday. If you have ever come across a database that has WAY too many indexes you are going to be able to put a price tag on what those extra indexes would cost. Usually you aren’t able to attach a cost, and cost is also the biggest motivating factor for how and why things get done (or not get done).