Improving Performance with SQL Server Indexed Views

Improving Performance with SQL Server 2008 Indexed Views

SQL Server 2008
31 out of 54 rated this helpful – Rate this topic


SQL Server Technical Article


Writers: Eric Hanson, Susan Price

Contributors: Gail Erickson, Lubor Kollar, Jason Ward

Technical Reviewer: Susan Price


Published: October 2008

Applies to: SQL Server 2008


Summary: This document describes the indexed views capability of SQL Server 2005 and SQL Server 2008, including the new support for partition-aligned views added to SQL Server 2008. Indexed views are explained and specific scenarios in which they may provide performance improvements are discussed.

For many years, Microsoft SQL Server has supported the ability to create virtual tables known as views. Historically, these views served these main purposes:

  • To provide a security mechanism that restricts users to a certain subset of data in one or more base tables.
  • To provide a mechanism that allows developers to customize how users can logically view the data stored in base tables.

With SQL Server 2000, the functionality of SQL Server views was expanded to provide system performance benefits. It is possible to create a unique clustered index on a view, as well as nonclustered indexes, to improve data access performance on the most complex queries by precomputing and materializing the view. This is often particularly effective for aggregate views in decision support or data warehouse environments. In SQL Server, a view that has a unique clustered index is referred to as an indexed view.

The discussion in this paper applies to SQL Server 2005 and SQL Server 2008. In SQL Server 2005, certain operations on partitioned tables with indexed views required dropping the index and then re-creating the index on the view. In SQL Server 2008, the need to drop an indexed view on a partitioned table during common maintenance operations is greatly reduced, so indexed views are more easily maintained over large partitioned tables.

From the database management system (DBMS) perspective, a view is a description of the data (a form of metadata). When a typical view is created, the metadata is defined by encapsulating a SELECT statement that defines a result set to be represented as a virtual table. When a view is referenced in the FROM clause of another query, this metadata is retrieved from the system catalog and expanded in place of the view’s reference. After view expansion, the SQL Server query optimizer compiles a single execution plan for executing the query. The query optimizer searches though a set of possible execution plans for a query, and it chooses the lowest-cost plan it can find, based on estimates of the actual time it will take to execute each query plan.

In the case of a nonindexed view, the portions of the view necessary to solve the query are materialized at run time. Any computations such as joins or aggregations are done during query execution for each query referencing the view [Note1] . After a unique clustered index is created on the view, the view’s result set is materialized immediately and persisted in physical storage in the database, saving the overhead of performing this costly operation at execution time.

The indexed view can be used in a query execution in two ways. The query can reference the indexed view directly, or, more importantly, the query optimizer can select the view if it determines that the view can be substituted for some or all of the query in the lowest-cost query plan. In the second case, the indexed view is used instead of the underlying tables and their ordinary indexes. The view does not need to be referenced in the query for the query optimizer to use it during query execution. This allows existing applications to benefit from the newly created indexed views without changing those applications.

Note: Indexed views are a feature of all versions of SQL Server 2000 and SQL Server 2005. In the Developer and Enterprise editions of SQL Server 2000 and SQL Server 2005, the query processor can use an indexed view to solve queries that structurally match the view, even if they don’t refer to the view by name. In other versions, you must reference the view by name and use the NOEXPAND hint on the view reference to query the contents of an indexed view.

Using indexes to improve query performance is not a new concept; however, indexed views provide additional performance benefits that cannot be achieved using standard indexes. Indexed views can increase query performance in the following ways:

  • Aggregations can be precomputed and stored in the index to minimize expensive computations during query execution.
  • Tables can be prejoined and the resulting data set stored.
  • Combinations of joins or aggregations can be stored.

The graph demonstrates the typical performance increases that can be achieved when the query optimizer uses an indexed view. The represented queries varied in complexity (for example, the number of aggregate calculations, the number of tables used, or the number of predicates) and included large multi-million row tables from a real production environment.


Figure 1 Typical performance improvement with use of an indexed view

Secondary, nonclustered indexes on views can provide additional query performance. Similar to nonclustered indexes on tables, nonclustered indexes on views may provide more options for the query optimizer to choose from during the compilation process. For example, if the query includes columns not covered by the clustered index, the optimizer can choose one or more secondary indexes in the plan and avoid a time-consuming full scan of the indexed view or base tables.

Adding indexes to the schema increases the overhead on the database because the indexes will require ongoing maintenance. Careful consideration should be given to finding the right balance of indexes and maintenance overhead.

Analyze your database workload before implementing indexed views. Use your knowledge of the queries as well as various tools (for example, SQL Server Profiler) to identify the queries that can benefit from indexed views. Frequently occurring aggregations and joins are the best candidates for indexed views. Whether or not a query is asked frequently, it may be a candidate for an indexed view if it takes significant time to answer, and the value of getting the answer quickly is high. For example, some developers find it useful to create indexed views that precompute and store the answers to queries for reports run at the end of each month by senior executives.

Not all queries will benefit from indexed views. Similar to ordinary indexes, if the indexed views are not used, there is no benefit. In this case, not only are performance gains not realized, but the additional cost of disk space, maintenance, and optimization is incurred. However, when indexed views are used, they can provide significant improvements (by orders of magnitude) in data access. This is because the query optimizer uses the precomputed results stored in the indexed view, which substantially reduces the cost of the query execution.

The query optimizer considers indexed views only for queries with nontrivial cost. This avoids situations where trying to match various indexed views during the query optimization costs more than the savings achieved by the indexed view usage. Indexed views are rarely used in queries with a cost of less than 1.

Applications that benefit from the implementation of indexed views include:

  • Decision support workloads.
  • Data marts.
  • Data warehouses.
  • Online analytical processing (OLAP) stores and sources.
  • Data mining workloads.

From the query type and pattern point of view, the benefiting applications can be characterized as those containing:

  • Joins and aggregations of large tables.
  • Repeated patterns of queries.
  • Repeated aggregations on the same or overlapping sets of columns.
  • Repeated joins of the same tables on the same keys.
  • Combinations of the above.

On the contrary, online transaction processing (OLTP) systems with many writes, or database applications with frequent updates, may not be able to take advantage of indexed views because of the increased maintenance cost associated with updating both the view and underlying base tables.

In SQL Server Enterprise, the SQL Server query optimizer automatically determines when an indexed view can be used for a given query execution. The view does not need to be referenced directly in the query for the optimizer to use it in the query execution plan. Therefore, existing applications can take advantage of the indexed views without any changes to the application itself; only the indexed views have to be created.

The query optimizer considers several conditions to determine if an indexed view can cover the entire query or a portion of it. These conditions correspond to a single FROM clause in the query and consist of the following:

  • The tables in the query FROM clause must be a superset of the tables in the indexed view FROM clause.
  • The join conditions in the query must be a superset of the join conditions in the view.
  • The aggregate columns in the query must be derivable from a subset of the aggregate columns in the view.
  • All expressions in the query select list must be derivable from the view select list or from the tables not included in the view definition.
  • One predicate subsumes another if it matches a superset of the rows matched by the other. For example, “T.a=10” subsumes “T.a=10 and T.b=20.” Any predicate subsumes itself. The part of the predicate of the view that restricts values of one table must subsume the part of the predicate of the query that restricts the same table. Furthermore, it must do so in a way that SQL Server can verify.

All columns in the query search condition predicates that belong to tables in the view definition must appear in one or more of the following in the view definition:

  1. A GROUP BY list.
  2. The view select list if there is no GROUP BY.
  3. The same or equivalent predicate in the view definition.

Cases (1) and (2) allow SQL Server to apply a query predicate to rows from the view to further restrict the rows of the view. Number (3) is a special case where no filtering is needed on the column, so the column needn’t appear in the view.

If the query contains more than one FROM clause (subqueries, derived tables, UNION), the optimizer may select several indexed views to process the query, and apply them to different FROM clauses. [Note2]

Example queries demonstrating these conditions are presented at the end of this document. Allowing the query optimizer to determine which indexes, if any, to use in the query execution plan is the recommended best practice.

Using the NOEXPAND View Hint

When SQL Server processes queries that refer to views by name, the definitions of the views normally are expanded until they refer only to base tables. This process is called view expansion. It’s a form of macro expansion.

The NOEXPAND view hint forces the query optimizer to treat the view like an ordinary table with a clustered index. It prevents view expansion. The NOEXPAND hint can only be applied if the indexed view is referenced directly in the FROM clause. For example, the following statement includes the indexed view View1 in the FROM clause.

SELECT Column1, Column2, ... FROM Table1, View1 WITH (NOEXPAND) WHERE ...

Use NOEXPAND if you want to be sure to have SQL Server process a query by reading the view itself instead of reading data from the base tables. If for some reason SQL Server chooses a query plan that processes the query against base tables when you’d prefer that it use the view, consider using NOEXPAND. You must use NOEXPAND in all versions of SQL Server other than Developer and Enterprise editions to have SQL Server process a query against an indexed view directly. You can see a graphical representation of the plan SQL Server chooses for a statement using the SQL Server Management Studio tool Display Estimated Execution Plan feature. Alternatively, you can see different nongraphical representations using SHOWPLAN_ALL, SHOWPLAN_TEXT, or SHOWPLAN_XML. See SQL Server Books Online for a discussion of the different versions of SHOWPLAN.

When processing a query that refers to a view by name, SQL Server always expands the views, unless you add the NOEXPAND hint to the view reference. It attempts to match indexed views to the expanded query, unless you specify the EXPAND VIEWS query hint in an OPTION clause at the end of the query. For example, suppose there is an indexed view View1 in the database. In the following query, View1 is expanded based on its logical definition (its CREATE VIEW statement), and then the EXPAND VIEWS option prevents the indexed view for View1 from being used in the plan to solve the query.

SELECT Column1, Column2, ... FROM Table1, View1 WHERE ...

Use EXPAND VIEWS if you want to be sure to have SQL Server process a query by accessing data directly from the base tables referenced by the query, instead of possibly accessing indexed views. EXPAND views may in some cases help eliminate lock contention that could be experienced with an indexed view. Both NOEXPAND and EXPAND VIEWS can help you evaluate performance with and without use of indexed views when you test your application.

SQL Server 2005 introduced many improvements for indexed views. Starting with SQL Server 2005, the set of indexable views includes those based on:

  • Scalar aggregates, including SUM and COUNT_BIG without GROUP BY.
  • Scalar expressions and user-defined functions. For example, given a table T(a int, b int, c int) and a scalar user-defined function dbo.MyUDF(@x int), an indexed view defined on T can contain a computed column such as a+b or dbo.MyUDF(a).
  • Persisted imprecise columns. An imprecise column is one whose type is float or real, or a computed column that is derived from a float or real column. In SQL Server 2000, an imprecise column could be used in the select list of an indexed view if it was not part of the index key. An imprecise column could not be used elsewhere inside the view definition either, such as in the WHERE or FROM clauses. SQL Server allows an imprecise column to participate in the key or inside the view definition if the column is persisted in the base table.

Persisted columns include regular columns and computed columns marked PERSISTED. The fundamental reason that imprecise, nonpersisted columns can’t participate in indexes or indexed views is that it is necessary to be able to detach a database from one computer and attach it to another. After the move, all computed column values stored in indexes or indexed views must be derivable in exactly the same way on the new hardware as on the old hardware, down to the individual bit. Otherwise, these indexed views are logically corrupted with respect to the new hardware. Because of this corruption, on the new hardware, queries to the indexed views could return different answers depending on whether the plan used the indexed view or the base tables to derive the view data. Furthermore, the indexed views couldn’t be maintained correctly on the new computer.

Unfortunately, floating point hardware on different computers (even with the same processor architecture from the same manufacturer) does not always stay 100% the same from version to version of the processor. A firmware upgrade might mean that (a*b) on the new hardware is not equal to (a*b) on the old hardware, for some floating point values a and b. For example, the results might be very close, but differ in the least significant bit. Persisting the imprecise computed values before indexing them solves this detach/attach inconsistency problem since all expressions are evaluated on the same computer during database update and maintenance of indexes and indexed views.

  • Common Language Runtime (CLR) types. A major new feature of SQL Server 2005 is support for user-defined types (UDTs) and user-defined functions based on the CLR. Indexed views can be defined on CLR UDT columns, or expressions derived from those columns, provided that the columns or expressions are deterministic, and precise, persisted, or both. CLR user-defined aggregates cannot be used in an indexed view.

The capability of the optimizer to match queries to indexed views, and thus use them in query plans, includes:

  • New expression types in the SELECT list or condition of a query or view that involve:
    • Scalar expressions, such as (a+b)/2.
    • Scalar aggregates.
    • Scalar user-defined functions.
  • Interval subsumption. The optimizer can detect whether interval conditions in an indexed view definition cover, or subsume, interval conditions in a query. For example, the optimizer can determine that “a>10 and a<20” covers “a>12 and a<18.”
  • Expression equivalence. Certain expressions that can be shown to be the same even though they are syntactically different are treated the same. An example is that “a=b and c<>10” is equivalent to “10<>c and b=a.”

In addition, if there is a large number of indexed views in the database, compilation performance for tables on which the views are defined is typically significantly faster in SQL Server 2005 and later than in SQL Server 2000.

Partitioning large fact tables can decrease the time required for maintaining large fact tables. Using indexed views can greatly improve performance of aggregate queries over large fact tables. In SQL Server 2005, an indexed view created on a partitioned table must be dropped before switching a partition in or out of a partitioned table. The indexed view must then be re-created after the partition switch. If the partitioned table is large, re-creating the indexed view may consume considerable time.

SQL Server 2008 includes a new class of indexed views, partition-aligned indexed views, which can dramatically improve the usability of indexed views built on partitioned tables. A partition-aligned indexed view has the same characteristics as a regular indexed view except that the query processor automatically maintains the indexed view when a new partition of the table is switched in. Switching a partition in or out of a partitioned table does not require dropping and re-creating a partition-aligned indexed view.

The availability of partitioned-aligned indexed views substantially improves the manageability of indexed views. As part of your data warehouse load process, a partition of a partition-aligned indexed view can be prepared in a staging area together with a new partition of a partitioned table, and additional maintenance time is not required for re-creating the whole indexed view.

Partition-aligned indexed views are useful whenever you want to do both of the following:

  • Partition a table to simplify maintenance.
  • Create indexed views over the data in the partitioned tables to improve the performance of queries that access the indexed data.

Partition-aligned indexed views allow table partitioning and indexed views to work together smoothly and efficiently.

For example, suppose you want to store daily sales data, but most of your queries target the most recent day’s data. You want to use indexed views to improve performance of data access. You might partition the table by date and load the data for each day into a separate partition. You could switch data in the oldest partition out of the table, either moving the data into an archive table or discarding it, and switch new data into a new partition at the end of each day. You will not need to drop and rebuild your indexed views each day when you switch data in and out of the partitions.

In order for a view to be indexed and automatically maintained during partition switching, the base table and the view must share equivalent partitioning schemes. That means that if the base table and the view use different partition schemes, the two partition schemes must be based on the same, or equivalent, partition functions. For the two partition functions to be equivalent, the partition functions must have identical range clauses (both must specify either LEFT or RIGHT) and the values clauses must specify the same boundary points. More specifically, the view must meet the following requirements to qualify as a partition-aligned indexed view:

  • The view definition must qualify for a regular indexed view.
  • The partition functions for the base table and the indexed view must define the same number of partitions, must define the same boundary values for the partitions, and must use the same column as the argument to the partition function.
  • The projection list of the view definition must include the partitioning key (not a computation of the partitioning key) of the partitioned table.
  • If the view definition includes a grouping, the partitioning key must be one of the grouping columns in the view definition.
  • If the view refers to multiple tables, exactly one of the tables must be partitioned.

Additional details regarding designing and using partitions and the requirements for using partition-aligned indexed views are documented in SQL Server 2008 Books Online.

An illustration of how to use partition-aligned indexed views appears in the “Examples” section later in this paper.

Identifying an appropriate set of indexes for a database system can be complex. While there are numerous possibilities to consider when designing ordinary indexes, adding indexed views to the schema dramatically increases the complexity of the design and the potential results. For example, indexed views can be used on:

  • Any subset of tables referenced in the query.
  • Any subset of the conditions in the query for that subset of tables.
  • Grouping columns.
  • Aggregate functions (for example, SUM).

Indexes on tables and indexed views should be designed concurrently to obtain the best results from each construct. Because both indexes and indexed views may be useful for a given query, designing them separately can lead to redundant recommendations that incur high storage and maintenance overhead. While you tune the physical design of a database, tradeoffs must be made between the performance requirements of a diverse set of queries and updates that the database system must support. Therefore, identifying an appropriate physical design for indexed views is a challenging task, and the Database Tuning Advisor should be used wherever it is possible.

Query optimization cost can increase substantially if there are many indexed views that the query optimizer may consider for a particular query. A query optimizer may consider all indexed views that are defined on any subset of tables in the query. Each view has to be investigated for the potential substitution before it is rejected. This may take some time, especially if there are hundreds of such views for a given query.

A view must meet several requirements before you can create a unique clustered index on it. During the design phase, consider these requirements:

  • The view, and all tables referenced in the view, must be in the same database and have the same owner.
  • The indexed view does not need to contain all the tables referenced in the query to be used by the optimizer.
  • A unique clustered index must be created before any other indexes can be created on the view.
  • Certain SET options (discussed later in this document) must be set correctly when the base tables, view, and index are created, and whenever data in the base tables and view are modified. In addition, the query optimizer will not consider the indexed view unless these SET options are correct.
  • The view must be created using schema binding, and any user-defined functions referenced in the view must also be created with the SCHEMABINDING option.
  • Additional disk space will be required to hold the data defined by the indexed view.

Consider these guidelines when designing indexed views:

  • Design indexed views that can be used by several queries or multiple operations.
  • For example, an indexed view that contains the SUM of a column and the COUNT_BIG of a column can be used by queries that contain the functions SUM, COUNT, COUNT_BIG, or AVG. The queries will be faster because only a small number of rows from the view need to be retrieved rather than the full number of rows from the base tables and a portion of the computations required for performing the AVG function have already been done.
  • Keep the index key compact.
  • By using the fewest number of columns and bytes possible in the index key, access to the rows of the indexed view can be done more efficiently because the indexed view rows are narrower, and key comparisons are faster than with a wider key. Additionally, the clustered index key is used as a row locator in every nonclustered index defined on the indexed view. The cost of a larger index key increases in proportion to the number of nonclustered indexes on the view.
  • Consider the size of the resulting indexed view.
  • In the case of pure aggregation, the indexed view may not provide any significant performance gains if its size is similar to the size of the original table.
  • Design multiple smaller indexed views that accelerate parts of the process.
  • You may not be able to always design an indexed view that addresses the entire query. Should that occur, consider creating several indexed views, each performing a portion of the query.

Consider these examples:

  • A frequently executed query aggregates data in one database, aggregates data in another database, and then joins the results. Because an indexed view cannot reference tables from more than one database, you cannot design a single view to perform the entire process. However, you can create an indexed view in each database that does the aggregation for that database. If the optimizer can match the indexed views against existing queries, at least the aggregation processing will be faster, without the need to recode existing queries. Although the join processing is not faster, the overall query is faster because it uses the aggregations stored in the indexed views.
  • A frequently executed query aggregates data from several tables, and it then uses UNION to combine the results. UNION is not allowed in an indexed view. You can design views to perform each of the individual aggregation operations. The optimizer can then select the indexed views to speed up queries with no need to recode the queries. While the UNION processing is not improved, the individual aggregation processes are improved.

The Database Engine Tuning Advisor [Note3] is a SQL Server feature that helps database administrators tune their physical database design. Database Engine Tuning Advisor recommends indexed views in addition to recommending indexes on base tables, and it also recommends table and index partitioning strategies. Using Database Engine Tuning Advisor enhances an administrator’s ability to determine the combination of indexes, indexed views, and partitioning strategies that optimize the performance of the typical mix of queries executed against a database. Database Engine Tuning Advisor can recommend a wide variety of indexed views. These include ones that take advantage of the new features for indexed views for SQL Server 2005 described in the section “Indexed Views in SQL Server 2005”. Database Engine Tuning Advisor doesn’t eliminate the need for good judgment by the database administrator when designing physical storage structures. However, it can simplify the physical database design process. Database Engine Tuning Advisor operates in cooperation with the cost-based query optimizer by proposing a set of hypothetical index, indexed view, and partition structures. Database Engine Tuning Advisor uses the optimizer to estimate the cost of your workload with and without these structures, and it recommends structures that provide low overall cost.

Because the Database Engine Tuning Advisor forces all the required SET options (to ensure the result set is correct), its indexed view creation will succeed. However, your application may not be able to take advantage of the views if its option settings are not set as required. Inserts, updates, or deletes may fail on tables that participate in indexed view definitions if the SET options aren’t specified as required.

Indexed views are similarly maintained; however, if the view references several tables, updating any of them may require updating the indexed view. Unlike ordinary indexes, a single row insert into any of the participating tables may cause multiple row changes in an indexed view. This is because the single row may join with multiple rows of another table. The same is true for updates and deletes. Consequently, maintaining an indexed view may be more expensive than maintaining an index on the table. Conversely, maintaining an indexed view with a highly selective condition may be much less expensive than maintaining an index on a table, because most inserts, deletes, and updates to base tables the view references will not affect the view. These operations can be filtered out with respect to the indexed view without accessing other database data.

In SQL Server, some views can be updated. When a view is updatable, the underlying base tables are modified directly through the view using INSERT, UPDATE, and DELETE statements. Creating an index on a view does not prevent the view from being updatable. Updates to an indexed view really cause updates to the base table(s) underlying the view. These updates propagate back to the indexed view automatically as part of indexed view maintenance. For more information about updatable views, see “Modifying Data Through a View” in SQL Server 2008 Books Online.

Consider the following points when you plan indexed views:

  • Additional storage is required in the database for the indexed view. The result set of an indexed view is physically persisted in the database in a manner similar to that of typical table storage.
  • SQL Server maintains views automatically; therefore, any changes to a base table on which a view is defined may initiate one or more changes in the view indexes. Thus, additional maintenance overhead is incurred.

The net performance improvement achieved by a view is the difference of the total query execution savings offered by the view and the cost to store and maintain the view.

It is relatively easy to approximate the required storage the view will consume. Evaluate the SELECT statement encapsulated by the view definition with the SQL Server Management Studio execution plan option Display Estimated Execution Plan. This graphical display will yield an approximation of the number of rows returned by the query and the size of the row. By multiplying these two values together, it is possible to approximate the potential size of the view; however, this is only an approximation. The actual size of the index on the view can be accurately determined only by executing the query in the view definition, or by creating the index on the view.

From the standpoint of automated maintenance considerations performed by SQL Server, the Display Estimated Execution Plan option may give some insight on the impact of this overhead. If a statement that modifies the view (UPDATE on the view, INSERT into a base table) is evaluated with SQL Server Management Studio, an execution plan displayed for the statement will include the maintenance operation for that statement. Taking this cost into consideration along with an idea of how many times this operation will occur in the production environment may indicate the potential cost of view maintenance.

As a general recommendation, any modifications or updates to the view or the base tables underlying the view should be performed in batches if possible, rather than singleton operations. This may reduce some overhead in the view maintenance.

The steps required to create an indexed view are critical to the successful implementation of the view:

  1. Verify the setting of ANSI_NULLS is correct for all existing tables that will be referenced in the view.
  2. Verify ANSI_NULLS is set correctly for the current session as shown in the table in “Using SET Options to Obtain Consistent Results” before creating any new tables.
  3. Verify ANSI_NULLS and QUOTED_IDENTIFIER are set correctly for the current session as shown in the table in “Using SET Options to Obtain Consistent Results” before creating the view.
  4. Verify the view definition is deterministic.
  5. Create the view using the WITH SCHEMABINDING option.
  6. Verify your session’s SET options are set correctly as shown in the table in “Using SET Options to Obtain Consistent Results” before creating the unique clustered index on the view.
  7. Create the unique clustered index on the view.

Use the OBJECTPROPERTY function to check the value of ANSI_NULLS and QUOTED_IDENTIFIER on an existing table or view.

Evaluating the same expression can produce different results in SQL Server if different SET options are enabled for the current session when the query is executed. For example, after the SET option CONCAT_NULL_YIELDS_NULL is set to ON, the expression ‘abc’ + NULL returns the value NULL. But after CONCAT_NULL_YIEDS_NULL is set to OFF, the same expression produces ‘abc’. Indexed views require fixed values for several SET options for the current session and for objects referenced by the view to ensure that the views can be maintained correctly and return consistent results.

The SET options ANSI_NULLS and QUOTED_IDENTIFIER of the current session must both be set to ON at the time a view on which you wish to build an index is created. This is because these two options are stored with the view definition in the system catalogs.

The SET options of the current session must be set to the values shown in the Required value column for the current session whenever these operations occur:

  • An index is created on a view.
  • There is any INSERT, UPDATE, or DELETE operation performed on any table participating in the indexed view.
  • The indexed view is used by the query optimizer to produce the query plan.

The following table lists the SET options.

SET options

Required value Default server value .NET SqlClient, OLE DB and ODBC value

DB LIB value


The ARITHABORT option does have to be ON for the current session to create an indexed view, but it is implicitly ON in SQL Server once ANSI_WARNINGS is ON, so it does not need to be set explicitly. If you are using a .NET SqlClient, OLE DB, or ODBC server connection, you do not have to modify any SET options from their defaults to create, use, and maintain indexed views. All DB LIB values must be set correctly either at the server level using sp_configure or from the application using the SET command. For more information about SET options, see “Using Options in SQL Server” in SQL Server Books Online.

The definition of an indexed view must be deterministic. A view is deterministic if all expressions in the select list, as well as the WHERE and GROUP BY clauses, are deterministic. Deterministic expressions always return the same result any time they are evaluated with a specific set of input values. Only deterministic functions can participate in deterministic expressions. For example, the DATEADD function is deterministic because it always returns the same result for any given set of argument values for its three parameters. GETDATE is not deterministic because it is always invoked with the same argument, yet the value it returns changes each time it is executed. For more information, see “Deterministic and Nondeterministic Functions” in SQL Server Books Online.

Even if an expression is deterministic, if it contains float expressions, the exact result may depend on the processor architecture or version of microcode. To ensure data integrity in SQL Server when moving a database from one computer to another, such expressions can participate only as non-key columns of indexed views. Deterministic expressions that do not contain float expressions are called precise. Only deterministic expressions that are persisted and/or precise may participate in key columns and WHERE or GROUP BY clauses of indexed views. Persisted expressions are references to stored columns, including regular columns and computed columns marked PERSISTED.

Use the COLUMNPROPERTY function and IsDeterministic property to determine if a view column is deterministic. Use the COLUMNPROPERTY function and IsPrecise property to determine if a deterministic column in a view with SCHEMABINDING is precise. COLUMNPROPERTY returns 1 if the property is TRUE, 0 if FALSE, and NULL for invalid input. For example, in this script the SELECT returns 0 for IsPrecise because the b column is of type real.

CREATE TABLE T(a int, b real, c as getdate(), d as a+b)
SELECT object_id('VT'), COLUMNPROPERTY(object_id('VT'),'b','IsPrecise')

You can experiment with COLUMNPROPERTY to find out whether the other columns of T are deterministic and precise.

The set of views that can be indexed is a subset of the set of possible views. Any indexable view can exist with or without an index.

In addition to the requirements listed in the design guidelines, “Using Set Options to Obtain Consistent Results”, and “Using Deterministic Functions” in this white paper, the following requirements must be met before you can create a unique clustered index on a view.

Base tables referenced by the view must have the correct value of the SET option ANSI_NULLS set at the time the table is created. The OBJECTPROPERTY function can be used to check the value of ANSI_NULLS on an existing table.

User-defined functions referenced by the view must be created using the WITH SCHEMABINDING option.

The view must meet the following requirements:

  • The view must be created using the WITH SCHEMABINDING option.
  • Tables must be referenced by the view using two-part names (schemaname.tablename).
  • User-defined functions must be referenced by the view using two-part names (schemaname.functionname).
  • SET options ANSI_NULLS and QUOTED_IDENTIFIER must be set correctly.

To create an index on a view in SQL Server, the view definition must not contain any of the following:

Arithmetic on imprecise (float, real) values OPENXML
CONVERT producing an imprecise result OUTER join
COUNT(*) References to a base table with a disabled clustered index
GROUP BY ALL References to a table or function in a different database
Derived tables (subquery in FROM list) References to another view
Expressions on aggregate results (for example, SUM(x)+SUM(x)) STDEV, STDEVP, VAR, VARP, AVG
Imprecise constants (for example, 2.34e5) SUM on nullable expressions
Inline or table-valued functions Table hints (for example, NOLOCK)
MIN, MAX text, ntext, image, filestream, or xml columns
Nondeterministic expressions TOP
Non-Unicode collations UNION
Contradictions SQL Server can detect that mean the view would be empty (for example, where 0=1 and …)

Note: The indexed view may contain float and real columns; however, such columns cannot be included in the clustered index key if they are nonpersisted computed columns.

If GROUP BY is present, the VIEW definition:

  • Must contain COUNT_BIG(*).
  • Must not contain HAVING, CUBE, ROLLUP, or GROUPING().

These restrictions are applicable only to the indexed view definition. A query can use an indexed view in its execution plan even if it does not satisfy these GROUP BY restrictions.

The index must meet the following requirements:

  • The user executing the CREATE INDEX statement must be the view owner.
  • If the view definition contains a GROUP BY clause, the key of the unique clustered index can reference only the columns specified in the GROUP BY clause.
  • The index must not be created with the IGNORE_DUP_KEY option enabled.

The examples in this section illustrate the use of indexed views with two major groups of queries: aggregations and joins. They also demonstrate the conditions used by the query optimizer when determining if an indexed view is applicable. For information, including a complete list of conditions, see “How the Query Optimizer Uses Indexed Views.”

The queries are based on tables in AdventureWorks, the sample database provided in SQL Server 2005, and AdventureWorksDW2008. For SQL Server 2008, the sample databases AdventureWorks and AdventureWorksDW are available as separate downloads from CodePlex at The queries can be executed as written. You may want to use the Display Estimated Execution Plan execution plan option in SQL Server Management Studio to view the plans selected by the query optimizer before and after the views are created. Although the examples demonstrate how the optimizer chooses the lower cost execution plan, the sample databases are too small to show performance gains.

Before you begin working on these examples, make sure your session has the correct options set by running these commands.



The following queries show two methods to return the five products with the largest total discount from the Sales.SalesOrderDetail table.

Query 1

SELECT TOP 5 ProductID, Sum(UnitPrice*OrderQty) -
     Sum(UnitPrice*OrderQty*(1.00-UnitPriceDiscount)) AS Rebate
FROM Sales.SalesOrderDetail

Query 2

SUM(UnitPrice*OrderQty*UnitPriceDiscount) AS Rebate
FROM Sales.SalesOrderDetail

The execution plan selected by the query optimizer contains:

  • A Clustered Index Scan on the Sales.SalesOrderDetail table with a row estimate of 121,317 rows.
  • A Hash Match/Aggregate operator that puts the selected rows into a hash table based on the GROUP BY column and computes the SUM aggregation for each row.
  • A TOP 5 sort operator based on the ORDER BY clause.

View 1

Adding an indexed view that includes the aggregations required for the Rebate column will change the query execution plan for Query 1. On a large table (multi-million rows), the query’s performance would also improve significantly.

SELECT SUM(UnitPrice*OrderQty) AS SumPrice,
SUM(UnitPrice*OrderQty*(1.00-UnitPriceDiscount)) AS SumDiscountPrice,
COUNT_BIG(*) AS Count, ProductID
FROM Sales.SalesOrderDetail

The execution plan for the first query shows that the Vdiscount1 view is used by the optimizer. However, the view will not be used by the second query because it does not contain the SUM(UnitPrice*OrderQty*UnitPriceDiscount) aggregate. Another indexed view can be created that will address both queries.

View 2

SELECT SUM(UnitPrice*OrderQty)AS SumPrice,
SUM(UnitPrice*OrderQty*(1.00-UnitPriceDiscount))AS SumDiscountPrice,
SUM(UnitPrice*OrderQty*UnitPriceDiscount)AS SumDiscountPrice2,
COUNT_BIG(*) AS Count, ProductID
FROM Sales.SalesOrderDetail

With this indexed view, after dropping Vdiscount1, the query execution plan for both queries now contains:

  • A Clustered Index Scan on the Vdiscount2 view with a row estimate of 266 rows.
  • A TOP 5 Sort function based on the ORDER BY clause.

The query optimizer selected the view because it provided the lowest execution cost even though it was not referenced in the query.

Query 3

Query 3 is similar to the previous queries, but ProductID is replaced by the column SalesOrderID, which is not included in the view definition. This violates the condition that all expressions in the query select list on tables in the view definition must be derivable from the view select list in order to use the indexed view in the query plan.

SELECT TOP 3 SalesOrderID,
SUM(UnitPrice*OrderQty*UnitPriceDiscount) OrderRebate
FROM Sales.SalesOrderDetail

A separate indexed view would be required to address this query. Vdiscount2 could be modified to include SalesOrderID; however, the resulting view would contain as many rows as the original table and would not provide a performance improvement over using the base table.

Query 4

This query produces the average price for each product.

SELECT p.Name, od.ProductID,
AVG(od.UnitPrice*(1.00-od.UnitPriceDiscount)) AS AvgPrice,
SUM(od.OrderQty) AS Units
FROM Sales.SalesOrderDetail AS od, Production.Product AS p
WHERE od.ProductID=p.ProductID
GROUP BY p.Name, od.ProductID

Complex aggregates (for example, STDEV, VARIANCE, AVG) cannot be included in the definition of an indexed view. However, indexed views can be used to execute a query containing an AVG by including the simple aggregate functions that, when combined, perform the complex aggregation.


This indexed view contains the simple aggregate functions needed to perform an AVG function. When Query 4 is executed after the creation of View3, the execution plan shows the view being used. The optimizer can derive the AVG expression from the view’s simple aggregation columns Price and Count.

SELECT ProductID, SUM(UnitPrice*(1.00-UnitPriceDiscount)) AS Price,
COUNT_BIG(*) AS Count, SUM(OrderQty) AS Units
FROM Sales.SalesOrderDetail

Query 5

This query is the same as Query 4, but it includes one additional search condition. View3 will work for this query even though the additional search condition references only columns from a table not included in the view definition.

SELECT p.Name, od.ProductID,
AVG(od.UnitPrice*(1.00-UnitPriceDiscount)) AS AvgPrice,
SUM(od.OrderQty) AS Units
FROM Sales.SalesOrderDetail AS od, Production.Product AS p
WHERE od.ProductID=p.ProductID AND p.Name like '%Red%'
GROUP BY p.Name, od.ProductID

Query 6

The query optimizer cannot use View3 for this query. The added search condition od.UnitPrice>10 contains a column from the table in the view definition, but the column does not appear in the GROUP BY list, nor does the search predicate appear in the view definition.

SELECT p.Name, od.ProductID,
AVG(od.UnitPrice*(1.00-UnitPriceDiscount)) AS AvgPrice,
SUM(od.OrderQty) AS Units
FROM Sales.SalesOrderDetail AS od, Production.Product AS p
WHERE od.ProductID = p.ProductID AND od.UnitPrice > 10
GROUP BY p.Name, od.ProductID

Query 7

In contrast, the query optimizer can use View3 for Query 7 because the column defined in the new search condition od.ProductID between 0 and 995 is included in the GROUP BY clause in the view definition.

SELECT p.Name, od.ProductID,
AVG(od.UnitPrice*(1.00-UnitPriceDiscount)) AS AvgPrice,
SUM(od.OrderQty) AS Units
FROM Sales.SalesOrderDetail AS od, Production.Product AS p
WHERE od.ProductID = p.ProductID AND od.ProductID between 0 and 995
GROUP BY p.Name, od.ProductID


This view will satisfy the conditions for Query 6 by including the columns SumPrice and Count in the view definition to allow the AVG in the query to be computed.

SELECT p.Name, od.ProductID,
SUM(od.UnitPrice*(1.00-UnitPriceDiscount)) AS SumPrice,
SUM(od.OrderQty) AS Units, COUNT_BIG(*) AS Count
FROM Sales.SalesOrderDetail AS od, Production.Product AS p
WHERE od.ProductID = p.ProductID AND od.UnitPrice > 10
GROUP BY p.Name, od.ProductID
CREATE UNIQUE CLUSTERED INDEX VdiscountInd on View4 (Name, ProductID)

Query 8

The same index on View4 will also be used for a query where a join to the table Sales.SalesOrderHeader is added. This query meets the condition that the tables listed in the query FROM clause are a superset of the tables in the FROM clause of the indexed view.


SELECT p.Name, od.ProductID,
AVG(od.UnitPrice*(1.00-UnitPriceDiscount)) AS AvgPrice,
SUM(od.OrderQty) AS Units
FROM Sales.SalesOrderDetail AS od, Production.Product AS p,
     Sales.SalesOrderHeader AS o
WHERE od.ProductID = p.ProductID AND o.SalesOrderID = od.SalesOrderID
     AND od.UnitPrice > 10
GROUP BY p.Name, od.ProductId

The final two queries are modifications of Query 8. Each modification violates one of the optimizer conditions, and unlike Query 8, they cannot use View4.

Query 8a

Query 8a cannot use the indexed view because of the WHERE clause mismatch between UnitPrice > 10 in the view definition and UnitPrice > 25 in the query, and the fact that UnitPrice does not appear in the view. The query search condition predicate must be a superset of the search condition predicates in the view definition.

SELECT p.Name, od.ProductID, AVG(od.UnitPrice*(1.00-UnitPriceDiscount))
     AvgPrice, SUM(od.OrderQty) AS Units
FROM Sales.SalesOrderDetail AS od, Production.Product AS p,
     Sales.SalesOrderHeader AS o
WHERE od.ProductID = p.ProductID AND o.SalesOrderID = od.SalesOrderID
     AND od.UnitPrice > 25
GROUP BY p.Name, od.ProductID

Query 8b

Observe that the table Sales.SalesOrderHeader does not participate in the indexed view View4 definition. In spite of that, adding a predicate on this table will disallow using the indexed view because the added predicate may change or eliminate additional rows participating in the aggregates shown in Query 8b.

SELECT p.Name, od.ProductID, AVG(od.UnitPrice*(1.00-UnitPriceDiscount))
     AS AvgPrice, SUM(od.OrderQty) AS Units
FROM Sales.SalesOrderDetail AS od, Production.Product AS p,
     Sales.SalesOrderHeader AS o
WHERE od.ProductID = p.ProductID AND o.SalesOrderID = od.SalesOrderID
     AND od.UnitPrice > 10 AND o.OrderDate > '20040728'
GROUP BY p.Name, od.ProductID

View 4a

View4a extends View4 by including the UnitPrice column in the SELECT list and GROUP BY clause. Query 8a can use View4a because the UnitPrice values that are be greater than 10 can be filtered further to leave only those greater than 25. This is an example of interval subsumption.

SELECT p.Name, od.ProductID, od.UnitPrice,
SUM(od.UnitPrice*(1.00-UnitPriceDiscount)) AS SumPrice,
SUM(od.OrderQty) AS Units, COUNT_BIG(*) AS Count
FROM Sales.SalesOrderDetail AS od, Production.Product AS p
WHERE od.ProductID = p.ProductID AND od.UnitPrice > 10
GROUP BY p.Name, od.ProductID, od.UnitPrice
     ON View4a (Name, ProductID, UnitPrice)


View5 contains an expression in its select and GROUP BY lists. Notice that LineTotal is a computed column, so by itself it is an expression. This expression is in turn nested inside a call to the FLOOR function.

SELECT FLOOR(LineTotal) FloorTotal, COUNT_BIG(*) C
FROM Sales.SalesOrderDetail

Query 9

Query 9 contains the expression FLOOR(LineTotal) in its select and GROUP BY lists. With the extensions to view matching for expressions added in SQL Server 2005, this query uses the index on View5.

SELECT TOP 5 FLOOR(LineTotal), Count(*)
FROM Sales.SalesOrderDetail


View6 stores information about line items for the three days at the end of a month. This clusters together these rows on a small number of pages so that queries to Sales.SalesOrderDetail for these days can be satisfied quickly.

SELECT SalesOrderID, SalesOrderDetailID, CarrierTrackingNumber, OrderQty,
     ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, rowguid,
FROM Sales.SalesOrderDetail
WHERE ModifiedDate IN ( convert(datetime, '2004-07-31', 120),
                        convert(datetime, '2004-07-30', 120),
                        convert(datetime, '2004-07-29', 120) )
     ON View6(SalesOrderID, SalesOrderDetailID)

Query 10

The following query can match View6 and the system can produce a plan that scans the VendJuly04Ind index on the view instead of scanning the entire Sales.SalesOrderDetail table. This also demonstrates expression equivalence (because the order of days is different in the query than in the view, and the data formats are different) and predicate subsumption (because the query asks for a subset of the results stored in the view).

SELECT h.*, SalesOrderDetailID, CarrierTrackingNumber, OrderQty,
   ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, d.rowguid,
FROM Sales.SalesOrderHeader as h, Sales.SalesOrderDetail as d
WHERE (d.ModifiedDate = '20040729' OR d.ModifiedDate = '20040730')
and d.SalesOrderID=h.SalesOrderID


Developers also sometimes find it convenient to use indexed views to enforce specialized integrity constraints. For example, a constraint you can enforce with an indexed view is “Column a of table T is unique except there may be multiple 0 values in the column.” Indexed view View7 enforces this constraint. If you run the following script, it runs successfully until the final insert. That statement is disallowed because it adds a nonzero duplicate value.


USE tempdb
FROM dbo.T
WHERE a <> 0
-- legal:
INSERT INTO T VALUES(0) -- duplicate 0
-- disallowed:

Query 11

Queries 11 – 13 are based on a table in the sample database AdventureWorksDW2008. The following script creates a partitioned table containing a subset of data in the AdventureWorksDW2008FactResellerSales table and then creates a partition-aligned indexed view.


USE AdventureWorksDW2008;
-- create a partition function
AS RANGE LEFT FOR VALUES (20011231, 20021231, 20031231, 20041231);
-- create the partition scheme
-- create a fact table
CREATE TABLE dbo.FactSales (OrderDateKey INT NOT NULL, ProductKey INT, EmployeeKey INT, SalesAmount MONEY)
ON PS1(OrderDateKey)
INSERT INTO factSales (OrderDateKey, ProductKey, EmployeeKey, SalesAmount)
 SELECT OrderDateKey, ProductKey, EmployeeKey, SalesAmount
 FROM AdventureWorksDW2008.dbo.FactResellerSales
-- create a clustered index - note that it is partitioned using the partition scheme specified
CREATE CLUSTERED INDEX ciFactsales on dbo.factSales (OrderDateKey, ProductKey, EmployeeKey) ON PS1(OrderDateKey)
--create an indexed view
SELECT OrderDateKey, ProductKey, EmployeeKey, COUNT_BIG(*) AS cnt, SUM(ISNULL(SalesAmount,0)) AS SalesAmount
FROM dbo.FactSales
GROUP BY OrderDateKey, ProductKey, EmployeeKey
-- materialize the view
-- uses same partitioning scheme as the partitioned table
ON dbo. SalesByDateProdEmp (OrderDateKey, ProductKey, EmployeeKey) ON PS1(OrderDateKey)

Query 12

The following script switches the data in PARTITION 1 from the table created in Query 11 into a new archive table.

-- create an archive table to receive the partition that will be switched out of the partitioned table
-- in this example, the archive table is not partitioned
CREATE TABLE dbo.SalesArchive (OrderDateKey INT NOT NULL, ProductKey INT, EmployeeKey INT, SalesAmount MONEY)
-- create a clustered index to match that on the fact table
CREATE CLUSTERED INDEX ciSalesArchive on dbo.SalesArchive (OrderDateKey, ProductKey, EmployeeKey)
-- create an indexed view with view definition matching SalesByDateProdEmp
-- on table FactSales.
SELECT OrderDateKey, ProductKey, EmployeeKey, COUNT_BIG(*) AS cnt, SUM(ISNULL(SalesAmount,0)) AS SalesAmount
FROM dbo.SalesArchive
GROUP BY OrderDateKey, ProductKey, EmployeeKey
-- materialize the view
-- the indexed view on the nonpartitioned archive table is not partitioned.
ON dbo. SalesArchiveByDateProdEmp (OrderDateKey, ProductKey, EmployeeKey)
-- switch the data in the old partition of the source table [FactSales]
-- and the indexed view [SalesByDateProdEmp]
-- out to the archive table [SalesArchive]
-- and the indexed view [SalesArchiveByDateProdEmp].
ALTER TABLE dbo.FactSales SWITCH PARTITION 1 TO dbo.SalesArchive

Query 13

In this script, we switch new data into an empty partition in the table created in Query 11.


-- create a staging table with data to be switched into the target table
CREATE TABLE dbo.SalesStaging (OrderDateKey INT NOT NULL, ProductKey INT, EmployeeKey INT, SalesAmount MONEY)
ON PS1(OrderDateKey)
-- populate the table
INSERT INTO dbo.SalesStaging VALUES (20050801, 355, 282, 5626.32)
INSERT INTO dbo.SalesStaging VALUES (20050901, 344, 283, 5389.45)
INSERT INTO dbo.SalesStaging VALUES (20050501, 347, 283, 2034.23)
INSERT INTO dbo.SalesStaging VALUES (20050401, 345, 282, 11895.20)
INSERT INTO dbo.SalesStaging VALUES (20050201, 351, 283, 6798.54)
-- create a clustered index on the staging table using the same partitioning scheme
ON SalesStaging(OrderDateKey, ProductKey, EmployeeKey) ON PS1(OrderDateKey)
-- create an indexed view
SELECT OrderDateKey, ProductKey, EmployeeKey, COUNT_BIG(*) AS cnt, SUM(ISNULL(SalesAmount,0)) AS SalesAmount
FROM dbo.SalesStaging
GROUP BY OrderDateKey, ProductKey, EmployeeKey
-- materialize the view
ON SalesStagingByDateProdEmp (OrderDateKey, ProductKey, EmployeeKey) ON PS1(OrderDateKey)
-- switch in the data
ALTER TABLE SalesStaging switch PARTITION 5 TO FactSales PARTITION 5

Query 14

This example shows how to use the NOEXPAND hint to rewrite a query to use an indexed view with only a small syntax change compared to the original query referencing the fact table. Suppose you wrote the following query originally.

SELECT d.FiscalYear, d.FiscalQuarter, p.Color, SUM(f.SalesAmount) AS SalesAmount
FROM dbo.factSales AS f, DimDate d, DimProduct p
WHERE f.OrderDateKey = d.DateKey
AND f.ProductKey = p.ProductKey
AND d.FiscalYear IN (2002, 2003, 2004)
AND p.Color IN ('Red', 'Black', 'Silver')
GROUP BY d.FiscalYear, d.FiscalQuarter, p.Color
ORDER BY d.FiscalYear ASC, d.FiscalQuarter ASC, p.Color ASC;

You might find that indexed view matching didn’t take effect. The following slightly modified query forces use of the index on the view in the query plan, and produces a fast execution reliably.

SELECT d.FiscalYear, d.FiscalQuarter, p.Color, SUM(f.SalesAmount) AS SalesAmount
FROM dbo.SalesByDateProdEmp AS f WITH(NOEXPAND), DimDate d, DimProduct p
WHERE f.OrderDateKey = d.DateKey
AND f.ProductKey = p.ProductKey
AND d.FiscalYear IN (2002, 2003, 2004)
AND p.Color IN ('Red', 'Black', 'Silver')
GROUP BY d.FiscalYear, d.FiscalQuarter, p.Color
ORDER BY d.FiscalYear ASC, d.FiscalQuarter ASC, p.Color ASC

The book by Adamson identified in the reference list at the end of this paper is a useful reference about how to do manual query rewrites using summary aggregates, similar to the one above, to speed up data warehouse queries. In SQL Server you can use the methods described in the book effectively with indexed views and the NOEXPAND hint.

Q. Why are there restrictions on the kind of views I can create an index on?

A. To make sure that it is logically possible to incrementally maintain the view, to restrict the ability to create a view that would be expensive to maintain, and to limit the complexity of the SQL Server implementation. A large set of views is nondeterministic and context-dependent; their contents ‘change’ independently of DML operations. These can’t be indexed. Examples are any views that call GETDATE or SUSER_SNAME in their definition.

Q. Why does the first index on a view have to be CLUSTERED and UNIQUE?

A. It must be UNIQUE to allow easy lookup of records in the view by key value during indexed view maintenance, and to prevent creation of views with duplicates, which would require special logic to maintain. It must be clustered because only a clustered index can enforce uniqueness and store the rows at the same time.

Q. Why isn’t my indexed view being picked up by the query optimizer for use in the query plan?

A. There are three primary reasons the indexed view may not be being chosen by the optimizer:

  • You are using a version other than Enterprise or Developer edition of SQL Server. Only Enterprise and Developer editions support automatic query-to-indexed-view matching. Reference the indexed view by name and include the NOEXPAND hint to have the query processor use the indexed view in all other editions.
  • The cost of using the indexed view may exceed the cost of getting the data from the base tables, or the query is so simple that a query against the base tables is fast and easy to find. This often happens when the indexed view is defined on small tables. You can use the NOEXPAND hint if you want to force the query processor to use the indexed view. This may require you to rewrite your query if you don’t initially reference the view explicitly. You can get the actual cost of the query with NOEXPAND and compare it to the actual cost of the query plan that doesn’t reference the view. If they are close, this may give you confidence that the decision of whether or not to use the indexed view doesn’t matter.
  • The query optimizer is not matching the query to the indexed view. Double-check the definition of the view and the definition of the query to make sure that a structural match between the two is possible. CASTS, converts, and other expressions that don’t logically alter your query result may prevent a match. Also, there are limits to the expression normalization and equivalence and subsumption testing that SQL Server performs. It may not be able to show that some equivalent expressions are the same, or that one expression that is logically subsumed by the other is really subsumed, so it may miss a match.

Q. I update my data warehouse once a week. Indexed views speed up my queries a lot during the week, but slow down the weekly update. What should I do?

A. Consider dropping the indexed views before the weekly update, and creating them again afterwards.

Q. My view has duplicates, but I really want to maintain it. What can I do?

A. Consider creating a view that groups by all the columns or expressions in the view you want and adds a COUNT_BIG(*) column. Then create a unique clustered index on the grouping columns. The grouping process ensures uniqueness. This isn’t really the same view, but it might satisfy your needs.

Q. I have a view defined on top of another view. SQL Server won’t let me index the top-level view. What can I do?

A. Consider expanding the definition of the nested view by hand into the top-level view and then indexing it, indexing the innermost view, or not indexing the view.

Q. Why do indexed views have to be defined WITH SCHEMABINDING?

A. So that both of the following conditions are met:

  • All objects the view references are unambiguously identified using schemaname.objectname, regardless of which user is accessing the view.
  • The objects referred to in the view definition can’t be altered in a way that would make the view definition illegal or force SQL Server to re-create the index on the view.

Q. Why can’t I use OUTER JOIN in an indexed view?

A. Rows can logically disappear from an indexed view based on OUTER JOIN when you insert data into a base table. This makes incrementally updating OUTER JOIN views relatively complex to implement, and the performance of the implementation would be slower than for views based on standard (INNER) JOIN.

Indexed views can substantially improve query performance when used appropriately. An indexed view is persistently stored, meaning that the data can be accessed directly without the need to execute the query that defines the view. This is particularly useful for storing precomputed aggregate data.

Good schema design requires balancing the benefits of indexed views with their costs. Indexed views require additional storage space, and updating the base tables on which a view is defined may require updating the indexed view.

Partition-aligned indexed views extend the usefulness of indexed views to scenarios involving partitioned tables. When large tables are partitioned and data is switched in and out of the partitions, partition-aligned indexed views allow you to maintain the indexed views during switching without incurring additional maintenance cost.

For more information:

SQL Server Web site:

SQL Server TechCenter:

SQL Server DevCenter:

SQL Server Magazine at

Mastering Data Warehouse Aggregates by Christopher Adamson, Wiley, 2006

Did this paper help you? Please give us your feedback. Tell us on a scale of 1 (poor) to 5 (excellent), how would you rate this paper and why have you given it this rating? For example:

Are you rating it high due to having good examples, excellent screenshots, clear writing, or another reason?

Are you rating it low due to poor examples, fuzzy screenshots, unclear writing?

This feedback will help us improve the quality of white papers we release.

Send feedback.

End Notes

[Note1] The view does not always need to be fully materialized. The query can contain additional predicates, joins, or aggregations that can be applied to the tables and views referenced in the view, which eliminates the need for full materialization.

[Note2] There are exceptional situations when the optimizer may collapse two FROM clauses into one (subquery to join, or derived table to join transformation). If that happens, the indexed view substitution may cover more than one FROM clause in the original query.

[Note3] Database Engine Tuning Advisor is an enhancement of the Index Tuning Wizard found in SQL Server 2000.

Database Tuning Advisor bug

I love the ability to profile SQL Server activity and then run the database tuning advisor.

Unfortunately, sometimes I start this DTA advisor, and I let it run for a day, and it doesn’t complete.  Often, when I kill DTA in Task Manager, I have noticed that this leaves orphaned Indexed Views left in my database.

This can cause unintended problems.  Indexed views are extremely powerful, but the bottom line is that having 100 orphaned indexed views is going to slow down your database (it actually might speed it up in some situations, but not likely- 100 indexed views is just ridiculous).

I use a script like this in order to clean up these fragments

select ‘drop view [‘ + name + ‘]’
From sysobjects
where name like ‘%dta_mv_%’

This is just like any other situation where I use Dynamic SQL.. Write a script that generates the DROP VIEW ______ statements, instead of doing this by hand:

drop view [_dta_mv_0]
drop view [_dta_mv_1]
drop view [_dta_mv_2]
drop view [_dta_mv_3]
drop view [_dta_mv_4]
drop view [_dta_mv_5]
drop view [_dta_mv_6]
drop view [_dta_mv_7]
drop view [_dta_mv_8]
drop view [_dta_mv_9]
Now of course, you can copy the results from this script into a new query Window and execute the results in order to drop the views.

Entity Framework – 5200 lines of code in generated SQL

I’ve never understood the point of objects. If procedural code is FASTER then what is the point?

I’ve also never understood the point of ORM mapping software.

ANYTHING that generates 5200 line queries should be thrown onto the trash heap.

Pitfalls and Pointers for a Base Logging Class in EF Models

Julie Lerman


Julie LermanI recently spent time with a client who was experiencing some occasional—but severe—performance issues with their Entity Framework-related code. Using a tool to profile the queries generated by Entity Framework, we discovered a 5,800-line SQL query hitting the database. (Learn about profiling tools in my December 2010 Data Points column, “Profiling Database Activity in the Entity Framework,” I gasped when I saw that the EDMX model contained an inheritance hierarchy I had taught friends, loved ones and developers to avoid. The model had a single base entity from which every other entity derived. The base entity was used to ensure each entity had properties to track logging data such as DateCreated and DateLastModified. Because this model was created using Model First, the inheritance was interpreted by Entity Framework as a Table per Type (TPT) model in which each entity mapped to its own table in the database. To the uninitiated, this looks innocent enough.

But TPT inheritance is notorious in Entity Framework for its generic query-building pattern that can result in sprawling, poorly performing SQL queries. You might be starting out with a new model where you can avoid TPT, or you might already be stuck with an existing model and TPT inheritance. Either way, this month’s column is dedicated to helping you comprehend the potential performance pitfalls of TPT in this scenario, and showing you a few tricks you can leverage to get around them if it’s too late to modify the model and database schema.

The Scary Model

Figure 1 shows an example of a model I’ve seen all too often. Notice the entity named TheBaseType. Every other entity derives from it in order to auto-matically inherit a DateCreated property. I understand why it’s tempting to design it this way, but Entity Framework schema rules also demand that the base type owns the key property of each derived entity. To me, that’s already a red flag signaling it’s not an appropriate use of inheritance.

All Classes Inherit from TheBaseType
Figure 1 All Classes Inherit from TheBaseType

It’s not that Entity Framework specifically created a problem with this design; it’s a design flaw in the model itself. In this case, inheritance says that Customer is a TheBaseType. What if we changed the name of that base entity to “LoggingInfo” and then repeated the statement “Customer is a LoggingInfo”? The fallacy of the statement becomes more obvious with the new class name. Compare that to Customer is a Person. Perhaps I’ve now convinced you to avoid doing this in your models. But if not, or if you’re already stuck with this model, let’s take it a little further.

By default, the Model First workflow defines a database schema with one-to-one relationships between the base table and all of the tables that represent the derived types. This is the TPT hierarchy mentioned earlier.

If you were to execute a few simple queries, you might not notice any problem—especially if, like me, you’re not a DBA or other flavor of database guru.

For example, this LINQ to Entities query retrieves the DateCreated property for a particular customer:

  1. context.TheBaseTypes.OfType<Customer>()
  2.   .Where(b => b.Id == 3)
  3.   .Select(c => c.DateCreated)
  4.   .FirstOrDefault();

The query results in the following TSQL executed in the database:

  1. SELECT TOP (1)
  2. [Extent1].[DateCreated] AS [DateCreated]
  3. FROM  [dbo].[TheBaseTypes] AS [Extent1]
  4. INNER JOIN [dbo].[TheBaseTypes_Customer]
  5. AS [Extent2] ON [Extent1].[Id] = [Extent2].[Id]
  6. WHERE 3 = [Extent1].[Id]

It’s a perfectly good query.

A query to retrieve an entire entity is a little uglier because of the need to perform a nested query. The base query retrieves all of the fields that represent the join between TheBaseTypes table and the table containing the derived type. Then a query over those results projects the fields to be returned to Entity Framework to populate the type. For example, here’s a query to retrieve a single product:


  1. context.TheBaseTypes.OfType<Product>().FirstOrDefault();

Figure 2 shows the TSQL executed on the server.

Figure 2 Partial Listing of a Nested TSQL Query When Specifying a Type

  2. [Limit1].[Id] AS [Id],
  3. [Limit1].[C1] AS [C1],
  4. [Limit1].[DateCreated] AS [DateCreated],
  5. [Limit1].[ProductID] AS [ProductID],
  6. [Limit1].[Name] AS [Name],
  7. […continued list of fields required for Product class…]
  8. FROM ( SELECT TOP (1)
  9.       [Extent1].[Id] AS [Id],
  10.       [Extent1].[DateCreated] AS [DateCreated],
  11.       [Extent2].[ProductID] AS [ProductID],
  12.       [Extent2].[Name] AS [Name],
  13.       [Extent2].[ProductNumber] AS [ProductNumber],
  14.       […continued list of fields from Products table aka “Extent2” …],
  15.       [Extent2].[ProductPhoto_Id] AS [ProductPhoto_Id],
  16.       ‘0X0X’ AS [C1]
  17.       FROM  [dbo].[TheBaseTypes] AS [Extent1]
  18.       INNER JOIN [dbo].[TheBaseTypes_Product]
  19.       AS [Extent2] ON [Extent1].[Id] = [Extent2].[Id]
  20. )  AS [Limit1]

That’s still not such a bad query. If you were profiling your queries up to this point, you might not notice any issues caused by the model’s design.

But what about this next “simple” query, which wants to find all objects that were created today, regardless of type? The query could return Customers, Products, Orders, Employees or any other type in your model that derives from the base. Thanks to the model design, this seems like a reasonable request, and the model plus LINQ to Entities makes it easy to express (DateCreated is stored in the database as a date type, so I don’t have to be concerned about comparing to DateTime fields in my example queries):

  1. var today= DateTime.Now.Date;
  2. context.TheBaseTypes
  3.   .Where(b => b.DateCreated == today)  .ToList();

Expressing this query in LINQ to Entities is short and sweet. But don’t be fooled. It’s a hefty request. You’re asking EF and your database to return instances of any type (be it Customer or Product or Employee) created today. Entity Framework must begin by querying each table that maps to the derived entities and joining each one to the single related TheBaseTypes table with the DateCreated field. In my environment, this creates a 3,200-line query (when that query is nicely formatted by EFProfiler), which can take Entity Framework some time to build and the database some time to execute.

In my experience, a query like this belongs in a business analysis tool anyway. But what if you’ve got the model and you want to get that info from within your app, perhaps for an administrative reporting area of an application you’re building? I’ve seen developers try to do this type of query in their applications, and I still say you need to think outside of the Entity Framework box. Build the logic into the database as a view or stored procedure and call that from Entity Framework, rather than asking EF to build the query for you. Even as a database procedure, this particular logic isn’t simple to build. But there are benefits. First, you have a greater chance of building a better-performing query. Second, EF won’t have to take the time to figure out the query. Third, your application won’t have to send a 3,300 (or more!)-line query across the pipe. But be warned that the more you dig into this problem and attempt to solve it from within the database or by using EF and .NET coding logic, the clearer it will become that the problem is not so much Entity Framework as the overall model design that’s getting in your way.

If you can avoid querying from the base type and query-specific types, your queries will be much simpler. Here’s an example that expresses the previous query to focus on a particular type:

  1. context.TheBaseTypes.TypeOf<Product>()
  2.   .Where(b => b.DateCreated == today)
  3.   .ToList();

Because EF didn’t have to be prepared for every type in the model, the resulting TSQL is a simple 25-line query. With the DbContext API, you don’t even have to use TypeOf to query derived types. It’s possible to create DbSet properties for the derived types. So I could query even more simply:

  1. context.Products
  2.   .Where(b => b.DateCreated == today)
  3.   .ToList();

In fact, for this model I’d completely remove the TheBaseTypes DbSet from my context class and prevent anyone from expressing queries directly from this base type.

Logging Without the Scary Model

I’ve focused so far on a hierarchy scenario I strongly advise devel-opers to avoid when building models with Entity Framework: using a mapped entity as a base from which every single entity in the model also derives. Sometimes I come upon scenarios where it’s just too late to change the model. But other times I’m able to help my clients avoid this path completely (or they’re early enough in development that we’re able to change the model).

So how to better achieve the goal—which is to provide commonly tracked data, such as logging data—for every type in your model?

Often, the first thought is to keep the inheritance but change the type of hierarchy. With Model First, TPT is the default but you can change that to Table per Hierarchy (TPH) using the Entity Designer Generation Power Pack (available in Visual Studio Gallery via Extension Manager). Code First defaults to TPH when you define inheritance in your classes. But you’ll quickly see that this is not a solution at all. Why? TPH means that the entire hierarchy is contained in a single table. In other words, your database would consist of just one table. I’m hoping no more explanation is necessary to convince you that this is not a good path.

As I said earlier (when I asked if a Customer is really a type of LoggingInfo), the specific scenario I’ve focused on, to solve the problem of tracking common data, begs that you just avoid inher-itance altogether for that goal. I’d recommend you consider an interface or complex types instead, which will embed the fields into each table. If you’re already stuck with the database that created a separate table, a relationship will do.

To demonstrate, I’ll switch to a model based on classes using Code First instead of an EDMX (although you can achieve these same patterns using an EDMX model and the designer).

In the first case I’ll use an interface:

  1. public interface ITheBaseType
  2. {
  3.   DateTime DateCreated { get; set; }
  4. }

Each class will implement the interface. It will have its own key property and contain a DateCreated property. For example, here’s the Product class:

  1. public class Product : ITheBaseType
  2. {
  3.   public int ProductID { get; set; }
  4.   // …other properties…
  5.   public DateTime DateCreated { get; set; }
  6. }

In the database, each table has its own DateCreated property. Therefore, repeating the earlier query against the Products creates a straightforward query:

  1. context.Products
  2. .Where(b => b.DateCreated == today)
  3. .ToList();

Because all of the fields are contained in this table, I no longer need a nested query:

  1. SELECT TOP (1) [Extent1].[Id]                     AS [Id],
  2.                [Extent1].[ProductID]              AS [ProductID],
  3.                [Extent1].[Name]                   AS [Name],
  4.                [Extent1].[ProductNumber]          AS [ProductNumber],
  5.                …more fields from Products table…
  6.                [Extent1].[ProductPhoto_Id]        AS [ProductPhoto_Id],
  7.                [Extent1].[DateCreated]            AS [DateCreated]
  8. FROM   [dbo].[Products] AS [Extent1]
  9. WHERE  [Extent1].[DateCreated] = ‘2012-05-25T00:00:00.00’

If you prefer to define a complex type and reuse that in each of the classes, your types might look like this:

  1. public class Logging
  2. {
  3.   public DateTime DateCreated { get; set; }
  4. }
  5. public class Product{
  6.   public int ProductID { get; set; }
  7.   // …other properties…
  8.   public Logging Logging { get; set; }
  9. }

Note that the Logging class doesn’t have a key field (such as Id or LoggingId). Code First conventions will presume this to be a complex type and treat it as such when it’s used to define properties in other classes, as I’ve done with Product.

The Products table in the database has a column generated by Code First called Logging_DateCreated, and the Product.Logging.DateCreated property maps to that column. Adding the Logging property to the Customer class would have the same effect. The Customers table will also have its own Logging_DateCreated property, and it maps back to Customer.Logging.DateCreated.

In code, you’ll need to navigate through the logging property to reference that DateCreated field. Here’s the same query as before, rewritten to work with the new types:

  1. context.Products.Where(b => b.Logging.DateCreated == DateTime.Now).ToList();

The resulting SQL is the same as the interface example except the field name is now Logging_DateCreated rather than DateCreated. It’s a short query that only queries the Products table.

One of the benefits of inheriting from the class in the original model is that it’s easy to code logic to automatically populate the fields from the base class—during SaveChanges, for example. But you can create logic just as easily for the complex type or for the interface, so I don’t see any disadvantage with these new patterns. Figure 3 shows a simple example of setting the DateCreated property for new entities during SaveChanges (you can learn more about this technique in the Second and DbContext editions of my “Programming Entity Framework” book series).

Figure 3 Setting the Interface’s DateCreated Property During SaveChanges

  1. public override int SaveChanges()
  2. {
  3.   foreach (var entity in this.ChangeTracker.Entries()
  4.     .Where(e =>
  5.     e.State == EntityState.Added))
  6.   {
  7.     ApplyLoggingData(entity);
  8.   }
  9.   return base.SaveChanges();
  10. }
  11. private static void ApplyLoggingData(DbEntityEntry entityEntry)
  12. {
  13.   var logger = entityEntry.Entity as ITheBaseType;
  14.   if (logger == null) return;
  15.   logger.DateCreated = System.DateTime.Now;
  16. }

Some Changes in EF 5

Entity Framework 5 does bring some improvements to queries that are generated from TPT hierarchies that help but do not alleviate the problems I demonstrated earlier. For example, rerunning my query that initially resulted in 3,300 lines of SQL on a machine that has the Microsoft .NET Framework 4.5 installed (with no other changes to the solution) generates a query that’s reduced to 2,100 lines of SQL. One of the biggest differences is that EF 5 doesn’t rely on UNIONs to build the query. I’m not a DBA, but my understanding is that such an improvement wouldn’t impact the performance of the query in the database. You can read more about this change to TPT queries in my blog post, “Entity Framework June 2011 CTP: TPT Inheritance Query Improvements,” at

Not All Inheritance Is Evil

Having a single base type for all entities in your model is an extreme example of modeling and inheritance gone wrong. There are many good cases for having an inheritance hierarchy in your model—for example, when you do want to describe that a Customer is a Person. What’s also important is the lesson that LINQ to Entities is just one tool that’s available to you. In the scenario my client showed me, a clever database developer reconstructed the query against the base type fields as a database stored procedure, which brought a multisecond activity down to one that took just 9 milliseconds. And we all cheered. We’re still hoping they’ll be able to redesign the model and tweak the database for the next version of the software, though. In the meantime, they’re able to let Entity Framework continue to generate those queries that aren’t problematic and use the tools I left behind to tweak the application and database for some fantastic performance improvements.