Saturday 3 November 2012

SharePoint List Indexes : Under the Hood


You must be already aware that SharePoint provides the functionality to index columns so querying on them will be faster and that Throttling will not occur. Let us look at how SharePoint maintains this index and how exactly is it stored in the Content Database.

SharePoint maintains multiple tables in the content database with names starting with dbo.NameValuePair and followed by Culture Names where the Site ID, Web ID, List Id, Item ID and Value of the Indexed fields are stored.

If the value of the field is not Culture Dependent, e.g. the DateTime Fields, the Person or Group Fields etc. then the Value is stored in the table named dbo.NameValuePair.

If the value of the field is Culture Dependent e.g. the Text fields, then the value is stored in a table named dbo.NameValuePair_and followed by Culture Name. E.g. If the current language being used is English, then the value is stored in the table called dbo.NameValuePair_Latin1_General_CI_AS.

When a query is made to any of the lists which has an indexed field, then a JOIN is performed on the dbo.AllLists table and the relevant dbo.NameValuePair table and the joined data is presented.

Since the data in the indexed fields is stored completely in a different table, list throttling does not occur even if the query is made to more than 5000 rows for a normal user.

To test this out, I created a brand new WebApplication and new root level site collection under that with the Team Site template. After that, I created an index in my “Tasks” list on the Assigned To field which is a Person or Group field and added a sample task to the list. Then I opened the Content Database of the new WebApplication and had a look at the dbo.NameValuePair table:


The row which is highlighted with red contains the SiteId, WebId, ListId, ItemID and the value which I entered in the Assigned To field of my Tasks List. It is showing as 9 because it is a Person field and that is the ID of the User I assigned the Task to.

The 2 rows below the highlighted rows are the values of the Modified field from the “Site Pages” list. The values belong to the Home.aspx and the How To Use This Library.aspx as these 2 items are created by default in a Team Site.  This field is added to the Index by default and since it is a field of type Date and Time (which is not culture dependent) we can see it in the dbo.NameValuePair table.

After that I opened the dbo.NameValuePair_Latin1_General_CI_AS table and had a look in there:

So just like we saw earlier, all the data related to the Title field and its value (which is culture dependent) is stored in this table.

Composite Indexes:


SharePoint also allows composite indexes to be created on a list. However there are limitations on which type of columns can be included as the Primary and the Secondary Columns in the composite index.

If the Primary Column in a composite index is selected as column whose value is stored in the dbo.NameValuePair table, then the secondary column must also be selected which will be stored in the same table. So in short, if the primary column is language independent, then the secondary column in the index should also be language independent.

If the primary column in the list is language dependent, then you cannot specify a secondary column in the composite index.

Hope you had fun knowing more about indexes! Happy SharePointing!

No comments: