mercredi 25 mars 2015

Clustering a heap by date in stead of unique composite key


I've been working on improving the performance of a large reporting database.

This database is 2TB in size, and one of the larger tables is a heap with three non-clustered indexes on it.


The non-clustered indexes each deal with a combination of Case_ID's with relevant data.

However the case id's are not at all unique, as each case entry is stored in the same heap with the Case_ID.

Inquiries at the business side of things have led me to the fact that a combination of Case_ID, Line_Number and Document_ID is unique.


However, nearly all of the queries are claused by dates.

Usually by WHERE DATE = 'xxxx-xx-xx 00:00:00.000, or WHERE DATE > 'xxxx-xx-xx 00:00:00.000'. And the Line_Number and Document_ID values are in most cases not even included in the reports.


As such, I created a clustered index on the heap (on DEV), keyed on the DATE column.

Performance of the reports has increased by a decent amount, reducing query times from 5 minutes to 3 minutes for most of my test reports, and I haven't seen any negative impact on the reports not claused by date.


I am however concerned that I'm looking too far as I understand that non unique clustered indexes are rarely the solution. Is what I'm doing a valid approach? Or should I simply index on the unique composite key, and create a covering index on the DATE column?





Aucun commentaire:

Enregistrer un commentaire