An example from this question shows that SQL Server will choose a full index scan to solve a query like this:
select distinct [typeName] from [types]
Where [typeName] has a non-clustered, non-unique ascending index on it. His example has 200M rows but only 76 unique values. It seems like a seek plan would be a better choice with that density (~76 multiple binary searchs)?
His case could be normalized but the reason for the question is that I really want to solve something like this:
select TransactionId, max(MaxCreatedUtc) from TxLog group by TransactionId.
There is an index on (TransactionId, MaxCreatedUtc).
Re-writing using a normalized source (dt) does not not change the plan.
select dt.TransactionId, MaxCreatedUtc
from [Transaction] dt -- distinct transactions
cross apply
(
select MaxCreatedUtc from TxLog tl
where tl.TransactionId = dt.TransactionId
) ca
Running just the CA subquery as a Scalar UDF does show a plan of 1 seek.
select max(CreatedUtc) as MaxCreatedUtc from Pub.TransactionLog where TransactionID = @TxId;
Using that Scalar UDF in the original query seems to work but loses parallelism (known issue with UDFs):
select t.typeName,
Pub.ufn_TransactionMaxCreatedUtc(t.TransactionId) as MaxCreatedUtc
from Pub.[Transaction] t
Rewriting using an Inline TVF reverts it back to the scan based plan.
Aucun commentaire:
Enregistrer un commentaire