jeudi 26 mars 2015

Is it possible to get seek based parallel plan for distinct/group by?


An example from this question shows that SQL Server will choose a full index scan to solve a query like this:



select distinct [typeName] from [types]


Where [typeName] has a non-clustered, non-unique ascending index on it. His example has 200M rows but only 76 unique values. It seems like a seek plan would be a better choice with that density (~76 multiple binary searchs)?


His case could be normalized but the reason for the question is that I really want to solve something like this:



select TransactionId, max(MaxCreatedUtc) from TxLog group by TransactionId.



There is an index on (TransactionId, MaxCreatedUtc).


Re-writing using a normalized source (dt) does not not change the plan.



select dt.TransactionId, MaxCreatedUtc
from [Transaction] dt -- distinct transactions
cross apply
(
select MaxCreatedUtc from TxLog tl
where tl.TransactionId = dt.TransactionId
) ca


Running just the CA subquery as a Scalar UDF does show a plan of 1 seek.


select max(CreatedUtc) as MaxCreatedUtc from Pub.TransactionLog where TransactionID = @TxId;


Using that Scalar UDF in the original query seems to work but loses parallelism (known issue with UDFs):



select t.typeName,
Pub.ufn_TransactionMaxCreatedUtc(t.TransactionId) as MaxCreatedUtc
from Pub.[Transaction] t


Plans for Cross apply, Just the UDF, Using UDF


Rewriting using an Inline TVF reverts it back to the scan based plan.





Aucun commentaire:

Enregistrer un commentaire