Database : Is it possible to get seek based parallel plan for distinct/group by?

vendredi 27 mars 2015

Is it possible to get seek based parallel plan for distinct/group by?

An example from this question shows that SQL Server will choose a full index scan to solve a query like this:


select distinct [typeName] from [types]

Where [typeName] has a non-clustered, non-unique ascending index on it. His example has 200M rows but only 76 unique values. It seems like a seek plan would be a better choice with that density (~76 multiple binary searchs)?

His case could be normalized but the reason for the question is that I really want to solve something like this:


select TransactionId, max(CreatedUtc) 
from TxLog 
group by TransactionId

There is an index on (TransactionId, MaxCreatedUtc).

Re-writing using a normalized source (dt) does not not change the plan.


select dt.TransactionId, MaxCreatedUtc
 from [Transaction] dt -- distinct transactions
 cross apply
   (
        select Max(CreatedUtc) as MaxCreatedUtc 
          from TxLog tl
         where tl.TransactionId = dt.TransactionId         
   ) ca

Running just the CA subquery as a Scalar UDF does show a plan of 1 seek.


select max(CreatedUtc) as MaxCreatedUtc
 from Pub.TransactionLog 
 where TransactionID = @TxId;

Using that Scalar UDF in the original query seems to work but loses parallelism (known issue with UDFs):


select t.typeName, 
       Pub.ufn_TransactionMaxCreatedUtc(t.TransactionId) as MaxCreatedUtc
  from Pub.[Transaction] t

Plans for Cross apply, Just the UDF, Using UDF

Rewriting using an Inline TVF reverts it back to the scan based plan.

From answer/comment @ypercube:


select TransactionId, MaxCreatedUtc        
 from Pub.[Transaction]  t
  cross apply
   (
        select top (1) CreatedUtc as MaxCreatedUtc 
        from Pub.TransactionLog l
        where l.TransactionID = t.TransactionId
        order by CreatedUtc desc                     
   ) ca

Plan using top/order

Plan looks good. No parallelism but pointless since so fast. Will have to try this on a larger problem sometime. Thanks.

Database

vendredi 27 mars 2015

Is it possible to get seek based parallel plan for distinct/group by?

Aucun commentaire:

Enregistrer un commentaire