vendredi 27 février 2015

Subselect of MAX(Version) takes many minutes, though there's only ~20K records


I have a pretty complicated data model. I can't understand most SQL examples without an explanation of what's being modeled, so I'll try to explain.


Mainlines -> Releases -> Overlays -> Calibrations <- Parameters


Calibrations, then, is the child of the the Mainline-Release-Overlay chain, and Parameters. This is the set I want to return.


Now, the complexity is that -- to save space -- we are storing the base Mainline's calibrations, and then just the DIFFERENCES when changes happen at the Release and Overlay levels. This produces a "base" set of Calibrations of about 16K rows, and then a few hundred changes each Release, and perhaps just a few changes each Overlay. Deletions of parameters happen. To keep track of this, Calibrations has a Status field (tinyint) which is set to 1 for a deletion.


Overlays get versioned successively. So, to get a complete "calibration," we need to query the Calibrations table for the latest version of a Parameter's data, up to a particular Overlay version number. Parameter metadata may change, but the names stay the same, so these Calibrations may refer to differing versions of Parameters, albeit with the same name.


Up till now, the following has worked perfectly, and instantaneously (with the following caveats):



SELECT c.Parameter_ParameterID, p.Designation AS Name, c.Data, o.Version
FROM Calibrations c, Parameters p, Overlays o, Releases r
WHERE r.Mainline_MainlineID = 9
AND o.Release_ReleaseID = r.ReleaseID
AND c.Overlay_OverlayID = o.OverlayID
AND c.Parameter_ParameterID = p.ParameterID
AND o.Version =
(SELECT MAX(o1.Version)
FROM Parameters p1, Calibrations c1, Overlays o1, Releases r1
WHERE r1.Mainline_MainlineID = 9
AND o1.Release_ReleaseID = r1.ReleaseID
AND c1.Overlay_OverlayID = o1.OverlayID
AND c1.Parameter_ParameterID = p1.ParameterID
AND p1.Designation = p.Designation -- New condition
AND o1.Version <= 68)
AND c.Status != 1 -- Changed from boolean to tinyint
ORDER BY Version DESC, Name


The problem has come in since we have started (properly) tracking Parameter versions, so the inner match for the Parameter can no longer by on the ParameterID, but rather the Designation (i.e., Name). At first this wasn't a problem, but it is now that Calibration Status is no longer a boolean for deletion. Now it can also be set to 2 for newly introduced, and 3 for reinserted.


Adding these two changes have caused this query to take 10-15 minutes to run! How is that even possible? Even if I contorted the query to connect every row to every other row, it shouldn't take this long!


There are two maddening things here. One is that, when this initially became a problem, I put an index on the Parameters Designation field, and it made it work again. Now, deleting and recreating it doesn't make a difference. Also, there's still a special case, in that, if I have exactly 2 Mainlines in the database, it still runs instantaneously. If I have only 1 or more than 2, it takes forever.


I've tried variously commenting out sections, like the o1.Version <= condition and the c.Status != condition, and sometimes, it works again. I can't get a handle on the differences in the various cases, and I'm at the end of my rope. I know I'm a tiny trick away from getting it to run fast again, but it's apparently beyond me, and this is a show-stopper for my beta plans next month.


I've been programming for 35 years, but I've never had to deal with SQL like this before, and SSMS's tools for showing explain plans mean nothing to me. I have no reputation on this site, so I don't have rep to spend, and I hope it's not a violation of this site's TOS, but I'd PayPal someone $100 who can fix this for me, and help me understand the fundamental issues.





Aucun commentaire:

Enregistrer un commentaire