vendredi 27 février 2015

Optimizing queries for 25+ million rows


I am using MS SQL, and I have to run several queries on the same table on different criteria. At first I ran each query on the original table although they all share some filtering (i.e Date, status). This took a lot of time (around 2 minutes).


There are duplicates in data rows, and all indexes are NON-CLUSTERED. I am only interested in 4 columns for my criteria and the result should output the count only, for all queries.


columns needed: TABLE, FIELD, AFTER, DATE, and there is an index on each of DATE and TABLE.


After creating a temp table with only the fields I need, it went down to a 1:40 minutes, which is still very bad.



CREATE TABLE #TEMP
(
TABLE VARCHAR(30) NULL,
FIELD VARCHAR(30) NULL,
AFTER VARCHAR(1000) NULL,
DATE DATETIME,
SORT_ID INT IDENTITY(1,1)
)
CREATE CLUSTERED INDEX IX_ADT ON #TEMP(SORT_ID)

INSERT INTO #TEMP (TABLE, FIELD, AFTER, DATE)
SELECT TABLE, FIELD, AFTER, DATE
FROM mytbl WITH (NOLOCK)
WHERE TABLE = 'OTB' AND
FIELD = 'STATUS'


Runnig this -> (216598 row(s) affected)


Since not all queries rely on date range, I didn't include it in the query. The problem is that it's taking well above 1 minute to insert only. The above insert took 1:19 mins


I want to run something like this for several queries:



SELECT COUNT(*) AS COUNT
FROM #TEMP
WHERE AFTER = 'R' AND
DATE >= '2014-01-01' AND
DATE <= '2015-01-01'


It's a problem with the insert more than that of the selection, but the temp has way less rows than the original table which could be better than going through the table several times.


How can I optimize this?





Aucun commentaire:

Enregistrer un commentaire