Database : MySql GROUP BY on two large joined tables

mardi 3 mars 2015

MySql GROUP BY on two large joined tables

I have two tables:

Person

PersonID (PK)

GroupID

DeletedDate

PersonData

PersonID (PK)

FieldID (PK)

Value

The fields are custom-created; hence the schema. I am trying to run a histogram-type query to discover the range of values for a given FieldID together with a count of the times each value occurs. Here is the query:


SELECT Value, count(*) 
FROM Person p 
INNER JOIN PersonData pd ON p.PersonID=cd.PersonID
WHERE p.DeletedDate IS NULL AND GroupID=‘cAPHSWovx9d2yaN’
GROUP BY Value

We use GUIDs for the GroupIDs.

Performance is fine for small groups but degrades rapidly. The group in question has 5 million records in table Person and an order of magnitude beyond that in PersonData. The above query takes an hour to run. It would be acceptable to get it down to a few minutes.

I created an index on PersonData on PersonID/FieldID/Value in that order. MySql generally chooses not to use it when I run explain. (Actual results of explain vary based on FieldID.) I also have an index on Person on GroupID/DeletedDate.

I have two questions:

On the explain I always get a “Using where; Using index; Using temporary; Using filesort” on the Person table. Why does MySql need to sort the Person table before joining to PersonData?

Why does MySql generally decline to use the PersonID/FieldID/Value index? Is there a better index I could use? I assumed I needed the PersonID in the index to help with the join.

Edit Here is the relevant portion of the explain:


id    select_type    table    type    key              key_len    ref          rows       Extra
1     SIMPLE         p        ref     deleted_person   67         const,const  3476105    Using where; Using index; Using temporary; Using filesort
1     SIMPLE         pd       ref     person_field_val 62         p.PersonID   10         Using index

Edit 2 Forgot the GROUP BY in the query above (face palm)

Database

mardi 3 mars 2015

MySql GROUP BY on two large joined tables

Aucun commentaire:

Enregistrer un commentaire