mercredi 24 décembre 2014

mongoexport not exporting all possible data




I have a collection with documents similar to the following:



{
_id: 'p_123456',
id: 123456,
kind: 'person'
data: [...]
}


id field can contain either positive or negative integers.

This collection contains close to 100 million documents and I have a python script which I use in order to process the data from the collection.


I'm trying to export all the data, but into 8 different processes using the $mod operator in the following way:

mongoexport -u user -p password -d db -c collection --query "{\$and:[{kind:'person'},{id:{\$mod:[8,\$i]}}]}" | python process.py - where $i is a number between 0 - 7.


For some reason I've noticed that when I use this method with 8 processes not all the data is being exported just 65 million out of 87 million for this specific kind.


If I run a single mongoexport process with the query {kind:'person'} only, all 87 million documents are being exported.


Is it possible that running 8 different proccesses with $mod:[8,0] to $mod:[8,7] isn't enough in order to export all the data? what am I missing here?


Thanks un advance





Aucun commentaire:

Enregistrer un commentaire