I have a collection with documents similar to the following:
{
_id: 'p_123456',
id: 123456,
kind: 'person'
data: [...]
}
id field can contain either positive or negative integers.
This collection contains close to 100 million documents and I have a python script which I use in order to process the data from the collection.
I'm trying to export all the data, but into 8 different processes using the $mod operator in the following way:mongoexport -u user -p password -d db -c collection --query "{\$and:[{kind:'person'},{id:{\$mod:[8,\$i]}}]}" | python process.py - where $i is a number between 0 - 7.
For some reason I've noticed that when I use this method with 8 processes not all the data is being exported just 65 million out of 87 million for this specific kind.
If I run a single mongoexport process with the query {kind:'person'} only, all 87 million documents are being exported.
Is it possible that running 8 different proccesses with $mod:[8,0] to $mod:[8,7] isn't enough in order to export all the data? what am I missing here?
Thanks un advance
Aucun commentaire:
Enregistrer un commentaire