lundi 2 février 2015

Recommend a suitable DB for storing events


I have rows of the following type:



identifier (string), type (int), date (date), count (int), extra info


Rows have a unique constraint on (identifier, type, date).


I want to perform the following online queries:



  • all entries, ordered by date

  • all entries for identifier, ordered by date

  • all entries for type, ordered by date


These are not log files, but they are similar. Data is inserted in batches, not in a continuous stream. It is not inserted in any particular order. I want a database for querying this data.


I had some success using MySQL MyISAM tables, but now I've got to 300 million entries it's getting unwieldy and I expect to reach at least 800 million.


The nature of the queries mean that I need indexes on identifier, type and to sort by date.


I've considered perhaps using MongoDB and de-normalising so I'm storing the data once per query type (i.e. tables for per-identifier, per-type).


The only snag is that I need to enforce the uniqueness constraint, which may involve over-writing duplicate values.





Aucun commentaire:

Enregistrer un commentaire