lundi 2 février 2015

Greenplum database storage model for time series data


I have to deploy a greenplum database for analysis of time series data. I will have around 50 different time series (s1,s2,s3,...s50) and each series will have multiple pairs (time is 1 hour average over 2 years)


I can store this data in two formats: Format 1:



Series T1 T2 T3 T4 …
s1 val val val val …
s2 val val val val …
s3 val val val val …
… … … … … …


Format 2:



Time Series1 Series2 Series3 …
T1 val val val …
T2 val val val …
T3 val val val …
… … … … …


In case of Format 1, I will go with row oriented table. For format 2, column oriented table is a better option. This is because most of my queries will be to fetch data for a single series over a specific time duration. Also, I will update the table in bulk periodically (for example every month).


Which format is better suited for my needs?





Aucun commentaire:

Enregistrer un commentaire