Page 1 of 1

Rate Limiting; Performace

PostPosted: Tue May 25, 2010 6:55 am
by jamus
A couple questions:

1) Is there rate limiting? I have variables that are updated every 15 seconds, yet I only see a total of 1440 entries per day (1 per minute). BTW, I don't "need" historical data every 15 seconds, but I would like to make sure the behavior I'm seeing is correct.

2) Has anyone compared the performance of sqlite vs postgres? A query like "select * from variable_history where var_name='kwH_month' and date(ts)="2010-03-19" order by ts desc limit 1;" takes 12 seconds to execute. That seems slow to me, but I don't work with databases much. This is on a G4 1.5ghz PPC. There are 3254601 records in this table.

Re: Rate Limiting; Performace

PostPosted: Tue May 25, 2010 11:39 am
by seanadams
jamus wrote:
A couple questions:

1) Is there rate limiting? I have variables that are updated every 15 seconds, yet I only see a total of 1440 entries per day (1 per minute). BTW, I don't "need" historical data every 15 seconds, but I would like to make sure the behavior I'm seeing is correct.


If you are updating variables every 15 seconds that lead me to wonder if there might be a better event-driven solution. I.e. only taking action when the thing of interest becomes interesting. [edit: never mind, for a data logging application this might be appropriate. However you might want to check out rrdtool for this sort of thing, which includes a both a database designed for time-series data, and a graphing engine for viewing it]

2) Has anyone compared the performance of sqlite vs postgres? A query like "select * from variable_history where var_name='kwH_month' and date(ts)="2010-03-19" order by ts desc limit 1;" takes 12 seconds to execute. That seems slow to me, but I don't work with databases much. This is on a G4 1.5ghz PPC. There are 3254601 records in this table.


This means you need to create an index. That causes SQL to maintain a sorted view of the column(s) you're accessing, so that it doesn't have to process a large data set during retrieval. Try:

CREATE INDEX idx_var_name_ts ON variable_history(var_name, ts);

Re: Rate Limiting; Performace

PostPosted: Tue May 25, 2010 12:11 pm
by jamus
seanadams wrote:
If you are updating variables every 15 seconds that lead me to wonder if there might be a better event-driven solution. I.e. only taking action when the thing of interest becomes interesting. [edit: never mind, for a data logging application this might be appropriate. However you might want to check out rrdtool for this sort of thing, which includes a both a database designed for time-series data, and a graphing engine for viewing it]


I'll have to revisit rrdtool. When I checked a while ago, it didn't support a second y-axis. It looks like support has been added since.

seanadams wrote:
This means you need to create an index. That causes SQL to maintain a sorted view of the column(s) you're accessing, so that it doesn't have to process a large data set during retrieval. Try:

CREATE INDEX idx_ts ON variable_history(kwH_month, ts);


I used
Code: Select all
CREATE INDEX idx_ts ON variable_history(ts)
since kwH_month is a value not a column, which changed the query to near instantaneous. Since this is Indigo's database, is there going to be any interaction issues with it? I'm thinking probably not, because I didn't modify the schema of the table, but want to make sure.

Re: Rate Limiting; Performace

PostPosted: Tue May 25, 2010 12:17 pm
by seanadams
I used
Code: Select all
CREATE INDEX idx_ts ON variable_history(ts)
since kwH_month is a value not a column, which changed the query to near instantaneous. Since this is Indigo's database, is there going to be any interaction issues with it? I'm thinking probably not, because I didn't modify the schema of the table, but want to make sure.


Yeah, I realized just after posting it and edited my comment - the updated code indexes both columns together (first by var_name, then by ts) which is the correct solution when you are filtering on one and sorting on the other. Only indexing the date is much better than nothing but it still means the database has to walk through all the entries for that day to compare if var_name is the one you're looking for.

If you want to try it, first do: ALTER TABLE variable_history DROP INDEX idx_ts;

No this will not affect other clients using the database. It just takes up some disk space for the index.

Re: Rate Limiting; Performace

PostPosted: Tue May 25, 2010 12:33 pm
by seanadams
As for sqlite vs postgres:

They are very comparable in performance UNLESS you have a situation where there are many clients attempting to simultaneously write to the db, in which case sqlite becomes slow because of its coarse file locking.

I prefer sqlite for this sort of application because it is small, public domain, and easy to embed in other programs (no daemon to configure and run). If I were making a web-based application for many simultaneous users I would use postgres or mysql.

However, with any db you would have to know to create indexes to facilitate such queries.

Re: Rate Limiting; Performace

PostPosted: Tue May 25, 2010 1:52 pm
by jamus
seanadams wrote:
If you want to try it, first do: ALTER TABLE variable_history DROP INDEX idx_ts;


Just in case others are following...the index isn't part of the table, so the command to use is:

DROP INDEX idx_ts;

Re: Rate Limiting; Performace

PostPosted: Tue May 25, 2010 3:09 pm
by seanadams
jamus wrote:
seanadams wrote:
If you want to try it, first do: ALTER TABLE variable_history DROP INDEX idx_ts;


Just in case others are following...the index isn't part of the table, so the command to use is:

DROP INDEX idx_ts;



That syntax varies by dbm. I guessed wrong!