Rrd4j long term data storage

I try to store data over a longer period from my daily power consumption. I track the kWh and reset the value every day. I installed rrd4j a long time ago at it works fine for the general data storage (only put an item in the .persist file). Now I try to use the rrd4j.cfg file to consolidate the data over a longer time.
I have this in the rrd4j.cfg:

kWh.def=GAUGE,900,0,100,600
kWh.archives=MAX,0.5,1,60:MAX,0.5,60,24:MAX,0.5,144,720
kWh.items=zwave_device_3cab72a2_node51_meter_kwh

And I need to have this in the .rrd4j.persist file

zwave_device_3cab72a2_node51_meter_watts : strategy = everyMinute, restoreOnStartup
zwave_device_3cab72a2_node51_meter_kwh : strategy = everyMinute, restoreOnStartup

even though it is a double definition. Without the stategy in the persist file I do not get stored values.
The rrd data files that is created is only 2kB (folder userdata…). I was expecting it creates either several files or one file with different time spans and time interval. What I get are 144 data points that are 10 min. apart. The time span does not change over the file.
The archive definition should create indeed at first 10 min. time spans:

kWh.def=GAUGE,900,0,100,600

I would expect 60 points with 10 min. interval because of this:

kWh.archives=MAX,0.5,1,60

With:

…:MAX,0.5,60,24

I was expecting 24 points in the file that cover a time span of 600s*60=3600s (1h) each.
The last row should create a point every day for 720 days.

It seems the consolidating function does not work (I have another case with AVERAGE consolidating function that also does not works).

Does it work for anyone? What is wrong in the definitions?

Rrd4j is a round robin database designed to overwrite old data after a configured period of time. I do not know where the rrd4j configuration is stored.

Why don’t you use the default setup for that? It stores values for 10 years.

Where do you see anything defined double? Actually I’m missing your second item in the items line, that way this item will use the default database.

Which is totally correct. This is the setting that tells openHAB what and when to persist. The .cfg file tells the persistence service how to persist.

Obviously you didn’t read the rrd4j documentation.
Your setup does create a .rrd file that holds 3 archives, the first with 60 values, the second with 24 values and the third with 720 values. In toto 804 values, 2kB does sound ok to me!

How did you check that? Whenever requesting datafrom an .rrd file ( via the REST API?) data gets fetched from that archive that covers the requested timeframe ( default is from now backward 24 hrs). Asking for the same timespan will always give the same number of values ( or less if the file isn’t filled up yet). That is by design!
I do not see any archive in your setup that would give 144 values, especially not for your archive 1.

Additionally your second archive has 24 values, each covering 60*600 seconds, which is 10 hrs!

Why on earth are you using a sample rate of 600 seconds?
With a sample per 10minutes, how to find the MAX of the last 10 minutes?
Consolidation Functions DO WORK,

You’ve got your instructions to rrd4j about configuring an archive, and your instructions to openHAB persistence about what to persist when. Two different functions.

What I consider double configured is that 60 sec. time period. That I have to tell openhab what to persist is a understood. Anyhow, it does not work without a strategy stated within the persist file. So it is okay.

First of all, thank you for looking into this. I try to answer the questions

I want to have a certain point per time rate throughout the time covered. I want in the end one point per day over ~1.5 years. And it should be the maximum value of this day.

In the persist file it is defined that every minute a point needs to be taken, in the rrd4j.cfg file some other frequency is stated. That is double definition. Anyhow, it does not work at all if I leave the strategy out of the .persist file. If that is not an issue. I am fine with that as long as the frequency in the .cfg file is used.
The second item has a different definition. I think I can fix it once this one is fixed. I tried not to overwhelm with too much details.

Good to know. Then this is not the issue.

Why so offensive? Well, I did read the documentation, several times. When openhab creates three archives it is good and that is correct. The question is then how to get the data out again.

Yes, I used the rest-API with a date setting well in the past for the start and well in future for the end date. I also used the link provided in the rest-API in firefox. The length and data are the same, 24h only. In addition, in a chart I also only see 24h for this item. For the items that do not have an own .cfg (standard configuration) I can see in the charts longer periods (several month). So I concluded the archives for kWh are not created in the intended way. Btw. the documentation does not state how to get the data. Just guessing, maybe I did not read it.

So how do I get all the values that are stored for this item?

I’m sorry if you understood that as offensive, please take my apologies.
When querying a rrd4j database for a single data point in time or for a timeframe only the archive is used the covers the whole requested timeframe. A standard query will never give all data points in the rrd file, only those in the archive. The exception is the “rrd-inspector” which is build to show and manipulate the data the rrd file.
For openHAB the data is queried with REST calls, the same as you already tried. However trying to get future data is impossible.
Let’s get back to what you are really after:

Starting point is the .def line, I suggest you do it like this
kWh.def=GAUGE,900,0,100,60
Yes, take sample rate of 60 seconds.

For the archives you stated:

For that use an archive 1 that takes a value every minute for 24 hours AND an archive 2 that has a value per day for 550 days (roughly1.5 years).
kWh.archives=MAX,0.5,1,1440:MAX,0.5,1440,550
Only two archives because you did not ask for more detailed data.
Additionally the .items line has to state all items that should use this config.
A strategy in a the.persist file of everyaMinute is MANDATORY as well

After all that is set remove the existing .rrd file for all items that should use that config, that way you can be assured that a new file is created. My guess is that you missed that step before.

Using the REST Api you can monitor values getting persisted. Starting with a call without any time setting (I.e. a request for the last 24 houres) you will see the number of values stored increasing to 1440.
After a day try the get a query for a timeframe starting more then a day ago, omitting the end date will take the default (now). You will see after each day another value be stored.

I’m standing by for further questions.

Thank you, that helps already.
It still does not work as expected. So I think I have to work on my expectations further. Your help is very appreciated.
I deleted the files again. I was not sure anymore if I deleted it consequently after each change.
The hint with archive wise reading of the time frame was good. That helps in understanding. Is there a reading resource for rrd data bases this and more is listed?

I still do not understand the behavior:
The following definition I used:

watt.def=GAUGE,900,0,16000,60
watt.archives=AVERAGE,0.5,1,2880:AVERAGE,0.5,30,17520
watt.items=zwave_device_3cab72a2_node51_meter_watts

However, although the chart has a longer time frame I only see 6h in the chart. Interestingly, it is also the case when I use 30s time between two data points in watt.def .
Could it be that openhab only shows 360 values at most? It seem to holds for both REST API and chart.
Could it be that only every 60s time interval between data point is supported. That would at least fit to the *.persist item definition as a minimum time frame.

I totally forgot the "Persistence viewer build by user 5iver. Having installed it (Github page you can use this (adjusted) link http://Your_openHAB_IP:8080/static/PersistenceViewer/ to view the data.

For how long did you persist data? (my guess only those seen 6 houres, persisting longer should show more)
Changing the sample rate, does change the whole database setup! If you changed only the sample rate the archive 1 of your setting would cover (30 seconds *2880)=24 houres. Due to the everyMinute strategy a new value would have been passed to the database only every minute.

No, I’m looking at charts with data for a year (in steps of 4 houres).

Not understood!

Using a sample rate lower then 60 seconds should produce shorter intervall in the datasource, however due to the everyMinute strategy a new value would have been passed to the database only every minute, hence the same data gets persited until a new one was provided.

You could use a lower setting then everyMinute, however here a lower sample rate is required in order to really persist those values.

Is that in an openhab chart or in the additionally installed viewer?


The behavior I see is weird. I hope you can time at looking at this. It supports and doesn’t what you are saying.
What I did:
I changed the config of “watts” and added more archives so that openhab may use another archive to show more than 6h of data. The configuration is:

watt.def=GAUGE,900,0,16000,60
watt.archives=AVERAGE,0.5,1,2880:AVERAGE,0.5,4,360:AVERAGE,0.5,28,360:AVERAGE,0.5,30,17520
watt.items=zwave_device_3cab72a2_node51_meter_watts

In addition I made up a new config for a temperature sensor:

temp.def=GAUGE,900,-30,50,60
temp.archives=AVERAGE,0.5,1,2880
temp.items=jeelink_lacrosse_47_temperature

I also saved the config file and deleted both rrd files. New rrd file were created:


The temperature archive was created later. It is bigger in size. How can that be?
It reflects in the chart:

The blue temperature curve starts at ~2pm as expected. The yellow power curve is exactly 6h long. It is always 6h long. When looking 10 min. later into the chart it starts 10 min. later and covers again exactly 6h.

I double checked if there is not another “watts”.rrd file. There is no other. I also made a new chart to select the sensor newly. The result is seen above. In the REST API there are exactly 360 value in the 24h archive. So the chart shows what is in the file. In opposition the temp. archive shows already 415 data points.

Any idea what to do?

Actually, no!

One possible cause, the persistence service should be stopped, .rrd file file deleted and then service started again.

I would like to check your .rrd files with the rrd4j-inspector, however I can’t get that one to open any of my files ATM. Using the inspector we could see the actually used datasource setting.
[Edit:]
After a nigths sleep and some more tries I’m now able (again) to open .rrd files created with openHAB 2.X. I can’t say why it did not work before???
Could you PM send me the .rrd files from the post above? For a little safety please stop the rrd4j bundle before you copy the files (I’m not sure if that was the problem).

If you want to do it yourself, read this thread and use the download link in the second post.

Thank you for further digging with me to the ground.

I do not know how to stop the persistent service. Do you have any link for it?

What I did yesterday still was:

  1. I stopped openhab
  2. I cleaned the cache of openhab
  3. I restarted openhab by reboot

The result is still the same. The last 6h of data, nothing more. Strange enough, the temperature data archive I started with the same first archive setting works fine. I see all the data starting at the time the old file was deleted and the new config was used.

I will check the viewer and send the file to you.

For stopping any “bundle” (without the need to stop openHAB completly )start the karaf console (ssh -p 8101 YourOpenHABIP@localhost), use bundle:list to show all started bundles, search for the number of openHAB RRD4j Persistence Bundle and use this numbe for bundle:stop xx . For starting use bundle:start XX

Don’t do that just for heck of it, it causes disruption. If you have a partcular purpose go ahead, but expect to have to reboot again afterwards to restore normal workings.

Thank you for writing how to do it. I did already clean the cache and reboot twice. Does that have the same effect or shall I also do the stopping of the service?

I did reboot after that, indeed. I already started the Pi another time.

I took out the item of the persistence and let the system run. Then I rebooted.
After running the system a while I deleted the rrd file and introduced the configs again. Guess what. It still does not persist more than 6h. The file size is right after starting the persistence 4kB.

The .rrd filesize is fixed. It is created with the full size and will not grow!
PM me the .rrd file and I’ll have look into it and your
services/rrd4j.cfg

Restarting a bundle will ONLY restart that part of openHAB. It has no effect on the cache.
I recommended that one in order to be sure that the new settings for the rrd4j datasource are being used.

1 Like

I tried to run the persistence for the item with standard settings. So no entries in the cfg file. I deleted the rrd file and also deleted the entries in persist and cfg file. No persistence was available, as expected. Then I only added the entry in the persist file expecting the normal archive size. Again only 4kB file size was found and the usual 6h were seen not more.

I am going to delete the entries in the persist and cfg file. Then delete the rrd file and then stop the persist service. The I will add the config and persist entries again.
I would then like to take your offer OPUS and send you what ever comes out+config files.

1 Like

A look into the ( deleted) .rrd would have been useful as well.