RRD4J, historicState(), and NaN gaps in data

I’ve got an openHAB development system on an old laptop; this does not run 24/7
I’m generating simulated numeric meter readings and use RRD4J persistence, with the usual everyMinute settings and default consolidation strategy.

Because the system is not 24/7, I have gaps in the persisted data, naturally enough. Here’s a sample from rrdInspector, 2nd archive where it’s been consolidated to every 4 minutes.


No surprises so far.

If I now try to recover values using historicState() in rules, I get into trouble.
Yes, I realise the older data may have accuracy compromises because it has been averaged over time, not the problem here.

I don’t really know how historicState() decides which archives to look in, but tested here with a target time that is not in the first every-minute archive, but is in the second 4-minute archive.

Requesting historicData() for 11:59 gives me the expected object with state 3806.75, and timestamp 11:56
I say expected, because what I would expect is for searching for 11:59 exactly to “fail”, but then look for preceding data instead.

Likewise, seeking for 12:02 returns data stamped 12;00

Requesting historicState() for 12:10 returns null.
That’s not what I expect. As I understand it, if there is no record at the specified time instant, historicState() should grope back in time until it finds a record, making the assumption that data persisted until the given time.
In short, return the next-oldest data.

It’s arguable that the NaN records do represent “the next-oldest data”.
But those NaN records have been created, synthesized, by the rrd4j aggregation process. No data was actually recorded for those times at all.

Note that whatever it is that the BasicUI simple Chart widget does to access the same gappy dataset, it does not choke on it and draws a chart using the “ignore NaN and assume last value remains valid” strategy.

I think this is a bug, but welcome discussion. @opus perhaps?

Upfront I have to say that I am an interested user of rrd4j and your statement: I don’t really know how historicState() decides which archives to look is true for me as well.
The samples you found getting a result both are within a period were values have been stored, the value that does not have a directly proceeding value in the timestep. For me it LOOKS like historicState looks into the first archive that fits and only returns what is stored as a value for the requested timestep, in other words it does not go backwards until it finds a value but takes exactly that value that is stored fot the “touched” timestep.I never dared to look that deep into the code.

Thanks for taking the time to look.

Yep, that’s the thing. Is it supposed to reach back for a valid value? (as it would in any other database).
I think I’ll lodge a formal bug later.

It may be that this is something that is not easy to overcome, because of the way rrd4j works, in uhh “fabricating” NaN data to span gaps.

Most users would never see it unless they target a timeslot corresponding to a reboot or such.

EDIT -

Just for info, I made a kludgy workaround for my purpose.

I’m actually recording meter readings. Yes, influxdb would be "better’ for that, but I already use rrd4j for temperature charting etc. I’m aware of the limitations due to data consolidation, but it’s good enough for my purpose.

So, my task is just to get meter reading for yesterday 00:00 and today 00:00, and calculate yesterday’s daily consumption. This goes wrong if the historicState() function I’d usually use to retrieve these data points hits a NaN record, as described earlier.

My cheat is to use minimumSince() instead. This skips over NaN, and because my meter is only incrementing, the first valid record it finds will be the lowest. It’s NOT the same, but good enough for my needs for the time being.

3 Likes

I do not yet have a viable OH3 system to verify if this problem, err, persists.

Since my github issue report was against 1.x persistence extension as used in OH2, I am concerned it will now get lost but that the issue may still exist in OH3. If anyone can verify or encounters this in OH3, we should make a new issue report.

I am a bit concerned since for most users, this will appear like a transient error and not be understood. They will have few NaN gaps in the rrd4j database, and are likely using now.minushours type queries that will likely work when run the next day etc.

I have a .rrd file which is build by the OH3 default persistence and which has a short timeframe with NaN
Doing some requests for historicState around the time when only a single value is persisted gve the following:

2021-01-02 17:15:37.077 [INFO ] [.core.model.script.TestHistoricState] - Value(Day Before): null
2021-01-02 17:15:37.099 [INFO ] [.core.model.script.TestHistoricState] - Value: 16.12.20, 22:45: CPU_Load -> 0.1
2021-01-02 17:15:37.121 [INFO ] [.core.model.script.TestHistoricState] - Value: 16.12.20, 22:45: CPU_Load -> 0.1
2021-01-02 17:15:37.140 [INFO ] [.core.model.script.TestHistoricState] - Value: null
2021-01-02 17:15:37.157 [INFO ] [.core.model.script.TestHistoricState] - Value: null
2021-01-02 17:15:37.178 [INFO ] [.core.model.script.TestHistoricState] - Value: null
2021-01-02 17:15:37.198 [INFO ] [.core.model.script.TestHistoricState] - Value: null
2021-01-02 17:15:37.217 [INFO ] [.core.model.script.TestHistoricState] - Value: null
2021-01-02 17:15:37.238 [INFO ] [.core.model.script.TestHistoricState] - Value(Day After): 17.12.20, 21:00: CPU_Load -> 0.1

First and last calls where for a day before and after, all others are at 15 minutes steps (same as the steps in the archive). During this time only the value for 22:45 is in the database, all others are NaN.

In other words: No change!

1 Like