How to calculate average from "first data point in set of persisted data"

I am using rrd4j as persistence service for values read from my energy meter. One thing I would like to display in my sitemap is the current power consumption (“now”) versus the average power consumption of my household.

For the “average power consumption” I am using the .averageSince() method from the persistence extensions. Since I would like to have the average since “dawn of times” (= whenever I started measuring the power consumption), I have - in my rules - a variable defining the epoch (as a DateTime), and then I use this variable as input to the .averageSince() method.

To find the value of the epoch I have simply used HABmin and looked at the table view of the data in question.

Now, my setup seems to work well - for a period of time. Once in a while the .averageSince() method fails (return a null a believe). When this happens I can see from the table view in HABmin that the oldest data point is now later than my defined epoch. Consequently I have to move my epoch forward in time, and then all works well again - for a period of time.

What I really want to do is tell the .averageSince() method to start with the first available data point (historic state) in the persistence set, instead of giving it a specific date and time. Sadly, I have not found a way to do this.

Anyone that knows how to do this, or can suggest a different approach to solve this?

It is odd that averageSince returns null when you ask for data older than what is in the database. That is kind of a bummer because it would be far easier to ask for something like now.minusYears(30) or create a DateTime with 0 as the time (00:00 Jan 1, 1970).

Unfortunately you are running into some of the tradeoffs you get when you use rrd4j. When it “compresses” the older data the datetimes change. I think you will continue to see this as your data gets older and older and rrd4j culls and compresses the older data. If you are truly interested in this data you might consider using something like db4o or InfluxDB which doesn’t monkey with your data as it ages, though the DB will continue to grow forever.

It is reasonable to use these DBs for only those items you want to do charts and calculations like this on and rrd4j for everything else.

Exactly my point! I was quite unhappy when I discovered this behavior, :frowning:

Yes, from what I’ve seen so far I expect this to be a continuous problem going forward.

On a separate note, for this particular function (getting average “since dawn of times”) I am quite happy to not have the exact data to play with (even though the average might then be slightly off) so I could live with using rrd4j if I could just find a way to avoid the problem with the “moving epoch”.

Anyway, I guess I must either move over to another persistence service, or maybe roll my own running calculation of average by keeping track of the total (sum of all samples) and the number of samples…

Maybe move your beginning of time in by a week or a month. If you aren’t too worried about the averages being off I suspect you would be OK with skipping the first week or month’s worth of data. Then even as the first value keeps moving around in time it won’t mess up your query.

Interesting thought, but wouldn’t this just buy me some more time before I need to update my epoch again? I mean, eventually also data that is a week or a month “newer than the oldest” would also be thrown out, right? Or am I misunderstanding how rrd4j works?

According to the wiki page for rrd4j, at ten years old the data granularity is every seven days for numbers (i.e. not switches or contacts). So if you skipped the first two weeks (just to be safe) you would probably be OK forever, or at least for the life of your current HA configuration.

The granularity at ten years old for switches and contacts and such is every one day.

I bet that by the time you get to data older than 10 years old the ability to configure this will be implemented so you can make sure it doesn’t mess you up.

For some reason, I’m not able to reproduce this problem. I only have about a month of data persisted. However, I can do an averageSince with DateTime(0) as the start time. This works for both rrd4j and mysql persistence providers although I get different averages because of rrd4j’s data compression for older data points.

Interesting…

I tried with the following line in my rule:

var Number PowerAverage = nPower.averageSince(DateTime(0)) as DecimalType

but this produces the following error:

2015-10-21 22:50:23.834 [ERROR] [o.o.c.s.ScriptExecutionThread ] - Error during the execution of rule 'Electricity - Power - Average consumption': The name 'DateTime(<XNumberLiteralImpl>)' cannot be resolved to an item or type.

Ps! I do have the following as well:

import org.joda.time.*

and I am using DateTime() with values different from 0 to define my epoch, so DateTime() as such should work.

I gotta say I hate messing with DateTime.

From the error I’m thinking it can’t figure out that your ‘0’ is a long.

Try the following:

val long zero = 0
var Number PowerAverage = nPower.averageSince(DateTime(zero)) as DecimalType

You could also make absolutely sure you are using the joda date time by using the fully qualified name of org.joda.time.DateTime on the averageSince line.

Beyond that I’m out of ideas. That’s what I would try.

@steve1 : Do you mind sharing your working code so that I can sanity check my own?

BTW; As I understand it from the documentation (Jodatime) DateTime(0) is basically creating a DateTime object referring to 1970-01-01:00:00…, right?

@rlkoshak : I tried your suggestion (explicitly stating the type to be long) but unfortunately I get the same error.

As it turned out, the above line should be:

var Number PowerAverage = nPower.averageSince(new DateTime(0)) as DecimalType

Note the ‘new’ method.

This got rid of the error - and more importantly: it got rid of my original problem!!

I am not sure what I did wrong before, but using new DateTime(0) - which translates to 1970-01-01T00:00:00Z - as reference point for the .averageSince method now gives me a proper answer - regardless of when in time the first data point is.

Thanks for your help! Support case is now closed, :slight_smile:

Hello,

since a few years I used the expression

.averageSince(new DateTime(0)) as DecimalType

in my rule with my rrd4j persistance service without any problem. But with my move to oh3 this method does not work (DSL rule with NGRE main ui). I get the error:

DateTime cannot be resolved.; line 26, column 2213, length 8

I have no clue what to change… Can somebody give me a hint.

Thanks in advanced.

Persistence methods of OH3 under Java11 no longer use DateTime objects.
You’ll need to give it a ZonedDateTime instead.
Searching this forum should offer some help.

If I understand it right I can’t give rrd4j the date and I had to use the

new DateTime(0)

Expression. Is there a equivalent ZonedDateTime expression without using the translated date?

You’re only trying to set “a long time ago”. There will be six different ways to do this, but
now.minusYears(20)
should do it.

1 Like

Thank you, that did the trick. I thought that a fixed timestep back is causing an error with rrd4j, when I do not have these time persisted. E.g. with now.minusYears(50) I get the same error like mentioned before.