Timestamps and persistence

Something that I would like to see added to OpenHAB core is support for contextual information on the Items state and emitted events, which can be made accessible to bindings, rules, the API and ui expressions.

  • Most importantly timestamps for last change and last update
    • ideally with support for setting these timestamps by a binding or the rest API if this info is available from the real-world device or service.
  • secondarily, the source/reason of the state change or update (UI, automation, api, binding, optionally annotated with the specific user/token/rule_id/binding).

There are open issues and discussions around this for a very long time, but no progress has been made due to differences in views about the core architecture, the approach on how to do it without breaking bindings and rules etc. Some links:

In my opinion, the time of last change/update is a fundamental part of virtual state. The system should have the information whether the virtual entity state reflects the most recent and accurate information from the physical world. This information could be used for instance in rules to help avoid incorrect decisions based on outdated or incomplete information.

We can express the probability or confidence P(t) that the current virtual state represents the state of the actual sensor as a function of the time passed since the last update through an exponential decay function:

P(t) = exp(-λt)

where λ is the decay constant, which represents the rate at which the probability or confidence decreases as time passes. The larger the value of λ, the faster the probability or confidence decays.

For example, if λ = 0.1, then after one unit of time (e.g., one minute), P(t) = exp(-0.1) ≈ 0.93. This means that the probability or confidence that the virtual state represents the state of the actual sensor is 93% after one minute has passed since the last update. After 5 minutes this probability drops to about 70%.

Besides the need for accessing these timestamps, there is a need to ingest historical (or future) sensor values. (this has been made partially available with ModifiablePersistenceService but there is currently no support to do this from bindings and you cannot set the current state and the historical state without introducing duplicates for the last value. But maybe this is a separate albeit related topic).

Therefore, in my opinion we need these timestamps in order to be able to perform this kind of calculations in automation rules. And we need to support the ingestion of actual timestamps if a binding has this information.

I can’t comment on bindings here but I can’t see how it would be useful in rules beyond what Persistence already provides.

When a rule is triggered by an Item event, it means that the event occurred now, or at least as close to now as you ever get in computers, so lets say within the past 100 msec (if it’s this long ago there is probably something wrong or you are running on a particularly slow machine). This could be measured on your particular machine.

Once you are inside the rule, if you have an everyChange strategy for an Item than lastUpdate will give you the timestamp when it changed (it’s frankly a poorly named method). If you have an everyUpdate strategy for an Item than lastUpdate will give you the timestamp when it was last updated. Unfortunately, if you have a periodic startegy (e.g. everyMinute) you can’t rely on that timestamp.

So in a rule, as far as I can tell, you can apply your λ just as you require even now.

I’d second that.
I agree it’s sometimes cumbersome to e.g. get the right value and to think of proper logic to ensure it’s up to date, but a timestamp won’t change much about that.

Also remember that any feature has its cost.
Very very few people need to (or rather: want to) do time critical or complex math operations in the automation language, and directing development ressources there means they will be lacking elsewhere, with questionable benefit for the OH project as a whole as it’s in the end just, well, only very very few people that benefit.
And those are 100% programmers that can also think of alternative ways how to handle this stuff.

My POV on that is KISS: if there’s a low hanging fruit, ok, grab it.
But if this requires substantial changes to the core, better not.
The cost(+risk)-to-benefit ratio is too bad.

And for those who want it, there’s options. There’s Influx and other DBs that offer all sort of number crunching. There’s profiles that provide a last change timestamp, there’s persistence to offer changedSince(), there’s a PWM automation, and I recall there was a great proof of concept to handle even Bayesian optimization on combinations of item values (btw @cweitkamp any update on that?) and just like with quite many wishes to appear here, there often already exists a way to handle this outside of the core or even outside of OH and it’s just about finding it.

The timestamp question is a little bit more complex. IF we add a timestamp, we could also use it for providing future values (which is e.g. useful for energy calculations) or catch up with historical events after a re-connect. Adding timestamps is a very low hanging fruit, there are however concerns because the timestamp we add is not necessarily the timestamp of the “real” event and also it might change the strict linear order of events in the SSE event stream.

Yeah when posting I thought of right that, too. But there, too, frankly I’m still undecided if that’s a good idea conceptually.
For example, any intent to make power available tomorrow 11:00-12:00 at say 10 ct is just that, an intent that can still be changed by then so it isn’t the same quality like an event at 11:00 that says the price IS 10ct from now on. Remember you might notice or not should the intent change meanwhile as you might be offline. That can lead to all sorts of friction in application programming.
Using the same item for intent and events therefore is, well, conceptually cringe IMHO.

I myself am also struggling to display both, current and tomorrow’s energy prices in one chart,
but there’s options to do it without future timestamping, you can use proxy items or charting offsets.

Then again the ability to provide timestamps on .persist() other than the implicit <now> would ease programming quite some so as you say it’s low-hanging why not.
(I wasn’t involved in the discussion and I don’t wanna start any discussion here so forgive me this comment but I don’t understand what that has to do with ordering in the event stream)

The problem is that current we assume that if event2 comes after event1 then it als did happen after event1. This is not necessarily the case when we allow timestamps (historical or future), because event2 could have a timestamp that is older than the timestamp of event1 (either because it happened before and was created retrospectively - imagine a binding that can retrieve past values, or because the previous event was a “future” event).

The problem with using items for predictions is that this requires one item per datapoint, so 48 items for hourly changing values for today and the next day. This is already a lot, but what if energy suppliers change to prices that change every 15 minutes? Nearly 200 channels and items for that is really a bad design. OTOH, we already have persistence and could enable it to store values including timestamps and could get this information from there with only one item. This also eases charting.

1 Like

Sure thing, and it’ll become even more as granularity will increase to 15min slots soon.
That’s inacceptable to deal with in user-side programming, I myself have been cursing that a lot already.
So I, too, consider a (future-)timestamped item series of values to be the better choice for this.

However, my comment is sort of independent of that.
We wouldn’t generate an event when time has elapsed and that future point in time we have persisted the data with has become “now”, would we ? Just the timestamp is there, part of persisted data.
Again, I’m not involved in those discussions but that’s why I was wondering what timestamping alone and all by itself has to do with the real ordering of events.

EDIT: on a sidenote, isn’t the timestamp already part of the persisted data ? It may not be possible to retrieve it (or is it?), but if it wasn’t there, how else would it work to retrieve data from a specified point in time ?
IIRC docs also say that if I use .averageSince(), it’ll compute a (time-)weighted average.
That wouldn’t work if the timestamp wasn’t there.

This might be, or not, what Rich meant when he said that it’s all there in persistence.
To be outright frank, I didn’t understand the point of @liaskt’s request anyway.
Even on bindings, if a binding changes an item through a channel, persistence if properly configured for “every change” will record it, too.

That’s right.

It helps me a little bit to separate the event from the Item’s state. To handle @liaskt’s use case right now, we can depend on persistence using only lastUpdate and the assumption that if we are processing an Item event, it occurred now. If persistence is configured properly that’s all you need to know how long ago an Item’s state was changed or updated. You don’t even need anything fancy. If that’s your only use case all you need is MapDB.

Timestamps are indeed stored in persistence but right now there is no way for bindings to say “this Item state represents a predicted value for 12:34 tomorrow” and have persistence save it with that timestamp instead of now. There are work arounds through the REST API for persistence but bindings don’t have access to that and rule access is awkward and even there you’d need to get the timestamp from a separate DateTime Item or through a calculation.

I definitely see a need for that use case but I too have concerns over what that means to the event stream. What does it mean to a rule to be triggered by an event that is in the future or the past or out of order? Is that even the event or is it something else?

Thank you for your feedback. I have read your arguments and the ones in the open issues I linked previously and there are all valid concerns. I am aware that there are some workarounds as well some of which I am using myself at the moment.

I have compiled some use cases below, workarounds with current OH, some issues with these workarounds and some potential solutions (probably a separate post would be better).

Task: Access the item lastUpdated timestamp

Why:

  • Reason about whether the virtual state reflects the real physical state of a sensor.
  • In the UI: Show NULL/UNDEF (“unknown”) instead of the last stalled state the system received from the sensor or show the last update timestamp so that the users reason themselves.
  • In rules:
    • Take a different decision based on the time passed since the last update (e.g. due to low confidence that this is the real state at this moment).
    • Attempt to refetch the state (if the binding/api provides an on-demand “pull” method)

How to achieve with OH3

  1. If the item update is the reason of a rule trigger, then assume it just updated/changed “now”. This may be some milliseconds off which in most cases is not an issue.
  2. Enable persistence services with an everyUpdate strategy. Then you can use persistence extensions such as <item>.lastUpdate and <item>.updatedSince(ZonedDateTime)
  3. For the UI, I am not aware of a solution to show NULL/UNDEF (“unknown”) instead of the last stalled state (use the “expire” profile maybe?). In order to show the lastUpdate from persistence so the users can reason themselves, is it possible to show it with a custom widget?

Issues with the current approaches

  • Issues with solution 1
    • This does not work if you want to reason about the state of an item which is not part of the rule trigger (rules that run on schedule, or rules that are triggered by a different item change or multiple items).
  • Issues with solution 2
    • OpenHAB is already a “stateful” framework which keeps the Items registry and their state in memory. You shouldn’t need persistence services enabled in order to access just the last updated timestamp of an item. If the timestamps were stored in-memory together with the state you could just access them by .lastUpdate or a similar interface.
    • Persistence services have an overhead (disk space, disk writes, query latency to access the timestamp) which may be an issue in some use cases such as with high frequency updates or high frequence checks for the state. You have to enable “onEveryUpdate” strategy in order to be able to retrieve the last update timestamp which increases the overhead of the persistence service.

Proposed solution

  • Save lastChange and lastUpdate in-memory along with the Item state so it is available in rules and the UI without persistence queries.

Task: Ingest the real lastUpdate/lastChange timestamp from a binding or the REST API.

Why:

  • a binding/script may be configured to send the last state change/update to OH on schedule (e.g. every hour). The same binding may have access to the real timestamp of the last change/update (e.g. a specific timestamp within the previous hour).
  • For the use-cases mentioned in the previous row (what to show in UI and what decision to take on rules) it would be helpful if the real timestamp was available instead of “ingestion/system” time.

Use-cases:

  • Battery-powered devices or devices with data-caps (e.g. NarrowBand, special SIM cards for trackers etc.) may be configured in this way to save battery or data transmission volume.
  • mobile devices that keep the last state in a buffer until network connection becomes available.

How to achieve with OH3

  1. Use a different channel and different item to retrieve and store the value of this timestamp. E.g. a DateTime-type Item named “realLastChange” could be used if such a channel is made available by the binding (or with the REST API from a rule if this info is available by other means).
  2. Use the REST API /persistence endpoint along with a ModifiablePersistenceService to store the state with the real lastUpdate/lastChange timestamp in the database. This is also available in Habapp (uses the same endpoint).

Issues with the current approaches

  • Issues with solution 1
    • Multiple items for attributes of the same state update/change which may not update atomically and are saved with different timestamps in the persistence database. You have to perform a 2nd persistence query to retrieve the DateTime item state but this is not guaranteed to have the same timestamp with the primary item so it is difficult to join reliably.
  • Issues with solution 2
    • The state kept in the OH in-memory registry does not reflect the state stored in the database since OH was bypassed in order to save the correct timestamp in the database. Therefore, the last value shown in a chart may be different that what is shown in an Item card. If you attempt to also send a postUpdate or sendCommand to an item, then duplicates will be written to the database (one with the real timestamp and another with the system time).
    • Another issue is that currently this can only be used with the REST API and not with an official OH interface used by the bindings (the argument is that bindings, by design, should only update things and channels and not items).

Proposed solution

  1. Allow bindings (and optionally rules) to send the real timestamp of lastChange and lastUpdate to OH through standard interfaces (e.g. extra argument in postUpdate or similar methods).
  2. Store these timestamps in-memory along with the Item state
  3. use these real timestamps instead of “now()” in persistence services that support ModifiablePersistenceService.

Questions:

  • What to do if the new timestamp is before the currently stored lastUpdate/lastChange timestamp?
    • Only persist: I would choose to persist this timestamp,state pair but I would not update the current state of the item.
  • What to do when the new timestamp is in the future?
    • Only persist: I would choose to persist the value in the DB but as above I would not update the current state of the item.

Therefore, if newMsg.lastUpdate <= item.lastUpdate || newMsg.lastUpdate > now then just persist, otherwise both persist and update the current state (and timestamps).

Task: Ingest multiple historical states from a binding or the REST API.

Why: (extension of the previous use-case)

  • A mobile device has lost internet connection but in the meanwhile it is able to read the state from connected sensors (e.g. GPS). It has the capability to keep these states in a local buffer and send them to OH when the network is available. It would be helpful to be able to send these updates to OH and persist them in the database in order to:
    • visualize in the UI
    • troubleshoot by having access to the whole measurements history (through the logs or the database)
    • use in advanced rules that take into account multiple historical states (e.g. train/extrapolate).

How to achieve with OH3

  1. Use the REST API /persistence endpoint along with a ModifiablePersistenceService to store the historical states with the real lastUpdate/lastChange timestamp in the database. This is also available in Habapp (uses the same endpoint).

Issues with the current approaches

Same issues as previously but mainly for the last value:

The state kept in OH in-memory registry does not reflect the state stored in the database since OH was bypassed in order to save the correct timestamps in the database. If you attempt to also send a postUpdate or sendCommand to an item (for the last “current” state) then duplicates will be written to the database (one with the real timestamp and another with the system time).

Another issue is that currently this can only be used with the REST API and not with an official OH interface used by the bindings (the argument is that bindings, by design, should only update things and channels and not items).

Proposed solution

Same as the proposed solution for the previous task.

Task: Ingest future states from a binding or the REST API.

Use cases:

  • Next day/week hourly electricity prices which can be used to schedule activities with the lowest cost
  • Predictions from bindings or rules

How to achieve with OH3

  1. Use the REST API /persistence endpoint along with a ModifiablePersistenceService to store the future states in the database.

Issues with the current approach

This can only be used with the REST API and not with an official OH interface used by the bindings.

Proposed solution

Same as the proposed solution in the previous 2 tasks (just persist if timestamp is in the future).


Apologies for the long post, it would probably be better to make a separate thread.

For electricity prices, this is exactly the same quality. The price is settled, it will not change.

However, for a weather forecast, your point is valid. This seems more complex to me, as there may not even be a migration path from “forecast” to “actual”, i.e. when time passes the forecasted data, it should be invalidated, but there may not be a source of “actual” data to replace with. At least, I could imagine a binding forecasting weather, but not providing any historic measurements. Also, even for a channel that would be a pure forecast (not mixed with actual), how would we replace the forecasted data points?

Example - forecast 1, 02.04.2023 22:00:

  • At 03.04.2023 06:00 the wind will blow 5 m/s in area X.
  • At 03.04.2023 07:30 the wind will blow 6 m/s in area X.

Forecast 2, 03.04.2023 00:00 - now adjusted:

  • At 03.04.2023 05:00 the wind will blow 5 m/s in area X.
  • At 03.04.2023 06:45 the wind will blow 6 m/s in area X.

We would not like to end up with:

  • 03.04.2023 05:00 5 m/s
  • 03.04.2023 06:00 5 m/s
  • 03.04.2023 06:45 6 m/s
  • 03.04.2023 07:30 6 m/s

Forecast 2 should replace forecast 1, i.e. those values must be deleted. If an item is updated with a timestamp, would that be enough to decide to delete everything after that timestamp? When also storing historic values, this could end up in some undesired deletions, I guess.

Yes, Expire is exactly for this purpose. But in that case, you will be changing the Item’s actual state to UNDEF/NULL, in which case in your rules you don’t necessarily even need the timestamp from persistence. You can just check if the Item’s state is UnDefType and if so you know you can’t use it.

Why doesn’t your “2. Enable persistence services
lastUpdate
” cover the other Items and these other cases?

Note that your three solutions are not mutually exclusive. In fact they very likely need to be used together.

It depends on the behavior of the binding. Some update the Item frequently while others only update when there is a change.

That too comes with an overhead, one that could potentially negatively impact the performance of the UI if these timestamps are part of the events.

Do we have actual examples of these that are supported by OH? How are their clocks synchronized? What happens if they are not synchronized?

It’s also worth mentioning that OH is a home automation framework, not a data collection and analysis framework. I’m not aware of a lot of home automation use cases where there are devices that go off and collect data and then dump them when reconnected that would be home automation relevant. And I’m unaware of any battery powered home automation devices that even have a clock, let alone report the time along with the data. Maintaining the current time is itself a battery drain.

Famous last words, spoken first Thursday Oct 28, 1929, renewed by Lehman Brothers, Credit Suisse et al since.
SCNR :slight_smile:

1 Like

The timestamp-change (and timestamp-update) profile is another solution for the 1st “Task: Access the item lastUpdated timestamp”, thanks for pointing this out.

In order to achieve the next two tasks (“Ingest the real lastUpdate/lastChange timestamp” and “Ingest multiple historical states”) I have currently bypassed the native MQTT binding and instead ingest with Habapp and a set_persistence_data (with ModifiablePersistenceService), therefore it does not fit my setup (since I don’t use the native MQTT channels). It should work well though for someone who just wants to know if the sensor sent a recent update and wants to avoid relying on persistence extensions.

Yes. GROHE ONDUS - Bindings | openHAB . The water sensors (“Sense”) collect temperature and humidity every hour or 15 minutes (with timestamp), and reports time series online once a day to save battery. The official binding just reports the same, outdated value to openHAB over and over again. I have a local version that mitigates this, however in a way that is unacceptable according to the architecture rules.

Another use-case is remote openhab instances (“slaves”) that loose contact with the master openhab installation. Anything happening with connected sensors on the slaves while disconnected just gets lost.

I am strongly in favour of adding an official and generic support for historic and future time series in openHAB.

About clock sync; this is handled in this case by Grohe.

Cheers

Another example which is popular among home users for presence detection or other activities is OwnTracks. OwnTracks can keep locations with timestamps in a local sqlite (e.g. if you are on a flight with airplane mode enabled) and send the locations to MQTT/HTTP when the network is available.

Other use cases mentioned in these threads:

That sounds like a feature request on the binding. On connection, the states of the remote Items should be sent to the connecting OH instance. Or at least have an option for that.

The MQTT Event Bus rules do not have this problem. Item states are published retained so upon connection the latest value gets published.

Hello @liaskt, thank you for bringing this aspect up and description of possible use cases. I observed some of them myself and know for example that BACnet have its own concept of “reliability” of reported values. Making similar concept present in openHAB, even in basic form, based on timestamp would be indeed very helpful for debugging some of “lost communications”.
I find myself often looking at history of items in order to find if it become “stale” or not. I began working on some intermediate solution for bindings I maintain which I called communication-watchdog. However it is a low level component employed on a binding side to ensure that channels are being updated based on basic time interval policy (code example). It does not solve an end user trouble nor provide a user controllable reliability policy.

I agree that timestamp-change and timestamp update is just not going to fly. Simply, because various channels and items may have different update cycle. Effectively you would need to duplicate each item in order to get that working.
The only one viable solution is, as you indicated, making sure that we keep timestamp next to received state. I think that [Enable binding to store historic states by altaroca · Pull Request #3000 · openhab/openhab-core · GitHub](https://Enable binding to store historic states) is fairly close to your intent, however we haven’t had a consensus on how to approach that change.

To be fair in my personal perception largest part of a trouble, at least from core point of view, is if new kinds of state will be introduced or we’ll amend existing ones. Each of these have pros and cons. I’m more in favor of making new state kinds which open up framework, however it comes at the cost of extra complexity on UI and all other elements involved in processing (and there a lot of them).

Speaking of actual use cases - I do actually have a customized persistence service which allows storing states ahead of time in order to make a single record per given period (day/month/year). In order to satisfy that requirement I had to mock a lot of stuff, including surgery of jdbc persistence, because its handling of update and store time was leaky.

Thanks @splatch for your insights. Your thing-level communication watchdog looks like a nice alternative approach to look at the reliability problem.

You phrased it better than me: “keeping the timestamp next to the received state”.

Then, the questions are (1) which timestamp(s), (2) where/how they are currently saved or made available.

The available timestamps are:

  1. “source” timestampÂč : real timestamp of the moment the device state was sampled - if available
  2. “ingestion” timestamp : system timestamp OpenHAB received the state from binding or API
  3. “processing” timestamp : system timestamp a listener (Observer) received/processed the state. This is different for each observer (e.g. profile, persistence addon, rule etc.)
  4. “database” timestamp : the database system time (now() in SQL) at the time of DB insert.
  5. “access” timestamp: the time someone accesses the virtual state.

OH currently ignores timestamps (1) and (2), considers (2) and (3) equal by not saving (2), uses timestamp (4) for persistence, and makes it hard to use anything other than timestamp (5) in the UI.

  • OH does not keep track of timestamp 2 (“ingestion” time). It could if it was possible to save it “next to state” in the State objects and forward it through Event objects.
  • OH currently persists timestamp 4 (“database ingestion time”) next to the state. e.g., in JDBC sqltype.tablePrimaryValue: NOW()
    • So, you get access to timestamp 4 (not 2 or 3) in charts and in rules when using peristence extensions such as <item>.updatedSince(ZonedDateTime)
    • Timestamp 4 should be some milliseconds off but close to timestamp 3 if the DB and OH system clocks are properly synchronized (or same system). Persistence addons can be easily refactored to use timestamp 3 by using Java time instead of database server time.
  • OH recently made it possible to use timestamp 1 (“source” time) but only through (Modifiable) persistence and only through the REST API (not available to bindings).
  • In the UI, users only get access to timestamp 5 unless they duplicate each Item with a DateTime item (with timestamp 3). With the exception of charts (timestamp 4).

I understand the complexities arising from modifying existing State and Event objects or introducing new ones.
I am just trying to make the case for this aspect to be considered in the roadmap of some future release (4.0 or later) and I would be happy to contribute with code and/or testing.


Âč I initially used the term “event” timestamp instead of “source” timestamp but changed it to avoid confusion with OpenHAB events. “Event-time” is a common term in some frameworks (e.g. 1, 2). PhenomenonTime in some OGC standards and W3C ontologies.

I think the last or current state is recovered, and I believe on the master side that the event is persisted as happened “now” and not in the past as is actually the case.

I do not understand how the remoteopenhab nor mqtt binding is able or allowed to store any past data/time series (except for the last state with most likely wrong timestamp). If this was to be a feature request on these bindings and they would be allowed to do this, why would the same functionalty in other bindings be rejected by the maintainers?

Cheers

I believe that most of issues boils down to lack of reliable source timestamp which leads to each part of processing pipeline assume its own. For example when you look at how persistence services are resolving time - they do assume “now” while storing received states. I might be wrong, but at least several months ago, they did not utilize ingestion time set at the event sent via openhab event bus. Effectively they did rely on database timestamp, sometimes leading to situations where system timezone settings was differing from openHAB.

I believe leaving it as is is wrong for primary for clarity reasons. Any computing systems is a victim of more or less predictable latency. This means that binding could start polling at 23:59:59.876, receive answer at 23:59:59.989 but then, going over event bus towards persistence service will result in that state being stored for next day, because database now will resolve to 00:00:00.104. Effectively you never know what was source time, you know only database timestamp. Some persistence services (rrd4j) will not care about that time, but rrd4j is special care anyway.
I know it is not a major issue, it could always be arranged through other tools, however the less consistency we have across multiple areas, the more messy it gets.

I recently come across OPC UA; it does have “reliability” aspect built into protocol. The socketcan interface may also attach hardware timestamp to received CAN messages (although it is not mandatory).
I’d not run for making it mandatory for bindings to make exact time calculation, however making them an option to provide a time when they create a state would form a valid input for further work. Then, another aspect is propagation of that timestamp through rules, transformations (good luck!) and profiles.

I don’t mind unification under which we will start using “event time”, however then it is a question of what system should do if it does receive future or past time? This gets us back to initial github issue you linked.