Disable the default restoreOnStartup in the channel definition

I don’t remember setting up anything related to persistence but apparently I did, or maybe rrd4j is now enabled by default. I’m working on some changes to the powermax binding and I noticed that after a restart, all non-string items are restored to their pre-shutdown values, courtesy of rrd4j. (This seems to be the documented default behavior.)

The problem is that this binding’s values shouldn’t be restored on startup. The binding is attempting to maintain an internal state which mirrors the state of the alarm system, and if you shut down OH for any length of time, you can’t assume that the previous state is still valid. You need to query the alarm system and rebuild from scratch. That can take some time though, and at startup, you don’t want a window where the channels have stale values. The only correct behavior is for all channels to start NULL and stay that way until the binding has successfully queried the panel.

How do I mark a channel as non-restorable though, in the xml or otherwise? I couldn’t find anything relevant in the thing-description schema file. I don’t want to edit my own rrd4j.persist because that would only be for me; I want the binding to have some instruction that these channels should never be restored.

I tried to add an initializer for all of my channels to override any values restored from persistence, but @Lolodomo didn’t like that approach. :slight_smile:

What is the correct way to do this?

Justr to clarify: the binding is already initializing 95% of the channels during the init sequence. This was my initial intention as I did not want to rely on persistence. There are exceptions like the trip timestamps which will not be initialized until the next time your door/window is opened (make sense).

If you enabled persistence on your items linked to this binding and you don’t want restore on startup, you just have to disable (or more exactly not to enable) restore on startup for these items. This is something to be adjusted in your persistence strategy.

As an example, here is the content of my rrd4j.persist file:

// persistence strategies have a name and a definition and are referred to in the "Items" section
Strategies {
	// for rrd charts, we need a cron strategy
	everyMinute : "0 * * * * ?"

Items {
	GTemperature*, GHumidite*, GSuiviConsoElec* : strategy = everyChange, everyMinute, restoreOnStartup
	GVirtualData*, GParametres* : strategy = everyChange, restoreOnStartup

None of my powermax binding items are covered by the persistence.

My opinion is that there is no valid reason that a binding should implement something specific because a user enables persistence and restore on startup on binding items. Persistence should be only managed in the persistence configuration and not in every binding.

1 Like

So, the only problem I think with that approach is this:

The rrd4j persistence services comes with a default persistence strategy which persists every Item on every state change and at least once a minute. Additionally, it restores the last stored value at system startup.

Just by installing rrd4j, you get default persistence and restoreOnStartup for everything. It’s left to the user to decide which values should NOT be restored and override this default behavior.

Personally, I think the binding should be able to specify the default. As a binding author, you know which values are transient and which are meaningful after an interruption. I don’t think the burden should be on every user to figure that out for themselves.

Alternative view; What happens to Items is none of the binding’s business. It just offers channels. If you have no data to offer, offer none.

You’ve no idea how the user is going to use their Item, there may be links to other bindings for example.

There is another option here; if the binding knows a value is broken e.g. out of range or corrupt, it can make UNDEF updates to state channels.
That’s not intended for “I don’t know yet”, it’s more “Tried to find out but it has gone wrong”.

Maybe it would clarify to think about what happens if the communication fails in service, rather than focusing on startup. How will that stale data be dealt with?

1 Like

Yes, we are talking about items but you’re right, in fact we are really talking about channels.
What is doing the binding is what you describe, it sets channels with values it gets from the device (alarm panel setup). If no data is retrieved for a particular channel, the channel state is not updated.

What would like @ronisaacson is to to add a first step in the binding that will set all channels to a default state and then do the normal stuff (setting states appropriarly). This first step is, as I understand, to override the “restore on startup” run by the persistence service for the same items. It should be useless for all channels which are set during the startup sequence, that is most of them.

@ronisaacson : can you explain exactly what happens and in which order for a specific channel of the binding ? Because I don’t understand where is your problem, even if restore on startup is applied ?
Would it be that “restore on startup” is applied after the binding is fully initialized ?

By the way, having “restore on startup” by default enabled for all items is a strange default setting.
I don’t have this problem because I have the same rrd4j config since many years and so I don’t use the default configuration setting.

And the additional problem with @ronisaacson solution is that a channel state can be updated twice at startup, first to a default state and then finally to the real state. That could trigger unexpected rules.

I agree default restore-everything is … not always best. But it’s what OH3 does, for us to live with.

Not sure about the trigger’twice thing; I thought restore-on-startup was accomplished silently without appearing as an event. But in any case usually completed before rules run.

We’ve had binding vs. restore fights before. The original V2 MQTT set everything UNDEF until a message came in (so destroying restores).
That was deemed undesirable, on the basis that when the binding knows nothing it should say nothing.

At the end of the day, the simple solution here is advice in the binding docs. If it’s important, get the users to make the conscious choices.

My two cents in this discussion.
RestoreOnStartUp is a strategy used by a majority of users, which is one of the reasons it is set as a default even if no persistence is installed.
IMHO the need to explicitly not use such strategy is an edge case.

As @Lolodomo described by the use of a custom persistence strategy for the items in concern this problem can be solved locally by the user. Adding such to the documentation would be necessary.

Well I’ll describe the very real case that happened to me, which is why I added the initializer:

I had to reboot my server for whatever reason, and after OH came back up, it wasn’t able to reestablish communication with the panel. This is a known issue with this panel; it’s very particular about serial timings. Some dongles work better than others, and sometimes you need to physically unplug it and plug it back in before the panel will talk to you. So the binding was dead in the water, with no link to the panel.

However, the default persistence scheme had restored all of my zone states / timestamps / etc. from pre-shutdown. So when I looked in the openhab UI, it looked like everything was working normally. There was no indication that those values were all stale and no longer reflected reality.

In the latest PR, I added better error handling to the binding, so getThingStatusInfo("powermax:xxx") properly reports Online/Offline and any error message, and I exposed that in the UI. But I don’t think a user should need to look at the thing status to know if the rest of the channels have valid values.

If the Status channel says Armed, that should only be because the binding knows with certainty that the panel is armed, and not because it WAS armed at some point in the past. It’s never valid to restore this from persistence.

@Lolodomo to answer your question… there are three scenarios:

  1. Binding starts and restores previous values from persistence. Then it connects to the panel and updates all channels. This is mostly ok but there’s a race condition where stale values are being shown.
  2. Binding starts and restores previous values from persistence, then is unable to connect to the panel. The stale values will stay indefinitely.
  3. Binding is working normally and everything is fine, but then communication with the panel is lost.

To prevent #1 and #2, I think the binding should be able to indicate that these channels are not good candidates for persistence. For #3, it should reset all channels to a null/unknown state.

Maybe a compromise would be to ignore the startup window in #1, but explicitly reset all channels after #2 or #3?

Why is that different to from the Item storing the state it was a little while ago? When we update an Item’s state, it is instantly stale. It’s just a question of how long you trust any Item. (see also ‘expire’ feature) This will come down to user preferences.

Pedantry; restore (or not) is nothing to do with the binding.
This is important because what you are asking for is the binding to reach out and interfere with other openHAB sub-systems. I suspect such requests will simply be rejected.
Everything you want to do is already within your control, but will involve you configuring those other sub-systems.

Please don’t misunderstand; I think we all see your issue, it is important not to present false security data. Because it is important, the user can take special actions. Because it is important, it is maybe not so smart to rely on automating those actions instead.

Yes of course. Wouldn’t the developer of the binding be in a better position to make that call though? If your binding is polling, you know how long the polling cycle is and when something goes wrong. If it’s event-driven then maybe you have an idle timeout. In the case of the powermax binding, I know with certainty when a stale value is still likely to be valid and when it’s not, because it’s a function of how the binding communicates with the panel. For any random user, I wouldn’t expect them to know this, nor should they have to.

I’d argue that rrd4j already did that, and not in a very friendly way, by persisting and restoring everything (with no expiry, best I can tell) without taking any input from the relevant bindings. I understand that zero-touch user-friendly persistence can be a good thing, but I think that binding authors should at least be able to build in persistence hints to get the best possible default behavior.

I guess the biggest question is who is “the user” in this case. I don’t even have an rrd4j.persist file, and before this thread, I didn’t know that was a thing. So I can customize the persistence of these values for myself, and add something to the documentation suggesting that other users should do the same. This seems clunky though.

What do you think of my compromise proposal, to ignore what happens on startup, and reset all values to null/unknown only when you’ve tried and actively failed to communicate with the panel?

I’d expect the binding to update the relevant Channels (and therefore the Items) with UNDEF in that case. That’s one of the purposes of UNDEF after all. The Thing is OFFLINE or in an error state so we can’t know what state the Channels are.

Not necessarily. Often it is the user who is in best position to make that call because they have a much wider understanding and context to understand what that Item is being used for. Most of the time most users will want a default behavior but there will always be some edge case users that have multiple Channels with profiles, Expire, rules or other stuff going on where they do need the Item’s state restored and you as the binding developer is in the worst position to make that call in those cases.

So two approaches are possible, make the decision for the user which makes the job of many/most users a little easier but completely breaks those edge case users, or support the edge case users.

But I think this is mostly an argument over approach more than anything. You’ve stated that your binding knows when something is wrong. So when it knows something is wrong, post UNDEF to the Channels. That will overwrite anything that may have been restored on startup. The user won’t be seeing stale info and there is no need to binding authors to "break* user’s current persistence experience nor the need to add new features to core to provide a way for bindings to mess with persistence.

I’m not certain that’s a compromise so much as how it’s supposed to work. But use UNDEF, not NULL. NULL is supposed to mean never initialized. UNDEF means the binding (or whatever) just can’t know the state of the Item. Note that this is beyond just startup too. Even if it’s been weeks since OH restarted, if the binding determines there is a problem communicating with the device, it should post UNDEF to the Channels and maintain that state until communication is restored.

I understand now that @ronisaacson problem has nothing to do with restore on startup. What he wants is just to have UNDEF channel states when the device is OFFLINE. That makes sense and this is something very common done by bindings. I am even surprised that it is not implemented in this binding because I already implemented several times in other bindings.
So what is expected is to set channels to UNDEF in case the binding is not able to connect to the device (or detect a loss of connection). It will avoid having channels set to an outdated state.
Can you confirm @ronisaacson that it is what you expect ? In that case, I am 100% ok with that, you just have to set channels to UNDEF when the binding sets the thing status to OFFLINE. In fact, I even thought you already did that in one of your previous changes.

Edit: it looks like it is already done by the method setAllChannelsOffline. So I am totally lost…

This is not the binding which restores the previous values, this is another openHAB service, the rrd4j service.
What race condition are you talking about ? Do you mean the rrd4j stuff (restore) and the binding stuff (init of channels) are done in parallel and so with undetermined result ?

Normally, this case is covered by the use of your method setAllChannelsOffline which reset all channels to UNDEF. Isn’t it working ?

Same thing, isn’t setAllChannelsOffline called in this case ?

The error handling on failure is only part of it… there’s definitely an issue on startup where the default rrd4j behavior is interfering with the proper initialization of the various channels. This is the race condition I was referring to. It’s been a few months so I don’t remember exactly what happens but I know I needed to add the initializer for a reason. :slight_smile:

Since you have a non-default persistence setup, it makes sense that you wouldn’t see the same issue. Let me retest and I’ll come back with a better description of the conflict.

In case the problem is a race condition between rrd4j and a binding, it has to be fixed globally, not in every binding.
But I have a vague reminding that I detected such problem before OH3 was released and @Kai fixed it. @Kai can you tell us if restore on startup is run before any binding is started? Or is it done in parallel?

No, please don’t! If this is done by bindings, please help removing such code from the codebase.
If a Thing is OFFLINE it cannot tell anything about the states and therefore should not do any state updates at all. FTR: UNDEF is not saying “I don’t know”, but rather “it is a value that I cannot reasonable map to the item”.

In case the problem is a race condition between rrd4j and a binding, it has to be fixed globally, not in every binding.

I agree, but I haven’t understood if we really have any race condition here.

@Kai can you tell us if restore on startup is run before any binding is started?

The start levels can be seen here, so the restoreOnStartup happens on level 30. I am not 100% certain that handlers are started only after that, but I haven’t experienced another situation yet. One measure to prevent any issue here might be to do a NULL check on the item state and only apply restoreOnStartUp if no other state has yet been set.

For @ronisaacson’s problem, tough: You should consider the restoreOnStartup feature as a measure to make a “reboot” invisible - it just makes sure that the item states are the very same as if the reboot hasn’t happened. Therefore I do not see any need to be able to deactivate this feature for certain channels. It is rather up to the binding to make sure that it behaves nicely, whenever its handler is stopped and started. So if there is an issue if you stop the handler and restart it, because you cannot successfully reconnect to the panel, you have the very same issue and it is independent of restoreOnStartup. Whenevery communication problems occur, the Thing must go OFFLINE and this is actually the situation you want to inform your admin user about, so that he knows that there’s something wrong in your setup.
For defining a validity of item states, the expiry functionality should be used. You could have all your alarm panel items be set to NULL, if there are no updates within one minute. One feature we could maybe also think about is to allow item state expiry as soon as a Thing is OFFLINE. This would imho be much better than implementing this in every binding again and again (see above).

1 Like

Is that even desirable though? If OH shuts down at t0 and restarts at t1, there’s some amount of time that passes in between. Is it safe to assume that nothing changed externally in that time? Sometimes the new state of a channel can be determined by the binding, but sometimes not.

I’ll give you an example: the powermax binding gives you a set of alarm zones, and for each zone there’s a Last Trip time. The panel doesn’t keep this state; the binding sets it when it sees that a zone is tripped. It’s really useful. I have a rule that tells me if I’ve accidentally left the garage door open for too long, for example.

What is the proper value of this channel when OH starts up? Is it the previous Last Trip time from before the shutdown? What if the zone was tripped while OH was down? There’s no way to know. Restoring this channel gives you a value that could be false / misleading. All OH can tell you is “well, it hasn’t tripped since I’ve last been restarted” and therefore the value should be null.

Some other channels are not like that. For some channels, it’s possible to query the panel and get a guaranteed value which overwrites whatever was restored. But I’d argue that the binding author, not the user, is the best person to say whether that’s the case. The binding author knows which values can be set with high confidence.

This is one reason I proposed a change to the powermax binding which forces many channels to null on startup. Making the reboot invisible is, IMHO, exactly the wrong thing to do.

This is counter to how many/most of us here on the forum have understood UNDEF to work. This was perhaps driven by the behavior of the MQTT binding and others.

Sometimes it’s desirable and sometimes it’s not. A default behavior must be chosen though and the default is to restore everything. In those cases where it’s not, the users have the ability to pick and choose which Items are restored.

It would have been equally valid to choose the other way and have no Items restoreOnStartup and give the users the option to configure that should they choose.

Either way there is a block of users who have extra work to do. OH chose to go with the former approach in OH 3. They probably looked at how they use OH and how others have reported how they use OH and determined over all it’s less work on the end users is everything is restored by default. But as I said, in either approach end users will have extra work to do if the default doesn’t work for them.

And you have the option to configure it like that. And there have been several alternatives suggested here

The user can too with proper documentation on the Channels. And it’s the user who knows everything else about his Items such as whether Expire is configured, more than on Channel is linked to it, what rules they’ve written involving that Item and its state. The binding author knows none of that and if the binding author decides they know best with their limited context they might break perfectly valid configurations.

Perhaps some users don’t care in this case and would rather that Item retain the last value seen by OH rather than lose the value entirely. When the binding makes decisions like this it takes options away from the users. Often that can be a good thing as it leads to simplicity. But in a platform like openHAB I’m not sure that it is a good thing because it reduces the types of problems users can solve with openHAB, making it less useful.

I do like the idea Kai raised of having a sort of automatic Expire timeout for when the Thing goes OFFLINE. But I also like the use of UNDEF sent to the Items when that happens too. It provides a very nice way for even the regular users to see something is wrong and perhaps their actions won’t be acted upon (e.g. if all the Thermostat Items read - then turning on the AC isn’t going to work). Given how much of a pain it is to get an alert when a Thing does go OFFLINE, that is often the first that even the admin users see there is a problem. But that’s a problem for another thread.

1 Like

All this debate is for only one channel, the last trip timestamp. If I am not wrong (I will check again), all other channels are set by the powermax binding with their current value retrieved from the device during the init sequence.