Disable the default restoreOnStartup in the channel definition

Alternative view; What happens to Items is none of the binding’s business. It just offers channels. If you have no data to offer, offer none.

You’ve no idea how the user is going to use their Item, there may be links to other bindings for example.

There is another option here; if the binding knows a value is broken e.g. out of range or corrupt, it can make UNDEF updates to state channels.
That’s not intended for “I don’t know yet”, it’s more “Tried to find out but it has gone wrong”.

Maybe it would clarify to think about what happens if the communication fails in service, rather than focusing on startup. How will that stale data be dealt with?

1 Like

Yes, we are talking about items but you’re right, in fact we are really talking about channels.
What is doing the binding is what you describe, it sets channels with values it gets from the device (alarm panel setup). If no data is retrieved for a particular channel, the channel state is not updated.

What would like @ronisaacson is to to add a first step in the binding that will set all channels to a default state and then do the normal stuff (setting states appropriarly). This first step is, as I understand, to override the “restore on startup” run by the persistence service for the same items. It should be useless for all channels which are set during the startup sequence, that is most of them.

@ronisaacson : can you explain exactly what happens and in which order for a specific channel of the binding ? Because I don’t understand where is your problem, even if restore on startup is applied ?
Would it be that “restore on startup” is applied after the binding is fully initialized ?

By the way, having “restore on startup” by default enabled for all items is a strange default setting.
I don’t have this problem because I have the same rrd4j config since many years and so I don’t use the default configuration setting.

And the additional problem with @ronisaacson solution is that a channel state can be updated twice at startup, first to a default state and then finally to the real state. That could trigger unexpected rules.

I agree default restore-everything is … not always best. But it’s what OH3 does, for us to live with.

Not sure about the trigger’twice thing; I thought restore-on-startup was accomplished silently without appearing as an event. But in any case usually completed before rules run.

We’ve had binding vs. restore fights before. The original V2 MQTT set everything UNDEF until a message came in (so destroying restores).
That was deemed undesirable, on the basis that when the binding knows nothing it should say nothing.

At the end of the day, the simple solution here is advice in the binding docs. If it’s important, get the users to make the conscious choices.

My two cents in this discussion.
RestoreOnStartUp is a strategy used by a majority of users, which is one of the reasons it is set as a default even if no persistence is installed.
IMHO the need to explicitly not use such strategy is an edge case.

As @Lolodomo described by the use of a custom persistence strategy for the items in concern this problem can be solved locally by the user. Adding such to the documentation would be necessary.

Well I’ll describe the very real case that happened to me, which is why I added the initializer:

I had to reboot my server for whatever reason, and after OH came back up, it wasn’t able to reestablish communication with the panel. This is a known issue with this panel; it’s very particular about serial timings. Some dongles work better than others, and sometimes you need to physically unplug it and plug it back in before the panel will talk to you. So the binding was dead in the water, with no link to the panel.

However, the default persistence scheme had restored all of my zone states / timestamps / etc. from pre-shutdown. So when I looked in the openhab UI, it looked like everything was working normally. There was no indication that those values were all stale and no longer reflected reality.

In the latest PR, I added better error handling to the binding, so getThingStatusInfo("powermax:xxx") properly reports Online/Offline and any error message, and I exposed that in the UI. But I don’t think a user should need to look at the thing status to know if the rest of the channels have valid values.

If the Status channel says Armed, that should only be because the binding knows with certainty that the panel is armed, and not because it WAS armed at some point in the past. It’s never valid to restore this from persistence.

@Lolodomo to answer your question… there are three scenarios:

  1. Binding starts and restores previous values from persistence. Then it connects to the panel and updates all channels. This is mostly ok but there’s a race condition where stale values are being shown.
  2. Binding starts and restores previous values from persistence, then is unable to connect to the panel. The stale values will stay indefinitely.
  3. Binding is working normally and everything is fine, but then communication with the panel is lost.

To prevent #1 and #2, I think the binding should be able to indicate that these channels are not good candidates for persistence. For #3, it should reset all channels to a null/unknown state.

Maybe a compromise would be to ignore the startup window in #1, but explicitly reset all channels after #2 or #3?

Why is that different to from the Item storing the state it was a little while ago? When we update an Item’s state, it is instantly stale. It’s just a question of how long you trust any Item. (see also ‘expire’ feature) This will come down to user preferences.

Pedantry; restore (or not) is nothing to do with the binding.
This is important because what you are asking for is the binding to reach out and interfere with other openHAB sub-systems. I suspect such requests will simply be rejected.
Everything you want to do is already within your control, but will involve you configuring those other sub-systems.

Please don’t misunderstand; I think we all see your issue, it is important not to present false security data. Because it is important, the user can take special actions. Because it is important, it is maybe not so smart to rely on automating those actions instead.

Yes of course. Wouldn’t the developer of the binding be in a better position to make that call though? If your binding is polling, you know how long the polling cycle is and when something goes wrong. If it’s event-driven then maybe you have an idle timeout. In the case of the powermax binding, I know with certainty when a stale value is still likely to be valid and when it’s not, because it’s a function of how the binding communicates with the panel. For any random user, I wouldn’t expect them to know this, nor should they have to.

I’d argue that rrd4j already did that, and not in a very friendly way, by persisting and restoring everything (with no expiry, best I can tell) without taking any input from the relevant bindings. I understand that zero-touch user-friendly persistence can be a good thing, but I think that binding authors should at least be able to build in persistence hints to get the best possible default behavior.

I guess the biggest question is who is “the user” in this case. I don’t even have an rrd4j.persist file, and before this thread, I didn’t know that was a thing. So I can customize the persistence of these values for myself, and add something to the documentation suggesting that other users should do the same. This seems clunky though.

What do you think of my compromise proposal, to ignore what happens on startup, and reset all values to null/unknown only when you’ve tried and actively failed to communicate with the panel?

I’d expect the binding to update the relevant Channels (and therefore the Items) with UNDEF in that case. That’s one of the purposes of UNDEF after all. The Thing is OFFLINE or in an error state so we can’t know what state the Channels are.

Not necessarily. Often it is the user who is in best position to make that call because they have a much wider understanding and context to understand what that Item is being used for. Most of the time most users will want a default behavior but there will always be some edge case users that have multiple Channels with profiles, Expire, rules or other stuff going on where they do need the Item’s state restored and you as the binding developer is in the worst position to make that call in those cases.

So two approaches are possible, make the decision for the user which makes the job of many/most users a little easier but completely breaks those edge case users, or support the edge case users.

But I think this is mostly an argument over approach more than anything. You’ve stated that your binding knows when something is wrong. So when it knows something is wrong, post UNDEF to the Channels. That will overwrite anything that may have been restored on startup. The user won’t be seeing stale info and there is no need to binding authors to "break* user’s current persistence experience nor the need to add new features to core to provide a way for bindings to mess with persistence.

I’m not certain that’s a compromise so much as how it’s supposed to work. But use UNDEF, not NULL. NULL is supposed to mean never initialized. UNDEF means the binding (or whatever) just can’t know the state of the Item. Note that this is beyond just startup too. Even if it’s been weeks since OH restarted, if the binding determines there is a problem communicating with the device, it should post UNDEF to the Channels and maintain that state until communication is restored.

I understand now that @ronisaacson problem has nothing to do with restore on startup. What he wants is just to have UNDEF channel states when the device is OFFLINE. That makes sense and this is something very common done by bindings. I am even surprised that it is not implemented in this binding because I already implemented several times in other bindings.
So what is expected is to set channels to UNDEF in case the binding is not able to connect to the device (or detect a loss of connection). It will avoid having channels set to an outdated state.
Can you confirm @ronisaacson that it is what you expect ? In that case, I am 100% ok with that, you just have to set channels to UNDEF when the binding sets the thing status to OFFLINE. In fact, I even thought you already did that in one of your previous changes.

Edit: it looks like it is already done by the method setAllChannelsOffline. So I am totally lost…

This is not the binding which restores the previous values, this is another openHAB service, the rrd4j service.
What race condition are you talking about ? Do you mean the rrd4j stuff (restore) and the binding stuff (init of channels) are done in parallel and so with undetermined result ?

Normally, this case is covered by the use of your method setAllChannelsOffline which reset all channels to UNDEF. Isn’t it working ?

Same thing, isn’t setAllChannelsOffline called in this case ?

The error handling on failure is only part of it… there’s definitely an issue on startup where the default rrd4j behavior is interfering with the proper initialization of the various channels. This is the race condition I was referring to. It’s been a few months so I don’t remember exactly what happens but I know I needed to add the initializer for a reason. :slight_smile:

Since you have a non-default persistence setup, it makes sense that you wouldn’t see the same issue. Let me retest and I’ll come back with a better description of the conflict.

In case the problem is a race condition between rrd4j and a binding, it has to be fixed globally, not in every binding.
But I have a vague reminding that I detected such problem before OH3 was released and @Kai fixed it. @Kai can you tell us if restore on startup is run before any binding is started? Or is it done in parallel?

No, please don’t! If this is done by bindings, please help removing such code from the codebase.
If a Thing is OFFLINE it cannot tell anything about the states and therefore should not do any state updates at all. FTR: UNDEF is not saying “I don’t know”, but rather “it is a value that I cannot reasonable map to the item”.

In case the problem is a race condition between rrd4j and a binding, it has to be fixed globally, not in every binding.

I agree, but I haven’t understood if we really have any race condition here.

@Kai can you tell us if restore on startup is run before any binding is started?

The start levels can be seen here, so the restoreOnStartup happens on level 30. I am not 100% certain that handlers are started only after that, but I haven’t experienced another situation yet. One measure to prevent any issue here might be to do a NULL check on the item state and only apply restoreOnStartUp if no other state has yet been set.

For @ronisaacson’s problem, tough: You should consider the restoreOnStartup feature as a measure to make a “reboot” invisible - it just makes sure that the item states are the very same as if the reboot hasn’t happened. Therefore I do not see any need to be able to deactivate this feature for certain channels. It is rather up to the binding to make sure that it behaves nicely, whenever its handler is stopped and started. So if there is an issue if you stop the handler and restart it, because you cannot successfully reconnect to the panel, you have the very same issue and it is independent of restoreOnStartup. Whenevery communication problems occur, the Thing must go OFFLINE and this is actually the situation you want to inform your admin user about, so that he knows that there’s something wrong in your setup.
For defining a validity of item states, the expiry functionality should be used. You could have all your alarm panel items be set to NULL, if there are no updates within one minute. One feature we could maybe also think about is to allow item state expiry as soon as a Thing is OFFLINE. This would imho be much better than implementing this in every binding again and again (see above).

1 Like

Is that even desirable though? If OH shuts down at t0 and restarts at t1, there’s some amount of time that passes in between. Is it safe to assume that nothing changed externally in that time? Sometimes the new state of a channel can be determined by the binding, but sometimes not.

I’ll give you an example: the powermax binding gives you a set of alarm zones, and for each zone there’s a Last Trip time. The panel doesn’t keep this state; the binding sets it when it sees that a zone is tripped. It’s really useful. I have a rule that tells me if I’ve accidentally left the garage door open for too long, for example.

What is the proper value of this channel when OH starts up? Is it the previous Last Trip time from before the shutdown? What if the zone was tripped while OH was down? There’s no way to know. Restoring this channel gives you a value that could be false / misleading. All OH can tell you is “well, it hasn’t tripped since I’ve last been restarted” and therefore the value should be null.

Some other channels are not like that. For some channels, it’s possible to query the panel and get a guaranteed value which overwrites whatever was restored. But I’d argue that the binding author, not the user, is the best person to say whether that’s the case. The binding author knows which values can be set with high confidence.

This is one reason I proposed a change to the powermax binding which forces many channels to null on startup. Making the reboot invisible is, IMHO, exactly the wrong thing to do.

This is counter to how many/most of us here on the forum have understood UNDEF to work. This was perhaps driven by the behavior of the MQTT binding and others.

Sometimes it’s desirable and sometimes it’s not. A default behavior must be chosen though and the default is to restore everything. In those cases where it’s not, the users have the ability to pick and choose which Items are restored.

It would have been equally valid to choose the other way and have no Items restoreOnStartup and give the users the option to configure that should they choose.

Either way there is a block of users who have extra work to do. OH chose to go with the former approach in OH 3. They probably looked at how they use OH and how others have reported how they use OH and determined over all it’s less work on the end users is everything is restored by default. But as I said, in either approach end users will have extra work to do if the default doesn’t work for them.

And you have the option to configure it like that. And there have been several alternatives suggested here

The user can too with proper documentation on the Channels. And it’s the user who knows everything else about his Items such as whether Expire is configured, more than on Channel is linked to it, what rules they’ve written involving that Item and its state. The binding author knows none of that and if the binding author decides they know best with their limited context they might break perfectly valid configurations.

Perhaps some users don’t care in this case and would rather that Item retain the last value seen by OH rather than lose the value entirely. When the binding makes decisions like this it takes options away from the users. Often that can be a good thing as it leads to simplicity. But in a platform like openHAB I’m not sure that it is a good thing because it reduces the types of problems users can solve with openHAB, making it less useful.

I do like the idea Kai raised of having a sort of automatic Expire timeout for when the Thing goes OFFLINE. But I also like the use of UNDEF sent to the Items when that happens too. It provides a very nice way for even the regular users to see something is wrong and perhaps their actions won’t be acted upon (e.g. if all the Thermostat Items read - then turning on the AC isn’t going to work). Given how much of a pain it is to get an alert when a Thing does go OFFLINE, that is often the first that even the admin users see there is a problem. But that’s a problem for another thread.

1 Like

All this debate is for only one channel, the last trip timestamp. If I am not wrong (I will check again), all other channels are set by the powermax binding with their current value retrieved from the device during the init sequence.

Ok, I finally got some time to test this and I think @Lolodomo is right. While I was making these changes, I had a lot of trouble with channels picking up stale values after restart, due to what seemed like a race condition with the persistence layer. I wasn’t able to reproduce it this weekend though. So it’s possible there was a problem with this previously and it got fixed (I’ve upgraded to 3.1.0-final since then), or maybe it was a different issue in my dev environment. In any case:

After a restart now, all of the channels (including zones violated, arm state, etc.) will start with their persisted values, but those are all set even before the binding starts initializing. Once it goes Online, they get set to the current values as queried from the panel. And if this fails, the old values remain but the binding never goes Online. The only values that can’t be queried are the Last Trip timers, which will be wrong if a zone was tripped while OH was down, but I’m not going to stress about this.

The net result is that the alarm channels can be considered safe to use if the binding is Online, and not otherwise. I created a channel to expose this, and I don’t let any of my rules fire unless the binding is Online:

// String Powermax_BindingStatus

rule "Alarm Connection Status"
when
    Thing "powermax:serial:alarm" changed
then
    val status       = getThingStatusInfo("powermax:serial:alarm")
    val short_status = status.getStatus().toString()

    Powermax_BindingStatus.postUpdate(short_status)
end

I still think this has the potential to be confusing and non-intuitive for users with the default persistence strategy. @rlkoshak I understand that you need to provide sensible defaults, and some users might want different behavior than the default. I don’t think it has to be all-or-nothing though. My suggestion is that the binding author should be able to provide persistence hints that give a reasonable default behavior for each channel. Any user who wants a different behavior could customize it of course. But I don’t think “persist everything” and “persist nothing” need to be the only out-of-the-box options. That’s probably a discussion for a different thread though.

From what I recall from my most recent trip into persistence service configuration, the underlying model has a notion of “filter”. I suppose these are rather for excludes since items or groups are specified to be included. Since handling of filters/excludes is currently missing in persist files and persistence manager I did my own experiments.
Based on that I coded my own persistence configuration mechanism which allows to manage exclusion filters:

I still do verify actual behavior to be sure it works. So, in theory with a lot of caution there is a way to exclude some of items from persistence policies.

2 Likes

I had a similar problem with the rrd4j default configuration, as there is no restore for strings. Which is quite strange for a default user.

Is there any reason mapdb is not included per default in a way where it only affects strings? I guess it’s quite an overhead ov two services need to react on everychange of everything. One is enough.

At least would have been nice to have a config file representing the default config of rrd4j to just remove “restoreonstartup”, instead of having to recreate the config file based on a short textual description in the docs what rrd4j supposedly does on default. This would have made it easier for the OP as well to just change it so some items don’t get restored.

There is no database that meets all the requirements for a good default persistence:

  • embedded, no external services required
  • limits storage so it doesn’t grow forever
  • stores historic data so anything that can be charted in MainUI will be chartable by default

rrd4j comes the closest to meeting all of those requirements.

No, this is not technically feasible at this time. It could be configured to do that but that configuration requires knowledge only you know (i.e. the names of the Items that are Strings). There is no way to tell persistence to “save all String Items”.

There are open PRs being worked to add the ability to configure persistence strategies through MainUI. When those get merged it will be as simple as changing the existing config.