Persistence, or lack thereof

Win 10, OH3.4.2 and OH4.0 (swapping JAVA pointers)

So - my little mess I made without thinking by doing 4.0 and 3.4.2 (not directly a problem) and then grabbing my Zigbee Coordinator in 4.0. Well… Of course, 3.4.2 no longer HAD it. Grrr.

Anyway - I got the coordinator running again in 3.4.2. All my Z things and Z items were still there. Most still eventually came up as ONLINE! Until you tried to do something.

In the end, I had to recreate things and items. I’m assuming this is what loss of persistence looks like? I did not RECREATE the Coordinator. I disconnected reconnected. So, the long alpha numeric ID should not have changed?

No, Persistence (the OH term, I tend to capitalize these to distinguish between the OH technical term and the common usage) has to do with Items only. Losing Persistence would look like Items failing to restoreOnStartup, charts coming up empty, and rules that use persistence Actions suddenly returning null.

Persistence has nothing to do with Things, Widgets, etc. In this case you are talking about Things.

If you are careful to recreate the Thing with the same ID (when creating the Thing or discovering the Thing you have the option to replace the randomly generated string with your own) than your Links will remain valid and therefore your Items will link right back up with the Channels. No changes required.

If you do recreate the Thing with a new ID, your Items are still fine. You just need to delete and recreate the Links.

Things that fall under a Bridge Thing (like the Zigbee coordinator) inherit their UID from the bridge so yes, the new Things should have had the same UID once recreated. But there are a lot of steps that you’ve taken here. There are far too many assumptions we have to make just to know what you’ve done let alone where things broke down.

Of course. A Thing is OH’s internal representation of a device. It doesn’t disappear when the hardware becomes inaccessible. It just marks it as OFFLINE.

But each technology is different. In the case of Zwave, almost everything that OH needs lives on the USB dongle, so moving that from one machine to the next is no big deal. It just takes OH a bit of time to rediscover what each node in the network is and create the appropriate Thing for it.

For Zigbee, the binding is much more involved. So it seems not unreasonable to me that moving the dongle from one machine to the next might require rediscovery of the devices, particularly if the PAN IDs, or security Keys are different in the coordinator Bridge Thing. That’s like changing the password without telling the devices that the password changed.

Guess who is going to start creating their own unique ID’s now!

Ok. I have not redone everything. Starting at the top, my Sonoff Coordinator is still there, running, same parameters, same unique ID.

Below that I have several Sonoff Temp/Humidity sensors and Door Contacts, all of which show offline.

I went so far as to look at the CODE for each thing and each item, and in all cases, the pointers to the unique ID’s are correct.

I took one TH sensor and unlinked/relinked the item (temp). No luck.

I removed the item and added it back to the thing. No luck.

Of course, since it is the THING that shows offline, none of that is surprising.

If I delete and re-add the THING then everything works.

So - I have not messed with a lot of stuff since I would like to learn how to “fix” things like this. The only “corrupted” stuff is my Zigbee switched outlets (removed and re-added) and my 1st Temp/Humidity sensor.

Note: Here is my “code” for each item in a chain. ZB Coordinator (Thing), Sonoff ZB TempHumid sensor (Thing), Temperature (Item under the above “thing”).

This an example of the “chain” that does not go online (which implies a Thing > Coordinator issue.)

UID: zigbee:coordinator_ember:72461121e7
label: Ember Coordinator
thingTypeUID: zigbee:coordinator_ember
configuration:
zigbee_port: COM3
zigbee_channel: 11
zigbee_initialise: false
zigbee_concentrator: 0
zigbee_trustcentremode: TC_JOIN_SECURE
zigbee_extendedpanid: E6363BEAA4B09B6D
zigbee_baud: 115200
zigbee_flowcontrol: 2
zigbee_panid: 48188
zigbee_powermode: 1
zigbee_txpower: 0
zigbee_networksize: 25
zigbee_linkkey: 5A6967426565416C6C69616E63653039
zigbee_childtimeout: 86400
zigbee_networkkey: E722611B0D5D3302B478E69F128CF7BE
zigbee_meshupdateperiod: 86400

UID: zigbee:device:72461121e7:00124b0029100dc0
label: sns_TH02
thingTypeUID: zigbee:device
configuration:
zigbee_macaddress: 00124B0029100DC0
bridgeUID: zigbee:coordinator_ember:72461121e7
location: Master Bathroom

If I unlink and try to relink, it never finds the channels to add since… the THING is offline!

image

Debug info for ITEM:

  • Url: /settings/items/MastBath_Temp/links/edit/zigbee:device:72461121e7:00124b0029100dc0:00124B0029100DC0_1_temperature
  • Path: /settings/items/MastBath_Temp/links/edit/zigbee:device:72461121e7:00124b0029100dc0:00124B0029100DC0_1_temperature
  • Hash:
  • Params:
  • Query:
  • Route: (.*)

Battery or mains powered?

Do you see anything interesting in openhab.log?

Note, the ONLINE/OFFLINE status of a Thing has nothing to do with the Link or the Item. Essentially you are shutting the barn door after the horses got out. Links and Items are down stream from the Things. There is nothing you can do with those that will change the ONLINE/OFFLINE status of the THING.

There must be something going on but it’s definitely technology specific (i.e. only applies to the Zigbee binding). I’ve never tried to move a coordinator between different instances of OH so I’m afraid I don’t have much else to offer. I know that the binding is more involved with Zigbee than it is with Zwave so there must be additional data either on the stick or in the $OH_USERDATA/zigbee folder or the JSONDB files that is not preserved when moving from one machine to the next.

You may have stumbled upon on the solution. Deleting and recreating the Things is a pretty standard task to need to do when there are certain changes in some bindings. You’ve moved from 3.4.2 to 4.0.0 ? so there must have been such a change. Often an upgrade process will catch that and make the necessary changes for you but in this case there was no upgrade, right?

Please use code fences.

```
code goes here
```

While YAML is better than JSON or XML for human readability, it suffers from that feature dreaded by programmers everywhere: the whitespace is meaningful. Code fences preserves the white space.

It’s not that big of a deal, but you probably don’t want to post the network or link keys. Those are the keys someone would need to decrypt your Zigbee traffic. It’s a little like posting your password. Obviously the opportunity to attack you is much less than posting your email password (for example) but it’s still a good idea to keep secrets secret.

Note, under “Edit” there is now a “Code” tab for Items too (finally) in the OH 4.0 snapshots. If you got it, post the code from it when wanting to show your Item config. But in this case it’s kind of irrelevant to the problem.

Actually, this is all from 3.4.2 still. And it is indeed possible that something “not exposed” got changed on the coordinator when I went 4.0 and back again.

  • Battery powered. Although the Mains powered plugs had to be readded as well. I intentionally have not touched the other “things” yet (well, two are messed up now due to experimenting).
  • I will look at the log. I suspect not.
  • Re-adding the thing indeed fixes it all. And, I agree. Offline/Online has to be a link issue from the thing to the coordinator. If the thing won’t talk, the item certainly won’t! And as long as I use the same names, things like the scheduler still work. I was just trying to avoid recreating all my things. Part of life in the hobby world.
  • Hah! That’s how do mark “code”. Thanks.
  • I’ll check in the zigbee folder and JSON to see if there is added info that has changed that can easily be fixed.
  • And this: zigbee:coordinator_ember:72461121e7 is what you are saying can be “anything” as long as it is unique? Although now that I’m understanding more, it’s less important. Mabye.

This kind of stuff is to be expected. Hence - my sandbox approach to start.

There are also some buttons on the edit page.

image

For some things you can get better highlighting by putting the language after the first set of backticks.

```javascript
// JavaScript code goes here
```

Yes, the bolded part can be anything. I named mine zigbee:coordinator_ember:zg_coordinator and my Zwave controller is named zwave:serial_zstick:zw_controller. But you only get one shot at changing it when you first create it/accept it from the Inbox. After that the only way to change it is to delete and recreate the Thing.

As a rule of thumb, I always replace the randomly generated IDs with something more meaningful.

As I’m learning, it appears that my ID (for me) needs to reflect the brand/device:
Sonoff_zb_coordinator (I’ll always know what that is) or Sonoff_zb_SNZB02_01. (the 01 since I have multiples)
The NAMES can be…my coded names for internal use - the Sonoff_zb_SNZB02_01 would become my sns_TH01.

Just to throw a lot more fun in the mix.

My Ubiquiti router comes today so I’ll be firewalled. And all my WiFi stuff (for automation) will be on it’s own firewalled VLan. Using - a Ubiquiti Unifi Pro 6.

This is insanely fun! (and a little expensive).

It is fun indeed.

Now’s a good point for you to sit down and really think about backup and restore strategies though. As someone who just had to rebuild two of his servers I’m super happy that I had automatic backups of all my services and either restoration scripts or detailed instructions.

Unfortunately my backups didn’t include OH persistence so I lost all my historic sensor readings but :person_shrugging:, I wasn’t using them much anyway. But I am adjusting my auto backups to include OH persistence from now on as there were a few Items I depend on restoreOnStartup in OH for my rules to run properly.

Given I was running openHAB, Mosquitto, Zabbix, Wyze-Bridge, PostgreSQL, Redis, ElasticSearch, Gitlab (the restore of this was touch and go and would have been near disastrous if I did fail), Librephotos (which I replaced with PhotoPrism instead of restoring), Plex, Calibre, Nextcloud, and a Minecraft server (for the ten-year-old and I to play together in) on these two VMs and all I lost was my MapDB and rrd4j data in OH, I consider this a huge success.

And the way I deploy and configure my machines (Ansible) meant that I could start over on Proxmox instead of an old unsupported version of ESXi and fresh installs of Ubuntu 22.04 instead of three times upgraded from Ubuntu 16.04 (IIRC) with all the cruft that built up from that.

Ouch. Backup? lol

If I don’t care about persistence…and I restore to the exact same version in the same location, can I simply mass backup my entire OHHome directory and restore? Actually, would that also keep my persistence data?

This goes back to – what does backup MEAN? Well, what does backup have to INCLUDE (for Windows).

Use openhab-cli backup to make OH backups which grabs everything you need but none of the stuff you don’t. On Windows there’s a backup.bat under runtime/bin IIRC. Essentially you need all of config and most userdata excluding userdata/cache and userdata/tmp (those will get repopulated automatically). If you’ve installed add-ons manually you might want to grab the runtime/addons folder as well.

Only MapDB and rrd4j. It won’t include external databases like InfluxDB.

I personally run OH in a Docker container so I have just the one root folder I can grab all at once (and there’s no openhab-cli). Essentially, every time you create a new container it’s like moving everything from one machine to a new machine. So if I just grab everything I mount to the container, I’ve got everything.

One level of backup is handled because I use git to version control my OH configs. Normally a restore would just be a git clone and I’m good. But that only works when:

  1. everything is checked in to git and I exclude persistence
  2. the git server isn’t also offline

The reason GitLab not being restorable would have been a disaster is that was my backup for OH (among other things). Bad Rich! Bad Bad! I know better. And checking changes into git isn’t automated either so it can go a long time between commits.

Both GitLab and openHAB were running on the same physical machine and the same physical hard drive! It’s not much of a backup if it’s on the same hard drive, even if it’s in a separate VM. If it’s important it shouldn’t even be backed up in the same building. If it’s really important it should not even be backed up in the same city.

cleared cache and got my things back. Now working on my items.

  • Url: /settings/model/links/edit/zigbee:device:72461121e7:00124b00290f118f:00124B00290F118F_1_temperature
  • Path: /settings/model/links/edit/zigbee:device:72461121e7:00124b00290f118f:00124B00290F118F_1_temperature

These do not seem to be literal paths. NONE of the path segment exists. Thoughts? I’m going to check again, but I think all the UID segments are correcft.

Where are you seeing these? Except for the “edit” part they look like a reasonable value that might be in a Link.

It shows up after I unlink and link. I get this. (Cache still?)

If I delete and recreate the item it works. I’m going to compare some before and afters…

An interesting thing of note. A zigbee directory. And the only thing in it is the coordinator. No other devices.

All the useful code I’ve found is JSON or in yuge zigbee directory.

I WILL figure this out.

Shows up where? I don’t have any idea what I’m looking at.

Shouldn’t be related to cache. That’s basically just where add-ons get installed to. When you clear the cache it causes OH to download and reinstall the selected add-ons.

I don’t know what the folder is for but I only have a firmware and my coordinator in that folder. It’s created and managed by the binding.

JSON is the lingua franca of OH. You’ll find JSON pretty much everywhere you look.

As soon as I click Link, select the property and OK it pops up a box with this text in it.

Select which property?

This is an existing Link or creating a new Link?

Selected from the Item to the Link or from the Channel to the Link?

I can’t find anything similar in my system.

Maybe screen shots or animated GIF showing how you get to the link and what you click on to get the dialog would be informative.

I captured a bunch of screenshots. But, I think it happens after a “loop” of opening and reopening dialogs and clicking SAVE.

ie (using my item names): Click the Carport Temperature ITEM.
Click the sns_TH06 Channel Link (offline)
Click the item again (I think I’m creating a stack that is a bit… over stacked?)
Edit
Save
Back

and I get this screen

Bottom line?
Since the channel is not seen, I tried to unlink, then recreate a link to the same channel. Doesn’t fix anything.
If I delete the ITEM and recreate (along with the channel) everything is happy.

This is more a curiosity thing at this point. Especially since, as Matt pointed on in the Reolink thread, 4.0 M3 is due out soon. And any fixes are more likely to be done in 4.

And, I just noticed that my items that I have not fully recreated are offline again. Just easier to spend 15 minutes and recreate them.

OK, the “Not found” part is pretty critical information.

Are you hitting the back icon or back through the browser?

If you navigate to it some other way (e.g. go to Things → Thing → Channel → Link) do you get the same error?

If you query for the Link with that UID (everything after “edit/” I think is the Link UID) is there anything returned? Can you find that Link in $OH_USERDATA/jsondb/org.openhab.thing.link.ItemChannelLink.json?

Perhaps, but you can end up with orphaned Links if you are not careful which can sometimes cause problems. It might be worthwhile just clearing them out. If you log into the Karaf Console you can issue the following command to see if you have any orphaned links:

openhab:links orphan list

And the following command to remove them:

openhab:links orphan purge
1 Like