Openhabian + RPi4 + Zwave Gen5 falls offline on restart

Good catch, @Wolfgang_S. It’s somewhat amusing that @Andrew_Rowe and I were both involved in that similar conversation last year, but in a slightly different context that made it unrecognizable.

Reading further into that thread, a fix to remove the stale lock file was introduced in 3.4, but maybe it only works for containers?

Thanks. It’s nice to know I’m not the only one seeing this issue and there’s a known workaround!

I also opened an issue on Github and Chris responded. I guess I’m curious if the binding could clear out the lock files when it initializes. Or if it wants to be nice it could only clear them if they’re older than the system uptime. Not sure if the binding can tell the exact path where the lock files live through the java libraries.

FWIW I think zram slightly masks the bug because (and I’m guessing here) the lock files don’t always get committed out of RAM cache. I tried moving OH to a USB mounted device on the PI - and saw consistent restart failures. Then I tried OH on a VirtualBox Ubuntu Server on a NUC and also saw consistent restart failures. Then went back to the PI, turned off zram on the SD Card file system and saw consistent failures.

No, that wouldn’t be the problem. The folder where the lock resides, /var/lock, isn’t put into zram. That’s an operating system controlled file system and not one that openHABian messes with.

In fact, if the lock files were in zram, the lock file would get lost when pulling the power because it doesn’t exist in disk.

When you pull the power, you prevent any cleanup from happening to remove the lock file which happens during an orderly shutdown. That’s why it fails on all those different platforms and configurations too.

no, the first link in this post is back to the original issue with NRJavaSerial by a contributor of that library. Wouter forked NRJavaSerial and patched it for us but this is a know bug with the library.

Andrew - is the upshot that deleting the /var/lock/LCK…device files is not a reliable way to restore the connection?

Seems like this issue has been around for years and may merit a fix. How can I help?

No that works. It has been years that this bug has been in NRJavaSerial. This post is from a github (closed) issue in the NRJavaSerial repository that was started in November 2020. This exact post from MrDos (who is a maintainer of that repository) explaining that he knows there is a file leak and really doesn’t know how to fix the problem. Ive seen this issue manifest it self in many odd ways. Sometimes seemingly unrelated problem end up being a result of this bug.
I think it also effected the modbus binding. If you search, there is a thread about using an alternative serial library.

know java?

1 Like

I can hack my way around in Java.

In the meantime I could see openhabian adding “rm /var/lock/LCK…*” to the startup script.

I would also add to binding docs (at least zwave, modbus and serial) there’s a known bug with stale lock files on Linux and to try removing them manually.

1 Like

Hi Asher - I am still experiencing a similar issue but have not rebooted in awhile so it’s on the back burner … here is my thread from this … also posted in the GitHub thread for this issue.

Thanks. From the various threads it sounds like there may be a fix but it’s not clear it’s integrated in OH 3.4.4. Does anyone know how to message Wouter since he seems to own this integration?

FWIW I just looked at the NRJavaSerial library code and I think the LCK files are created in a C library that the Java library wrappers. So this is an ancient anomalous behavior. (I’m trying to not use the word “bug”).

2 Likes

In the one thread (not the git issue, the one in our forum) I believe there is a script someone posted to delete all the stranded lock files.

it’s a bug

1 Like