Oh, no I am not suggesting that! I think (guess) also that the issue is in the oH supplied service. I always thought this was changed somehow in the early version 3 - either changed into the nrjava serial stuff from something else, or the nrjava serial where upgraded etc.
I also only have the zwave stick attached to this box, no other serial sticks etc to compare with.
That is not my understanding.
Full disclosure: I am not a java programmer. I can only give my interpretation of the forum posts I’ve read.
The issue with nrjavaserial is that it leaks lock files. How this manifests itself is the problem that Ardanedh and Cplant are having is that when they restart openHAB, their usb zwave dongles can not use the usb port because the port is blocked by a lock file. When nrjavaserial assigns a device to a particular usb port it creates a lock file. When it shuts down, or a device is unplugged and no longer using the port, the lock file is supposed to be destroyed or deleted, freeing up that port for further use. Because of the bug in the software, these lock files are not deleted and when a new device attempts to use the port, it can not because of the lock file. Over time, there are often many lock files created.
The script Ardanedh has placed in his init file deletes all the lock files before the container is started. Stopping the host that the container is running on will often delete the files as well. Again, this is just my understanding from reading the forum posts.
The one thread I linked above has posts from one of the developers of nrjavaserial in which he states how the lock files are created, how they are supposed to be destroyed and the portion of the code which does not seem to be working as it is supposed to. He goes on to explain what steps he has taken to fix the issue unsuccessfully and what steps he thinks may have to be taken to fix it.
Edit:
I dug around on git and found some commits to core concerning nrjavaserial. One by wouter recently here on Apr 8 which was merged. Should be in 3M
This is a fix for this Modbus issue which includes a very long discussion
OK, maybe I should rephrase; the Issue I am seeing (should really only speak about my own issues), is probably some kind of race condition. Yes, also in my setup lock files are created and sometimes not deleted as they should. But that is only part of the issue. I delete many lock files during my not so lucky days trying times starting oH. In my case, it is not as easy as making sure there’s no lock files before starting oH.
For me, stopping, starting or unplugging/replugging/changing the port in UI make it start after a while. But also, like I said, sometimes it just starts at the first go. I have not been able to see a pattern when it starts and when it does not.
The things @chris mentioned (if still around in nrjava) (the serial port testing by opening), could be part of the issue.
OK Micael, it actually sounds like the same issue is the possible problem for you as well. I have been digging thru git and found one very recent commit which was by Wouter.
Most importantly this fixes a file descriptor leak when checking lock dir permissions.
Please see my edit in my above post for links. Apparently core is running a patched version of nrjavaserial. As of Apr 8, there should be a fix. What versions is everyone running?
So to summarize, this is not a Zwave binding issue. Nor is it a Aeotec Gen5 stick issue. Modbus binding users are having problems as well. Please see this post by ssalonen in Mar 2021 which include links to other discussions concerning nrjavaserial
Thanks for digging into this Andrew!
Any kind of poking in nrjavaserial brings up my hope for the new 3.3 version. I am still on 3.1, but have decided to upgrade to 3.3 as soon as time permit.
Just for the sake of completeness (or more like: as a documentation for myself when I bump into the problem next time, and until I’ve come across doing @Ardanedh’s init-file fix:
Log into the openHAB docker container: docker exec -t -i openhab /bin/bash
Access the respective folder with the lock file that shouldn’t be there cd /var/run/lock
Delete the respective file that shouldn’t be there: rm -f LCK..ttyACM0
Restart the openHAB container (in my case via Portainer)
Done.
Worked 100% of the time for me, and is at least a bit more elegant than re-creating the entire container.
I also recall there were permissions issues surrounding this file on past posts about Docker. I’m on openhabian on a Pi4 & the file is setup so openhab can read write. Don’t know if that is relevant, but I have seen notes in the log that the stale file was removed by openhab on a restart.
Quick question on this, since, from what I can see, the nrjavaserial-bug appears to be still open: Any chance of setting incentives (e.g. on Bountysource) to set some incentives to get this fixed? Or is this naive?
Maybe let’s start something on bountysource on this. Being able to reboot my openHAB Raspi without the need to manually fiddle around with my Aeotec Z-Wave stick would be something of actual value for me.
FYI … not sure if you have the same experience, but with mine it seems that if I dont turn off the Pi and just do a Stop / Start it comes up reliable every time (not enough testing to say this for sure) … also, if I turn it off, unless I remove the ethernet cable and let the unit sit for a while before rebooting it, it seems to always have this issue. (If that is true, I suspect some HW bit getting stuck in a state) but again very little data to make any conclusive statement. The only thing conclusive is that it’s a total pain in the ass to go through several boots to get it to come up (I’m on the dual-usb zwave/zigbee) and until zigbee is all Online, zwave is stuck.
If you use Docker and can fix it every time by manually removing those lock files… we could also automatically remove the lock files when the container starts. Normally they would also have been removed by the OS on startup.
Interesting. I also don‘t have that much data (maybe n=15 or so over the past 6 months), but in my case it is exactly the opposite, since I never turn off the Raspi. So most likely no connection to that.
This one here helps me every time btw. Quite annoying, but at least it brings the stick back online.
I am running openhab-alpine:3.4.1 and still have problems with Serial input via /dev/ttyUSB0. I always need to delete the lock file (created by openhab user) at /var/lock/LCK..ttyUSB0 and reinitiate the Smart Meter Thing to get things working for a couple of minutes. Any ideas?
I have the same issue but on a rpi4 running Openhabian and a arduino being connected.
when trying @Cplant solution the file isn’t removed. any ideas? thanks
sorry for being unclear.
i couldn’t delete the files when doing it without sudo for access reasons. And with sudo i don’t get an error but the files aren’t removed.
For now after a few restarts the problem is gone, but i had it various times and its unclear when restarting helps and when it doesn’t work.