Summary
I lost my OH2.4 system (HDD failure) 2 weeks ago, and after rebuilds, have been trying to diagnose the issue since. I sincerely hope I’m overlooking something very simple, perhaps some config/setting set years ago, etc. that I’m missing. Looking for suggesting as I’ve been troubleshooting (probably over-analyzing) this for 2 weeks now.
It doesn’t make logical sense that it is the binding as I’ve started from a clean install of OH2.4.0/Zwave Binding 2.4.0 which is the same version I’ve been on over 9 months. However, since the rebuild, when OH is started, the ZWave stick status LED (which normally cycles a color pattern ~0.5 seconds, looks like it “freezes” but if you wait long enough, will cycle to the next color. It varies, but I’ve timed this at ~12minutes as an example. The response of z-wave devices seem to follow suit. (for example, if I switch a device on/off in Paper-UI, the delay for when it actually occurs seem to coincide to a similar timing) If using the Sigma “Z-Wave PC-Controller” v4.78 software (aka ZenSys Tools), I can switch devices on/off immediately, leave it running for hours/overnight, etc., no issues I can tell.
I have switched PCs, Z-Stick, software and everything possible I can think of to rule out every variable I can think of (as much as possible without the benefit of the original drive). I’ve tried to capture all of this below which hopefully helps identify some “ah-ha” thing I’m missing or determine if actually a bug somewhere.
Details / Troubleshooting / Observations
(not necessarily in order)
-
Prior to failure:
- OH 2.4.0 Stable Rel
- 2 core/ 4GB RAM dedicated system. (Small Form Factor PC)
- Aeotec G5 ZWave Stick
- Ubuntu 14.x LTS (not 100% sure now)
- Been running with no issues since 2.4 was released I believe last Dec
-
HDD issue when OH system was powered off inadvertently.
- ironically when breakers off to install 14 new Z-Wave switches / devices
- (Yes, on an APC UPS, and yes I know the battery was overdue to be replaced
-
Rebuilt PC with new SSD HDD
- (had already planned to replace spinning HDD with new SSD some many months ago, so again ironically, already had new SSD waiting for this)
- Ubuntu 18.04 LTS
- Java OpenJDK 8 JRE/JDK Headless (build 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10)
- Initially installed OH 2.5.0M3, when realized this behaviour, removed and started clean install of 2.4.0
-
When OH 2.4.0 (and 2.5.0.M3) starts, Aeotec G5 stick appears to freeze during each OH initialization…
- Z-Stick LED, which typically sequence between red/blue/yellow every ~0.5 seconds can then take about ~12 minutes to sequence.
- After OH2.x initialization and lock-up occurs,reinserting z-stick (power cycle) returns led status sequence to normal
-
No obvious errors/issue (to me) with debug logging in OH for “org.openhab.binding.zwave” (and at ROOT level) observed.
-
Using Chris’s z-wave log viewer with zwave at debug level, nothing obvious (to me) either.
-
Devices to respond after some time, could be ~15m-ish to immediate (believe could be timing related to above window)
- (appears related to the zwave stick led status sequence timing)
-
Monitoring/Operating Z-Wave stick (via Sigma Z-Wave PC Controller) for long periods of time:
- No z-stick “lock-ups” (LED sequence / cycle normal)
- see what appears to be typical zwave packets… data received, etc.
- basic on/off control operations sent to devices are immediate
-
Originally thought Aeotec Home Energy Monitor (new device) might be “flooding” the zwave network based upon forums.
- unplugged initially
- then completely removed node from ZWave g5 stick.
- remains power off / not joined
-
OH Version
- Clean new install of 2.5.0.M3
- downgraded Stable 2.4.0
- Clean new install of Stable 2.4.0 (Current state)
-
Same behaviour with original/current and new physical Aeotec g5 Z-Sticks
-
OS/Ubuntu versions
- As mentioned above, new install on 18.04LTS
- Tested another clean OH2.4 install on 16.04LTS
- with all newly added items removed from Z-Stick
-
permissions - running OH as root vs typically as openhab
-
removing all recently added z-wave nodes (>= node ID: 45)
- incrementally removed z-zwave node IDs from 58 down to 44 (reboot/reset between each)
-
At various stages, gave the OH system lots of time (many hours / overnight) to let Z-Wave network healing processes, etc. basically time to do its thing, etc.
Again, hope I’m missing something simple!! Thanks all!