AeonLabs Z-Stick starts killing off nodes for no reason

My Z-Stick has been super reliable for well over two years now. I recently moved my openHAB server to a Proxmox hosted openvz container. For the most part everything has been running great, after getting my head around the USB device pass-thru for openvz containers (but that is another story!).

Anyway this afternoon I got a notification saying a Z-Wave node had died. Attached are the logs (it all happened around 4:06pm). I can’t see anything out of the ordinary, all of sudden node 11 failed to respond and then there was an avalanche of node deaths.

I attempted to restart openHAB but the stick fails to recover. I have resorted to unplugging the stick and re-inserting and restarting my openvz container and it seems to be ok again.

Has anyone had an issues running openHAB + Z-Wave USB stick in an openvz container?

@chris - if you get a chance can you run the attached log thru your analyser and see if there is anything obviously wrong? From what I could tell there was nothing there to indicate any problems so I am at a bit of a loss as to what has gone wrong here.

Cheers,
Ben

Hmmm - can’t seem to attach anything to this post…will try in a follow up…

Here is my log… http://pastebin.com/038CgQLe

Hmmm - seems I spoke too soon. Looks like I have a major problem - after restarting the container and openHAB the binding starts up ok but nothing is responding and eventually all nodes are marked dead.

I ended up having to reboot the entire host machine in order to get things back working. Not sure this was related to the binding at all now, seems like the host machine got itself in a bit of a state. A bit of a concern as the whole point of moving to a VM environment was to make things more stable and easier to recover from disasters!

Hey Ben,
You said your stick has been ok for “2 years” - is this still the v2 then? (I thought I saw a post from you saying you’d changed to the Gen5?).

As I’m sure you know, the V2 stick has some problems - this sounds like it might be related (but that’s a guess). I don’t know if I’ve seen reports of the same problems with the Gen5 (which I’m also using here now :smile:).

Let me know if the issues continue - I’ll take a look at the log a bit later…

Cheers
Chris

I have seen similar symptoms in a non-virtual environment (Pi 2) once or twice. OH looks like its running…from console,web UI and iPhone UI but nothing happening in real world. Bus event stream looks normal. Z-stick lit up normally. Fixed by cold start of machine.

My theory is some edge-case bug in low-level access (ie OS or driver level) to the COM/USB port.

@chris - I did test a new Gen5 stick a few months back but the main
controller at home is still the original S2. Didn’t fancy migrating all 30
devices across!

So after everything seemed ok after a reboot, it began failing again after
about 20mins. Now all nodes are dead although I have noticed every now and
again a command gets thru. Generally speaking tho everything is dead.

This has been running ok in my openvz container for a month before this
happened so I am at a bit of a loss.

Any hints in those logs?

Update: deleted all nodeXX.xml files from /etc/zwave and restarted the binding and so far (10 mins) everything is looking good. Will keep a close eye on it and report back if this changes.

Can’t imagine it is related however, I am predicting it will die again in the next hour or so.

Well, 8 hours on and everything is fine. I am thinking it was just a coincidence that after clearing out my node XML files everything started up ok, but I thought I better note it down here in case someone comes across this thread with the same/similar issue and wants to give it a try.

A little concerning as I am about to head away for 3 weeks and leave the house with a pregnant wife! I will not be popular if everything ceases to work again while I am away…

Hi Ben,
I had a look through the log - it doesn’t really show too much. Everything is timing out - including requests to the controller! This presumably means a fundamental breakdown in communications - either in the HABmin end, or the stick end… My guess is that this is associated with the z-stick since this is reasonably indicative of what we’ve seen in the past when the stick goes AWOL (and what people have reported over on the OZW list).

Not too helpful - sorry.

Cheers
Chris

Thanks Chris. Appreciate you having a look. If it happens again I might
have to look at migrating to the Gen5…

I seem to have the same problem.
I am renaming the xml files in etc… let’s see

nope. renamed, restarted. node grey in habmin

Grey node doesn’t mean the binding/stick is borked - it usually just means a battery powered device hasn’t finished initialising yet.

Thanks…I realized that after reading the bindings doc.
Will have to reinclude the sensor.
What is strange is that first time I tried it instananeously was in and green,

Unfortunately my stick has done it again - about a week to the minute that it happened last time. Very strange coincidence that it has happened on a Friday around 4pm both times…

Anyway - my Saturday is now going to be spent migrating all my nodes across to a new Gen5 Z-Stick in the hope that will resolve these issues.

Hi Ben,

Did it work when you migrated to Gen 5? I am having the same issue with this particular stick, both in OpenHab 1.x and OpenHab 2 on two different VMs (ESXi).

I only found this thread now, so I have already posted my log etc in this thread: All z-wave nodes go from online to offline (not communicating with controller)

Yes the Gen5 stick has been rock solid for over a year. Highly recommend.

Ok. I am having the problem with the gen5. I am guessing my problem lies somehow with the fact that it runs on a VM. The stick functions normal when I run it bare metal on an old laptop. Strange thing though; that is has the same symptoms as your S2.