2.4.0-SNAPSHOT Build #1447
Aeotech Z-Wave Gen 5 stick
~65 Z-Wave nodes
@chris could you please take a look at the debug log below?
node3 was flicked (in sitemap), but took 28s to respond. I see this throughout my whole network.
When entering a room it takes longer than usual for my Squeezebox to come on, and even longer for the lights. Delay is not consistent, but is there more often than not.
When tailing the zwave debug log, there are no frantic continuous activity.
When this log was taken, I had just power cycled my PC and also unplugged/re-plugged the Z-Wave stick.
A top looks ok. Java/OH only consumes 5-9% CPU
Thanks @mhilbush, for chiming in.
I checked node46 and it’s a Fibaro ‘The Button’ a battery device, but when I checked just now, it was quiet.
But I will monitor my /var/log/openhab2/zwave.log for timeouts.
I found that one node no longer in the net was getting Timeout retries every 8s, but that node is long gone, no longer in the net, so after a marked as fail and remove from controller in habmin that stopped.
I still got timeouts from 5-6 nodes and 3 in particular, and re-init/heal of those did not work. (stopped communicating)
I power cycled them, and viola, no timeouts for 30 minutes now, and response in net is very good.
How many timeouts are acceptable?
How often does it occur for others? (interesting to know)
Well, to sum up:
Yesterday, I became aware of a node (an Aeon Multisensor 6) that was spamming my net with BURGLAR events almost every 100ms. I think that was because it was placed in the roof looking down on my router lights and the LED display of 2 thermostats. Relocating it made it shut up.
I’m thinking that period of spam (was probably several days) could have left some of the meshing/relaying nodes in a peculiar state, such as those 3 I had to power cycle.
Also the zombie node (off net for ~3 weeks) was getting timeouts every 8s.
So, this shows that it is a good idea to monitor the network looking for talkative nodes and timeouts in the Z-Wave logs with DEBUG on. (for Z-Wave logging to own file, have a look here.)
Would it be possible to monitor these things and report it via some sort of Network Health?
Logging Z-Wave DEBUG severity on a regular basis should perhaps be set up to use a RAM/tempfs disk…
Just catching up with messages - it looks like you’ve worked this out now?
This would potentially be possible - I’m not sure completely how this would look like, but we could monitor things like a high level of SPAM and provide some sort of report or channel. I’m open to ideas about this as it is something that devices do seem to do occasionally .
Well, it should be low. The occasional timeout is ok, but I would expect the timeouts to be less than 1% of traffic (or in that range). As you see - the more timeouts there are, the more the network will slow down.