WakeupTimerTask -- what is it?

tarcus · December 2, 2018, 4:12pm

There was a debug posted earlier in this thread. The problem I’ve got is that the network seems to go through this every night and once complete, devices that were responding on one day no longer respond on the next. I’ve updated to the latest snapshot, disabled network healing and will be keeping a closer eye on it this week to see if I can narrow down the circumstances a bit.

mhilbush · December 2, 2018, 4:13pm

Sorry, I missed that. Let me take a look…

tarcus · December 2, 2018, 4:14pm

Thanks. The debug posted above was posted after disabling network healing, I have no idea if healing can have this kind of effect or not. The logs that were around before I disabled healing were rotated out and lost unfortunately.

mhilbush · December 2, 2018, 4:17pm

What are some of the node numbers that are problematic, and what specific devices are they?

mhilbush · December 2, 2018, 4:21pm

I don’t know what node 64 is, but this doesn’t look good every 5 seconds…

5iver · December 2, 2018, 4:24pm

Search the forum for the command “REFRESH” for details. You can send this command to an item to poll the channel.

mhilbush · December 2, 2018, 4:24pm

Timeouts on 3 and 58 not good either.

ZWave%20Log%20Viewer%202018-12-02%2011-23-34

tarcus · December 2, 2018, 4:26pm

Node 64 is a mains-powered external motion sensor, this kind of thing happens only when the morning “storm” of z-wave activity is happening which is what that log is from, during the rest of the day, including all the 30-minute radiator polling, is working fine now. Tomorrow morning it’ll probably go through another storm.

As for problematic nodes, it’s essentially all of them, the z-wave network becomes extremely unresponsive in the mornings for all devices, battery powered or mains powered, which is what I’m trying to track down. I was looking to see if there was some event that happens every day which might trigger the network traffic.

mhilbush · December 2, 2018, 4:31pm

I see

80 timeouts in the log
216 APPLICATION_BUSYs in the log, all or most of which are node 64

If it were my network, I’d exclude node 64, then observe the behavior. That will show whether it’s the victim or the cause.

tarcus · December 2, 2018, 4:37pm

OK I’ll have a go, it’s on a switch so I can just turn it off and see if it helps at all. I have had these problems from before node 64 was part of the network though so I suspect it’s a symptom, the node has a LUX sensor and that time in the morning the light is passing from darkness to daylight so it’s when the device is regularly trying to send new LUX readings. Once the sun gets high enough it maxes out the LUX sensor so it stops reporting changes. I’ll check the insanely complex and ambiguous documentation to see if I can restrict it to send updates slowly.

edit: I just had a quick zip through the logs, the luminance sensor is sending updates once a minute during the change from day to night and back again which doesn’t seem excessive.

tarcus · December 2, 2018, 4:50pm

I just had a look through the rest of the logs, the APPLICATION BUSY messages started at 1AM on Saturday and stopped this morning at 7:20 AM, I’ll have a good poke around and see if I can figure out what’s causing this.

rossko57 · December 2, 2018, 5:24pm

While the openHAB framework allows for this useful command, it’s up to each binding to implement it. I don’t know whether zwave does - for instance it makes no sense to try to REFRESH (poll) battery powered devices that are asleep.

tarcus · December 4, 2018, 6:37am

While yesterday morning was fine, this morning the delay and network storm was back, node 64 being the culprit again.

Does this log make it look like this device really doesn’t like polling? Is there a way to turn polling off completely for a device?

I’ve already set post-command polling to “disabled” and have tried to set the polling period to 10 days via habmin but paperui seems to disagree on what the polling period is. “things show” in karaf shows “binding_pollperiod : 86400” which is one day not 10. It also shows “binding_cmdrepollperiod : 0”

Debug log attached.

zwave.log.gz.xml (299.7 KB)

tarcus · December 4, 2018, 7:38am

Poking around a bit more, what I suspect is going on here is that openhab is polling the luminance sensor (endpoint 2) but when there’s no light falling on it (luminance 0), it returns “endpoint busy”, so either I need to disable polling for this device (I just don’t need it), or I need to restart openhab during daylight hours so the daily poll of the device happens during daylight, or I need to wait for summer!

Is there a way to disable polling for a device? I tried setting the polling period to 0 but it just started polling at 15 second intervals instead. I’ve also tried setting it to a second count longer than 1 day but monitoring the debug logs shows it caps it at one day.

I’ve re-opened my previously closed issue on this problem now I know what’s causing it:
https://community.openhab.org/t/unwanted-polling-loop-traffic-causes-z-wave-slowdown/52531

tarcus · December 6, 2018, 10:23pm

If anyone knows how to disable polling for a device, please tell me. It’s brought my network down again with a polling loop.

Log here. It’s Node 64.

zwave.log.gz.xml (527.3 KB)

tarcus · December 6, 2018, 11:01pm

Looking through the log, the z-wave binding is sending a SENSOR_MULTI_LEVEL_GET message to the node every 5 seconds, which seems excessive under any circumstance, particularly as command polling is disabled and the polling period is set to 1 day. Is it not handling the “APPLICATION_BUSY” return message properly and just hitting the node again indefinitely?

I did a reinitialise on the device, during the reinitialise it was still hitting it with polling messages every 5 seconds. I turned the device off until the network quietened down again and now it’s turned back on, the network is quiet and polling the device works fine, but this in the past has proven to be a temporary fix.

rossko57 · December 7, 2018, 1:01am

So far as I can tell, the APPLICATION_BUSY status generated by the device is only intended to be a transient state. The most sensible course of action is to retry after a few seconds.

You might be interested in
http://forum.micasaverde.com/index.php?topic=2021.0
which suggests that the device itself could supply a suggested retry time. I don’t know whether OH binding honours that, or does its own retry thang.

It might be possible to enhance zwave binding to abandon retries after N such responses, or perhaps better double the retry delay each time and push it to the back of the queue.

tarcus · December 7, 2018, 1:21am

Yes that thread in the MCV forums (I used to have a Vera – never again) looks similar to what I’m getting, I don’t know if there’s a way to simply increase the polling delay easily, I might have to raise that on the other thread.

rossko57 · December 7, 2018, 11:56am

Maybe I’m looking at it wrong, but it doesn’t seem to be polling that is the problem as such.
An otherwise routine poll gets this APPLICATION_BUSY response, and deals with that according to “rules of zwave” by retrying after a short period.
I’d call it a retry loop.

Reducing polling would reduce your chances of triggering the whole show, but once its looping it’s looping. It does not sound like the right avenue to pursue.

Not polling at all might never trigger the problem - that’d depend on why the device does this. But it is just a workaround.

tarcus · December 7, 2018, 1:06pm

I would love it if everything worked the way it ought to, but sadly workarounds are the name of the game. I’m currently trying to get OpenHab to complete initialisation for various devices on restart rather than significant numbers of my battery powered devices remaining in the “Initializing” state perpetually. A restart just changes which devices don’t complete initialisation. I’m also having problems with devices losing their associations sometimes, and both the GUIs don’t show associations clearly visible in the config of devices (which are also correctly sending alerts back to the lifeline so are plainly properly associated). Both GUIs tend to be out-of-date with current settings, so I have to hit reload on a page to have some changes show up, and so on. Basically quite a bit of my home automation work is spent working around problems with openhab. I’d love it if I could just get it all to work the way it ought to but it’s just not that kind of world.