Group of 8 Z-TRM2 modes sometimes only updates part of the group

zwave
Tags: #<Tag:0x00007f0e9050b338>

(Thomas Binder) #3

Do you use OH2.3+? There’s a neat new way to cope with changes of items within a group:
Member of hvacMode changed

see https://www.openhab.org/docs/configuration/rules-dsl.html#member-of-triggers for more info.
This way you could easily find the item which changed. Should work better?


(Scott Rushworth) #4

Even though it looks like you are using a recent version of the binding, the snapshots really only have had device database updates for quite some time. Before putting much effort into troubleshooting, I suggest you update to the development version of the binding, which has a lot of improvements and fixes, or wait for it to get into the snapshots…


(Ole Morten Rønning) #5

Thanks for chiming in.
@hr3 it is initiated manually in the sitemap,yes. Rules, I guess, triggers on both.

@binderth Will give it a try, but I doubt it is the problem. The display is dark on the device itself, so it has not received the command.

@5iver have been using the dev binding since (my) day 1


(Scott Rushworth) #6

Oops, sorry, I missed that! Guess it’s good to check though… :wink:

Do you have a log with the zwave binding in debug while this occurs? Is it always HVAC1Mode, or does it occur with other devices too? Does it always work if you send a single command directly to HVAC1Mode? You can do this through Karaf, without fiddling with a sitemap…

smarthome:send HVAC1Mode 1

Every now and then, it seems like I’ve had some lights not responding to commands sent while iterating through a group. This is not quite what you are doing, but similar, since we’re both rapidly sending multiple commands to the binding. This is very intermittent and I haven’t tried tracking it down yet. I’m currently using snapshot 1341 and the latest dev binding.


(Ole Morten Rønning) #7

@binderth changed my rule to:

rule 'HVACMode'
when Member of hvacMode changed
then
    logInfo(name12, "Update: hvacMode triggerd by " + triggeringItem.name + " Command=" + triggeringItem.state)
end

It worked nicely. See log below:

@5iver

Always! I did as you suggested and issued the group commands from Karaf.
Just as I was about to give up getting it to fail, it happened.

Log

I did these commands in Karaf:

openhab> log:set debug org.openhab.binding.zwave  @2018-09-03 22:49:44.372
openhab> smarthome:send hvacMode 0                @2018-09-03 22:49:47.354
Command has been sent successfully.
openhab> smarthome:send hvacMode 1                @2018-09-03 22:49:55.544
Command has been sent successfully.
openhab> log:set info org.openhab.binding.zwave

I have not run this throug Chris’s log viewer since I’m not sure hvat to look for, but it looks to me that the:

openhab> smarthome:send hvacMode 0
Command has been sent successfully.

command was only sent to 5 out of the 8 nodes.

Sending single commands to a node at a time always work.


(Scott Rushworth) #8

Good test! I’m busy holding down a hammock, but I’ll take a look later tonight.

@OMR, I’m having trouble downloading your log file. It seems to just take me to a blank page. Never mind… had to find a Windows pc.


(Scott Rushworth) #9

The log is interesting, but I think a longer section, about 30s after sending the command, would be more helpful. There is a lot going on when sending these commands, and it takes a while for everything to settle. Also, sending 1 to hvacMode at 2018-09-03 22:49:55.544 makes it difficult to piece it all together.

But hold off until @chris has time to take a look at it… this may be enough.

How are are you confirming the command fails to be sent to the device? The display on the device lights up when the mode is changed?


(Ole Morten Rønning) #10

Right, but it always reflects the settled state of the mode displayed in the logs.
Issuing the command a second time puts the whole group to the correct mode.


(Ole Morten Rønning) #11

@chris, just did a new test on 2.4.0-SNAPSHOT Build #1361 (from yesterday). Z-Wave binding now also from snapshot.
This is still an issue. Sending a Mode change to a group of 8 devices fails 50% of the time.
Let me know if you need more than the log I made previously.
Thanks.


(Chris Jackson) #12

Sorry - I missed this.

Can you provide the XML file for node 85 please.

I also note node 30 is sending a lot of spam - you might want to reset it.


(Ole Morten Rønning) #13

Sure, here is node85.

network_e0b2d700__node_85.xml (38.4 KB)


(Chris Jackson) #14

Are you able to set the thermostat mode of node 85 outside of the group command?


(Ole Morten Rønning) #15

Affirmative. Both from paperUI and Karaf.
One thing though. Not all of these 8 ztrm2 has been deleted and re-included after the latest database update.
Could that be an issue?


(Chris Jackson) #16

No - it shouldn’t be a problem…


(Chris Jackson) #17

@OMR can you enable trace logging and try again please and send me the log…

Thanks.


(Ole Morten Rønning) #18

As before, seems like enabling debug logging has an effect.
This time I’m unable to get it to fail with debug on.

Tried 14 times from Karaf and 6 times from site map.

Could it be marginal timing somewhere?


(Chris Jackson) #19

Yes - I’m thinking it’s something related to the queue management as the commands probably all come in different threads. However I just went through this and checked that all queue interactions are synchronised, so I can’t see where this is if it is a synchronisation issue.

I’ve spent most of the afternoon looking at this - I think it’s soon time for beer! :confounded:


(Ole Morten Rønning) #20

Sleep on it. Maybe something pops to mind :slight_smile:


(Chris Jackson) #21

Looking over this some more, and based on this log, I’m not sure there is actually a problem at all.

Your network is VERY busy, and some devices are taking a long time to respond. This is blocking traffic. If we look at the first set of commands that come in -:

Here we see that just before the commands are received, a request (transaction 1) is sent to node 97. We don’t get the response to this for a few seconds, and then the final report is received after 5 seconds, and the next command (transaction 2) is sent. I suspect that there is quite a queue of transactions waiting, and these commands are just at the bottom of the queue.

If we look at the next set of commands -:

Here we see a similar problem - node 97 is again playing hard to get, and in this case, the transaction fails (we see transaction 3 with the NO ACK). The next transaction is again sent within a few milliseconds.

So, from this, I don’t think there’s a problem with the binding. I think that there is a LOT of traffic on your network, and some devices are slow to respond which is really killing things.

This is probably also why the problem doesn’t always appear - if the network is working ok, then there is no problem.

I would try and look over the logs to see if your system is configured properly. As I mentioned earlier, node 30 is looking like it has problems -:

Here we see a lot of humidity updates received in the space of a few 10s of milliseconds. This sort of issue may be caused by the device being incorrectly configured (maybe you have multiple associations configured - this is a common reason for multiple messages) or maybe the device needs resetting. My guess is it might be the latter as we also see a LOT of UV messages and I don’t think there are likely to be this many association groups! -:


(Ole Morten Rønning) #22

Thanks for looking into this.
Yes, my network has grown ~73 nodes now.
I really should pay more attention to unnecessary traffic.

I power cycled node30, and grep’ed my log for ultra(violet), a sensor I don’t use at all.
I never linked that channel, but that does not stop the node from sending it.

Loggs show there were quite a lot of ‘ultraviolet’ messages from Multisensor 6’s so a changed parameter 101 from 241 to 96, and that did shut them up.

Will check other parameters I’m not using as well and report back.