OH2 Z-Wave refactoring and testing... and SECURITY

It’s certainly not working as I expected -:

This shows the second message that is rejected is sent 5ms after the first one - this should have been delayed (like the third TX message is). This seems to be common…

Good catch. There’s 200 ms between transmissions on node 68, but the message for node 85 goes out immediately after the rejection.

Ok, I’ve resolved the issue with the holdoff not working first time around - the request gets requeued before the timer is started, so that shouldn’t be a big issue.

As a matter of interest, what sort of computer are you running on? I’m wondering if I shouldn’t add a short delay between all transactions (say 20ms) to see if we can reduce the occurrences of failures in the first place…

Intel Core i5 overclocked to 4.2 GHz. :sunglasses:

It’s not linked to the node, so this is just coincidence… The holdoff is a total block on all sending.

In all, this does look a lot better. I see one OFFLINE in this log -:

After this, node 68 works fine, so this is really caused by comms issues with the controller. With the first holdoff being skipped, we effectively only have a single 200ms delay here before we’re offline.

After your earlier comments about it always taking 2 attempts, I increased the 200ms to 300ms - I was thinking about dropping it back down given the issue with missing the first hold, but I might leave it and try and eliminate these sort of OFFLINEs if we can.

I wonder if that’s why some people have more problems than others? Maybe adding a standard delay between transactions of 5 to 20ms would help avoid overloading the controller when people are using fast computers (just thinking out loud - comments welcome).

@5iver and @digitaldan - what are you guys running (probably also something fast :wink: ).

Updated version that fixes the first holdoff is here. This uses a 250ms holdoff.

Looks much better! Only one node is offline, and that’s a battery-powered device.

Very few REJECTED messages, and when one occurs, it’s successful on the 2nd attempt.

Sending you a log.

1 Like

Nothing too fancy here… I’m using 10+ year old junk from the recycling pile (AMD x2 5600+ 2.8GHz w/6GB DDR2). I nearly tossed it last week when the CPU fan mount spontaneously fractured. It got lucky… I found a model for it so I printed a replacement!

What’s going on here? This first bit looks OK…

But then there’s a barrage of multiple reports (off screen), and then some more gets, then node dies and comes back online…

Another strange one…

The first one was from a siren (battery powered frequently listening), the second from a dimmer, this one is from a WADWAZ-1 door sensor. I currently have 30 dead nodes and growing, all battery powered.


I’m shocked. I felt sure you’d be using a cryogenically cooled super-computer :smile:

Can you email me over the log?

On their way…

Great - thanks. The hold-off seems to be a big step forward :slight_smile: .

I will merge this in to the dev binding tonight probably (I’ll take a look at Scotts log first).

FTR -:

Hey guys, that is not fair, one of those is my main working computer (upgraded with an ssd and still working fine with Win10). And yes, I bought it in 2008 … :joy:

My POS has got one too! :kissing_heart:

1 Like

I have always had issues starting up the z-network ~70nodes, and I have OH running on my main server, rackmount core i7 4-something GHz, so I thought I should try the new updated version, and gut feeling is that it works much better than before, also looking in logs gives me the feeling that this is a step up from before. I have not been updating so much lately, so maybe this is something that was fixed earlier during the last month or so, but I just wanted to give my feedback!

1 Like

I suspect that there is some “bad sh!te” happening at the network level. It’s a guess as we don’t have any visibility of that at the binding level, but the multiple responses are indicative of lots of retries happening, or maybe the network being congested and retries getting queued. I thought that the controller only sends a few retries though (3 I thought). In this sequence the binding is sending 1 GET request, and we get 12 REPORTs.

The OFFLINE at the end I will look at. This is another area that I shouldn’t set the device offline. Here the frame is rejected by the controller, not the device, so we shouldn’t blame the device and set it offline.

Thanks - it’s really useful…