OH2 Z-Wave refactoring and testing... and SECURITY

Well, the controller we are using is licensed, right? It’s a standard controller, and it is apparently providing this information. However this is exactly why I decided to use the controller for this - they should have the best view of the network.

Why? So, the controller tells us it’s alive, and now we need to tell the controller it’s alive? My expectation is the controller already knows this as it is the one sending us the information from the device in the first place.

Sigma don’t release information on using the serial API. I talk directly to Sigma regularly so I’ll try and find out more, but given that this information is not publicly released, they might not be forthcoming with information.

That’s not correct - at the moment even if the controller thinks the device has failed (based on the information I have) we can still send data to the device. This was my point above - the controller should therefore know that the device is alive…

@chris, did something stood out for you in the log? I didn’t saw something strange.

Oh. I thought the controller blocked attempts to communicate with FAILED nodes.

Well, anyway, the problem as I see it is that often the controller seems to have erroneous information. For some weird reason it decides that nodes are dead that I know for sure are working. I have seen many times data being sent from a node to the controller and seeing in the logs that the controller simply drops the data because it has marked the node as failed, how stupid is that? (this might have been in OZW but I guess software doesn’t matter in this case)

No - nothing notable.

No - not in my experience.

I don’t know what error this is - can you provide a log as I don’t know would cause this in the current code.

As I said, this wasn’t in the current code, so maybe it isn’t relevant at all.

I’ll have a check this afternoon when I get home. It could be that the “failed” nodes simply haven’t sent anything to the controller since I restarted the binding. Seems unlikely though, one of them is a power meter which usually flucuates quite lot.

Ok - I didn’t see that this was for another code. As I’ve said above, this code is completely different so we need to be clear to avoid wasting time.

Sorry. Understood.

What I can say about the current code though is that requesting the IsFailedNode status every 30 seconds seems to be kinda meaningless, even if we ask a thousand times the controller still claims it is failed. But I guess the controller will never actively tell us when a node comes to life so we have to poll the status, right? Is there any way we could differentiate this behaviour between pure sensors devices (we will eventually get to know they’re living anyway when they send us something) and other devices (which can be good to know if they’re alive if we want to send them something)?

Why? I could change it to 60 seconds - what does it matter to the user if it’s 5 seconds or 60 seconds? It’s a period I chose to check the status of devices. If you think there’s a better time then I’m happy to discuss but this was chosen to provide timely updates without flooding the system in any way.

That’s not correct. At least in my experience, at some point the controller will decide that the device is online and it will return this.

Correct - this is exactly why I poll every 30 seconds.

No - that’s not correct. For sensors (I guess you mean battery devices) receiving data doesn’t mean they are alive. The only time to really know that the device is functioning properly in the network is during a wakeup period.

Ok… My experience is that nodes can be marked as failed for days and weeks and it’s not until the controller actually gets some communication from the node it may be marked as ok again.

No, I’m not talking about battery devices. I’m talking about devices whose primary function is to send data to the controller as opposed to those (switches, dimmers, whatever) who also have channels that we wish to manipulate from the controller. I mean, do we really care if our temperature sensor is alive or not? When we get data from it we see that it is. But maybe I’m misunderstanding something here, since you’re saying that receiving thata from a device doesn’t mean it being alive. I just don’t really understand how a dead node could be sending anything at all.

My issue/question is that for my door sensors I’m only getting trigger event (door alarm) reports. I’m not getting battery level reports from them. In the past with the test binding I have been able to get them in a constant connected state by triggering a manual wakeup when the controller was communicating with them and then get regular battery level updates. With the old OH2 binding I also saw more consistent battery level reports.

Thanks.

Well, yes, of course. If the device is really offline and not communicating, then I would assume that it stays marked as offline. The point is though that the only way to know this is to continue to poll the device and you are saying that this is “meaningless” which I don’t agree with.

A node can send, but not receive - it’s unusable and will not work.

This probably means the device is not waking up the binding isn’t receiving the wakeup message. I would suggest to check the wakeup settings, then check the log to see what is happening.

It is probably not related to the binding. The binding sends the polls every hour, or every time the device wakes up - whichever is longer.

It was set by default to once every 60 minutes. As a test I pushed it back to once every 30 last night but I just looked and that change is still pending, so I think your likely right that the device just isn’t waking up and sending battery reports thought it will still send an alarm report. What is the best way to generate meaningful data to see what might be causing this?

My thought for a data test would be to (turn on debug):

  • Change a configuration setting to see the communication between the controller / device
  • Then trigger an alarm event to get that communication

Any other actions that you would suggest as part of the test?

Ok, well in that case I guess I’m wrong then. I just really don’t like things being polled, but if that’s the only way, it feels more like a shortage in the serial api and I guess we can’t do much about that.

Ah, I see. When you put it that way it makes sense.

Starting to understand that you’re really doing everything you can from the binding’s point of view, guess it’s not easy backwards engineering something and then getting it rock stable :neutral_face:

The device will need to wake up before you can send the wakeup ;). So, to get this set if it’s not already, you’ll need to manually wake it.

Triggering an alarm event means nothing - it’s not related to wakeup in any way. You need to manually wake the device, or wait until it wakes up by itself.

Wouldn’t an alarm event make the device wakeup, or are they only outbound events and don’t trigger a regular wakeup action (still trying to understand all the nuances here)?

Polling is pretty standard stuff I’m afraid :wink: Even when something is sent unsoliticed, polling is still needed otherwise you don’t know if no message means everything is fine and there’s nothing to send, or everything has gone to hell and nothing is working at all… A low rate poll IMHO is in any case needed (low is a relative term - relative to the capacity of whatever you’re polling) - I poll the controller quite often, and devices much less often…

It’s not totally reverse engineered - I have the serial API documents, but they are quite old so have quite likely been updated and potentially improved, but as it’s all backward compatible it still should be applicable. This helps with understanding the frames, but not understanding what is really happening under the hood… I’m also hoping that in future we may be able to improve things through various means, but that’s for another day and another discussion…

No - not at all.

Yes - this is the point. An event is the device sending data, but it won’t listen for any response. A wakeup is the device sending a message to tell the controller it can receive messages for the next few seconds.