Multiple Fibaro Wall Plugs FGWP102 keep getting excluded from the network unexpectedly

robbert · February 16, 2020, 10:12am

Very well - as I do not use associations a lot, I am fairly sure these shouldn’t be there. But I will double check.

It has, second device in a row, on the same spot.

You are right that I did not mention this when posting the second occurence of the issue, however when the issue first occurred I did describe the trouble I had fixing the node, from which can be concluded the device was in a rather confused state, not just temporarily unreachable.

Yes, that’s what I wrote in my other reply in this thread. I had to set the node as failed and remove it using PC Controller.

Unfortunately I can’t check this any more as I have already removed the dead node.

I guess partly the “more information” you are referring to is already present in my other responses in this thread, and from those it can be concluded that this problem isn’t likely to be regular interference.

However, I realize I could have mentioned the device’s state in my follow-up response about the second occurence as well. I’m sorry for that missing piece of info and I will do my best to be more clear in future.

Thanks for your help as always.

robbert · February 16, 2020, 10:14am

Exactly, this is the weird part.

This is correct.

chris · February 16, 2020, 10:17am

Again - I would suggest a word of caution! Don’t jump to the assumption that it is the same issue. It might be, but the number of times I see people make assumptions like this that force them to make prejudged decisions that are later shown to be incorrect is quite high.

It’s useful to consider, but I’d really not make any assumption at this point.

robbert · February 16, 2020, 10:24am

We are on the same page, Chris. I definately agree with you, we shouldn’t jump to any conclusions.

I know I have been having issues in this spot for a while now, and have the feeling that all these events are related to the same issue. But a feeling does not help much, that’s why I recently started monitoring/logging this.

I now have permanent debug logging on and a Zniffer coming in, so time will tell…

Thanks for your help so far, @chris, @robmac, @Bruce_Osborne!

robmac · February 16, 2020, 10:26am

The phantoms on these devices are even weirder than that. A device with no association suddenly has node id 315 as an association for example.

Yes I know only 232 should be possible but there is a definite issue in these and a few other fibaro devices. Sme NVM corruption at a guess.

They do not show in openHAB interface but if you query in PC Controller you see them.

You will see odd packets in Zniffer though when the device generates reports for a node it should know nothing about with possibly an out of range id.

chris · February 16, 2020, 10:29am

I can’t see how that is possible. Firstly, node IDs are a single byte - 8 bits as stored. This only physically allows numbers up to 255.

I don’t know what query you refer to? Do you mean getting the list of nodes, or what?

robmac · February 16, 2020, 10:32am

Indeed by design but by Polish programming odd things happen. c is such a raw thing . With multiple endpoints who knows how they have managed to get it wrong. Combine endpoint and nodeid badly???

The association tab. I think I have an image of one somewhere. I will see if I can track it down.

chris · February 16, 2020, 10:35am

Well - by physics really. If there’s only 8 bits being used, then it can only be a number between 0 and 255. Anything else is a display problem as a byte can ONLY hold that number.

That would be interesting.

I would be careful not to assume there is no bug in the Sigma software either That said, if it’s showing something like that, then there is clearly a problem elsewhere as well.

robmac · February 16, 2020, 10:48am

there is a bit of memory and only 8 bits should be written and read. nothing physical. This is c not a nice protected language.

It is so bizarre that nobody believes it can be

I have no illusion on that. Some of the comments in the code that is available are fun.

I will search the old thread on this. Original Fibaro RGBW was main culprit but these modules do it to.

chris · February 16, 2020, 10:56am

I’m still not really sure what you’re referring to, so it’s a little accademic. At the end of the day, if it’s 8 bits, it can’t hold a larger number and that is independent of the language. If it’s displaying a larger number, then that could be a bug in the display/interpretation of the information.

Nobody believes WHAT can be? An 8 bit number will physically only use 8 bits and can only store 0-255. There can be a bug in any language to display something different, but if the protocol, or memory allocation only physically allocates 8 bits, the it can physically only store a number from 0-255 as there is no room for anything else.

Apologies if I’m misunderstood your point here.

robmac · February 16, 2020, 11:32am

You have

You assume it is a language with a byte data type that is all managed but it is c and the programmer might just have malloc a bit of memory and then made a mess of the boundaries.

In PC program they may have been lazy and just made it an int but who knows in the firmware other than the fibaro dev. You also assume that the pointer is not corrupt and pointing at the 8 bits that was allocated rather than some totally different bit of memory

chris · February 16, 2020, 11:37am

No - I’m assuming that the protocol defines this as a byte. A byte is physically 8 bits and can only carry numbers of 0 to 255.

That’s why I said earlier that programs can have bugs, but the physics of a byte, and the allocation of a byte to a protocol can not provide a number of greater than 255 in a byte. That’s physics.

Anyway, it doesn’t really matter so I’ll close my input on this thread.

robbert · February 22, 2020, 4:20pm

So, I just managed to deliberately reproduce this issue (or: these symptoms).

Current OH version: 2.5.2 (release).

I reproduced this issue by quickly toggling the wall plug (while connected to some LED lights), on and off multiple times. EDIT: toggles were performed from the OH web UI and not using the physical B button. After doing this for a while, things start lagging, and eventually the device goes offline. These effects are not suprising thusfar. However, now the device thinks it’s not included to the network anymore (signaling a red ring when plugged into the socket).

Note: Obviously I know that switching on/off quickly multiple times is not very normal user behaviour, but this problem also happens under normal usage (it only takes longer to trigger).

Anyway, the plug is now offline/excluded. Except for unplugging/replugging I haven’t done anything yet to fix it. If someone wants me to try something, let me know.

Logs are here.

At 16:24:03 it went offline, and then came back online at 16:25:01 after just replugging it.
At 16:30:54 it went offline again, and now it thinks it’s excluded.

Does anyone have any ideas as to why this happens?
Suggestions for me to try?

Bruce_Osborne · February 22, 2020, 4:40pm

That is normal behaviour. From the manual:

Exclusion Information To remove the device from the Z-Wave network:

Plug the device into a socket nearby the main Z-Wave controller.

The LED ring will glow green signalling being added (removing is not necessary otherwise).

Set the main controller into remove mode (see the controller’s manual).

Quickly, triple click the B-button located on the casing.

Wait for the removing process to end.

Successful removing will be confirmed by the Z-Wave controller’s message.

robbert · February 22, 2020, 5:09pm

Note that I didnt touch the B button. These are commands sent from the web UI.

robbert · March 9, 2020, 6:42pm

Update: I have been continuing to troubleshoot this issue together with @robmac and @petergebruers (thanks guys!). Eventually I contacted Fibaro as we suspected an issue with the device firmware (or design). Fibaro have now acknowledged that indeed there is an issue and have notified me that they are working on a fix.