Raspberry Pi 4 8Gb performance with OpenHAB 3.4

First, I wish you all had an awesome Xmas. :slight_smile:

Now, regarding my question. I have RPi 4 8Gb, I’m not using any SD Card but instead an SSD (for performance and reliability) and I have installed a very stable OH 3.4.

I have no problems whatsoever using it. It idles at around 3/4% CPU and when I’m using it remotely with RealVNC, scrolling inside OH menus, it has peaks between 30% to 40%. When I send a command (clicking an item for instance) it might have a peak of 50% or 60% but that’s really just a fraction of a second, and then it goes back down.

This to say, I’m experiencing a very stressful situation with a Zwave network.

I’m using in an installation a Aeotec Z-Stick Gen5+ USB pen and then 4x Aeotec Range Extender 7 to make sure the Zwave network has a good signal all around the house.

All Zwave devices are from a single brand: Simon Tech.

These are light switches and blinds/curtains.

Initially, there was a problem where Simon had old buttons mixed with new buttons, meaning, they had a previous supplier that was using Zwave 5 and then that supplier apparently had a chip shortage and they had to order from a different supplier, which apparently used a different version of Zwave and that caused problems with the network whenever devices from both suppliers were installed at the same time.

They then took all light switches and updated them. The network improved but still there are problems. And these are the problems currently happening:

  1. If we try to turn on a light, although it’s not instant, it takes some seconds but it does turn ON
  2. If we try to move a blind or curtain, not only that takes sometimes almost a minute to move but then all Zwave network get’s stuck and if in the meanwhile we try to turn ON or OFF, it takes also a huge time and only when the blind or curtain does something, does the light turn (sometimes ON and OFF multiple times)
  3. Even after a lot of time waiting, the blinds/curtains behavior is very strange because it goes sometimes multiple times UP or multiple times DOWN at small steps (yes, I do have activated the “Send command on release” on the widget)
  4. The most weird thing is that, when we click on some blinds/curtains buttons physically, sometimes it does not responde and we need to perform a calibration to make it work again

Last but not least, we are talking about around 190 Zwave buttons in this house (with a total of around 230 items, including multiple MQTT, which all work very fast, almost instantly).

Also, I’ve already performed a FULL reset to my USB Zwave Stick 3 times and each time I take good care to first reset each button manually and then include it in the network again, starting on the ones closest to the USB Zwave Stick, so yes the network is supposed to be properly created (have other houses with similar structure and equal server and it works normally).

So, for the sake of performance wise, what’s your take on using this hardware that I have for this? Is it too much for my little RPi4 to handle? Is it suppose to be fine? Should I move to a different setup in these cases to ensure it does not “drag”? If so, should I consider the new RPi5?

If your opinion is (like mine) that this current setup is enough for the job, then would you point the culprit as being the physical buttons?

If I didn’t had any other similar installations I would be more keen into having the server as the main cause of this problem, but since I do have similar cases working good, I am pointing for the buttons itself.

Simon brand have been very responsive and they have handled the situation very professionally, can’t complain on them. But I do need to be 100% sure that I am not doing anything wrong in this installation.

Thank you very much and I wish you all a Happy New Year. :slight_smile:

The OH 4 Zwave binding supports series 7 and series 8 controllers and devices except for the long range new feature.

I don’t think the issue is the RPi. I think it’s the controller or the network some how. The debug zwave binding logs and/or a zniffer might help identify issues or bottlenecks with that. I don’t think a bigger RPi or other server is going to change anything. I think the issue is closer to the Zwave hardware, but not encessarily the buttons. It might be the controller or something else (zombie nodes for example).

2 Likes

But does this mean that serie 7 does not work with my Zwave binding on my OH version?

Zombie nodes there are none because I reset the controller and created a fresh network. I really believe that the problem lies on the buttons because this also happened with light switches, now it only happens with blinds switches. And besides, the fact that some buttons sometimes does not work properly when they are physically clicked, that’s a huge red flag for me.

With a zniffer, can I see literally in which node the communication is getting stuck?

Yes, support for 700 was only added in 4.x (4.2 IIRC). However, I don’t know if that means that 700 devices won’t work out if it’s just controllers that went work

theoretically. you’ll see all the network traffic.

1 Like

It’s only controllers. For devices, in general, it isn’t really relevant what series of chip is inside. There are a couple of exceptions, but in general it’s not something you have to worry about for devices.

2 Likes

A zniffer is a good tool with a network of your size, however, if you haven’t already built one from a spare zstick, the zwave debug log should show the problem. You can use the color-coded viewer to help analyze the problem nodes. Z-Wave Log Viewer

If you haven’t done so already I would consider disabling the network heal and see if that helps.

1 Like

I thought this was almost like a “must have” to be ON. Would you mind explaining why this being active can actually cause the network to be slow? Thank you so much!

The are a lot of messages with a heal including every node sending a message to every other node to find its neighbors. If you are not moving around the devices and have Zwave+ devices it has little value. Frankly, I never use it.

The other culprit could be several chatty nodes that are sending messages every few ms, electric meters can be bad actors.

As noted above using debug and viewer will find the (or at least point to likely) problems.

Older post but still informational. [SOLVED] Unresponsive Z-Wave Network: Tools and Approaches to track down the issues - Tutorials & Examples / Solutions - openHAB Community

1 Like

Yes, indeed each device send electric consumption information, so what you say makes sense.

Also, after analyzing the network I do see many nodes with OFFLINE (COMMUNICATION_ERROR) which explains the delays on the network.

As I said, I already worked with other equally big zwave networks, even with multiple brands on it (Simon Tech included) and they work flawlessly and very fast. So something must be going on in here.

Another thing I noticed was that, for some reason, curtains take a lot more time to respond than blinds, although the zwave devices are the same brand and model.

Simon Tech sales man told me they indeed had in other cases, problems with other curtains on other houses, due to the way those use power (especially Chinese brands he said).

Can that actually be the culprit here?

I honestly do not know what more to do… there are no ghost nodes on the zwave network, other non-zwave devices (e.g. Shelly) work flawlessly, server is idling perfectly and very stable, logs show multiple devices with COMMUNICATION_ERROR as I mentioned, so my conclusion is that this must be something on the devices directly, not anything at all from my setup (server + zwave stick).

What’s your opinion on this?

Impossible to say without at least some debug logs. What do your logs look like through the Debug viewer? Offline could be caused by too much traffic (see above) or the radio waves not getting through, or the device, or something else

Since the debug log is huge, I link the log file with debug mode active for binding.zwave: openhab.log

Ok, that might be one thing yes, but as I said, I’ve had other similar sized projects in the past and are working perfectly fine for years now and with the same Simon Tech brand.

FLOOR 0


FLOOR 1

So, to make it easier to understand the architecture, here are the blueprints for Floor 0 and Floor 1. On Floor 1 (second image) the green part is fully open to Floor 0, so on the right that’s a Mezzanine (allowing wireless/radio signals to travel without any blocking).

Number 1 in red is a Aeotec Z-Stick Gen5+ which is bellow the stairs (which are made in wood, not concrete), where the server cabinet is bellow, inside the garage. Signal strength is strong right above in Floor 0.

Number 2, 3, 4 and 5 are Aeotec Range Extender 7.

And then ALL Simon Tech light and blinds switches, which also act as routing node slaves.

Out of curiosity, here’s how the “map” of the zwave network looks like:

I suspect that might be the problem here. As you can see, in the map there’s one red item (that’s why I posted that image). This happens with other devices as well. But it’s kind of random, because I’ve seen this happening with devices that always worked well and all of a sudden, they stop responding for some reason.

There has also been strange behaviors on some Simon Tech switches, like stopped working even when clicking on it (this has happened to some 2 or 3 switches, right now there’s only one with that problem and it’s in the Suite, which is the part of the house where we have most problems).

This is the part that concerns and scares me… if we cannot be sure of what it is. :frowning:

As I said previously, Simon Tech salesman told me that they had problems in the past with curtains motors due to how they use energy or whatever, he couldn’t explain clearly. But he said that indeed some Chinese brands sometimes cause problems.

Thank you very much once again Bob.

Thanks for the log. It is harder to read than normal because you have an IP camera warning with a NPE (null) that keeps repeating every few seconds. Don’t know if that is contributing to the Zwave problems but probably should fix. Or you could set up a separate zwave log file (see the very end of the Zwave Readme) if you don’t want to deal with that.

In the snippet Node 132 has the most problems. What is odd is that when a command is sent to the controller, it takes too long in many cases for the controller to respond. Normally that is quick at the ms level, but with Node 132 it times out at 2 seconds and the problems compound (It blocks other commands from getting through).

The other thing I noticed is an “itchy” finger. For example; At 11:44:21.718 Node 124 is commanded UP, then at 11:44:37.548 it is commanded DOWN, then at 11:45:20.089 it is commanded DOWN again, then at 11:46:51.239 it is commanded DOWN yet again. My guess is that you need to lengthen the command poll based on how quickly the actual device takes to go from UP to DOWN, but there may be another reason.

One thought to reduce the load on the controller is to group blinds via the Association group 2. For instance if a rule triggers the blinds on the south side to go DOWN, add the other South side nodes to the group 2 for one of the blinds. It should trigger the others without the controller needing to send a command.

I did see that node 135 is not responsive. Also there is a node missing a number?

1 Like

Disabled the Things, errors stopped showing but no improvements whatsoever, even after a full restart.

Now cleaner, here’s the new log file.

This is actually very random. Today for instance, Nodes 82, 94 and 125 show communication error. Yesterday were Nodes 77, 132 and 113. And others have shown this error, but randomly.

But isn’t Association group 2 the “Control”?


Because, if I put there nodes from other devices, whatever this device does, it will be replicated by the others. Right?

This one I didn’t noticed, but I am not surprised with that since, as I said before, devices randomly stop responding (even manually clicking it).

It was much probably some UI bug, just checked and all numbers appear.

You and I have much different views on what is a clean file. First it is now 90% SQL errors that could also be affecting performance. Second it is not the separate Zwave file I asked for, Third, by the time you started the ZW debug there had been around 170 prior zwave commands, so it was already “borked”. I’m not sure I can help unless I get a zwave only debug from startup. Even then no promises.

I suggest you ask your Simon contact. In 99% of ZW devices Association group 1 is for the controller only and Association group 2 should not have the controller as default but have other blinds (devices) that respond when the mother device itself is activated. What “control” means is other child devices that are controlled and has nothing to do with the mother device itself (If that makes sense)
I don’t think some of these devices are set up correctly in the ZW DB. The only blind setup the way I think they should be is this one. I changed the other 2, but could change them back if needed. Also don’t think it is the root cause, just an incremental reduction.

EDIT: Another incremental move that might help is to disable the command poll for the blinds as they move slow and appear to send a message anyway when they are done. This is from the ZW ReadMe
The binding can perform a poll of the device shortly after sending a command to make sure that the command was implemented, and the binding has the correct view of the devices state. This is called "Command Poll Period" and may need adjustment for some devices that may update their state slowly (e.g. dimmers that have a slow transition). This is defined in milliseconds, and can be set to 0 to disable this polling.

1 Like

“cleaner” != “completely clean” :slight_smile:

Fixed, that was due to Spotify Binding.

I know, I’m sorry. Will prepare that one next time I perform more tests.

It was actually me the one who inserted all Simon devices in the Zwave database. But I based on the manual they have and for blinds, they have this in the Associations:

This is actually a master device, meaning, this one does not directly control any equipment. What this does is controlling other devices. They developed this button so that, for instance, inside a room we can click up or down and it activates both a blind and blackout. Or inside a big living room with multiple blinds and blackouts, that one can command all of those at the same time.

I’ve changed polling times to 20 seconds for every blinds controller. Actually helped a bit, but not that much.

I shall remind that this is not the first nor second house with multiple devices from Simon that I work with. And in no other place this is happening.

Also, CPU usage on the RPi when idling is very, very small. I’m talking about 3/4% when idle. So I don’t think other processes are actually affecting performance. If I click multiple devices in OH, CPU usage reaches 50/60% during a “fraction of a second”.

Next thing I will actually try there, will be resetting (yet again) the entire network and then join light switches only. If everything works flawlessly, I will stop my tests and request Simon assistance. I even suggested Simon personal to go there and use their Hub to create the network on their own APP and see if everything works perfectly, because if it does, then we would at least know the problem was not from their devices, but they told me something similar already happened on other houses and the problem was due to how curtains motors used energy which interfered with their devices and so they asked for the company that installed the motors to check that as well. And this is it, we are on a “ping pong” game which is very frustrating for everyone involved here.

I appreciate all your time and dedication on trying to help.

If you run any more tests, I can take a look.

You might want to put a second blind (or a couple of secondary blinds) in Assoc group 2 and see what happens to them when you activate the primary blind. I’d be curious to know if the secondary blinds move.

You also might want to try a 700 (or 800) controller if you are redoing it. The chip is a bit faster and the range is a bit longer. EDIT: If this is an older Aeotec Z-Stick it needs to be in an USB 2.0 hub to work in a Rpi4. There was a hardware issue you can google.

I don’t think CPU is a good indicator. The Zwave radio operates at 100kb/sec (at best). There could be issues on the serial port that will never affect the CPU. I’m assuming you have nothing else using serial communications?

1 Like

Thank you for your suggestions. I did purchased the latest 700 USB from Aeotec and setup ONLY the basic switch (lights), because those were working “fine” (meaning, not as bad as blinds). We are talking about almost 100 devices.

I even upgraded my OpenHAB from 3.4 to the latest version - fortunately it went VERY well, I was surprised on how well the upgrade worked out. No errors, no big deal. Really nice.

Unfortunately, in a certain part of the house, devices still respond slowly.

Yes, I was referring to the possibility for the process delays to even be sent at all. I do have a USB port with the SSD for the SO.

I am wondering, now that you mention it, if the USB extension I have is eventually causing problems at all. But if that would be the case, then ALL Z-Wave network would be slow, am I right? Only on a specific division on the house.

Since I now have a older 500 serie Aeotec USB stick around, I will change the firmware and do a proper investigation on the network.

Any suggestions on how I can do that to make sure if this problem is being caused by slow devices or not? I mean, right now I just want to be 100% sure if the factory should be the one to blame and force them to change their devices or if there’s something else going on with my network, somehow.

Since I am using repeaters as I said, I suspect some devices are to blame. Also because some buttons are not working properly when we click (not even working at all, sometimes). And since I need to create some report to show the manufacturer where, specifically, the problems are occurring, I really need to do this right.

Thank you so much once again.

Assume you are talking about zniffer. I haven’t written a guide on that conversion but there is information on the internet and the Silabs web pages. After you have the firmware changed the Simplicity suite has the app included. I think you will find it interesting and should be able to find the slow nodes.

I’ve tried to put the stick in boot mode but it never works. Maybe I have to do it via UART? The thing is, I’ve opened the stick and I do not see any TX and RX as I was expecting. I do see GND and VCC though. Can you give me a help here please?

Also, I could not find any software to program the .hex file “sniffer_ZW050x_USBVCP.hex”. Where can I find it? And last but not least, is this file the correct one?

Thank you so much.

It has been a while, but I found this Z-Wave 500 Series End Device Software Development Kit - Silicon Labs. I used the programmer in windows, never opened the device. If you registered for the Simplicity studios, that should work here too. I can’t find the .hex file right now, but that looks okay

edit: one more: Z-Wave 500: Converting a UZB3 Controller to a Zniffer

1 Like