[SOLVED] Openhab 3 slow on commands (5 to 30 seconds intervals)

I’m using a RPi4 8Gb with Raspbian 10 (buster) and Openhab 3.2.0~M3-1 all running in an SSD via USB. I have dozens of lights, multiple sensors, multiple rules and everything was running perfectly for some time. This was a RAW installation (not an upgrade from previous version).

I previously had for 2 years a RPi3 w/ OH2.5 running and it was flawless! Incredibly fast, reliable, altough it took me some time to reach a stable solution I have to admit. It took me actually months!

So I have everything updated and upgraded to the latest versions (just did it right now).

I did it because I thought that since I cannot find a clear answer to my problem, maybe I could update, upgrade and have things working properly. But that did not happened!

My problem is that, when I send a command to, let’s say, turn ON a light via MQTT, it’s taking somewhere around less than 1 second (rarely) and close to almost a minute (more common). The average though is around 5 to 10 seconds, which makes no sense at all!

I first started to notice this with Z-Wave devices, I have a Z-Wave Aeotec Z-Stick Gen5, connected to a USB extension because it was not working properly with my RPi4 8Gb (I’ve read this was a solution and indeed it is). Now it’s gotten worst and it’s happening with everything I have connected!

In my openhab.log file, the only thing I see every 60 seconds is this:

2021-10-25 23:33:40.080 [INFO ] [control.internal.WebSocketConnection] - Web Socket close 1005. Reason: null
2021-10-25 23:33:59.099 [INFO ] [.reconnect.PeriodicReconnectStrategy] - Try to restore connection to '192.*****'. Next attempt in 60000ms
2021-10-25 23:33:59.102 [INFO ] [.transport.mqtt.MqttBrokerConnection] - Starting MQTT broker connection to '192.*****' with clientid *********

Now, I’ve checked if some binding was not working correctly or something like that, but all bindings I have, all are working properly!

This is my /var/lib/openhab/config/org/openhab/addons.config file by the way:

:org.apache.felix.configadmin.revision:=L"15"
binding="amazonechocontrol,astro,mqtt,zwave,miio,ipcamera"
misc="openhabcloud"
package="standard"
persistence="jdbc-mysql"
service.pid="org.openhab.addons"
transformation="jsonpath"
ui=""
voice="googletts"

And this is it, I need the help of this great community once again. By the way, CPU usage is perfectly sitting around 5%.

There is a whole lot of stuff between clicking on a button and a device changing state. For example, there’s the communication between the browser UI and openHAB’s REST API. Processing of the REST API call to generate an event. Receipt of the event and a rule running or binding acting on it. Time for the binding to process the event and generate a message to the device. For something like MQTT there’s is also the communication between openHAB and the MQTT broker and then again between the broker and the subscribers. Delays can be caused at any one or all of those steps.

There are also things outside of OH that might cause problems. For example, your CPU is fine but how is your system load? That’s a measure of how many processes are stuck waiting for some resource. What about memory? Are you using a lot of swap space? What about the file system. Does it look healthy and not being too heavily hit? How’s the network? The log messages you posted seems to indicate openHAB constantly losing connection to the MQTT Broker.

So the first thing to do is look to see if the system itself is healty in CPU, load, memory, and networking. If all that checks out start to try to narrow down where the delay(s) are occurring. Make heavy use events.log as that will tell you when OH actually received and posted the Item events. If rules are involved litter those with log statements to see if there is a delay starting a rule (i.e. look at the timestamp for an event and the timestamp for when the rule reacted to the event) or a delay processing the rule.

1 Like

Interestingly I’m experiencing something very similar. I’ve upgraded from 2.5 to latest stable. After that I’m experiencing those types of delays that you describe which I didn’t earlier.
I also have that same model of zwave stick. I use mosquitto for mqtt.

1 Like

First, thank you once again for your help guys.

I know, I own a software development company and I started developing around 20 years ago, so I totally get that. :slight_smile:

It’s just great, very few processes running.

I am using a RPi4 w/ 8Gb RAM, I don’t think this should be a problem.

I have a brand new Kingston 120Gb SSD via USB w/ 95Gb free space, so I don’t think this is the problem either.

Ok this might just be the culprit here! I have a TP-Link Archer AX10 as main router to manage my entire network, which has near 100 devices connected, between computers, IoT devices (a lot), and other appliances. Then to boost wifi range I have 3 repeaters, all connected by cable to the main router but sharing the same network (so I can have it all under the same network).

The weird thing is, when I had my old system running, everything was running smoothly and only 2/3 new devices were added to the network after I changed my OH system, so I find it weird that only by adding 2/3 devices it would cause such unstability.

For a couple of days so far the system has been stable and working without problems, but the instability is annoying.

OH3 has proper authentication for the REST API. You might suffer from additional authentication overhead. Try to turn it off to see if it improves the latency:

Run “ssh -p 8101 openhab@localhost”. Use password “habopen”.
bundle:stop org.openhab.core.io.rest.auth

Afterward you can go into settings to enable caching.

Remember OH rules are all JIT compiled. This could be the why you see some radical delays. The first time through a code path it has to compile and that can take a while on a pi. And you have to do this every time you restart OH.

Switching over to the JSR223 plugin and converting my rules to python was a tremendous performance boost. Especially startup that went from 15 minutes to 2 on my hummingboard.

I have a very similar system from hardware and software perspective. 3.2…0-M3

My recent sluggishness turns out to be the z-wave network with several ghost nodes.
Once I cleaned that up the system is very responsive.

As others pointed out, this could be caused by many things so it may take some deep analysis of logs

1 Like

I’m almost out of ideas now, the only reason I can think of that this might be so slow is due to my network!

As an example, I was accessing one of my Tasmota devices and it was so slow changing from one setting to another.

I bought a TP-Link AX1500 Wi-Fi 6 Router and two repeaters connected to the main router by cable. My computer is also connected to the router by cable.

The thing is, I have near 100 devices connected to the router so I can only guess that this is the router that is not being able to manage this quantity.

My house is not big, but I have 7 wifi cameras and many other devices connected by wifi spread all over and that is why I had to put one repeater in my garage and the other one in a room since the main router is roughly in the middle of the house.

What is your opinion on this? Should I risk to buy yet another router or even a mesh network?

Thanks!

ZWave has nothing to do with WiFi, although its mesh performance does vary from time to time with latency sometimes can reach seconds. In any case, your main connection is from where you send the command (presumably the cell phone) to the PI. You can test this by isolating that connection from the rest of the network. If it works there, then it’s router. If not, then it’s something else.

Regarding my problems with my MQTT devices (wifi), it was indeed due to bad configuration of my main router. After I fine tunned it, it seems to be perfect, even with so many devices. So I’m happy that my router can still handle this quantity of devices. :slight_smile:

Out of curiosity for those that might come across this, here’s my wifi settings:

image

Regarding my Zwave setup, it seems indeed that it was due to ghosts. Thank you for this @brianlay :slight_smile:

So it seems that things are working flawlessly so far! :slight_smile: