Hue Binding for API v2 [3.4.0.0;4.0.0.0)

laursen · June 25, 2023, 9:06am

Oh, you are right, I missed that. Perhaps you could for a moment disable all rules and test just that (i.e. without disabling all bindings)? If I understand it correctly, the issue is reproducible for you, so you might be able to possibly rule out the rules (pun intended) by doing that?

With that amount of rules I wouldn’t be surprised if they could be related. Do you have any rules with Thread::sleep?

laursen · June 25, 2023, 9:32am

You could also have a look at the output from this console command:

threads --monitors --locks

struppie · June 25, 2023, 10:16am

Yes, I will try the approach to disable all rules as a first test.
I’m not using sleeps anywhere, but I’m calling via http-requests some other devices and cloud services which may have an influence (typically with timeout 5secs max).
Looking into the past I had some trouble with the telegram-actions, which rarely blocked the system some time ago. But this was months ago, happened rarely and typically could be detected in the logs, which currently show no indication for that problem.

Thanks for the threads command. There are a bunch, 448 of them
Some of them with “Locked synchronizers” mentioned, which doesn’t seem to be necessarily bad.

@AndrewFG Do you have the former build of the Hue binding somewhere available for download? I didn’t save that one
I think the easiest way to switch to this one will be to put it into the …/addons directory and remove the one from the marketplace before via OpenHAB?

(yes, I have a full backup of the OpenHAB VM stored, but switching to this means, that I break my history in the DBs, because I’m storing a lot of data continuously…)

I’ll continue this evening with analysing…

laursen · June 25, 2023, 11:26am

It might also be interesting to check how many of them are blocked:

threads --monitors --locks | grep BLOCKED

AndrewFG · June 25, 2023, 4:36pm

I am not fully acquainted with thread pools. but I can imagine the tasks have possible states queued, running, blocked or finished. So the hue binding tasks may be being delayed either a) because there are actually threads BLOCKED in the pool, or too many threads QUEUED ahead of them. So logging on BLOCKED threads may not give the full answer. Perhaps you need to also log on threads QUEUED?? Or something like that…

struppie · June 25, 2023, 6:57pm

There are not many BLOCKED or QUEUED threads:

openhab> threads --monitors --locks | grep BLOCKED
"pipe-grep BLOCKED" Id=952 in TIMED_WAITING on lock=java.io.PipedInputStream@7a2958f1
openhab> threads --monitors --locks | grep QUEUED
"pipe-grep QUEUED" Id=953 in TIMED_WAITING on lock=java.io.PipedInputStream@367dae05
openhab>

Question:
If I want to reproduce the problem with the former Hue binding, is it then sufficient to:

stop openHAB
go to the path /var/lib/openhab/marketplace/bundles/146799 and replace the newest Hue binding with the former one
and then start openHAB again?

Because I retrieved the former Hue binding from a backup and installed it the way described above, but I could not reproduce the problem.

If not correct, what is the best way to switch to the former Hue binding?

AndrewFG · June 25, 2023, 7:48pm

Remove my version from the UI MarketPlace section, and add the version in the UI Official section.

struppie · June 25, 2023, 7:56pm

Blockquote
Remove my version from the UI MarketPlace section, and add the version in the UI Official section.

My goal is to use your former Hue binding, because I modified all items/things to use the v2 API of Hue (I’m configuring e.g. the items via text configuration files (*.items) and already adapted all my rules to the new items… which are slightly different, e.g. no channel for dark, slightly different names of channels etc)

AndrewFG · June 25, 2023, 8:03pm

I shall revert the changes I made two days ago, as the OH core maintainers don’t like them. So you would need to solve the issue in the rest of your system. Perhaps its a trite comment but is your pc simply too slow for your large system?

struppie · June 25, 2023, 8:54pm

Blockquote
I shall revert the changes I made two days ago, as the OH core maintainers don’t like them

That’s really sad…especially because other binding authors (ZigBee, refer to above reference) solved the same problem the same way…

My OpenHAB is a virtual machine (Debian) running with Proxmox on an Intel i5 - it’s not busy at all. CPU usage is always very low. No memory pressure.

Would be good to have more intensive tracing for the scheduler of the core itself…
Without it’s quite hard to find the issue…

Still searching for the best way to quickly switch between the former (using the scheduler and causing the problem) and the fixed (using an own thread) version of the Hue binding to find the root cause of the issue…

fwolter · June 26, 2023, 5:51am

You could try to increase the framework’s Thing handler thread pool size to test if the problem goes away:

/etc/openhab/services/runtime.cfg:
org.openhab.core.threadpool:thingHandler=50

The default size is 5.

struppie · June 26, 2023, 6:30pm

Update: I cannot prove 100%, but I think I found the root cause of my issue.

It’s the Shelly binding I’m using, BUT ONLY if there are Shelly devices added as thing/item, which are located on the edge of the WiFi (with low signal strength). These Shelly devices are also slow if I try to connect them directly via the web browser.

What I did:

Switch to former Hue binding using the scheduler
removed Shelly binding → effect with slow motion updates by Hue binding gone
then I installed the Shelly binding again and only disabled the things/items which represent the Shelly devices located in my garden (new garden watering project ;)) at the edge of my WiFi → effect with slow motion updates by Hue binding still gone!

I have to admit, that I cannot reproduce the effect always at anytime, but currently it looks like described as above. Additionally it matches to the date since when I’m discovering the slow motion updates with the Hue binding (one of the changes was to add a new Shelly device to OpenHAB sitting at the edge of the WiFi). So this is not to blame the Shelly binding authors, the opposite is the case! I’m very happy with this binding, too!

My guess is, that the Shelly binding may try to communicate with the Shelly devices from time to time, but runs into timeouts or blocks, if the Shelly devices do not answer in a timely manner.

AndrewFG · June 26, 2023, 7:31pm

As @fwolter says, normally OH has a pool of (only) 5 threads. So if you have 5 such Shelly devices, you could hit this issue. Perhaps follow @fwolter advice to increase the pool size to be larger than the number of Shelly things.

struppie · June 29, 2023, 6:57am

The issue seems to be solved for now with an additional outdoor WiFi repeater This lets the Shelly devices respond quickly and keeps the Shelly binding happy as far as I can see.

Nevertheless I continue to monitor this and if there are new findings I will report (and try to increase the thread pool size etc.)

Many thanks to all of you helping me out here and spending time! This is much appreciated!

AndrewFG · June 29, 2023, 11:23am

Perhaps you should open an issue officially on GitHub Issues · openhab/openhab-addons · GitHub – note: there seem to be quite a lot of issues for Shelly already…

laursen · June 29, 2023, 2:30pm

Also thanks for tracking this and sharing your findings. I’m glad we didn’t end up with a work-around in the Hue binding for this, but actually found the root cause. This learning can also be used in similar future cases.

AndrewFG · June 30, 2023, 4:27pm

@struppie / @laursen see this…

github.com/openhab/openhab-addons

[shelly] low signal strength device causes OH thread pool to block (thus blocking other bindings)

opened 04:25PM - 30 Jun 23 UTC

andrewfg

bug

This issue was originally reported on the forum [HERE](https://community.openhab….org/t/hue-binding-for-api-v2-3-4-0-0-4-0-0-0/146799/32?u=andrewfg). If there are Shelly devices added as thing/item, which are located on the edge of the WiFi (with low signal strength), then the updating of those devices can cause threads in the OH core thread pool to block. And since there are only 5 threads in the OH core thread pool, this can cause the updating of other bindings to block as well. Notes: - Such Shelly devices are also slow if one tries to connect them directly via the web browser. - Workaround is to install a WiFi range extender.

holger_hees · July 1, 2023, 7:06am

about the threadpool, I found this

AndrewFG · July 1, 2023, 12:09pm

Indeed. And that is why this issue is to do with the Shelly binding, and not related to rules. (Apparently the Shelly binding blocks for long periods, which means the OH scheduler thread pool can block, and thus block other bindings too).

AndrewFG · July 9, 2023, 10:30pm

@struppie the author of the Shelly binding thinks that this issue is NOT a problem in the Shelly binding see here … so if you disagree with him, please respond to his post. (I shall not respond further concerning that issue as I don’t have Shelly devices)