Hue Binding for API v2 [3.4.0.0;4.0.0.0)

Oh, you are right, I missed that. Perhaps you could for a moment disable all rules and test just that (i.e. without disabling all bindings)? If I understand it correctly, the issue is reproducible for you, so you might be able to possibly rule out the rules (pun intended) by doing that?

With that amount of rules I wouldn’t be surprised if they could be related. Do you have any rules with Thread::sleep?

You could also have a look at the output from this console command:

threads --monitors --locks

Yes, I will try the approach to disable all rules as a first test.
I’m not using sleeps anywhere, but I’m calling via http-requests some other devices and cloud services which may have an influence (typically with timeout 5secs max).
Looking into the past I had some trouble with the telegram-actions, which rarely blocked the system some time ago. But this was months ago, happened rarely and typically could be detected in the logs, which currently show no indication for that problem.

Thanks for the threads command. There are a bunch, 448 of them :wink:
Some of them with “Locked synchronizers” mentioned, which doesn’t seem to be necessarily bad.

@AndrewFG Do you have the former build of the Hue binding somewhere available for download? I didn’t save that one :wink:
I think the easiest way to switch to this one will be to put it into the …/addons directory and remove the one from the marketplace before via OpenHAB?

(yes, I have a full backup of the OpenHAB VM stored, but switching to this means, that I break my history in the DBs, because I’m storing a lot of data continuously…)

I’ll continue this evening with analysing…

It might also be interesting to check how many of them are blocked:

threads --monitors --locks | grep BLOCKED

I am not fully acquainted with thread pools. but I can imagine the tasks have possible states queued, running, blocked or finished. So the hue binding tasks may be being delayed either a) because there are actually threads BLOCKED in the pool, or too many threads QUEUED ahead of them. So logging on BLOCKED threads may not give the full answer. Perhaps you need to also log on threads QUEUED?? Or something like that…

There are not many BLOCKED or QUEUED threads:

openhab> threads --monitors --locks | grep BLOCKED
"pipe-grep BLOCKED" Id=952 in TIMED_WAITING on lock=java.io.PipedInputStream@7a2958f1
openhab> threads --monitors --locks | grep QUEUED
"pipe-grep QUEUED" Id=953 in TIMED_WAITING on lock=java.io.PipedInputStream@367dae05
openhab>

Question:
If I want to reproduce the problem with the former Hue binding, is it then sufficient to:

  • stop openHAB
  • go to the path /var/lib/openhab/marketplace/bundles/146799 and replace the newest Hue binding with the former one
  • and then start openHAB again?

Because I retrieved the former Hue binding from a backup and installed it the way described above, but I could not reproduce the problem.

If not correct, what is the best way to switch to the former Hue binding?

Remove my version from the UI MarketPlace section, and add the version in the UI Official section.

Blockquote
Remove my version from the UI MarketPlace section, and add the version in the UI Official section.

My goal is to use your former Hue binding, because I modified all items/things to use the v2 API of Hue (I’m configuring e.g. the items via text configuration files (*.items) and already adapted all my rules to the new items… which are slightly different, e.g. no channel for dark, slightly different names of channels etc) :wink:

I shall revert the changes I made two days ago, as the OH core maintainers don’t like them. So you would need to solve the issue in the rest of your system. Perhaps its a trite comment but is your pc simply too slow for your large system?

Blockquote
I shall revert the changes I made two days ago, as the OH core maintainers don’t like them

That’s really sad…especially because other binding authors (ZigBee, refer to above reference) solved the same problem the same way…

My OpenHAB is a virtual machine (Debian) running with Proxmox on an Intel i5 - it’s not busy at all. CPU usage is always very low. No memory pressure.

Would be good to have more intensive tracing for the scheduler of the core itself…
Without it’s quite hard to find the issue…

Still searching for the best way to quickly switch between the former (using the scheduler and causing the problem) and the fixed (using an own thread) version of the Hue binding to find the root cause of the issue…

You could try to increase the framework’s Thing handler thread pool size to test if the problem goes away:

/etc/openhab/services/runtime.cfg:
org.openhab.core.threadpool:thingHandler=50

The default size is 5.

Update: I cannot prove 100%, but I think I found the root cause of my issue.

It’s the Shelly binding I’m using, BUT ONLY if there are Shelly devices added as thing/item, which are located on the edge of the WiFi (with low signal strength). These Shelly devices are also slow if I try to connect them directly via the web browser.

What I did:

  • Switch to former Hue binding using the scheduler
  • removed Shelly binding → effect with slow motion updates by Hue binding gone
  • then I installed the Shelly binding again and only disabled the things/items which represent the Shelly devices located in my garden (new garden watering project ;)) at the edge of my WiFi → effect with slow motion updates by Hue binding still gone!

I have to admit, that I cannot reproduce the effect always at anytime, but currently it looks like described as above. Additionally it matches to the date since when I’m discovering the slow motion updates with the Hue binding (one of the changes was to add a new Shelly device to OpenHAB sitting at the edge of the WiFi). So this is not to blame the Shelly binding authors, the opposite is the case! I’m very happy with this binding, too!

My guess is, that the Shelly binding may try to communicate with the Shelly devices from time to time, but runs into timeouts or blocks, if the Shelly devices do not answer in a timely manner.

1 Like

As @fwolter says, normally OH has a pool of (only) 5 threads. So if you have 5 such Shelly devices, you could hit this issue. Perhaps follow @fwolter advice to increase the pool size to be larger than the number of Shelly things.

The issue seems to be solved for now with an additional outdoor WiFi repeater :wink: This lets the Shelly devices respond quickly and keeps the Shelly binding happy as far as I can see.

Nevertheless I continue to monitor this and if there are new findings I will report (and try to increase the thread pool size etc.)

Many thanks to all of you helping me out here and spending time! This is much appreciated! :slight_smile:

2 Likes

Perhaps you should open an issue officially on GitHub Issues · openhab/openhab-addons · GitHub – note: there seem to be quite a lot of issues for Shelly already…

1 Like

Also thanks for tracking this and sharing your findings. I’m glad we didn’t end up with a work-around in the Hue binding for this, but actually found the root cause. This learning can also be used in similar future cases.

1 Like

@struppie / @laursen see this…

1 Like

about the threadpool, I found this

Indeed. And that is why this issue is to do with the Shelly binding, and not related to rules. (Apparently the Shelly binding blocks for long periods, which means the OH scheduler thread pool can block, and thus block other bindings too).

@struppie the author of the Shelly binding thinks that this issue is NOT a problem in the Shelly binding see here … so if you disagree with him, please respond to his post. (I shall not respond further concerning that issue as I don’t have Shelly devices)