Hue Binding for API v2 [3.4.0.0;4.0.0.0)

AndrewFG · June 19, 2023, 7:52pm

What kind of conflict?

struppie · June 19, 2023, 8:11pm

It looks like this in the UI:

and I see this in openhab.log:

2023-06-19 22:03:34.474 [WARN ] [g.discovery.internal.PersistentInbox] - Cannot create thing. No binding found that supports creating a thing of type hue:bridge-api2.

I think this is, because the new Hue binding does not come up like expected?
Refer to above log-snippets - Hue binding sticks to status “Waiting”

The conflict occurs during the attempt to create a thing for the v2 api bridge

AndrewFG · June 19, 2023, 8:41pm

Is the regular ‘old’ Hue binding un-installed? If not then there is likely to be a conflict.

struppie · June 19, 2023, 8:56pm

No, it is not - thought it’s possible to run both at the same time. Thanks for the tip - will try to remove that one tomorrow and do further tests (after creating a backup of the system ;)) and report about the results

struppie · June 19, 2023, 9:15pm

couldn’t resist, just stopped the “old” binding and gave it a try again. Result: Success!
Now I have an OpenHAB inbox full of new, found hue devices and will continue tomorrow.
Thanks a lot for your help! Now I know how to continue!

struppie · June 20, 2023, 7:25pm

Here’s some feedback, after switching with my Hue motion sensors to the new API v2:

I have 14 Hue motion sensors installed
10 of them were automatically found and I could easily add them via the UI of OpenHAB as “things”
for 4 of them it was necessary to find out the resource IDs via openhab-cli and add them manually as hue:device for API v2
currently I’m using a mixture of v1 and v2 API, but only with the new Hue binding (uninstalled the “old” Hue binding completely). I connected all Hue sensor motion channels via the new v2 API, because this was the main motivation to switch to v2
Basically it’s working now

But:

After restarting OpenHAB it takes up to 2-3 minutes until the Hue v1 and v2 bridges switch to status online and all the Hue devices are also going to status “online”. All other bindings are way quicker and the “old” binding was also quicker. I’m using a virtual machine with 2 i5 cores assigned.
Main problem: Switching to v2 of the API did not fix the major issue I saw with the old binding. I’m still getting very late events for movement detections.
If I walk through my rooms I can see in the Philips Hue App and Apple HomeKit App very quick reports about detected motions. In most of the times this is also true for OpenHAB, but in about 10-20% of the cases it takes up to about half a minute or more to get a motion event via OpenHAB.

As reported in the other thread this wasn’t an issue for all the months before, only at about April/May something changed causing these slow events. My guess is still, that this has something to do with a firmware update of the Philips Hue device which were shipped and automatically installed on my devices during these months. I didn’t change anything at OpenHAB during this time.

Maybe Philips changed something how events are reported?
Seeing the quick events in the Philips Hue App and Apple HomeKit App proves, that it’s still possible to get quick events and the problem is not related to the number of devices etc. in my environment.

From my perspective it seems, that this issue is still not solved with API v2 usage: https://community.openhab.org/t/hue-slow-delay-after-3-4-0-update/142435/30?page=2

If I can help in anyway to track this down or provide traces or whatever, please let me know. Happy to assist and improve the binding
Many thanks for your support @AndrewFG !

AndrewFG · June 20, 2023, 8:31pm

Could you try disabling the old bridge? And see if that impacts the new one? The old bridge will still be polling like crazy, and that is probably blocking the Api v2 calls. The combination is almost definitely going to be worse than either Api on its own.

Can you please turn on log trace for this 2-3 minutes period?

struppie · June 20, 2023, 9:43pm

Sure, tried this, but did not have any effect. OpenHAB/Hue binding reports several times slow about detected movements, Hue App reports instantly.

Yes, turned on tracing via opencli this way:

log:set DEBUG org.openhab.binding.hue

Then searched for text “hue” and “clip” with regex and put the found lines into a file (will send you via PM).
What I did is:

23:03 - restarted OpenHAB service
23:05 - all other bindings are up and running
23:06 - hue bridge and things are up and running (this time the gab was not that huge)

struppie · June 20, 2023, 10:03pm

Maybe I found an interesting part in the trace with enabled Hue binding tracing:

2023-06-20 23:55:36.289 [DEBUG] [.hue.internal.connection.Clip2Bridge] - onEventData() data length:380
2023-06-20 23:55:41.679 [DEBUG] [.internal.handler.Clip2BridgeHandler] - onResourcesEvent() resource count 1
2023-06-20 23:55:41.679 [DEBUG] [e.internal.handler.Clip2ThingHandler] - 60a18e2f-b89c-44c1-be58-241d98c11f3a -> updateChannels() from resource id:6e8b9516-a149-4a8b-b295-41d376b98cc5, type:motion
2023-06-20 23:55:41.680 [DEBUG] [e.internal.handler.Clip2ThingHandler] - 60a18e2f-b89c-44c1-be58-241d98c11f3a -> updateState() 'motion' update with 'ON' (fullUpdate:false, isDefined:true)
2023-06-20 23:55:41.680 [DEBUG] [e.internal.handler.Clip2ThingHandler] - 60a18e2f-b89c-44c1-be58-241d98c11f3a -> updateState() 'last-updated' update with '2023-06-20T23:55:41.680090+0200' (fullUpdate:false, isDefined:true)
2023-06-20 23:55:41.680 [DEBUG] [e.internal.handler.Clip2ThingHandler] - 60a18e2f-b89c-44c1-be58-241d98c11f3a -> onResource() consumed resource id:6e8b9516-a149-4a8b-b295-41d376b98cc5, type:motion

I was moving in front of the motion sensor with resource id 60a18e2f-b89c-44c1-be58-241d98c11f3a.

At timestamp 23:55:36.289 it reports about onEventData(), which is exactly the timestamp I moved and the Hue App on my iPhone reports about this movement.

But only at 23:55:41.679 the data is consumed and classified as motion data in OpenHAB by the Hue binding. Means there is a gap of 5 seconds this time.

There is no other tracing in-between these first 2 lines of any other binding.
Hue bridge v1 was disabled, only bridge v2 enabled.

The events.log also reports at 23.55:41.685 about the changed move event.

AndrewFG · June 20, 2023, 10:58pm

I shall need log:set TRACE please. And please send me the whole log file verbatim without any filtering. It will be big…

AndrewFG · June 21, 2023, 2:11pm

struppie:

2023-06-20 23:55:36.289 [DEBUG] [.hue.internal.connection.Clip2Bridge] - onEventData() data length:380
2023-06-20 23:55:41.679 [DEBUG] [.internal.handler.Clip2BridgeHandler] - onResourcesEvent() resource count 1

The onEventData() method gets called when the HTTP2 connection receives data from the Hue bridge. That method then schedules the onResourcesEvent() method to be run as a separate task via the OpenHAB core thread scheduler.

In the log file that you sent me, the interval between onEventData() and onResourcesEvent() call is about 2…3 milliseconds – which is what one would expect from a scheduled task. By contrast in your later log entry quoted above the interval is (as you say) over five seconds. This means that the OpenHAB core thread scheduler is blocking.

This blocking could be due to other scheduled tasks within this binding, or it may be due to scheduled tasks in some other binding, or indeed in the core itself.

Note: it is therefore even possible that your poor performance with the ‘old’ binding could have been caused by other scheduled tasks blocking outside of the binding.

To resolve this we will need figure out which task threads are blocking. I am not a Linux expert but there may be some diagnostic tools to do that. (??) Or alternatively you will need to do it the ‘hard way’ by disabling all bindings and then re-enabling them one by one.

As mentioned earlier, I think running both API v1 and v2 bridges and things on the same binding could indeed be one reason for overloading the OH core task scheduler, so please disable the v1 bridge and things while doing the test on v2.

struppie · June 21, 2023, 2:27pm

Many thanks for your valuable comments!
I will now analyse a bit more into my system and the other bindings to find out, if I can find the reason for the current behaviour regarding the task scheduler and the gaps.
There are some other services/binding interacting with cloud services, which may cause these effects.
I don’t want to waste your time - I’m already impressed about you swift support and answers! Much appreciated!
I will definitely report about my findings during the next days!

struppie · June 23, 2023, 7:08pm

Hi @AndrewFG ,

here’s an intermediate update:

I’ve now completely switched off API v1 usage, everything migrated to API v2 usage
but still seeing the effect, that the duration between calls onEventData() and onResourcesEvent() is up to (rarely) minutes, but quite often several seconds.
I tried to disable several bindings, but no luck up to now (there are 23 bindings in use and a lot of rules, which may have an influence, too?)
CPU usage is low, the system itself seems to be fine. Other rules of OpenHAB executed every minute (cron) are starting perfectly fine each minute
Then I searched for more info and found this interesting thread/issue:
https://github.com/openhab/org.openhab.binding.zigbee/issues/392
and
https://github.com/openhab/org.openhab.binding.zigbee/pull/393

Therefore I grabbed the source from GitHub (click), but it seems, that these ones are for the binding for OpenHAB v4.x.y only.
Is there a version for OpenHAB v3.x.y available, or what needs to be modified to build the hue binding for 3.x.y?

AndrewFG · June 23, 2023, 7:52pm

Ok. I think properly we would need to get a fix in OH core. However knowing the speed that such changes may take … and given that we are both using v3.4.x … probably the fix is to use a binding specific task executor. I will have a look at it tomorrow…

EDIT: @struppie on reflection, I can fix the cases where the binding was relying on the OH core scheduler to execute a task instantaneously (and where the core is NOT executing it so). (That was anyway probably an improper use of a scheduler). HOWEVER I cannot fix the fact that something in your system is overloading the OH core scheduler. So even if I fix the binding to execute immediate tasks immediately, there may be some cases where intentionally delayed scheduled tasks will be delayed longer than they should be. So you anyway need to fix your system.

AndrewFG · June 24, 2023, 11:46am

@struppie there is a new build in the marketplace (so use UI to uninstall the prior marketplace version, and then install it again from marketplace).

struppie · June 24, 2023, 7:32pm

@AndrewFG Many thanks for the new binding! It works perfect!
After installing it and running through my house, each motion sensor’s event was handled right away! It’s now working perfectly fine with the motion sensor!

A big thank you for your work!!!

Regarding your additional comment (EDIT): Yes, I know. I will continue to search the root of the problem in my system…after finding it, I will report about…

laursen · June 25, 2023, 8:44am

Would it be possible for you to go back to the previous version where you had the issue and disable all bindings and rules to see if it works then?

You can then try to enable all bindings, and try again. If success, then enable all rules, and try again. If failure, you could enable bindings/rules in groups until the problematic binding or rule has been pinpointed.

struppie · June 25, 2023, 9:00am

@laursen
basically yes. That’s what @AndrewFG already suggested and I started with. The problem is, that I have > 20 bindings installed und > 6000 lines of rules and we (family) are highly depending on the functionality (charging electric car, light and roller shutter, grafana recording of all data, etc) and currently I’m busy with some other things, e.g. next project with OpenHAB: garden watering

But I will take care and try to track this down. It just may take some time…

Question:
Is there some more tracing available of OpenHAB core to switch on/off some more traces in this area or other things which may help?
I’m also thinking about creating a “dummy binding” just using the mechanism the old hue binding used to easier find the problem. I could then insert some tracing with timestamps throwing messages if the gap is > 5 seconds. Would be easier compared to running through the house and check if hue move events are handled correctly
I’m not familiar with java or building bindings with OpenHAB - so if that test binding is available somewhere which was mentioned in this thread (click), it would be very helpful

laursen · June 25, 2023, 9:06am

Oh, you are right, I missed that. Perhaps you could for a moment disable all rules and test just that (i.e. without disabling all bindings)? If I understand it correctly, the issue is reproducible for you, so you might be able to possibly rule out the rules (pun intended) by doing that?

With that amount of rules I wouldn’t be surprised if they could be related. Do you have any rules with Thread::sleep?

laursen · June 25, 2023, 9:32am

You could also have a look at the output from this console command:

threads --monitors --locks