Consistent 100% CPU use of safeCall-queue thread

The 100% CPU issue is when the underlying queue is empty for a thread. It fixes itself when enough items are sent through the higher level queue to necessitate sending one through the lower level queue.

I asked for a backport to Java 17 in the PR with the fix. :crossed_fingers:

7 Likes

Wouter… you sir are a flippin’ rockstar!!!

Looking at the JVM bug - would it be possible to swap collection implementation to another queue type? This way we do not need to wait for release of JVM.

Yes that would be the way to work around it. Maybe replace the buggy Java 17 implementation with the working Java 11 implementation. But it needs to be done everywhere the class is used. That seems to be openhab-core and jupnp for now. But it could also be used in add-ons or their dependencies.

I made some bundles with patches to use the Java 11 LinkedTransferQueue on Java 17.
This will only work with OH 4.0.1.
Maybe someone wants to test it?

You can update the bundles on the Console with these commands:

bundle:install -s https://github.com/openhab/openhab-base-fixes/releases/download/base-fixes-20230809/base-fixes-1.0.0-SNAPSHOT.jar
bundle:update org.openhab.core https://github.com/wborn/openhab-core/releases/download/base-fixes-20230809/org.openhab.core-4.0.1-SNAPSHOT.jar
bundle:update org.jupnp https://github.com/wborn/jupnp/releases/download/base-fixes-20230809/org.jupnp-2.8.0-SNAPSHOT.jar

The org.jupnp update is optional as it may fail if you don’t use any add-on that depends on it.
After updating the bundles openHAB needs to be restarted for it to work again.


To revert the changes use these commands:

bundle:uninstall org.openhab.base-fixes
bundle:update org.openhab.core https://openhab.jfrog.io/artifactory/libs-release/org/openhab/core/bundles/org.openhab.core/4.0.1/org.openhab.core-4.0.1.jar
bundle:update org.jupnp https://repo1.maven.org/maven2/org/jupnp/org.jupnp/2.7.0/org.jupnp-2.7.0.jar

Then again restart openHAB.

10 Likes

Good job! For convenience of the willing, perhaps you could also provide the commands for reverting back to the bundled version to be prepared in case of severe issues and/or before upgrading to 4.0.2?

They’ve been added to the post now.

2 Likes

Do I need to Install java11 again?

No keep using Java 17, my changes bring some code that did not have these issues from Java 11 to Java 17.

How do I know if I use an binding which uses jupnp?

You can check if the org.jupnp bundle is installed:

openhab> bundle:list -s | grep org.jupnp
245 │ Active │  80 │ 2.7.0                  │ org.jupnp

One more last question…just for my interest…what is the callsafequeue doing in openhab3 exactly?And what is the issue here? Also it’s fine if you can explain it in simple words.

SafeCaller is a construct in openHAB core that all parts of openHAB can use to wrap something in to prevent it from taking too long (or getting stuck forever). Some more basic examples would be when a rule is executed, when a request is made to an external service, etc. It’s a queue because the main code just adds a task to the queue to run on a different thread, and then when it’s done (successfully or not) the original code can continue.

5 Likes

Thanks Wouter
I have installed the patched bundles as per your instructions above after updating to 4.0.1
My case is one of the most easily reproducible with this issue so it won’t take to long to know if it is working. So far, update went well, everything is running smoothly on 4.0.1 after the update. All Jruby rules re-enabled and fingers crossed

2 Likes

So far, this is working great for me. I was having issues with items-queue thread.

1 Like

this patch is a huge step wrt performance.
Now RasPi2 got usabel again for OH4
Thx!

1 Like

@wborn - how do you interpret Doug’s latest comment in the PR. “Yes, it’s possible” - or “yes, I’ll do it”? :slight_smile:

In either case, do you know if it will be possible for us to merge the Java 11 version of LinkedTransferQueue into our repository without any license problems/conflicts? I’m wondering if we could do this at least in 4.0.x, as the tests so far looks positive.

Meanwhile I had a look at how OpenJDK operates and it looks like other devs typically do the cherry picking of fixes.

If you think openHAB has a formal process, you haven’t seen theirs yet.

Fixes first go to jdk17u-dev and then to the actual update repo jdk17u. So you can monitor the PRs of those repos to see if it gets cherry picked.

OpenJDK uses GPLv2 so that’s why I already put the code for the bundle in a separate repo.

2 Likes

If it were cherry picked and merged today it would still take some time before it gets released:

  • Tuesday, October 17 2023 GA; OpenJDK 17.0.9 released (tag: jdk-17.0.9-ga)

After that it may take more time for Debian/Zulu/Temurin to release builds.