Profiling results using JProfiler

Flole · January 7, 2023, 4:05pm

I am currently looking into why openHAB is starting relatively slow on my system with many items and rule files. So I used JProfiler to attach to the process and then had a look at what uses CPU and the results are pretty stunning:

Almost all of the CPU time is spent fpr the computeTypes stuff. If I open more of those entries I see lot of recursion there which might indicate that there is room for improvement there.

When the system is running I did some investigation into locks/monitors and their wait time and found another thing: In openhab-core/PersistenceManagerImpl.java at 8d64ecfd8db99844d79a776fef37cbe527c1f0c6 · openhab/openhab-core · GitHub we have a monitor. I believe we could use a ReadWriteLock here instead as it’s legal to have multiple readers access that HashMap at the same time. If needed we could add a critical section in the persistence service’s store method itself to avoid issues there. Some persistense services even have their own ReadWriteLock which takes additional time:
For example one thing I am seeing is this:

java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock()
org.mapdb.StoreWAL.update(long, java.lang.Object, org.mapdb.Serializer)
org.mapdb.Caches$HashTable.update(long, java.lang.Object, org.mapdb.Serializer)
org.mapdb.EngineWrapper.update(long, java.lang.Object, org.mapdb.Serializer)
org.mapdb.BTreeMap.put2(java.lang.Object, java.lang.Object, boolean)
org.mapdb.BTreeMap.put(java.lang.Object, java.lang.Object)
org.openhab.persistence.mapdb.internal.MapDbPersistenceService.store(org.openhab.core.items.Item, java.lang.String)
org.openhab.core.persistence.internal.PersistenceManagerImpl.handleStateEvent(org.openhab.core.items.Item, boolean)
org.openhab.core.persistence.internal.PersistenceManagerImpl.stateChanged(org.openhab.core.items.Item, org.openhab.core.types.State, org.openhab.core.types.State)
org.openhab.core.items.GenericItem.lambda$1(org.openhab.core.items.StateChangeListener, org.openhab.core.types.State, boolean, org.openhab.core.types.State)
java.util.concurrent.ThreadPoolExecutor$Worker.run()

In general in the persistence stuff I see some room for improvement, my system spends over 50% in MapDB or RRD4J persistence stuff while it’s running. I will come up with a PR to use mmap() in MapDB which (according to the MapDB documentation) can give between 10% and 300% performance increase.

I believe if some of these things are further looked into we can make OH 4.0 the fastest OH ever and significantly improve it’s performance and startup time. However, it won’t be easy and needs a lot of profiling and refactoring code. I have experience finding performance issues in the code, but I am not familiar with most of the openHAB codebase, so someone with more experience would need to help me, otherwise I will just break things trying to improve them.

J-N-K · January 7, 2023, 7:54pm

Allow managing persistence configurations and enable filters by J-N-K · Pull Request #2871 · openhab/openhab-core · GitHub contains a nearly complete rewrite of persistence handling with hopefully improved performance.

Flole · January 7, 2023, 8:13pm

I can check that once it’s merged and a new milestone is released. Thanks!

Flole · January 10, 2023, 5:59am

I’ve looked into what’s actually going on at the startup and apparently what’s taking so long is just the validation of the rules, which isn’t really necessary and could theoretically be skipped. However, I’m not sure if that just adds additional delays later on as rules still need to be parsed and so on or if it’s really entirely optional. If it’s really optional maybe an Environment variable could be used to disable that check if it’s not needed so if someone wants to benefit from faster startups that’s possible.

J-N-K · January 10, 2023, 6:36am

What type of rules?

Flole · January 10, 2023, 6:47am

As I’m only using DSL rules I can only assume that those are validated here during startup. I know that there’s also validation of items, sitemaps and things going on, it doesn’t seem like there’s a way to check what actually takes most of the time in my case though.

Is the validation where the computeTypes is taking part really necessary or just implemented to provide proper warnings? It appears like it’s just supposed to provide hints on errors/mistakes in the configuration? Maybe the results could be cached somehow so that the validation is only done once for each version of each file?

system · February 20, 2023, 10:47pm

This topic was automatically closed 41 days after the last reply. New replies are no longer allowed.