Performance issues with rrdj4

jlikonen · January 29, 2024, 7:08am

@splatch OK, thanks. It seems that rrdj4 v. 4.1.0 is active.

splatch · January 29, 2024, 8:17am

The rrd4j library is embedded into rrd4j persistence. By checking sources I found that OH <= 3.4.5 used rrd4j in version 3.8.1. OH >= 4.0 uses rrd4j version 3.8.2.

However, looking at changes between these two versions of rrd4j library I see no reasons for change in behavior. Hence rrd4j itself is unlikely to be guilty of performance issues you face.

jlikonen · January 29, 2024, 8:26am

@splatch OK, thanks. Task manager shows quite high Disk utilization (see attached screen copies) so could this be the culprit?

splatch · January 29, 2024, 10:45am

As far I remember rrd pre-allocate its file upon first launch. This means that it does not grow over time. However - if your system uses virtual memory (so called swap) it might happen that copying of data from memory/to disk and from disk/to memory interferes with processes which try to read from it.
Take a look if your windows installation uses virtual memory and how. Its been a while since I used windows last time, so my knowledge on how it works nowadays might be rusted.

rlkoshak · January 29, 2024, 2:29pm

Navigate to Developer Tools → API Explorer → Persistence and query for some data and see how long it takes.

And there have been no relevant changes to rrd4j I know of that would account for this different between OH 3.4 and 4.1, let alone between 4.0 and 4.1. In fact, there was only one PR for rrd4j between 4.0 and 4.1 which fixed a problem with restoreOnStartup which added an additional condition to an if statement, not something that’s going to impact querying performance.

https://github.com/openhab/openhab-addons/pulls?q=is%3Apr+is%3Aclosed+rrd4j

jlikonen · January 29, 2024, 4:11pm

@rlkoshak, OK , now I understand what you mean. I made some tests (see attached screencopies).

In the first case (with rrd4j) there were 1440 data points and with InfluxDB 1467 points. When I tested rrd4j and kept pressing the Execute button the response was immediate but after ~5 trials the response was very slow so it took longer than ~10s.

With InfluxDB the response was always immediate regardless how many times I pressed Execute.

I don’t know what is causing this behaviour.

OK, I accept the fact that there haven’t been relevant changes to rrd4j but something is slowing down rrd4j in my case.

rlkoshak · January 29, 2024, 4:31pm

I think this last experiment is almost enough to file an issue. Put rrd4j and org.openhab.core.persistence (I think) into DEBUG or TRACE level logging and run the repeated queries against rrd4j again. See if you can find which parts seem to take longer between runs.

Gather those logs and description of what you did and file an issue on openhab-core or openhab-addons depending on where it appears the slowdown is occurring, which is what you need the logs for.

jlikonen · January 30, 2024, 4:44pm

@rlkoshak, I have now enabled debug level logging by typing in Karaf:

log:set DEBUG org.openhab.persistence.rrd4j

I’m wondering one thing. It seems that rrd4j stores each Item twice, e.g. my Item Nibe_S1255_UlkoT is stored twice

024-01-30 18:42:00.679 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'Nibe_S1255_UlkoT' as value '0.1' with timestamp 1706632918 in rrd4j database (again)
2024-01-30 18:42:00.679 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'Nibe_S1255_UlkoT' as value '0.0' with timestamp 1706632919 in rrd4j database

Same happens with all my Items. Could this be the reason for performance issues? I haven’t filed a bug report yet.

rlkoshak · January 30, 2024, 4:50pm

I doubt it. For one, the performance issues you are encountering are on read and you’ve run a query and confirmed that you are not getting bact twice as many values as expected.

jlikonen · January 30, 2024, 5:08pm

@rlkoshak OK, thanks. With trace level logging I get following messages when running queries in API explorer:

2024-01-30 18:56:53.767 [TRACE] [d4j.internal.RRD4jPersistenceService] - Querying rrd4j database for item 'Nibe_S1255_UlkoT'

With debug level logging I get similar messages in the log but no error messages. I don’t think that these messages would reveal anythig. When running queries in API explorer I can run a query for 4 times in a row without any problems but the 5th query seems to take very long time.

rlkoshak · January 30, 2024, 5:26pm

It’s the timestamps that matter. Are there two log statements whose time between them grows with the repeated queries. That will indicate where in the code the problem is.

jlikonen · March 4, 2024, 4:28pm

It seems that issue #16360 in OH4.2.0M1 fixed the querying problem I have had with rrd4j. This has been a huge improvent for me. Many thanks.