Performance issues with rrdj4

I’m running OH4.1.0 on a Win11 box (Intel NUC8i with 8GB RAM) and now rrdj4 seems to be rather slow when I click analyzer for any Item. Sometimes the graph opens up almost instantly but in many cases it may take up to 10-20s to plot the values of an Item. I’m not quite sure when this problem appeared for the first time. I upgraded OH3.4 to OH4 last October and I think rrdj4 worked OK under OH4.0. I then upgraded OH4.0 to OH4.1 about 2 weeks ago so it could be that slowness of rrdj4 started then but I’m not completely sure. Any ideas?

1 Like

What are the runtime stats on the machine? In particular what’s the RAM situation? If you open the Task Manager, click the “Performance” icon, then the three dots menu and Resource Monitor a new window will come up. Clicking on the Memory tab will show a nice chart showing your memory utilization. If you don’t have any standby or free memory you might have oversubscribed the machine.

Look around on the other tabs for signs of trouble.

That’s a first step.

I have been looking at Task manager earlier but I’m not sure if now memory and CPU usages are higher than before when I didn’t have any issues with rrdj4. Java.exe is using typically 30% (1.3GB) and influxd.exe 15% (670MB) of memory. Total memory usage is ~77%. CPU usage is <25%.

I forgot to mention that I have pages with several chart widgets which read data from influxdb and I have no issues with these graphs so they open up more or less immediately. One thing I’m wondering ist that graphs for some items open up immediately and for some others it takes ages. If there was e.g. a memory issue then you would perhaps expect that none of the graphs would open very quickly.

Not necessarily. If you are low on memory that means parts of RAM are being saved to the swap file. Then when that memory is needed again something else needs to be moved to the swap file and the memory you need needs to be restored to RAM, resulting in two file system operations before the CPU can even start doing what it’s been asked to do.

What gets swapped out will be those parts of memory that has been accessed the longest time ago. So if something needed to generate a chart for one Item happened to be swapped out and for another Item it hasn’t then you’ll see the delay on the first but not the second.

I’m not saying that’s what is happening here but what’s been shown so far doesn’t necessarily rule it out either.

Yes, I agree with you about the memory usage. I had to reboot my OH server (due to some Windows updates) and I noticed that when clicking different Items many of them opened the graph almost immediately so I thought that rebooting the NUC helped but I kept clicking on different Items and the graphs started to appear more slowly. I also kept the Task manager open while clicking on Items but I didn’t notice any increase in the memory usage. As I said in my first post rrdj4 issues began most likely when I made the upgrade 4.0.3–>4.1.0 but I’m not completely sure. Now when looking at Task Manager the memory usage is ~80% so there should be ~20% free memory left.

So, how to continue?

I have now upgraded the amount of RAM (8–>16GB) but I still have the same performance issue with rrdj4 so the problem is not related to the amount of RAM. Any ideas how to continue?

Did you check the Event Viewer? Could you post relevant log records?

I have looked at the Event Viewer but I’m not sure what to look for there.

Look for errors or warnings which are related to rrdj4, cpu performance, etc.
Additionally I would deactivate any suspect widgets at the moment, reboot the machine and watch if the performance is stable.
Then add one widget after another and keep an eye on the event viewer.

OK, thanks. I’ll do that. I don’t use rrdj4 in any of my widgets. I use influxdb in all of them and I don’t have any issues with these widgets. The problems with rrdj4 occur only when I click the analyzer of any Item.

Is Influxdb configured as your default persistence? As far as I’m aware MainUI only supports generating charts from the configured default. That means either you have configured InfluxDB as your default and it’s being used for all charts, including the ones you are not having problems with, or it’s being used for none of them.

@Oliver2, I have now checked the Event Viewer but couldn’t find anything suspicious.

@rlkoshak, My defauls persistence is rrdj4 but I use also InfluxDB to store some of my Items on an hourly basis. I’m not using InfluxDB for my MainUI charts at all. This was my setup also in OH3.4 and I didn’t have any issues with the MainUI charts.

Then I’m confused. what did you mean by

I don’t use rrdj4 in any of my widgets. I use influxdb in all of them and I don’t have any issues with these widgets.

@rlkoshak, sorry if I haven’t been clear enough. I have used InfluxDB only in my own custom widgets, see e.g. my post. These widgets work really nicely and the data is plotted almost immediately.

I’m using rrdj4 only when plotting any Item’s chart by clicking Analyze in the MainUI. In these cases plotting of the chart is very slow.

Well, one difference is rrd4j saves a value once per minute which is a lot more than one value per hour. Have you verified the problem isn’t in the browser processing all of those values?

Is there a difference in how long the calls to the persistence rest API endpoints for each takes.

@rlkoshak yes, but this is only partly true. MainUI Analyzer plots data for the last 24h as a default which means that there are 24x60=1440 data points to be plotted. With InluxDB I plot also yearly values without any problems, i.e. 365x24=8760 points so many more than with rrdj4.

I have used Firefox, Edge and Chrome on different computers and the problem is not related to the browser not to the PC I’m using. I have the same problem also with the OH server (NUC).

Is there a difference in how long the calls to the persistence rest API endpoints for each takes.

I don’t quite understand what you mean here.

As I said in my first post I didn’t have any issues under OH3.4 and most likely not under OH4.0.

As far I remember there are no changes in query logic of persistence services. There were adjustments of persistence config, but this is how data arrives in services, bot how it returns.
I’d first check if there was any change in rrd4j library version as this might be one of destabilizing factors.

@splatch, ok, thanks. How can I check which rrdj4 version I have now?

Try in karaf/openhab shell la -l|grep rrd or la -s|grep rrd