High disk activity due to OH-scheduler process

niklas87 · February 3, 2021, 10:46pm

Platform information:
- Hardware: Synology Diskstation DS216+II / Intel Celeron N3060 / 4 GB RAM
- OS: Docker openhab/openhab:latest
- Java Runtime Environment: The one included in the Docker image aforementioned
- openHAB version: 3.0.1
Issue of the topic:
After migrating to openHAB 3.0 (I’m not really sure if the problem was not there before, but I just noticed it), I can see (hear) a high disk activity. Before, I had some moderate disk activity when performing some tasks, of course, but a constant, audible activity on the disk was new to me.

When htoping onto my host system, I can see that there is a process belonging to openHAB which is called OH-scheduler-xx where xx is a number incrementing after restart of the container, for example OH-scheduler-77. This process almost constantly writes to the disk with ~1 M/s as indicated by htop. Interestingly, when connecting to the container by docker exec -it openhab /bin/bash/, I only see java processes and some shell scripts running on the system. None of these processes has any noticable disk activity.

Anyways, when stopping the container, the high disk activity is gone. I also tried to disable all bindings one by one but without success. Also, I was afraid that an active timer somehow leads to this behavior but after canceling all timers, the issue still persisted.

The container has more than enough memory for OH. Swapping due to limited memory should not be an issue.

Edit: I’m using influxDB as persistence service, running in a separate container. The shortest logging interval is 30 s, so this shouldn’t be a problem either.

Please post configurations (if applicable):

I could not identify any specific component yet.

If logs where generated please post these here using code fences:

vshaev · February 4, 2021, 7:17am

I also observe this phenomenon - the load on the disk when switching to OH3 has clearly increased. And I also single out OH-scheduler as the culprit for this. I watched the load on the disk for a long time, and when switching to OH3, 20 GB more are overwritten on the disk per day.
OH2

OH3

hilbrand · February 4, 2021, 1:21pm

Could this be related to the default persistence strategy to store the states of all items? See:

rossko57 · February 4, 2021, 1:34pm

If it is, it’s probably hidden from majority users by openhabian default ZRAM application.

Eragim · February 4, 2021, 2:48pm

I also observed a similar problem with permanent write rates of approx. 1 MByte/s on my openhab container (Synology DS918+). I then set up an identical configuration on a Raspberry Pi 4B and only observed write rates of approx. 150 kByte / s.

niklas87 · February 4, 2021, 6:57pm

I changed the default persistence service under Settings > Persistence to the influxDB persistence layer. So, in theory, rrd4j should not be active anymore, right?

Also, my persistence strategies look like this:

Strategies {
    everyMinute :   "0 * * * * ?"
    everyHour :     "0 0 * * * ?"
    everyDay  :     "0 0 0 * * ?"

    default = everyChange
}

Items {
    gHeatingControls* : strategy = everyUpdate, restoreOnStartup
    gSockets* : strategy = everyMinute, restoreOnStartup
    gWeatherLoggingItem* : strategy = everyUpdate, restoreOnStartup

    NearestTracker : strategy = everyUpdate, restoreOnStartup
}

gHeatingControls contains a few Homematic Items, all updated quite slowly, gWeatherLoggingItem not faster than 1/min. And, of course, there is no default persistence for all items anymore.

rossko57 · February 4, 2021, 7:06pm

No, that just changes a parameter that says " X is the default" for use by charts and by queries in rules that don’t specify which persistence.
Whatever rrd4j was doing before, it’s still doing it.

You might uninstall it. That will break the charts in MainUI.

That only applies to one service,depending on the filename that you gave it.

niklas87 · February 4, 2021, 10:41pm

Yes, thanks! You were absolutely right!

I thought that rrd4j was not running anymore because all widgets I did not explicitly defined persistence strategies for had no data. But yeah, it seems that the widget uses the default persistence service as source but rrd4j was still running in the background, logging values for about 900 different items.

I uninstalled rrd4j and now there is something like silence in my living room. Thanks!

rossko57 · February 4, 2021, 10:59pm

Woah, that is a significant burden to persist every minute.
There’s nothing wrong with rrd4j, but you’d certainly want to be picking and choosing what to persist in that environment.

niklas87 · February 5, 2021, 9:43am

Most of the items were added to use the Weather Widgets as described here.

I guess, to disable the persistence for each item as it is done by default while keeping the binding installed, I just create a new rrd4j.persist which doesn’t contain any items, right?

Strategies {
    everyMinute : "0 * * * * ?"
}

Items {

}

Matej_Kotnik · February 24, 2022, 12:12pm

Thanks a lot!
I had no idea that rrd4j writes so much to the HDD for many years until i checked SSD health.
I also have quite a lot of items over 500, as i use OH for 3 years now with about 40 devices already.
Clearing rrd4j.persist file was the solution for me.

Before stopping rrd4j i recorded about 700MB writes/hour from 5 OH schedules.
Now the OH-schedulers are not doing any intensive writing at all.

I had to stop the wear on my SSD(Kingston A400 120GB).
For perspective it was writing about 20 GB per day, which is bad for SSD.
It is probably the main corporate for reduced SSD ramaining life to 58% in 3 years (system has written 10TB to SSD).

I also have InfluxDB with similar configuration and it uses the disk much less about 60MB/hour.

But i still had rrd4j because it loaded charts faster, before i upgraded the computer to HP prodesk mini with i5 4590T.
Now it can load InfluxDB charts in BasicUI just fine.

That below was my reaserch journey, maybe someone finds this tools i used, handy.

As i noticed unreasonably high wear on SSD via SMART report when i recently checked the it’s health with SMART tools in linux.

(sudo apt install smartmontools) sudo smartctl -a /dev/sda

(Important to mention Kingston SSDs report Remaining Life in % as RAW value instead of Wear indicator in %, it is against SMART standard so tools show wrong calculated value, as the RAW value is correct for remainig life.).

I still had to find that process that is wearing out my SSD.
So I used iotop in linux

(sudo apt install iotop) sudo iotop -a

to find the corporate for disk wear, and it was mainly 5x OH-scheduler that aparently also runs rrd4j internal databaes, when I emptied the rrd4j.persist the intensive writes ended.