All persistance stopped saving data suddently

Hello Together,

today, at ~11:20 AM all persistence services suddenly stopped working and I see absolutely no reason why…
They don’t really stopped working, I can read the data, but no data is stored anymore since ~11:20 AM.
What is strange, I have InfluxDB for some Grafana stuff und also RRD4j for the usual Item graphs inside OpenHab. And both of them stopped saving at nearly the same time. In Log Viewer nothing is visible, no red Lines…


I checked also free disk space, seems fine:
grafik

I did some days ago an update in openhabian-config, everything should be at latest stable version:
openHAB 3.4.2 Release Build
Influxdb-Version: 1.8.10

Anyone any idea what could be? What can I check?

I’ve no idea what it could be but have you tried restarting OH to see if that kicks it off? If not, maybe something is logged in the debug or trace level logs. At least you can see if OH is even trying to save the data and something is going wrong or if the data isn’t even being saved.

Some other things to check include:

  • are these Items in fact updating?
  • the add-ons are installed?
  • check to see if you have .persist files, if so check them for correctness

Can you check if items in /var/lib/openhab/persistence/rrd4j are updated every minute?
You have enabled zram, right? Maybe an issue with zram? Not sure…

Ok, thanks a lot, after restarting the openhab service it is working again… Have it already over two years, the fist time I had this issue…

Tonight it happened again:
grafik

I think something is wrong with the latest version of openhab :frowning:
Restart of openhab service helped again, but if that happens every day, that is not acceptable…

So, I’m pretty sure this is an new Bug in OpenHab 3.4.2
grafik
Before that update, at v3.4.1 I didn’t hat that problems, that began after that update, some days later.
After restart of openhab service it work again for some time. In LogViewer nothing is shown.
Items are updated all the time, this is also shown in LogViewer.

I would like to look in the logs at that specific time, if there is something happened, where I have to look for them?

Depending on how OH is installed the logs are either stored in /var/log/openhab or $OH_HOME/userdata/logs. When ever the file gets too large or OH restarts, the logs roll off. The most recent 9 files are kept. The first part of each log statement is the date and time.

The only thing what I have around that time is in the syslog:

Mar  6 09:18:10 openhab influxd-systemd-start.sh[3506]: [httpd] 127.0.0.1 - openhab [06/Mar/2023:09:18:10 +0100] "POST /write?db=openhab&rp=autogen&precision=n&consistency=one HTTP/1.1 " 204 0 "-" "okhttp/3.14.9" 6e6e3bca-bbf7-11ed-8bfb-e45f010be81e 7758
Mar  6 09:18:20 openhab influxd-systemd-start.sh[3506]: [httpd] 127.0.0.1 - openhab [06/Mar/2023:09:18:20 +0100] "POST /write?db=openhab&rp=autogen&precision=n&consistency=one HTTP/1.1 " 204 0 "-" "okhttp/3.14.9" 745958ec-bbf7-11ed-8bfc-e45f010be81e 7925
Mar  6 09:18:21 openhab influxd-systemd-start.sh[3506]: [httpd] 127.0.0.1 - openhab [06/Mar/2023:09:18:21 +0100] "POST /write?db=openhab&rp=autogen&precision=n&consistency=one HTTP/1.1 " 204 0 "-" "okhttp/3.14.9" 7521b46a-bbf7-11ed-8bfd-e45f010be81e 6347
Mar  6 09:18:22 openhab karaf[1053]: Exception in thread "OH-items-4" java.io.IOError: java.io.EOFException
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.Volume$FileChannelVol.getByte(Volume.java:1000)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.Volume.getUnsignedShort(Volume.java:109)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.StoreWAL.longStackGetPage(StoreWAL.java:1046)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.StoreWAL.longStackTake(StoreWAL.java:899)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.StoreDirect.freePhysTake(StoreDirect.java:1098)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.StoreDirect.physAllocate(StoreDirect.java:666)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.StoreWAL.update(StoreWAL.java:404)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.Caches$HashTable.update(Caches.java:270)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.EngineWrapper.update(EngineWrapper.java:63)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.BTreeMap.put2(BTreeMap.java:707)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.BTreeMap.put(BTreeMap.java:643)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.openhab.persistence.mapdb.internal.MapDbPersistenceService.store(MapDbPersistenceService.java:186)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.openhab.core.persistence.internal.PersistenceManagerImpl.handleStateEvent(PersistenceManagerImpl.java:152)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.openhab.core.persistence.internal.PersistenceManagerImpl.stateChanged(PersistenceManagerImpl.java:473)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.openhab.core.items.GenericItem.lambda$1(GenericItem.java:259)
Mar  6 09:18:22 openhab karaf[1053]: #011at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
Mar  6 09:18:22 openhab karaf[1053]: #011at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
Mar  6 09:18:22 openhab karaf[1053]: #011at java.base/java.lang.Thread.run(Thread.java:829)
Mar  6 09:18:22 openhab karaf[1053]: Caused by: java.io.EOFException
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.Volume$FileChannelVol.readFully(Volume.java:947)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.Volume$FileChannelVol.getByte(Volume.java:997)
Mar  6 09:18:22 openhab karaf[1053]: #011... 17 more
Mar  6 09:18:22 openhab karaf[1053]: Exception in thread "OH-items-1" java.io.IOError: java.io.EOFException
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.Volume$FileChannelVol.getByte(Volume.java:1000)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.Volume.getUnsignedShort(Volume.java:109)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.StoreWAL.longStackGetPage(StoreWAL.java:1046)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.StoreWAL.longStackTake(StoreWAL.java:899)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.StoreDirect.freePhysTake(StoreDirect.java:1098)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.StoreDirect.physAllocate(StoreDirect.java:666)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.StoreWAL.update(StoreWAL.java:404)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.Caches$HashTable.update(Caches.java:270)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.EngineWrapper.update(EngineWrapper.java:63)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.BTreeMap.put2(BTreeMap.java:707)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.BTreeMap.put(BTreeMap.java:643)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.openhab.persistence.mapdb.internal.MapDbPersistenceService.store(MapDbPersistenceService.java:186)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.openhab.core.persistence.internal.PersistenceManagerImpl.handleStateEvent(PersistenceManagerImpl.java:152)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.openhab.core.persistence.internal.PersistenceManagerImpl.stateChanged(PersistenceManagerImpl.java:473)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.openhab.core.items.GenericItem.lambda$1(GenericItem.java:259)
Mar  6 09:18:22 openhab karaf[1053]: #011at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
Mar  6 09:18:22 openhab karaf[1053]: #011at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
Mar  6 09:18:22 openhab karaf[1053]: #011at java.base/java.lang.Thread.run(Thread.java:829)
Mar  6 09:18:22 openhab karaf[1053]: Caused by: java.io.EOFException
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.Volume$FileChannelVol.readFully(Volume.java:947)
Mar  6 09:18:22 openhab karaf[1053]: #011at org.mapdb.Volume$FileChannelVol.getByte(Volume.java:997)
Mar  6 09:18:22 openhab karaf[1053]: #011... 17 more
Mar  6 09:18:53 openhab npm[1279]: #033[32mZigbee2MQTT:info #033[39m 2023-03-06 09:18:53: MQTT publish: topic 'zigbee2mqtt/0x00158d0007f724e0', payload '{"battery":19,"humidity":82.66,"linkquality":55,"pressure":926.2,"temperature":-0.37,"voltage":2745}'
Mar  6 09:19:25 openhab npm[1279]: #033[32mZigbee2MQTT:info #033[39m 2023-03-06 09:19:25: MQTT publish: topic 'zigbee2mqtt/0x60a423fffed27294', payload '{"linkquality":55}'
Mar  6 09:20:54 openhab grafana-server[3504]: logger=cleanup t=2023-03-06T09:20:54.766640711+01:00 level=info msg="Completed cleanup jobs" duration=16.951305ms
Mar  6 09:20:54 openhab grafana-server[3504]: FileLogWriter("/var/log/grafana/grafana.log"): close /var/log/grafana/grafana.log: file already closed

Well, that points to something going wrong with the file system. Is it always MapDB that complains or is it sometimes a different persistence engine?

It would be odd for an error in MapDB to mess up charts though. MapDB only saves one value per Item, you can’t chart with it. But if the file system if full all your persistence (and anything else that writes to that file system) would be effected.

If you are running openHABian, than most of OH operates out of zram which, because RAM is scarce, isn’t that big.

When persistence stops, run df -h and see if any file systems are full.

Check out zramctl --output-all if any limit is hit.

Plus you seem to be using Grafana. That’s a memory hog. You should not have installed that on ZRAM.

Yes, zram is installed. I also run df -h above, you can see it in my first post, there is a lot free.

I will try this today evening.

My OH3 is installed on RPi 4 with 8GB of RAM, I checked the free RAM before I restarted the Service, there was about 6GB free…
It seems, when MapDB crash, all persistence of OH went down, but the DB itself is still running.
Where can I see where this MapDB is writing the data to?

MapDB is no more used by default since OH3 (or even OH2 - I do not remember exactly). Did you install/enable ir?

MapDB writes to $OH_USERDATA/persistence/mapdb.

Memory Settings & Limits for ZRAM are in /etc/ztab → It has nothing to do with the 6GB.
But first check ‘zramctl --output-all’ if any limit is hit, as mentioned by mstormi

Stopped again today morning @ 04:00

It seems there is no problem:
grafik

Looks fine:

Memory seems also fine:
grafik

It seems I have it installed, but I’m not using it by default:
grafik

It seems OH3 is using a lot of cpu, but no idea why, I’m doing nothing there atm:

I didnt restarted openhab service yet, I will uninstall MapDB, as I’m not using it and then try it again…
I’m not sure, but is MapDB not responsible for the last state of some items, like switches?

Depends on how you have it configured. By default rrd4j will save all Items it supports and restoreOnStartup all of those Items. At this point most Items except Strings and Items with compount states like Color and Location are not saved. MapDB by default will save all Items regardless of type and restoreOnStartup. But InfluxDB also will do the same.

If you’ve customized the behavior using .persist files, than see those files to see what is saved how often and restoreOnStartup.

Looking at the screen shots (which are almost impossible to read on a phone unfortunately) it’s weird that one of your CPU cores is pegged and the other three are basically idle. That points to a process run amok.