This is quite urgent to me because I cannot open any rollershutter anymore.
The exceptions all look like (see below). Lots of them per second.
This is openhab 3.2.0-1 on Raspbian buster.
apt update && apt ugrade (should not have done that)
all things are gone
rebooting created a new openhab.log but the old one disappeared. openhab.log.1.gz is old. I did not find openhab in /etc/logrotate.d, who rotates them?
checked all still existing logs. I cannot see anything relevant prior to the exceptions.
srv/openhab-userdata/jsondb/org.openhab.core.thing.Thing.json had length 0, I restored it. Last modification time was in 2020
reboot, now I got a java error message (see below) about missing org.eclipse.smarthome.core.thing.internal.ThingImpl
apt install --reinstall openhab openhab-addons
still the same error
I compared a working backup with current state but did not yet find another relevant difference. Next I will continue with that search.
/var/log/kern.log.1 contains just one line saying
May 21 22:02:51 openhab kernel: [477561.583573] cgroup: fork rejected by pids controller in /system.slice/openhab.service
Might be relevant?
May 21 21:57:48 openhab karaf: Exception in thread "EventAdminAsyncThread #11" Exception in thread "EventAdminAsyncThread #12" java.lang.NullPointerException
May 21 21:57:48 openhab karaf: #011at java.base/java.util.concurrent.LinkedBlockingQueue.dequeue(LinkedBlockingQueue.java:217)
May 21 21:57:48 openhab karaf: #011at java.base/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435)
May 21 21:57:48 openhab karaf: #011at java.base/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1054)
May 21 21:57:48 openhab karaf: #011at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
May 21 21:57:48 openhab karaf: #011at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
May 21 21:57:48 openhab karaf: #011at java.base/java.lang.Thread.run(Thread.java:829)
2022-05-22 09:42:43.223 [ERROR] [re.storage.json.internal.JsonStorage] - Couldn't deserialize value 'org.openhab.core.storage.json.internal.StorageEntry@1a70599'. Root cause is: org.eclipse.smarthome.core.thing.internal.ThingImpl cannot be found by org.openhab.core.thing_3.2.0
I believe I never downloaded an addon outside of debian openhab-addons.
This system was originally 2.5 but I upgraded to OH3 when it became available. I probably did that as recommended, but I really do not remember.
I did not change anything except editing some rules in the last few month.
@Wolfgang_S: Will look at the log problem later. I find a lot of files with that string. Is my installation really that broken? It always worked mostly fine. See attached output of grep -arl org.eclipse.smarthome grep.txt (3.5 KB)
From symptoms you get, it might be a system configuration error or some kind of thread leak. Given number of addons you have its hard to determine.
To determine which is faulty - disable everything. Start with plain distro and observe behavior. My personal bet is automation as this thing is custom and most likely requires adjustment.
Meanwhile I just reloaded a 2 days old backup which fixed the problem. When it happens again I will do a backup right away before running apt upgrade, so there should be fewer differences to a working system.
Since I did not change anything about addons since a long time, I did not disable any for now. But I will install some kind of monitoring software which tells me when a log file suddenly starts growing fast. So I will hopefully be able to better find the cause.
With that in mind, I will also migrate from Raspberry to a virtual guest on my home server, so I can backup/restore the entire image. Then I can simply copy the runaway image and debug it separately.
It might be interesting to find common factors with the user I mentioned having had the same problem.
About the files with org.eclipse.smarthome: After ignoring backup and /srv, what remains is:
apt-file search for /audio.config and /voice.config finds nothing, so I deleted them and the now empty directories above.
The question why such code was executed at all remains open. I must admit I know next to nothing about Java, and I do not want to - especially since there is also kotlin.
I also deleted some empty openhab2 directories like /var/lib/openhab2, /var/log/openhab2. No idea why they were still there.
And since I regularly do full backups with restic (the very best backup solution IMHO) I see no need for openhab to create its own backups. Is there a config option to disable that (which I did not find) or can I just delete them with cron?
About the empty jsondb/org.openhab.core.thing.Thing.json:
2022-05-22 00:07:11.687 [ERROR] [re.storage.json.internal.JsonStorage] - Error writing JsonDB to /var/lib/openhab/jsondb/org.openhab.core.thing.Thing.json. Cause No space left on device.
2022-05-22 00:07:11.733 [ERROR] [re.storage.json.internal.JsonStorage] - Error writing JsonDB to /var/lib/openhab/jsondb/backup/1653170831692--org.openhab.core.thing.Thing.json. Cause No space left on device.
and I believe that this resulted in the empty file.
Maybe the saving code does not correctly handle full disk.
But as I said - this is Java, I do not even know how to look at the source code in a .jar file and I do not want to know.
My backups say
root@speicher:/mnt/snapshots# ls -lnrt 2*/openhab/var/lib/openhab/jsondb/org.openhab.core.thing.Thing.json | tail -5
-rw-rw-r-- 1 110 115 1446007 20. Mai 01:00 2022-05-20T01:00:22+02:00/openhab/var/lib/openhab/jsondb/org.openhab.core.thing.Thing.json
-rw-rw-r-- 1 110 115 1446027 21. Mai 01:00 2022-05-21T01:00:22+02:00/openhab/var/lib/openhab/jsondb/org.openhab.core.thing.Thing.json
-rw-rw-r-- 1 110 115 0 22. Mai 00:51 2022-05-22T01:00:24+02:00/openhab/var/lib/openhab/jsondb/org.openhab.core.thing.Thing.json
-rw-rw-r-- 1 110 115 0 22. Mai 07:32 2022-05-22T08:40:18+02:00/openhab/var/lib/openhab/jsondb/org.openhab.core.thing.Thing.json
-rw-rw-r-- 1 110 115 1446019 22. Mai 17:25 2022-05-22T17:24:17+02:00/openhab/var/lib/openhab/jsondb/org.openhab.core.thing.Thing.json