Back to the topic of zram usage; my zramctl shows:
# zramctl --output-all
NAME DISKSIZE DATA COMPR ALGORITHM STREAMS ZERO-PAGES TOTAL MEM-LIMIT MEM-USED MIGRATED MOUNTPOINT
/dev/zram3 450M 390.3M 44.9M zstd 4 794 200M 200M 200M 0B /opt/zram/zram3
/dev/zram2 350M 364K 5.2K zstd 4 0 184K 150M 184K 0B /opt/zram/zram2
/dev/zram1 350M 65.3M 1.5M zstd 4 58 32.6M 150M 32.6M 0B /opt/zram/zram1
/dev/zram0 450M 4K 86B lzo-rle 4 0 4K 200M 4K 0B [SWAP]
Size of /var/lib/openhab/persistence/rrd4j is 36M.
Funny enough, irrd4j is stored on zram1, and zram3 is used for logs. And size of logs is 390M. The largest files (128 mbytes!!!) are log/daemon.log and log/syslog; with the majority of contents showing that very error:
Feb 2 03:47:25 openhabian karaf[813]: Exception in thread "OH-eventexecutor-149284" java.lang.NullPointerException
Feb 2 03:47:25 openhabian karaf[813]: #011at java.base/java.util.concurrent.LinkedBlockingQueue.dequeue(LinkedBlockingQueue.java:214)
Feb 2 03:47:25 openhabian karaf[813]: #011at java.base/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435)
Feb 2 03:47:25 openhabian karaf[813]: #011at java.base/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1054)
Feb 2 03:47:25 openhabian karaf[813]: #011at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
Looks like the error is cascading. A failure in one place (persistence) causes spitting tons of these messages to logs; which in turn aggravates the failure until the complete collapse. @rlkoshak i wonder, can something be done about it ? One way would be to catch all DB submission errors in the openhab and report only the first error until things resolve. I understand that it’s easier said than done; but IMHO this is a major issue.
If you’re interested, this is how it began:
Feb 2 03:45:28 openhabian karaf[813]: Exception in thread "OH-eventexecutor-1" Exception in thread "OH-eventexecutor-2" java.lang.NullPointerException
Feb 2 03:45:28 openhabian karaf[813]: Exception in thread "OH-eventexecutor-3" #011at java.base/java.util.concurrent.LinkedBlockingQueue.dequeue(LinkedBlock
Feb 2 03:45:28 openhabian karaf[813]: #011at java.base/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435)
Feb 2 03:45:28 openhabian karaf[813]: #011at java.base/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1054)
Feb 2 03:45:28 openhabian karaf[813]: #011at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
Feb 2 03:45:28 openhabian karaf[813]: #011at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
Feb 2 03:45:28 openhabian karaf[813]: #011at java.base/java.lang.Thread.run(Thread.java:829)
Feb 2 03:45:28 openhabian karaf[813]: Exception in thread "OH-eventexecutor-4" java.lang.NullPointerException
Feb 2 03:45:28 openhabian karaf[813]: #011at java.base/java.util.concurrent.LinkedBlockingQueue.dequeue(LinkedBlockingQueue.java:214)
Feb 2 03:45:28 openhabian karaf[813]: #011at java.base/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435)
Feb 2 03:45:28 openhabian karaf[813]: #011at java.base/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1054)
Feb 2 03:45:28 openhabian karaf[813]: #011at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
Feb 2 03:45:28 openhabian karaf[813]: #011at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
Feb 2 03:45:28 openhabian karaf[813]: #011at java.base/java.lang.Thread.run(Thread.java:829)
So, it happened at 03:45; and the last such line is stamped 03:47. So, in two minutes the log was filled up to the top with 1044989 lines (i’ve used " grep karaf syslog |wc -l" to count); it looks like OH just keeps trying and failing with no delay.
kern.log has this:
Feb 2 03:47:23 openhabian kernel: [365506.992045] EXT4-fs warning (device zram3): ext4_end_bio:349: I/O error 10 writing to inode 56 starting block 47231)
Feb 2 03:47:23 openhabian kernel: [365506.997667] EXT4-fs warning (device zram3): ext4_end_bio:349: I/O error 10 writing to inode 56 starting block 47507)
Feb 2 03:47:23 openhabian kernel: [365506.997744] Buffer I/O error on device zram3, logical block 47507
Feb 2 03:47:23 openhabian kernel: [365506.997760] Buffer I/O error on device zram3, logical block 47508
Feb 2 03:47:23 openhabian kernel: [365506.997771] Buffer I/O error on device zram3, logical block 47509
Feb 2 03:47:23 openhabian kernel: [365506.997781] Buffer I/O error on device zram3, logical block 47510
Feb 2 03:47:23 openhabian kernel: [365506.997791] Buffer I/O error on device zram3, logical block 47511
Feb 2 03:47:23 openhabian kernel: [365506.997801] Buffer I/O error on device zram3, logical block 47512
Feb 2 03:47:23 openhabian kernel: [365506.997811] Buffer I/O error on device zram3, logical block 47513
Feb 2 03:47:23 openhabian kernel: [365506.997821] Buffer I/O error on device zram3, logical block 47514
Feb 2 03:47:23 openhabian kernel: [365506.997831] Buffer I/O error on device zram3, logical block 47515
Feb 2 03:47:23 openhabian kernel: [365506.997840] Buffer I/O error on device zram3, logical block 47516
Feb 2 03:47:23 openhabian kernel: [365507.010431] EXT4-fs warning (device zram3): ext4_end_bio:349: I/O error 10 writing to inode 56 starting block 33278)
Feb 2 03:47:23 openhabian kernel: [365507.025818] EXT4-fs warning (device zram3): ext4_end_bio:349: I/O error 10 writing to inode 56 starting block 24322)
Feb 2 03:47:23 openhabian kernel: [365507.040076] EXT4-fs warning (device zram3): ext4_end_bio:349: I/O error 10 writing to inode 56 starting block 24602)
Feb 2 03:47:23 openhabian kernel: [365507.054284] EXT4-fs warning (device zram3): ext4_end_bio:349: I/O error 10 writing to inode 56 starting block 24882)
Feb 2 03:47:23 openhabian kernel: [365507.068931] EXT4-fs warning (device zram3): ext4_end_bio:349: I/O error 10 writing to inode 56 starting block 25154)
Feb 2 03:47:23 openhabian kernel: [365507.082773] EXT4-fs warning (device zram3): ext4_end_bio:349: I/O error 10 writing to inode 56 starting block 25438)
Feb 2 03:47:23 openhabian kernel: [365507.097182] EXT4-fs warning (device zram3): ext4_end_bio:349: I/O error 10 writing to inode 56 starting block 25703)
Feb 2 03:47:23 openhabian kernel: [365507.111421] EXT4-fs warning (device zram3): ext4_end_bio:349: I/O error 10 writing to inode 56 starting block 25979)
Note the timestamp of 03:47, i. e. this was a consequence of filling up the log partition.