Log file is not writen any more

Hi this is yet another logging stopped problem that seems to be diffrent to all the treads that I just read.

  • Platform information:
    • Hardware: RPI4
    • OS: OpenHabian
    • openHAB version: 2.5

I have this setup of OH2.5.10 that stops writing into the log for seemingly no reason.
The /log/var/ has space

[10:53:38] openhabian@morele-OH:/var/log/openhab2$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/root        29G  5.4G   23G  20% /
devtmpfs        1.8G     0  1.8G   0% /dev
tmpfs           1.9G     0  1.9G   0% /dev/shm
tmpfs           1.9G  1.5M  1.9G   1% /run
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           1.9G     0  1.9G   0% /sys/fs/cgroup
/dev/mmcblk0p1  253M   49M  204M  20% /boot
/dev/zram1      469M  237M  197M  55% /opt/zram/zram1
overlay1        469M  237M  197M  55% /var/log
/dev/zram2      469M  780K  433M   1% /opt/zram/zram2
overlay2        469M  780K  433M   1% /var/lib/openhab2/persistence
tmpfs           389M     0  389M   0% /run/user/1000

But the file is no longer written. Log stops at 7am but the file seems to be touched after that.
There is not exception logged it both events and openhab files and last item is within the same second.

[10:57:45] openhabian@morele-OH:/var/log/openhab2$ ls -la
total 252560
drwxrwxr-x+ 1 openhab openhab     4096 Oct 10 23:24 .
drwxr-xr-x  1 root    root        4096 Oct 11 00:00 ..
-rw-rw-r--  1 openhab openhab        0 Nov 19  2020 audit.log
-rw-rw-r--  1 openhab openhab 13204378 Oct 11 10:49 events.log
-rw-rw-r--  1 openhab openhab 16777233 Apr 17 11:36 events.log.1
-rw-rw-r--  1 openhab openhab 16777296 Apr 17 21:40 events.log.2
-rw-rw-r--  1 openhab openhab 16777277 Apr 18 07:49 events.log.3
-rw-rw-r--  1 openhab openhab 16777249 Apr 18 17:58 events.log.4
-rw-rw-r--  1 openhab openhab 16777221 Sep 19 16:24 events.log.5
-rw-rw-r--  1 openhab openhab 16777294 Sep 20 02:01 events.log.6
-rw-rw-r--  1 openhab openhab 16777286 Oct 10 23:24 events.log.7
-rw-rw-r--  1 openhab openhab 10416721 Oct 11 10:49 openhab.log
-rw-rw-r--  1 openhab openhab 16777306 Apr  7  2022 openhab.log.1
-rw-rw-r--  1 openhab openhab 16777303 Apr  9  2022 openhab.log.2
-rw-rw-r--  1 openhab openhab 16777277 Apr 11  2022 openhab.log.3
-rw-rw-r--  1 openhab openhab 16777284 Apr 13 18:25 openhab.log.4
-rw-rw-r--  1 openhab openhab 16777318 Apr 15 14:03 openhab.log.5
-rw-rw-r--  1 openhab openhab 16777245 Apr 17 09:43 openhab.log.6
-rw-rw-r--  1 openhab openhab 16777243 Sep 19 17:59 openhab.log.7
-rwxrwxr-x  1 openhab openhab        0 Oct 26  2020 Readme.txt

loging into Karaf log:display shows most recent log items just fine , but they are not written to the file.

restarting the openhab2.service does not fix the problem but a reboot DOES!
Since last reboot it failed again after some 16 hours.

Can anyone make some sense out of that case?

This looks a whole lot like a worn out SDD card. But that seems unlikely since you are using ZRAM and the /var/log is in ZRAM. So it’s really kind of odd behavior. I can’t think of any reason why it would stop writing when there is space available.

One thing I would try is to delete the logs and then reboot. If you don’t delete the files first, ZRAM will flush them out to the SD card and then restore them to ZRAM on the reboot (this is why your OH configs don’t disappear when you reboot). That will give you a clean slate in the logs folder.

I’m not sure what else to try though if that doesn’t do anything. It’s really odd behavior.

It is a fairly new setup with industrial SD card (one of those dedicated to DVR’s that has 10x writes longevity) so I doubt it is it. but I will shut down and mirror it just to check.

As for deleteing the logs, I cant do it.

[11:22:54] openhabian@morele-OH:/var/log/openhab2$ sudo rm events.log.1
rm: cannot remove 'events.log.1': Read-only file system

Why is it read only, it’s a log dir?

Today it failed again at 2:55am but this time I found in journal that it accually logs an exception failing to do log rotate. So why the ZRAM becomes read only at certain moment?


Edit:
I just found @mstormi post on github ZRAM tracker that could be related to the similar issue.
Could you please maybe put some light on it how to work around that?

..
Oct 11 11:21:48 morele-OH systemd[1]: Started openHAB2 instance, reachable at http://morele-OH:8080.
Oct 12 02:55:49 morele-OH karaf[1021]: org.ops4j.pax.logging.pax-logging-api [log4j2] ERROR : Unable to move file /var/log/openhab2/openhab.log.4 to /var/log/openhab2/openhab.log.3: java.nio.file.FileSystemException /var/log
Oct 12 02:55:50 morele-OH karaf[1021]: org.ops4j.pax.logging.pax-logging-api [log4j2] ERROR : Unable to move file /var/log/openhab2/openhab.log.5 to /var/log/openhab2/openhab.log.4: java.nio.file.FileSystemException /var/log
Oct 12 02:55:52 morele-OH karaf[1021]: org.ops4j.pax.logging.pax-logging-api [log4j2] ERROR : Unable to move file /var/log/openhab2/openhab.log.6 to /var/log/openhab2/openhab.log.5: java.nio.file.FileSystemException /var/log
Oct 12 02:55:53 morele-OH karaf[1021]: org.ops4j.pax.logging.pax-logging-api [log4j2] ERROR : Unable to move file /var/log/openhab2/openhab.log.7 to /var/log/openhab2/openhab.log.6: java.nio.file.FileSystemException /var/log
Oct 12 02:55:54 morele-OH karaf[1021]: org.ops4j.pax.logging.pax-logging-api [log4j2] ERROR : Unable to write to stream /var/log/openhab2/openhab.log for appender LOGFILE Ignored FQCN: org.apache.logging.log4j.spi.AbstractLo
Oct 12 02:55:54 morele-OH karaf[1021]: org.apache.logging.log4j.core.appender.AppenderLoggingException: Error writing to RandomAccessFile /var/log/openhab2/openhab.log
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager.writeToDestination(RollingRandomAccessFileManager.java:141)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager.java:293)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager.flush(RollingRandomAccessFileManager.java:160)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(AbstractOutputStreamAppender.java:199)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutputStreamAppender.java:190)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:181)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender.append(RollingRandomAccessFileAppender.java:252)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:156)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:129)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:120)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:543)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:502)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:485)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.config.LoggerConfig.logParent(LoggerConfig.java:534)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:504)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:485)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.config.LoggerConfig.logParent(LoggerConfig.java:534)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:504)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:485)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:412)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:63)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.Logger.logMessage(Logger.java:154)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.ops4j.pax.logging.log4j2.internal.PaxLoggerImpl.doLog0(PaxLoggerImpl.java:354)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.ops4j.pax.logging.log4j2.internal.PaxLoggerImpl.doLog(PaxLoggerImpl.java:337)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.ops4j.pax.logging.log4j2.internal.PaxLoggerImpl.debug(PaxLoggerImpl.java:142)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.ops4j.pax.logging.internal.TrackingLogger.debug(TrackingLogger.java:115)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.ops4j.pax.logging.slf4j.Slf4jLogger.debug(Slf4jLogger.java:358)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.smarthome.model.script.actions.LogAction.logDebug(LogAction.java:39)
Oct 12 02:55:54 morele-OH karaf[1021]:         at sun.reflect.GeneratedMethodAccessor66.invoke(Unknown Source)
Oct 12 02:55:54 morele-OH karaf[1021]:         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Oct 12 02:55:54 morele-OH karaf[1021]:         at java.lang.reflect.Method.invoke(Method.java:498)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeOperation(XbaseInterpreter.java:1175)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeOperation(XbaseInterpreter.java:1150)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._invokeFeature(XbaseInterpreter.java:1136)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeFeature(XbaseInterpreter.java:1081)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.invokeFeature(ScriptInterpreter.java:151)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:991)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:954)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:235)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:215)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:458)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:239)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:215)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.evaluate(XbaseInterpreter.java:201)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.smarthome.model.script.runtime.internal.engine.ScriptImpl.execute(ScriptImpl.java:81)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.eclipse.smarthome.model.rule.runtime.internal.engine.RuleEngineImpl.lambda$2(RuleEngineImpl.java:313)
Oct 12 02:55:54 morele-OH karaf[1021]:         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
Oct 12 02:55:54 morele-OH karaf[1021]:         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
Oct 12 02:55:54 morele-OH karaf[1021]:         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
Oct 12 02:55:54 morele-OH karaf[1021]:         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
Oct 12 02:55:54 morele-OH karaf[1021]:         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
Oct 12 02:55:54 morele-OH karaf[1021]:         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
Oct 12 02:55:54 morele-OH karaf[1021]:         at java.lang.Thread.run(Thread.java:748)
Oct 12 02:55:54 morele-OH karaf[1021]: Caused by: java.io.IOException: Read-only file system
Oct 12 02:55:54 morele-OH karaf[1021]:         at java.io.RandomAccessFile.writeBytes(Native Method)
Oct 12 02:55:54 morele-OH karaf[1021]:         at java.io.RandomAccessFile.write(RandomAccessFile.java:525)
Oct 12 02:55:54 morele-OH karaf[1021]:         at org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager.writeToDestination(RollingRandomAccessFileManager.java:137)
Oct 12 02:55:54 morele-OH karaf[1021]:         ... 55 more

This means a broken SD card always if it is an SD card based installation.
When an SD card wears out, it usually places itself into read-only mode, trying to save as much data as possible.

IMHO, it is just a marketing selling point. SD cards by design can’t withstand much writes.

No. The fs in question is a ZRAM fs.

Not really. You could try to enlarge /var/log in /etc/ztab or reduce size and/or number of logs in log4j2.xml.
But it’s not worth the effort to analyze for an oldish (OH2 times) openHABian install like yours, too much has changed since. Better reinstall.
Then again if those logfiles are accessed by an external logrotate process (your statement is unclear here if it’s external or OH-internal), something’s fundamentally wrong anyway.

It is a stock Openhabian setup. Internal log rotate. but from the OH exception error I see the problem is IO related cause ZRAM fs locking itself to read only.

I do log quite a lot in that setup but not excessively, up to 12 sensors/second (usually 4-8 lines /s)

My trust in OH3 for production system is very limited, do you think it will solve the problem with ZRAM fs?

I do not understand that question.
Issues with SD card, filesystem, openHABian and OS settings etc are totally unrelated to the OH version you are running.

You advised to reinstall OH, I assumed that newest version of Openhabian that comes with OH3 installs some updated version of ZRAM that might have fixed the problem I see of ZRAM switching to Read Only.

Freshly after reboot, I deleted all the old rotated logs and now wait to see when it comes back.
My OH setup is very clean. Just one binding MQTT, persistance to Influx on the separate VM on other hardware.

Well, what ever is making it read only is the reason why the logs stop being written. You can’t write to a read only file system either. It might let you continue writing to a file that you opened before the file system went to read only, but as soon as OH decides to rotate the log files it’ll fail.

So this moves the investigation some. Something is causing the file system to be remounted as read only. Often that happens when the file system fills up but not always. Other errors can cause this to happen too. You might need to do some research and looking through the syslogs and zram logs (if it has them).

Unfortunately, per my discussion above, you can’t use the timestamp of the error from OH to narrow your search in the other logs. There could be quite some lag there.

Not always and this is a ZRAM file system which isn’t on the SD card, it’s in RAM. While a worn out SD card could possibly cause a problem like this, it’s not the only thing that fits the symptoms.

I don’t think this is true at all. For the most part, neither the OS nor the SD card itself knows that it’s failing. Weird stuff just starts happening (e.g. a deleted file comes back on reboot, strange characters in weird files, random crashes, changes to files not saving, etc.).

It’s right that in 99% of cases, too much data in /var/log (note with ZRAM there’s more limits to apply than just the sum of file sizes). zramctl --output-all will tell.

Just to update you on the issue, I deleted all rotated logs from the /var/log/openhab2 and the problem went away, ex. the system did not crash after 12-18h as previously and is now running for 2 days.

I do not need as much as 7 of the old log files. 1-2 will do. Where can I limit the number of log rotates that OH2 does?

Maciej

This same thing happened to me on OH3.3 (openHABian with ZRAM). My events.log and openhab.log files stopped updating yesterday afternoon, and I didn’t notice until today. In the Karaf console, I could see that log entries were still being generated. Restarting Frontail and rebooting didn’t fix the problem.

I’ve had a ton of log entries over the past week while trying to figure out an issue with the IPP binding, and then I generated more while trying to get the LG ThinQ binding to work yesterday.

After reading this thread, I thought it likely that I overloaded the ZRAM log (if I’m interpreting this discussion correctly). I went back into the console and reset the log with log:clear, then rebooted again. That seemed to do the trick, and my log is now working.

1 Like

Following your solution post I tried to limit number of the events log logfiles to just 2.
But adding a line like:

log4j2.appender.event.strategy.max = 2

To the section that I belive is setup for the event log.

# Event log appender
log4j2.appender.event.type = RollingRandomAccessFile
log4j2.appender.event.name = EVENT
log4j2.appender.event.fileName = ${openhab.logdir}/events.log
log4j2.appender.event.filePattern = ${openhab.logdir}/events.log.%i
log4j2.appender.event.immediateFlush = true
log4j2.appender.event.append = true
log4j2.appender.event.layout.type = PatternLayout
log4j2.appender.event.layout.pattern = %d{yyyy-MM-dd HH:mm:ss.SSS} [%-26.26c] - %m%n
log4j2.appender.event.policies.type = Policies
log4j2.appender.event.policies.size.type = SizeBasedTriggeringPolicy
log4j2.appender.event.policies.size.size = 16MB

Just stops the logging for both events.log and openhab.log immediately.

Your suggestion was to add that into appender.out section but that seen only related to openhab.log file and also resulsts in logging to stop.

So that is not working as a limitation to the number of log files created.

If I wipe logs dir just after reboot, it will hold about 4-5 days before ZRAM overload.