But the file is no longer written. Log stops at 7am but the file seems to be touched after that.
There is not exception logged it both events and openhab files and last item is within the same second.
This looks a whole lot like a worn out SDD card. But that seems unlikely since you are using ZRAM and the /var/log is in ZRAM. So it’s really kind of odd behavior. I can’t think of any reason why it would stop writing when there is space available.
One thing I would try is to delete the logs and then reboot. If you don’t delete the files first, ZRAM will flush them out to the SD card and then restore them to ZRAM on the reboot (this is why your OH configs don’t disappear when you reboot). That will give you a clean slate in the logs folder.
I’m not sure what else to try though if that doesn’t do anything. It’s really odd behavior.
It is a fairly new setup with industrial SD card (one of those dedicated to DVR’s that has 10x writes longevity) so I doubt it is it. but I will shut down and mirror it just to check.
Today it failed again at 2:55am but this time I found in journal that it accually logs an exception failing to do log rotate. So why the ZRAM becomes read only at certain moment?
Edit:
I just found @mstormi post on github ZRAM tracker that could be related to the similar issue.
Could you please maybe put some light on it how to work around that?
..
Oct 11 11:21:48 morele-OH systemd[1]: Started openHAB2 instance, reachable at http://morele-OH:8080.
Oct 12 02:55:49 morele-OH karaf[1021]: org.ops4j.pax.logging.pax-logging-api [log4j2] ERROR : Unable to move file /var/log/openhab2/openhab.log.4 to /var/log/openhab2/openhab.log.3: java.nio.file.FileSystemException /var/log
Oct 12 02:55:50 morele-OH karaf[1021]: org.ops4j.pax.logging.pax-logging-api [log4j2] ERROR : Unable to move file /var/log/openhab2/openhab.log.5 to /var/log/openhab2/openhab.log.4: java.nio.file.FileSystemException /var/log
Oct 12 02:55:52 morele-OH karaf[1021]: org.ops4j.pax.logging.pax-logging-api [log4j2] ERROR : Unable to move file /var/log/openhab2/openhab.log.6 to /var/log/openhab2/openhab.log.5: java.nio.file.FileSystemException /var/log
Oct 12 02:55:53 morele-OH karaf[1021]: org.ops4j.pax.logging.pax-logging-api [log4j2] ERROR : Unable to move file /var/log/openhab2/openhab.log.7 to /var/log/openhab2/openhab.log.6: java.nio.file.FileSystemException /var/log
Oct 12 02:55:54 morele-OH karaf[1021]: org.ops4j.pax.logging.pax-logging-api [log4j2] ERROR : Unable to write to stream /var/log/openhab2/openhab.log for appender LOGFILE Ignored FQCN: org.apache.logging.log4j.spi.AbstractLo
Oct 12 02:55:54 morele-OH karaf[1021]: org.apache.logging.log4j.core.appender.AppenderLoggingException: Error writing to RandomAccessFile /var/log/openhab2/openhab.log
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager.writeToDestination(RollingRandomAccessFileManager.java:141)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager.java:293)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager.flush(RollingRandomAccessFileManager.java:160)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(AbstractOutputStreamAppender.java:199)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutputStreamAppender.java:190)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:181)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender.append(RollingRandomAccessFileAppender.java:252)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:156)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:129)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:120)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:543)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:502)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:485)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.config.LoggerConfig.logParent(LoggerConfig.java:534)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:504)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:485)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.config.LoggerConfig.logParent(LoggerConfig.java:534)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:504)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:485)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:412)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:63)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.Logger.logMessage(Logger.java:154)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.ops4j.pax.logging.log4j2.internal.PaxLoggerImpl.doLog0(PaxLoggerImpl.java:354)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.ops4j.pax.logging.log4j2.internal.PaxLoggerImpl.doLog(PaxLoggerImpl.java:337)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.ops4j.pax.logging.log4j2.internal.PaxLoggerImpl.debug(PaxLoggerImpl.java:142)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.ops4j.pax.logging.internal.TrackingLogger.debug(TrackingLogger.java:115)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.ops4j.pax.logging.slf4j.Slf4jLogger.debug(Slf4jLogger.java:358)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.smarthome.model.script.actions.LogAction.logDebug(LogAction.java:39)
Oct 12 02:55:54 morele-OH karaf[1021]: at sun.reflect.GeneratedMethodAccessor66.invoke(Unknown Source)
Oct 12 02:55:54 morele-OH karaf[1021]: at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Oct 12 02:55:54 morele-OH karaf[1021]: at java.lang.reflect.Method.invoke(Method.java:498)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeOperation(XbaseInterpreter.java:1175)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeOperation(XbaseInterpreter.java:1150)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._invokeFeature(XbaseInterpreter.java:1136)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeFeature(XbaseInterpreter.java:1081)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.invokeFeature(ScriptInterpreter.java:151)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:991)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:954)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:235)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:215)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:458)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:239)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:215)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.evaluate(XbaseInterpreter.java:201)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.smarthome.model.script.runtime.internal.engine.ScriptImpl.execute(ScriptImpl.java:81)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.eclipse.smarthome.model.rule.runtime.internal.engine.RuleEngineImpl.lambda$2(RuleEngineImpl.java:313)
Oct 12 02:55:54 morele-OH karaf[1021]: at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
Oct 12 02:55:54 morele-OH karaf[1021]: at java.util.concurrent.FutureTask.run(FutureTask.java:266)
Oct 12 02:55:54 morele-OH karaf[1021]: at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
Oct 12 02:55:54 morele-OH karaf[1021]: at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
Oct 12 02:55:54 morele-OH karaf[1021]: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
Oct 12 02:55:54 morele-OH karaf[1021]: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
Oct 12 02:55:54 morele-OH karaf[1021]: at java.lang.Thread.run(Thread.java:748)
Oct 12 02:55:54 morele-OH karaf[1021]: Caused by: java.io.IOException: Read-only file system
Oct 12 02:55:54 morele-OH karaf[1021]: at java.io.RandomAccessFile.writeBytes(Native Method)
Oct 12 02:55:54 morele-OH karaf[1021]: at java.io.RandomAccessFile.write(RandomAccessFile.java:525)
Oct 12 02:55:54 morele-OH karaf[1021]: at org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager.writeToDestination(RollingRandomAccessFileManager.java:137)
Oct 12 02:55:54 morele-OH karaf[1021]: ... 55 more
This means a broken SD card always if it is an SD card based installation.
When an SD card wears out, it usually places itself into read-only mode, trying to save as much data as possible.
IMHO, it is just a marketing selling point. SD cards by design can’t withstand much writes.
Not really. You could try to enlarge /var/log in /etc/ztab or reduce size and/or number of logs in log4j2.xml.
But it’s not worth the effort to analyze for an oldish (OH2 times) openHABian install like yours, too much has changed since. Better reinstall.
Then again if those logfiles are accessed by an external logrotate process (your statement is unclear here if it’s external or OH-internal), something’s fundamentally wrong anyway.
It is a stock Openhabian setup. Internal log rotate. but from the OH exception error I see the problem is IO related cause ZRAM fs locking itself to read only.
I do log quite a lot in that setup but not excessively, up to 12 sensors/second (usually 4-8 lines /s)
My trust in OH3 for production system is very limited, do you think it will solve the problem with ZRAM fs?
I do not understand that question.
Issues with SD card, filesystem, openHABian and OS settings etc are totally unrelated to the OH version you are running.
You advised to reinstall OH, I assumed that newest version of Openhabian that comes with OH3 installs some updated version of ZRAM that might have fixed the problem I see of ZRAM switching to Read Only.
Freshly after reboot, I deleted all the old rotated logs and now wait to see when it comes back.
My OH setup is very clean. Just one binding MQTT, persistance to Influx on the separate VM on other hardware.
Well, what ever is making it read only is the reason why the logs stop being written. You can’t write to a read only file system either. It might let you continue writing to a file that you opened before the file system went to read only, but as soon as OH decides to rotate the log files it’ll fail.
So this moves the investigation some. Something is causing the file system to be remounted as read only. Often that happens when the file system fills up but not always. Other errors can cause this to happen too. You might need to do some research and looking through the syslogs and zram logs (if it has them).
Unfortunately, per my discussion above, you can’t use the timestamp of the error from OH to narrow your search in the other logs. There could be quite some lag there.
Not always and this is a ZRAM file system which isn’t on the SD card, it’s in RAM. While a worn out SD card could possibly cause a problem like this, it’s not the only thing that fits the symptoms.
I don’t think this is true at all. For the most part, neither the OS nor the SD card itself knows that it’s failing. Weird stuff just starts happening (e.g. a deleted file comes back on reboot, strange characters in weird files, random crashes, changes to files not saving, etc.).
It’s right that in 99% of cases, too much data in /var/log (note with ZRAM there’s more limits to apply than just the sum of file sizes). zramctl --output-all will tell.
Just to update you on the issue, I deleted all rotated logs from the /var/log/openhab2 and the problem went away, ex. the system did not crash after 12-18h as previously and is now running for 2 days.
I do not need as much as 7 of the old log files. 1-2 will do. Where can I limit the number of log rotates that OH2 does?
This same thing happened to me on OH3.3 (openHABian with ZRAM). My events.log and openhab.log files stopped updating yesterday afternoon, and I didn’t notice until today. In the Karaf console, I could see that log entries were still being generated. Restarting Frontail and rebooting didn’t fix the problem.
I’ve had a ton of log entries over the past week while trying to figure out an issue with the IPP binding, and then I generated more while trying to get the LG ThinQ binding to work yesterday.
After reading this thread, I thought it likely that I overloaded the ZRAM log (if I’m interpreting this discussion correctly). I went back into the console and reset the log with log:clear, then rebooted again. That seemed to do the trick, and my log is now working.
I am having the same issue and have had it for years over multiple versions of OH. I am currently on 3.4.1. When my system quits logging, I remove all the log files and reboot. The logs are good for about a week and then lock up again.
I went into the console and ran log:clear and rebooted. It is logging again, but all the files are still there in the log directory. I will see what happens this week.
I didn’t manage to solve the issue, i moved away from the RPI sbc and took the system with the nvme drive and installed the OH from apt instead of openhabian. That was a PITA as then all the litte tools your used to has to be done manually, like smb shares and frontail service.
The issue is not known to happen on latest openHABian.
But in the end it’s pretty simple, simply reduce the number and/or size of logfiles in /var/lib/openhab/etc/log4j2.xml.
This file is part of the openHAB distribution and as such may get replaced on upgrades, so eventually you have to repeat this after an upgrade.
As I wrote 3 posts before, touching log config breaks logging in my case.
Tried croning log delete, does not help either eve if I only have 2 files it will die after the week
Will check latest version of the openhabian.