My openHAB container was upgraded from 4.2.3 to 4.3.0 two days ago by watchtower. Since then, the docker container is “unhealthy”, the web interface is not reachable and no log files are written. No visible errors in the startup log - it ends with “Launching the openHAB runtime…”, afterwards no more log entries.
The update log does not contain any errors as well and ends with “SUCCESS: openHAB updated from 4.2.3 to 4.3.0”
What can I do?
This usually happens when you have a bad log4j2.xml file. Double check that paying particular attention to whether it’s valid XML.
You can compare your current version to the default with the command:
docker exec -it openhab diff userdata/etc/log4j2.xml dist/userdata/etc/log4j2.xml
And you can replace your version with the default with the command
docker exec -it openhab cp dist/userdata/etc/log4j2.xml userdata/etc/log4j2.xml
Thanks for the tip - I had some differences to the default version:
29,38d28
< <!-- Audit file appender -->
< <RollingRandomAccessFile fileName="${sys:openhab.logdir}/audit.log" filePattern="${sys:openhab.logdir}/audit.log.%i.gz" name="AUDIT">
< <PatternLayout pattern="%d{yyyy-MM-dd HH:mm:ss.SSS} [%-5.5p] [%-36.36c] - %m%n"/>
< <Policies>
< <OnStartupTriggeringPolicy/>
< <SizeBasedTriggeringPolicy size="8 MB"/>
< </Policies>
< <DefaultRolloverStrategy max="7"/>
< </RollingRandomAccessFile>
<
55,59d44
< <!-- Security audit logger -->
< <Logger additivity="false" level="INFO" name="org.apache.karaf.jaas.modules.audit">
< <AppenderRef ref="AUDIT"/>
< </Logger>
<
132c117
< <Logger level="ERROR" name="org.apache.sshd"/>
---
> <Logger level="WARN" name="org.apache.sshd"/>
135c120
< </Configuration>
\ No newline at end of file
---
> </Configuration>
I copied the default file and now, there are no more differences. However, this did not have any effect on the issue - it still persists. No new logs in /openhab/userdata/logs and the container log says in the last line “Launching the openHAB runtime…”
Permissions and ownership of the files and folders in the passed in volumes are correct? I’ve not had no logs at all under any other circumstances so I don’t have any other ideas.
FTR I’m successfully running 4.3 in the latest Docker image without issue.
You might test to see what happens if you run the container with completely empty userdata and conf folders. If that works you definitely know it’s something to do with something in your volumes. If not you know that it’s something to do with your Docker run command or the like.
openHAB starts ok with empty conf and userdata. Then I restored the original conf and userdata and emptied the log directory only. Log files are created ok but remain empty.
/docker/openhab/userdata/logs $ ls -l
total 0
-rw-r--r-- 1 openhab openhab 0 Dec 18 21:20 events.log
-rw-r--r-- 1 openhab openhab 0 Dec 18 21:20 openhab.log
/docker/openhab/userdata/logs $ date
Wed Dec 18 21:32:47 CET 2024
Any other idea how I can analyze what’s going wrong?
The container is running or else you wouldn’t have been able to run the fiff and cp commands above. So OH isn’t fully crashing. Can you log into the karaf console?
I logged into the container and got the following error when trying to start karaf console:
root@0c48c4b8a1d2:~# /openhab/runtime/bin/client
SLF4J(W): No SLF4J providers were found.
SLF4J(W): Defaulting to no-operation (NOP) logger implementation
SLF4J(W): See https://www.slf4j.org/codes.html#noProviders for further details.
SLF4J(W): Class path contains SLF4J bindings targeting slf4j-api versions 1.7.x or earlier.
SLF4J(W): Ignoring binding found at [jar:file:/openhab/runtime/system/org/apache/karaf/org.apache.karaf.client/4.4.6/org.apache.karaf.client-4.4.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J(W): See https://www.slf4j.org/codes.html#ignoredBindings for an explanation.
Logging in as openhab
Failed to get the session.
root@0c48c4b8a1d2:~#
SLF4J points to the logger being a problem still. What about the java.util.logging.properties and org.ops4j.pax.logging.cfg in that same userdata/etc folder?
Here are both files, modification date is way in the past (before the update)
# set default format layout
java.util.logging.SimpleFormatter.format = %1$tF %1$tT.%1$tL [%4$s] [%3$s] - %5$s%6$s%n
# disable lock messages at startup
org.apache.karaf.main.Main.level = WARNING
org.apache.karaf.main.lock.SimpleFileLock.level = WARNING
org.ops4j.pax.logging.log4j2.config.file=${karaf.etc}/log4j2.xml
IMHO, nothing looks fishy here …
It should have happened automatically during the upgrade. But maybe try clearing the cache again.
cleared userdata/cache - but isse remains. Is there any way, that I can redirect the log to docker console? There’s gotta be an unlogged exception somewhere …
Yes, sure, but that too requires a working logging configuration. The fact that it logs to files is besides the point. As I read the error you don’t have any logging config. You don’t even have anything that can log.
Have you gone to those URLs in the errors? Maybe there’s something useful/helpful there.
Since it seems to work comming up fresh you can try to do a minimal restore from backup.
- Create new volumes for a new container
- Start the container so it populates the new volumes
- Stop the container
- Copy only the contents of
conf
and the following folders from userdata:
- jsondb
- openhabcloud
- persistence
- secrets
- uuid
- any folder created by a specific binding
You will likely need to reinstall the bindings and redo any settings from the middle column in Settings and redo any specific binding settings.
This will stay far away from anything to do with the logging config.
I recently had a similar symptoms. I was trying to clean up my rpi3, OH docker updates were taking 20 minutes or more. Started with bookworm lite image and added docker via script. Thought I followed the steps for non-sudo docker, but had the same symptoms. Finally tried with sudo and it worked. My docker compose was the same as what was running before. I was getting odd file ownerships. Figured it was just me, but might be worth a try
That exact message doesn’t point to a logger problem though … see https://github.com/openhab/openhab-core/issues/4283.
Since the “L” in SLF4J stands for logging I figured it was still a logging issue. That coupled with the fact that the logs are not being written out anywhere.
Obviously waiting for Karaf 4.4.7 is problematic for those who need a working system now, what’s the suggested mitagation?
I think you understood wrong what I meant to say.
Those SLF4J stuff can be ignored, it has been present since 4.2.M4 or something like that.
The SLF4J mentioned in the linked issue does not affect logging.
That no logs are written anymore must be something different.
I did misunderstand. Thanks for the clarification.
@badda I’m stumpped. If neither my nor @apella12’s suggestions help I’m not sure where to go from here.
With this procedure, I got my openHAB installation back on track, thank you!
Not sure what went wrong during the update - watchtower is doing this automatically whenever a new version is available and it has been working for almost two years now flawlessly.
I think it would be worth investigating the root cause, however, to prevent this kind of issue for others during openHAB upgrade. I’m happy to help with any data you need for the analysis!