OH3 - killing my system occasionally - LOGfile exploding to 100% root

hmerk · October 17, 2021, 11:11am

Thanks, so prerequisits are met. You are using resol, http, jeelink and pushover binding, which I don‘t, so my rough guess would be one of those bindings causing the issue.
Your screenshot shows NPE‘s in conjunction with ThreadPoolExecutorWorker, so I would start with raising the ThreadPool limit, using the search will give you some posts how to do that in the configs.

And please verify you are on openHAB 3.2M3, as it has a fix for bindings closing shared thread pools.

norbert_jordan · October 17, 2021, 11:35am

OK thanks…will do the raising part. But hard to track as it shows up every few weeks…the worst thing is that it completely messes my SD to zero available HDD memory which results in Openhab loosing all configs that have been GUI based and not file based. Backup brings it all back in as now i do backups quite frequently…so its easy but also annoying sometimes.

Yes i’m always on the latest version, which we know is sometimes not the best idea…

hmerk · October 17, 2021, 11:42am

I am on M3 also and never hat issues with the Milestone versions…

mstormi · October 17, 2021, 1:44pm

With a proper openHABian based setup, you wouldn’t be having the problem that logs fill filesystems and all subsequent issues this is causing.

Wolfgang_S · October 17, 2021, 4:54pm

Have a look a logrotate and it’s configuration. The maxsize for files daemon.log and syslog can be set before they are being rotated. Similar stuff can be configured for openhab’s log files in the appenders configuration files.

norbert_jordan · October 17, 2021, 7:30pm

@mstormi
maybe Openhabian would save me from these consecutive issues but still its a problem (from what i read up to now) within openhab. So best would be to fix it there, of course i should start limiting log files to tackle it from the other side as well.

Furthermore, looking at openhabian for a long time, but if you have several parallel activities on this machine, openhabian is too limiting compared to the plain buster image.

Kind Regards

hmerk · October 17, 2021, 7:44pm

You did not tell this fact in your first post.
Even your screenshot shows Karaf entries, the culprit could be outside openHAB.
Did you check if no other prozess is „eating up“ your memory ?

norbert_jordan · October 17, 2021, 8:32pm

hm, my guess these parallel tasks are irrelevant here as the logs show pages of JAVA related activities and the rest is python/php based development.

except from zigbee2mqtt which is nodejs. not sure if javascript would lead to this as well.

mstormi · October 17, 2021, 9:19pm

It simply is a bad idea to put your home at risk by sharing your openHAB machine with any other programs whatsoever as each and every of them can conflict with OH on various resources (RAM, disk and swap space, package and library dependencies, networking just to name some) and can break your system like what openhab is doing to your machine right now.

Then again, openHABian does not limit you anywhere, that’s a plain wrong statement and frankly I dislike reading that from people that have just deliberately killed their own system by de-selecting openHABian because they think they know better how to setup a server best.

matt1 · October 17, 2021, 11:26pm

Have a look at the output from ‘shell:info’ from the karaf console (see docs on how to reach the console) and see if the threads are increasing over time? They will go up and down, but should not have a upwards trend.

As others have explained, you can setup the logs to rotate. I think I have mine set to reach a max of 10mb and only keep say 4 of the old files before they are thrown away. 40mb max for each log.

prevent the logs from growing out of control.
Look at what is flooding the logs and look at solving the root cause.

If they fill up all your spare space within X time frame, I would guess there is something that needs looking at.

rossko57 · October 18, 2021, 12:57am

Hmm, what is Java going to do when it tries to carry out work and finds the host resources have been taken by some other process …

He who complains about lack of resource is not usually the theif.

rlkoshak · October 18, 2021, 3:58pm

I cannot emphasize this enough (and I’m gonna steal that saying ). When it comes down to a lack of resources problem, the thing complaining is just what happened to need more resources first, not necessarily the thing that is running amok consuming more resources than is necessary.

A good while back there was a problem running an extension/plugin for Grafana on an RPi which consumed more and more resources quietly in the background. Eventually the OS would give up and just kill openHAB. The problem had nothing to do with openHAB but openHAB was the one showing the problems.

When one deviates from a standard and well known setup like openHABian, they drastically limit the amount and the quality of the support we here can provide. You are running a whole lot of stuff on a relatively limited set of hardware. What’s causing the actual problem? We’re not experts in all that other stuff you are running. We don’t even know what all that other stuff is. Are they causing a resource constraint?

mstormi · October 18, 2021, 5:33pm

openHAB is usually the one showing up because it’s usually the largest process / resource hog which is what Linux prefers to kill if in ‘desperate’ mode. The statements in the link given above are likewise true if you don’t use openHABian but any other OS or hardware: it’s not fair to ignore the recommendations in the first place and then ask here for help when you start seeing the effects of your decision.

rlkoshak · October 18, 2021, 6:48pm

On-the-other-hand, openHABian is not the only officially supported way to install and run openHAB. If someone followed any one of the sets of instructions in the docs, including installing via apt, manually, Docker, etc, we should do our best to help where we can. But the more someone has running along side openHAB, the less we will be able to help. But I don’t want to discourage users from asking for help in the first place.

mstormi · October 19, 2021, 5:24pm

Sure, and I didn’t mean to say that the recommendation is openHABian.
To be clear, the recommendation I refer to that the OP ignored was to run OH on a dedicated system (which by design ensures there cannot be resource conflicts (well, to the extent possible)).
Asking for help while omitting important information wasn’t fair because it wasted my and other volunteer’s free time that wouldn’t want to support in cases like this.
I don’t want to forbid such posts or discourage anyone from asking either, but the least thing I’d expect from any poster is to point out the shared usage broadly and in the first place.
That the OP did not do, instead he declares his parallel activity to be irrelevant.
That is bad style, disappointing and demotivating.
My humble opinion, YMMV.

norbert_jordan · October 19, 2021, 6:19pm

sorry, but “bad style, disappointing and demotivating” is a little overreaction in a situation where someone asking for a solution to a known problem related to OH3. The only thing i did not take into consideration that maybe (but this is not confirmed at all) other processes take resources and finally bring down OH3.

Thanks especially to @rlkoshak to bring in a differentiated view and always being a helping hand!!!
As there seems to be no further helpful discussion, lets please close this topic.
Will soon switch to Openhabian.

poki123 · October 19, 2021, 8:43pm

Many users have the same problem with big log files in very short time due to Java problem, since Openhab 2.5 (im Openhabian user with the same problem).

The best solution for me is:

set /var/log as separate partition (1GB)
make simple script for syslog control, every minute via crontab → restartopenhab and delete whole log file

In my case, the problem is maybe once/twice per month.
You can find more info (script template) here [Syslog Errors (100 GB) - #14 by manswiss https://community.openhab.org/t/syslog-errors-100-gb/116768/14](Syslog Errors (100 GB) - #14 by manswiss)

norbert_jordan · October 20, 2021, 5:35am

OK thanks a lot for the comment @poki123
That makes it look different and a switch to Openhabian would make someone happy but seems to be not the solution. Switching JAVA was mentioned in the other thread, so will give it a try.

I’m using influxdb, that brought in recently some known problems with latest updates, rolled back and its gone. but it was a complete different kind of issue.

mstormi · October 20, 2021, 7:53am

I understand it’s annoying when you encounter that but clearly that does not mean that “many” people have it - there have been less than maybe 10 in several years that I am aware of - and that includes those that caused it themselves.
Let alone that it is a general problem. So please no wrong generalizations this is misleading at best.

All Linux versions including openHABian are an Operating System, i.e. a complex system with many many parameters logging and disk usage depends on. In situations with resource shortage or outage many demons (part of the OS) and apps (you installed) will throw error logs, coredumps and other garbage hence fill disks. There obviously cannot be guarantees whatsoever all of this is properly being caught in all situations including exceptional ones.
Any professional knows and it’ll always be the operator’s job (i.e. yourself) to take care of it.

If you properly install openHABian from the beginning (image) with current openHABian and current openHAB and do not mess with the system thereafter such as to change logging config or anything in the OS or to put additional stuff on the system that creates extra logging, you shouldn’t be having this problem (with possible exceptions to apply as per above statement).
People who have this problem on openHABian modified their system in one way or another or use outdated openHABian and/or openHAB software versions and/or config (sic).

system · November 30, 2021, 11:53pm

This topic was automatically closed 41 days after the last reply. New replies are no longer allowed.