Openhab 4.0.1 System unstable from "class org.openhab.core.internal.items.ItemUpdater"

alexkarageorgis · August 14, 2023, 9:49am

Hi Andreas, did you manage to figure it out? similar problem. Clean install of Openahab 4.0 from SD card image provided by Openhab
updated to 4.0.1 release and have restored my previous install.

Release = Raspbian GNU/Linux 11 (bullseye)
Kernel = Linux 6.1.21-v8+
Platform = Raspberry Pi 4 Model B Rev 1.4/8gb

after about a day it freezes (will check the memory question when it crashes next)

Scriptwriter · August 14, 2023, 11:18am

Hi @alexkarageorgis ,
we have exactly the same Kernel and release running. So im not alone with my problems.
We figured out that it seems that there i a lot of load which could make the Problems.
So what i figured out in the meantime is that : https://community.openhab.org/t/openhab-4-0-1-and-network-pingdevice-on-docker-install-results-in-massive-threads/148658
So im pinging my network-devices to get the info if they are online or not. Openhab is making each ping-profile in a result of about 20 threads of arping per profile because it wants to ping the device from each existing bridge network which exists on the host. But i have this time no solution to only ping it from main network. You can see that if you open an interactive shell into container and make an “ps aux” and repeat this sometimes.

Maybe other impacts existing too but i did not figured out more.

So please let me know what you get out.

alexkarageorgis · August 14, 2023, 11:25am

while I get the general idea of what you are saying I am not using a docker image. the only thing i can think of causing network traffic is the ip camera binding. - i will let it crash one more time (i am loosing about 1% of memory every hour or so.) and then disable it and see if that fixes things

Thanks Andrea, will keep a look on your post.

A

Scriptwriter · August 14, 2023, 11:29am

im using the ip camera binding too (with 2 cams)

on my system load seems more problem than memory. So possibly we have different problems but maybe some the same. future will show what we learn…

alexkarageorgis · August 14, 2023, 1:01pm

IP camera always increased my system load a fair bit. in any case i disabled it and will wait and see if the issue comes back.

Alex

Scriptwriter · August 15, 2023, 12:06pm

today i got the load down:

The only thing i did is disabling the about 10 ping-checks profiles from network binding. Now it is the same load as if openhab is not running. But its still running.
I have opened another thread for the network binding: Openhab 4.0.1 and Network Pingdevice on Docker-Install results in massive threads

rlkoshak · August 15, 2023, 2:51pm

Now that the load is down, have the EventHandler warnings gone away?

Scriptwriter · August 15, 2023, 3:56pm

Hi Rich,
the last days before my change today with disabling network-binding i also did not had such en “explosion” with thousands of there messages.
It could also be that some conversions from older ecma-scripts to actual javascript had some errors which caused them. Possibly. I guess i have now all rules actualized that they dont raise errors and i guess it should not run rules any more on the nashorn engine which i installed directly after upgrade to run the most rules.

Another person told me about his impacts of ip camera binding which im also using. But disabling did not gave me a hint of impacts of this.

Thats why i logged interactively to container and had a look at the threads which are running and thnking about them.

These very much threads of arping gaves me a hint to disabling the network binding. Then it lowers the load completely as i have shown in graph. So what we have seen and seems also some years before someone got out (linked the other cases) is that pinging via network-binding is not useful if you are using it in docker container. It makes absolutely no sence that it starts pingthreads from each network which is existing in docker when in openhab is defined take only the network XXX.
I guess there would be a little change nessecary to avoid the unnessecary pings.
For me, if i would need it in future this means to make a little shellscript and start that instead of network-binding.

So fazit: load can be impacted from openhab itself also if someone tells it isnt

rlkoshak · August 15, 2023, 4:00pm

I believe you already filed an issue but if not, make sure you do so. The network binding should not do that.

But I also recommend using the best tool for the job. OH is not a very good IT system monitoring system. If you need to monitor a bunch of devices and services via ping, you might be better off deploying a system like Zabbix, ELK stack, Prometheus, etc.

Scriptwriter · August 15, 2023, 6:00pm

…i did a question in community about the network-binding and its behavior(no answers) ,
and also filed an issue now for the massive threads.
“best tool for the job”: i did use it because it was an easy way for showing which device running and which is not so other people in family can easily know which device to reboot
im also/already using prometheus/ LGTM and will use more this tool for that.

Today in morning it crashed again with WARN ] [ab.core.internal.events.EventHandler] - The queue for a subscriber of type 'class org.openhab.core.internal.items.ItemUpdater' exceeds 5000 elements. System may be unstable.
Im looking further if i can get out something. Maybe a systemlimit like open files or others…

alexkarageorgis · August 16, 2023, 7:04pm

@Scriptwriter just updating that since i unistalled the ip camera binidng i havent (yet) crashed - might be connected, maybe i will just get it later on.

Alex

alexkarageorgis · August 16, 2023, 11:55pm

just to put this down also. Zwave module / seems to be acting up also. i receive multiple same data from thermostats and energy meters. i.e. microsecond apart the same data 20 times. could be causing load issues. also send commands dont seem to work. another strange behaviour on my setup.

Scriptwriter · August 17, 2023, 6:10am

Hi @alexkarageorgis ,
did you realized that there is a problem of excessive update-logging in 4.0.0 and 4.0.1-versions? See https://community.openhab.org/t/successful-openhab-upgrade-v3-4-3-to-v4-0-1-potential-stumbling-blocks/148174 if the nessecary change in log4j2.xml is applied on your site. You can also do update to 4.0.2 which automaticly applies this change.

alexkarageorgis · August 17, 2023, 12:45pm

Thanks @Scriptwriter for pointing in the right direction. did the upgrade to 4.0.2 but not change to the logging. will try your two lines of code then

Strange.

** update. the file 4j2.xml contains one of the two lines of code -
not this one

   <Logger level="ERROR" name="openhab.event.GroupStateUpdatedEvent"/>

but i dont think this is the issue (see image below)

trying to add with Winscp as i am remote to the device currently but permission to edit/delete or overwrite is denied. is there a smart way to do this?

Alex

Wolfgang_S · August 17, 2023, 5:16pm

copy the file e.g. to /tmp folder.
then login via ssh and change to user root ( sudo bash )
from that shell you can copy the file from /tmp folder to the target folder
make sure that ownership and permissions are the same after this

alexkarageorgis · August 17, 2023, 5:31pm

thanks @Wolfgang_S appreciate the sharing of cleverness - do you think given the missing code it will correct the issue?
this code exists already

 <Logger level="ERROR" name="openhab.event.ItemStateUpdatedEvent"/>

this part is the only missing part

<Logger level="ERROR" name="openhab.event.GroupStateUpdatedEvent"/>

Scriptwriter · August 17, 2023, 5:34pm

…all “updatedevent” from your screenshot will suppressed then if you apply the correctly changed log4j2.xml file and after it restarting openhab. I had same issue and it solved this.

alexkarageorgis · August 17, 2023, 5:39pm

sorry just corrected my previous posts as the code wasnt working.
after update in 4.0.2 the part that isnt in the file is the “GroupStateUpdatedEvent” log level the “ItemStateisthere” so do you think given the screen show it could be that? (anyway will try)

Scriptwriter · August 17, 2023, 5:48pm

in the file you will find other Logger-Entries. The new one should be there with the correct number of spaces on beginning of line as the others. If you are editing the file use Linux-Texteditors (not Windows Editors, they can make wrong CR/LFs) (if you dont know one, and never before used others then take nano). If correctly applied AND after it openhab restarted it reads the file and use it. If you have problems i guess you can download it also from git to the correct place with wget

stas-dovgodko · August 25, 2023, 10:24am

Same error for me.
“The queue for a subscriber of type ‘class org.openhab.core.internal.items.ItemUpdater’ exceeds 5000 elements”
intel i3 + clean proxmox-LXC openhab-only container (ubuntu 22, 5gb+2swap), OH 4.0.2 just stop working after a few days with “exceed 500 errors in log”.
Same container with 3.4 was pretty stable.