Openhab freezes for some minutes

Hello everyone,

since this morning I observe a very strange issue. No major changes and/or upgrade before.
my Openhab Installation (Details see below) operates as normal, but out of a sudden it stops working. CPU actually freezes and according to my monitored metrics it remains on the level it was (have seen it with almost 0% util, but also with around 20%).
after around 6-8 Minutes everything is working again normal AND every event which hasn’t been processed during this period of time will be processed after it.

symptoms:
it seems that my eventbus stops working, the event.log doesn’t show any record in this time. The openhab.log shows me that cron-triggered rules are working, but no status changes are written to my items.

i do not have any high cpu loads, i/o loads and memory and diskspace are fine too.

Until now i cannot really narrow it down to any specific binding, or thing or whatever… when setting logs to debug i also cannot really find something very suspicious.

  • Platform information:
    • Hardware: x86, Celeron 8GB Memory; SSD
    • OS: ubuntu 18.04 LTS
    • Java Runtime Environment:
      openjdk version “1.8.0_222”
      OpenJDK Runtime Environment (build 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10)
      OpenJDK 64-Bit Server VM (build 25.222-b10, mixed mode)
    • openHAB version:2.4 stable

Openhab: i have around 400 rules, 2000 items, around 100 things.
around 20 dashboards in Habpanel and constantly 4 devices displaying it.
but above things were running perfectly fine for months…

Has anyone any idea what it could be or at least where I can start troubleshooting?

thanks
martin

First make sure you have a backup.

Then clean the cache and restart.

sudo systemctl stop openhab2

sudo openhab-cli clean cache

sudo systemctl start openhab2 or sudo reboot

Have a browser open with frontail opened so you can watch as OH starts back up. Post any relevant logs if the issue continues.

i had some issues when several clients(mostly PCs ) where open on the site map
OH started to act wired , not sure if it was the casue of the issue i had

but when i closed some of them it was fixed , again just a shoot in the dark but worth a try

hi,
thanks for you reply.
I already cleaned the cache after it happend the first time.
I stopped openhab, cleaned the cache and did a reboot. After the reboot everything worked fine, also startup logs are normal, but after around 1 hour it started again.

The issue just happend again, CPU was frozen at around 20 %, and after around 9 minutes it worked again (in the event log the gap is visible very good)

some ms before everythings worked as normal (and the eventlog show event-processing again) i found this in my debug logs

2019-12-07 19:02:02.232 [DEBUG] [ommons.httpclient.HttpMethodDirector] - Closing the connection.
2019-12-07 19:02:02.232 [DEBUG] [ommons.httpclient.HttpMethodDirector] - Closing the connection.
2019-12-07 19:02:02.233 [DEBUG] [ommons.httpclient.HttpMethodDirector] - Method retry handler returned false. Automatic recovery will not be attempted
2019-12-07 19:02:02.233 [DEBUG] [ommons.httpclient.HttpMethodDirector] - Method retry handler returned false. Automatic recovery will not be attempted
2019-12-07 19:02:02.233 [DEBUG] [he.commons.httpclient.HttpConnection] - Releasing connection back to connection manager.
2019-12-07 19:02:02.233 [DEBUG] [he.commons.httpclient.HttpConnection] - Releasing connection back to connection manager.
2019-12-07 19:02:02.234 [ERROR] [org.openhab.io.net.http.HttpUtil ] - Fatal transport error: java.net.ConnectException: Connection timed out (Connection timed out)
2019-12-07 19:02:02.234 [ERROR] [org.openhab.io.net.http.HttpUtil ] - Fatal transport error: java.net.ConnectException: Connection timed out (Connection timed out)

Thats interessting because since today morning i regulary see this line in my openhab.log. I checked my logs of the last 6 months, and this line isn’t in.

2019-12-07 07:51:12.785 [INFO ] [ommons.httpclient.HttpMethodDirector] - I/O exception (java.net.ConnectException) caught when processing request: Connection refused (Connection refused)

whats also interesting, when I cleaned my cache i lost by spotify binding. I was easily able to reinstall, but it’s strange since I do not have so many http connected things, and spotify is one of it.

hi,
a brief update for everyone who is interested.
the issue didn’t happen again since yesterday evening.

I narrowed down the above mentioned http client messages to one of my tablets (running fully kiosk browser) which http interface is connected and “monitored” by openhab.
i cannot exclude that this actually caused the issue, since I have observed really strange behavior if one of these http targets is not reachable, but I honestly do not believe this caused the issue.

I will no observe whats happening today, when there is more load on the system as during the night.

cheers

another quick update. everything still stable.
still no idea about root cause… :confused:

I see weird OH things happening from time to time and cleaning the cache with reboot always seem to work but I never find why it started acting up.:unamused: I just chalk it up to something I gotta do once in a while and move on.:wink: