System totally offline - Please help

All,

Platform information:

Hardware: Raspi 3
OS: openhabian 1.4 running 
Java Runtime openjdk version "1.8.0_152"

OpenJDK Runtime Environment (Zulu Embedded 8.25.0.76-linux-aarch32hf) (build 1.8.0_152-b76)
OpenJDK Client VM (Zulu Embedded 8.25.0.76-linux-aarch32hf) (build 25.152-b76, mixed mode, Evaluation)
openHAB version: OH 2.3 stable

Issue of the topic: out of a sudden Warning appear and ERRORS followed by unresponsive tool (habin, PaperUI, karaf)
Please post configurations (if applicable): Don’t know where to start

Items configuration related to the issue - no obvious items involved
Sitemap configuration related to the issue - no sitemaps in use (habpanel only)
Rules code related to the issue - no obvious rules involved
Services configuration related to the issue - no obvious bindings involved

once more I did a restart of the system (not learning from my mistakes obviously) and get loads of these Warnings:

2018-09-17 18:50:54.729 [WARN ] [me.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.eclipse.smarthome.core.internal.items.ItemUpdater@59890e' takes more than 5000ms.

I am actually lost to find out what I need to look for seeing “59890e” in the logs.

Habpanel is not accessible - neither is Paper UI.
Login into karaf does not work, but OH is obviously running somehow (log entries are generated and


openhab2 status

returns:

 openhab2.service - openHAB 2 - empowering the smart home
   Loaded: loaded (/usr/lib/systemd/system/openhab2.service; enabled)
   Active: active (running) since Mo 2018-09-17 15:47:34 CEST; 3h 25min ago
     Docs: https://www.openhab.org/docs/
           https://community.openhab.org
 Main PID: 746 (java)
   CGroup: /system.slice/openhab2.service
           └─746 /usr/bin/java -Dopenhab.home=/usr/share/openhab2 -Dopenhab.conf=/etc/openhab2 -Dopenhab.runtime=/usr/share/openhab2/runtime -Dopenhab.userdata=/var/lib/openhab2 -Dopenhab.logdir=/var/log/openhab2 -Dfelix.cm.d...

Sep 17 16:41:04 Homer karaf[746]: at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:367)
Sep 17 16:41:04 Homer karaf[746]: at org.glassfish.jersey.server.ChunkedOutput.flushQueue(ChunkedOutput.java:190)
Sep 17 16:41:04 Homer karaf[746]: at org.glassfish.jersey.server.ChunkedOutput.write(ChunkedOutput.java:180)
Sep 17 16:41:04 Homer karaf[746]: at org.eclipse.smarthome.io.rest.sse.internal.SseEventOutput.write(SseEventOutput.java:46)
Sep 17 16:41:04 Homer karaf[746]: at org.eclipse.smarthome.io.rest.sse.internal.SseEventOutput.write(SseEventOutput.java:1)
Sep 17 17:06:30 Homer karaf[746]: Exception in thread "pool-27-thread-3" java.lang.OutOfMemoryError: Java heap space
Sep 17 17:32:26 Homer karaf[746]: Exception in thread "pool-27-thread-4" java.lang.OutOfMemoryError: Java heap space
Sep 17 18:07:05 Homer karaf[746]: Exception in thread "ESH-OSGiEventManager" java.lang.OutOfMemoryError: Java heap space
Sep 17 19:00:42 Homer karaf[746]: Exception in thread "pool-27-thread-5" java.lang.OutOfMemoryError: Java heap space
Sep 17 19:01:08 Homer karaf[746]: Exception in thread "Timer-3" java.lang.OutOfMemoryError: Java heap space

even a shutdown, deleting cache/* and tmp/* did not do the trick…

after removing all rules, mainly zwave warnings were logged:

2018-09-17 19:29:36.183 [WARN ] [l.serialmessage.SendDataMessageClass] - NODE 9: Already processed another send data request for this callback Id, ignoring.
2018-09-17 19:29:46.147 [WARN ] [ocol.ZWaveController$ZWaveSendThread] - Receive queue TIMEOUT:
2018-09-17 19:30:19.663 [WARN ] [l.serialmessage.SendDataMessageClass] - NODE 9: Already processed another send data request for this callback Id, ignoring.
2018-09-17 19:31:05.928 [WARN ] [ocol.ZWaveController$ZWaveSendThread] - Receive queue TIMEOUT:
2018-09-17 19:31:13.047 [WARN ] [l.serialmessage.SendDataMessageClass] - NODE 9: Already processed another send data request for this callback Id, ignoring.
2018-09-17 19:31:13.078 [WARN ] [ocol.ZWaveController$ZWaveSendThread] - Receive queue TIMEOUT:

Anyone, who could give me a hint?

I deactivated zwave in the addons.cfg (was the only way to disable the binding without karaf).
After that the system lost the network connection and I was noch able to connect any more using ssh (putty) at all.

The only way was to unplug the power cord and restart the system (without the USB zwave stick to be on the safe side)

After this change I got a lot of error from amazonechocontrol (habpanel, habmin, karaf still not accessible):

2018-09-17 20:35:38.252 [WARN ] [mazonechocontrol.internal.Connection] - Parsing json failed {}
com.google.gson.JsonSyntaxException: java.lang.IllegalStateException: Expected BEGIN_OBJECT but was STRING at line 8 column 1 path $
        at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:224) [22:com.google.gson:2.7.0]
        at com.google.gson.Gson.fromJson(Gson.java:887) [22:com.google.gson:2.7.0]
        at com.google.gson.Gson.fromJson(Gson.java:852) [22:com.google.gson:2.7.0]
        at com.google.gson.Gson.fromJson(Gson.java:801) [22:com.google.gson:2.7.0]
        at com.google.gson.Gson.fromJson(Gson.java:773) [22:com.google.gson:2.7.0]
        at org.openhab.binding.amazonechocontrol.internal.Connection.parseJson(Connection.java:565) [220:org.openhab.binding.amazonechocontrol:2.3.0]
        at org.openhab.binding.amazonechocontrol.internal.Connection.getEnabledFlashBriefings(Connection.java:802) [220:org.openhab.binding.amazonechocontrol:2.3.0]
        at org.openhab.binding.amazonechocontrol.handler.AccountHandler.updateFlashBriefingProfiles(AccountHandler.java:541) [220:org.openhab.binding.amazonechocontrol:2.3.0]
        at org.openhab.binding.amazonechocontrol.handler.AccountHandler.addFlashBriefingProfileHandler(AccountHandler.java:174) [220:org.openhab.binding.amazonechocontrol:2.3.0]
        at org.openhab.binding.amazonechocontrol.handler.FlashBriefingProfileHandler.initialize(FlashBriefingProfileHandler.java:74) [220:org.openhab.binding.amazonechocontrol:2.3.0]
        at sun.reflect.GeneratedMethodAccessor54.invoke(Unknown Source) ~[?:?]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
                                                                                                                 

I then deactivated amazonechocontrol and the system became stable again.

Reboot with Z-wave Stick was ok.
Re-activating the zwave binding did not show any obvious issues and habmin, habpanel and karaf are available again as well.

I don’t know why the system did have these massive issues and Iam a little concerned that it will happen again (restart and reboot seemed to worsen the stability instead of solving the issues).

Any suggestion how to prevent this (any idea from the logs) are greatly appreciated.

I found the following in my log:

JDBC::query: unable to get item for itemName: 'Z_Watervalve_Num'. Ignore and give up!

The strange thing is, that I don’t use Z_Watervalve_Num any more and did delete …/cache/* and …/tmp/*

So ist there something else I need to delete if I would like to clean up without starting with a complete new installation from scratch?

Since Item Z_Watervalve_Num has been removed, can you remove it from JDBC to prevent the db from trying to query the item?

Actually I created a new DB, so this item is not in items files, no rules and not the DB.
That’s the weird thing

Out of a sudden I got another ERROR:

2018-09-21 19:15:27.366 [ERROR] [nal.common.AbstractInvocationHandler] - An error occurred while calling method 'EventSubscriber.receive()' on 'org.eclipse.smarthome.core.internal.items.ItemUpdater@494bf6': Java heap space
java.lang.OutOfMemoryError: Java heap space

Could anyone please point me to the direction where to look for 494bf6?
item, Thing somthing else??

Yesterday evening it happened again (OH stopped working incl. PaperUI, Habmin, karaf)

Initially I got:

2018-09-21 18:33:07.143 [WARN ] [ab.action.telegram.internal.Telegram] - Transport error: {}

This actually happened before, but did not have an impact on the overall stability.

30 minutes later I got:

2018-09-21 19:05:02.657 [ERROR] [ntime.internal.engine.ExecuteRuleJob] - Error during the execution of rule 'Send Hi / Lo': cannot invoke method public abstract org.eclipse.smarthome.core.types.State org.eclipse.smarthome.core.persistence.HistoricItem.getState() on null


another few minutes later:

2018-09-21 19:10:13.145 [ERROR] [b.binding.neato.handler.NeatoHandler] - Error when refreshing state. Error: java.util.concurrent.ExecutionException: java.io.EOFException: HttpConnectionOverHTTP@621caa(l:/192.168.178.50:34284 <-> r:nucleo.neatocloud.com/54.84.102.60:4443,closed=false)[HttpChannelOverHTTP@b2b00b(exchange=HttpExchange@12b4cc3 req=PENDING/null@null res=PENDING/null@null)[send=HttpSenderOverHTTP@53441e(req=HEADERS,snd=SENDING,failure=null)[HttpGenerator@9a8e72{s=COMPLETING}],recv=HttpReceiverOverHTTP@1875cc2(rsp=IDLE,failure=null)[HttpParser{s=CLOSED,0 of 0}]]]

Then ERRORS became dominant:

2018-09-21 19:24:12.735 [ERROR] [nal.common.AbstractInvocationHandler] - An error occurred while calling method 'EventSubscriber.receive()' on 'org.eclipse.smarthome.core.internal.items.ItemUpdater@494bf6': Java heap space
java.lang.OutOfMemoryError: Java heap space

and

2018-09-21 20:05:04.153 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception
java.lang.OutOfMemoryError: Java heap space

and many more heap space stuff:

2018-09-21 20:10:37.533 [ERROR] [me.core.internal.events.EventHandler] - Dispatching/filtering event for subscriber 'org.eclipse.smarthome.core.events.EventSubscriber' failed: Java heap space
java.lang.OutOfMemoryError: Java heap space

I guess The Neato binding might be a trigger (although it used to be working months without an issue)

but the other ERRORs especially “heap space” are new to me

Right now: openhab.log stopped logging and events.log stopped even earlier (12 hours ago).

The last messages are:

2018-09-22 03:28:39.721 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception
java.lang.OutOfMemoryError: Java heap space
2018-09-22 03:27:54.779 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception
java.lang.OutOfMemoryError: Java heap space
2018-09-22 05:13:53.999 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception
java.lang.OutOfMemoryError: Java heap space
2018-09-22 05:13:54.020 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception
java.lang.OutOfMemoryError: Java heap space
2018-09-22 07:15:26.979 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception
java.lang.OutOfMemoryError: Java heap space
2018-09-22 07:45:07.070 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception
java.lang.OutOfMemoryError: Java heap space
2018-09-22 07:51:53.363 [ERROR] [core.karaf.internal.FeatureInstaller] - Failed installing 'openhab-binding-volvooncall': Error:
        Java heap space

As you can see there is a huge gap of 2 hrs where notheing was logged (5:13 am to 7:15 am).
:frowning:

There seems to be still some activity:

2018-09-22 07:51:53.363 [ERROR] [core.karaf.internal.FeatureInstaller] - Failed installing 'openhab-binding-volvooncall': Error:
        Java heap space
2018-09-22 08:15:56.327 [ERROR] [core.karaf.internal.FeatureInstaller] - Failed installing 'openhab-binding-volvooncall': Error:
        Error downloading mvn:org.openhab.persistence/org.openhab.persistence.mapdb/1.12.0
        Error downloading mvn:org.eclipse.jetty.websocket/websocket-api/9.3.21.v20170918
        Error downloading mvn:org.eclipse.jetty/jetty-util/9.3.21.v20170918
        Java heap space

Looks like you have a memory leak issue. See How to find the cause of a OutOfMemoryError in OH2?. Did you already try to update to the latest 2.4.0-SNAPSHOT to see if it is fixed?

There is this handy template that should have made you enter this information in your opening post so it is easier to help you without asking lots of obvious questions. :slight_smile:

This is what the template normally looks like:

  • Platform information:
    • Hardware: CPUArchitecture/RAM/storage
    • OS: what OS is used and which version
    • Java Runtime Environment: which java platform is used and what version
    • openHAB version:
  • Issue of the topic: please be detailed explaining your issue
  • Please post configurations (if applicable):
    • Items configuration related to the issue
    • Sitemap configuration related to the issue
    • Rules code related to the issue
    • Services configuration related to the issue
  • If logs where generated please post these here using code fences:

Thanks I will check it out.

And I will edit my original post to be more precise.
You are right, that this thread is a little confusing, so sorry for that :slight_smile:
The main point is, that I am lost and don’t have a clue how to nail this issue down.
(I actually am afraid, that this is a corruption issue I had with an SD card a year ago).
Now I am using an SSD and thought, corruption is not an issue.

So I will be really relieved if it turns out, that it’s a misconfiguration or a bug.

1 Like