OH2.2 Crashing randomly

Tags: #<Tag:0x00007fc8f990c2b0>

Hello,

I’m running OH2.2 on my QNAP NAS. Version 2.1 was running stable, but with 2.2 the system crash randomly and sometimes is shutting down.

What I found in the latest log today - this was happening right after I started the system.
Any clue why this is happening?

Many thanks for your help
:slight_smile:
Stefan

   2017-12-27 19:42:52.663 [WARN ] [eclipse.jetty.servlet.ServletHandler] - 
javax.servlet.ServletException: A MultiException has 1 exceptions.  They are:
1. java.lang.IllegalStateException: ServiceLocatorImpl(__HK2_Generated_3,4,1871728606) has been shut down

	at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:489) [175:org.glassfish.jersey.containers.jersey-container-servlet-core:2.22.2]
	at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427) [175:org.glassfish.jersey.containers.jersey-container-servlet-core:2.22.2]
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388) [175:org.glassfish.jersey.containers.jersey-container-servlet-core:2.22.2]
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341) [175:org.glassfish.jersey.containers.jersey-container-servlet-core:2.22.2]
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228) [175:org.glassfish.jersey.containers.jersey-container-servlet-core:2.22.2]
	at com.eclipsesource.jaxrs.publisher.internal.ServletContainerBridge.service(ServletContainerBridge.java:76) [15:com.eclipsesource.jaxrs.publisher:5.3.1.201602281253]
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) [88:org.eclipse.jetty.servlet:9.3.22.v20171030]
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584) [88:org.eclipse.jetty.servlet:9.3.22.v20171030]
	at org.ops4j.pax.web.service.jetty.internal.HttpServiceServletHandler.doHandle(HttpServiceServletHandler.java:71) [191:org.ops4j.pax.web.pax-web-jetty:6.0.7]
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) [87:org.eclipse.jetty.server:9.3.22.v20171030]
	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) [85:org.eclipse.jetty.security:9.3.22.v20171030]
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) [87:org.eclipse.jetty.server:9.3.22.v20171030]
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) [87:org.eclipse.jetty.server:9.3.22.v20171030]
	at org.ops4j.pax.web.service.jetty.internal.HttpServiceContext.doHandle(HttpServiceContext.java:284) [191:org.ops4j.pax.web.pax-web-jetty:6.0.7]
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) [88:org.eclipse.jetty.servlet:9.3.22.v20171030]
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) [87:org.eclipse.jetty.server:9.3.22.v20171030]
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) [87:org.eclipse.jetty.server:9.3.22.v20171030]
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [87:org.eclipse.jetty.server:9.3.22.v20171030]
	at org.ops4j.pax.web.service.jetty.internal.JettyServerHandlerCollection.handle(JettyServerHandlerCollection.java:80) [191:org.ops4j.pax.web.pax-web-jetty:6.0.7]
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) [87:org.eclipse.jetty.server:9.3.22.v20171030]
	at org.eclipse.jetty.server.Server.handle(Server.java:534) [87:org.eclipse.jetty.server:9.3.22.v20171030]
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333) [87:org.eclipse.jetty.server:9.3.22.v20171030]
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) [87:org.eclipse.jetty.server:9.3.22.v20171030]
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) [79:org.eclipse.jetty.io:9.3.22.v20171030]
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108) [79:org.eclipse.jetty.io:9.3.22.v20171030]
	at org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:251) [79:org.eclipse.jetty.io:9.3.22.v20171030]
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) [79:org.eclipse.jetty.io:9.3.22.v20171030]
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108) [79:org.eclipse.jetty.io:9.3.22.v20171030]
	at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) [79:org.eclipse.jetty.io:9.3.22.v20171030]
	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) [90:org.eclipse.jetty.util:9.3.22.v20171030]
	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) [90:org.eclipse.jetty.util:9.3.22.v20171030]
	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) [90:org.eclipse.jetty.util:9.3.22.v20171030]
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) [90:org.eclipse.jetty.util:9.3.22.v20171030]
	at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) [90:org.eclipse.jetty.util:9.3.22.v20171030]
	at java.lang.Thread.run(Thread.java:748) [?:?]
Caused by: org.glassfish.hk2.api.MultiException: A MultiException has 1 exceptions.  They are:
1. java.lang.IllegalStateException: ServiceLocatorImpl(__HK2_Generated_3,4,1871728606) has been shut down

	at org.jvnet.hk2.internal.FactoryCreator.getFactoryHandle(FactoryCreator.java:106) ~[?:?]
	at org.jvnet.hk2.internal.FactoryCreator.dispose(FactoryCreator.java:173) ~[?:?]
	at org.jvnet.hk2.internal.SystemDescriptor.dispose(SystemDescriptor.java:526) ~[?:?]
	at org.glassfish.jersey.process.internal.RequestScope$Instance.remove(RequestScope.java:532) ~[177:org.glassfish.jersey.core.jersey-common:2.22.2]
	at org.glassfish.jersey.process.internal.RequestScope$Instance.release(RequestScope.java:549) ~[177:org.glassfish.jersey.core.jersey-common:2.22.2]
	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:319) ~[177:org.glassfish.jersey.core.jersey-common:2.22.2]
	at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305) ~[178:org.glassfish.jersey.core.jersey-server:2.22.2]
	at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154) ~[178:org.glassfish.jersey.core.jersey-server:2.22.2]
	at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473) ~[175:org.glassfish.jersey.containers.jersey-container-servlet-core:2.22.2]
	... 34 more
Caused by: java.lang.IllegalStateException: ServiceLocatorImpl(__HK2_Generated_3,4,1871728606) has been shut down
	at org.jvnet.hk2.internal.ServiceLocatorImpl.checkState(ServiceLocatorImpl.java:2288) ~[?:?]
	at org.jvnet.hk2.internal.ServiceLocatorImpl.getServiceHandleImpl(ServiceLocatorImpl.java:629) ~[?:?]
	at org.jvnet.hk2.internal.ServiceLocatorImpl.getServiceHandle(ServiceLocatorImpl.java:622) ~[?:?]
	at org.jvnet.hk2.internal.ServiceLocatorImpl.getServiceHandle(ServiceLocatorImpl.java:640) ~[?:?]
	at org.jvnet.hk2.internal.FactoryCreator.getFactoryHandle(FactoryCreator.java:103) ~[?:?]
	at org.jvnet.hk2.internal.FactoryCreator.dispose(FactoryCreator.java:173) ~[?:?]
	at org.jvnet.hk2.internal.SystemDescriptor.dispose(SystemDescriptor.java:526) ~[?:?]
	at org.glassfish.jersey.process.internal.RequestScope$Instance.remove(RequestScope.java:532) ~[177:org.glassfish.jersey.core.jersey-common:2.22.2]
	at org.glassfish.jersey.process.internal.RequestScope$Instance.release(RequestScope.java:549) ~[177:org.glassfish.jersey.core.jersey-common:2.22.2]
	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:319) ~[177:org.glassfish.jersey.core.jersey-common:2.22.2]
	at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305) ~[178:org.glassfish.jersey.core.jersey-server:2.22.2]
	at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154) ~[178:org.glassfish.jersey.core.jersey-server:2.22.2]
	at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473) ~[175:org.glassfish.jersey.containers.jersey-container-servlet-core:2.22.2]
	... 34 more
 .....

I have a similar problem,

After updating to 2.2 final i have:

  • a full crash every 1-2 days (restart of the service)
  • log:tail on the console is automatically closed after 10 minutes (a known bug)
    So have to go back to snapshot 996 which is running since month without any problems.

following the content of the hs_err_pid1803.log

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGILL (0x4) at pc=0xb41d4c64, pid=1803, tid=0x9addf470
#
# JRE version: Java(TM) SE Runtime Environment (8.0_151-b12) (build 1.8.0_151-b12)
# Java VM: Java HotSpot(TM) Client VM (25.151-b12 mixed mode linux-arm )
# Problematic frame:
# v  ~BufferBlob::adapters
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#

---------------  T H R E A D  ---------------

Current thread (0xa092c000):  JavaThread "pool-44-thread-5" [_thread_in_Java, id=2127, stack(0x9ad90000,0x9ade0000)]

siginfo: si_signo: 4 (SIGILL), si_code: 1 (ILL_ILLOPC), si_addr: 0xb41d4c64

Registers:
  r0  = 0xac25cc88
  r1  = 0xa493aab8
  r2  = 0xac25cc88
  r3  = 0xac25cc88
  r4  = 0xa5369678
  r5  = 0x00000019
  r6  = 0xa536fd50
  r7  = 0x00000000
  r8  = 0x9f0f7a78
  r9  = 0xa3c61fb8
  r10 = 0xa092c000
  fp  = 0x9adde680
  r12 = 0x9adda4a0
  sp  = 0x9adde450
  lr  = 0xb49bef6c
  pc  = 0xb41d4c64
  cpsr = 0x20000010

Top of Stack: (sp=0x9adde450)
0x9adde450:   9adde47c b41a71c0 a536fb80 9adde45c
0x9adde460:   9f0de53e 9adde488 ac25cc88 a95ba8a8
0x9adde470:   a5211228 00000000 9adde680 b4392ba8
0x9adde480:   a536fd00 b41a71c0 a493aab8 ac25cc88
0x9adde490:   a536fcd0 9adde494 9adde680 b49bca20
0x9adde4a0:   9f0e19a8 00000000 9f0e15c0 00000000
0x9adde4b0:   a536fce8 9adde4e8 b41a71c0 ac25cc88
0x9adde4c0:   9adde54c b430f580 9adde4c8 9f0de5c5 

Instructions: (pc=0xb41d4c64)
0xb41d4c44:   00000001 b69f2638 b69969cc 00004038
0xb41d4c54:   00000030 00000000 00000038 00000038
0xb41d4c64:   ffffffff 00004038 00000000 00000000
0xb41d4c74:   00000000 00000000 00000000 e5904004 

Register to memory mapping:

  r0  = 0xac25cc88
0xac25cc88 is an oop
org.apache.karaf.features.internal.resolver.CapabilityImpl 
 - klass: 'org/apache/karaf/features/internal/resolver/CapabilityImpl'

  r1  = 0xa493aab8

My OH2.2 is keep crashing randomly - yesterday the system crashed 5 times creating a lot of crash Dumps. However this time I cannot see any Exceptions in the log file every time.

Before one crash I logged below:

2017-12-31 17:19:31.742 [WARN ] [eclipse.jetty.servlet.ServletHandler] - 
javax.servlet.ServletException: A MultiException has 1 exceptions.  They are:
1. java.lang.IllegalStateException: ServiceLocatorImpl(__HK2_Generated_4,5,1196287467) has been shut down

Not to be seen as an OH2 bug, but my system crashed just after midnight.
Not sure what it’s been, as it was only pingable at last.

My guess is a memory leak in OH as the system is not doing anything beside.

Today the systems crashed already 6 times :frowning:

After the last crash a few minutes ago I restarted OH again using /etc/init.d/openHAB console command.
The system however only run for a few minutes and stopped again - but without any exception.
The only warning message I can see is

2018-01-01 13:35:51.413 [WARN ] [ore.internal.events.OSGiEventManager] - Dispatching event to subscriber 'org.eclipse.smarthome.config.discovery.internal.AutomaticInboxProcessor@3d2dd47e' takes more than 5000ms.

Below is the very short logfile since the OH start till it was automatically terminated again.

2018-01-01 13:35:05.013 [INFO ] [er.internal.HomeBuilderDashboardTile] - Started Home Builder at /homebuilder
2018-01-01 13:35:05.556 [INFO ] [.core.internal.i18n.I18nProviderImpl] - Locale set to de_DE, Location set to 50.122991199999994,8.322805299999999, Time zone set to Europe/Berlin
2018-01-01 13:35:07.229 [WARN ] [g.eclipse.smarthome.core.net.NetUtil] - Found multiple local interfaces - ignoring 10.0.3.1
2018-01-01 13:35:07.237 [WARN ] [g.eclipse.smarthome.core.net.NetUtil] - Found multiple local interfaces - ignoring 10.0.5.1
2018-01-01 13:35:07.273 [WARN ] [g.eclipse.smarthome.core.net.NetUtil] - Found multiple local interfaces - ignoring 10.0.3.1
2018-01-01 13:35:07.278 [WARN ] [g.eclipse.smarthome.core.net.NetUtil] - Found multiple local interfaces - ignoring 10.0.5.1
2018-01-01 13:35:07.408 [INFO ] [.dashboard.internal.DashboardService] - Started dashboard at http://192.168.5.21:8090
2018-01-01 13:35:07.411 [INFO ] [.dashboard.internal.DashboardService] - Started dashboard at https://192.168.5.21:8444
2018-01-01 13:35:16.769 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'WeWe.items'
2018-01-01 13:35:17.191 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'Netatmo.items'
2018-01-01 13:35:17.228 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'fs20.items'
2018-01-01 13:35:24.993 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'WeWe.rules'
2018-01-01 13:35:25.357 [INFO ] [thome.model.lsp.internal.ModelServer] - Started Language Server Protocol (LSP) service on port 5007
2018-01-01 13:35:26.111 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'WeWe.sitemap'
2018-01-01 13:35:26.588 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'WeWe.things'
2018-01-01 13:35:27.825 [INFO ] [thome.binding.astro.internal.job.Job] - Scheduled Astro event-jobs for thing astro:sun:home
2018-01-01 13:35:27.922 [INFO ] [ding.astro.handler.AstroThingHandler] - Scheduled Positional job astro:sun:home every 60 seconds
2018-01-01 13:35:28.054 [INFO ] [thome.binding.astro.internal.job.Job] - Scheduled Astro event-jobs for thing astro:moon:home
2018-01-01 13:35:28.063 [INFO ] [ding.astro.handler.AstroThingHandler] - Scheduled Positional job astro:moon:home every 60 seconds
2018-01-01 13:35:28.301 [INFO ] [thome.binding.astro.internal.job.Job] - Scheduled Astro event-jobs for thing astro:moon:local
2018-01-01 13:35:28.407 [INFO ] [thome.binding.astro.internal.job.Job] - Scheduled Astro event-jobs for thing astro:sun:local
2018-01-01 13:35:31.185 [INFO ] [basic.internal.servlet.WebAppServlet] - Started Basic UI at /basicui/app
2018-01-01 13:35:31.671 [INFO ] [arthome.ui.paper.internal.PaperUIApp] - Started Paper UI at /paperui
2018-01-01 13:35:32.191 [INFO ] [panel.internal.HABPanelDashboardTile] - Started HABPanel at /habpanel
2018-01-01 13:35:32.560 [WARN ] [homekit.internal.HomekitAuthInfoImpl] - Could not find existing MAC in org.eclipse.smarthome.storage.json.internal.JsonStorage. Generating new MAC. This will require re-pairing of iOS devices.
2018-01-01 13:35:34.215 [INFO ] [ulation.internal.HueEmulationServlet] - Started Hue Emulation service at /api
2018-01-01 13:35:34.679 [INFO ] [tatmo.internal.WelcomeWebHookServlet] - Started Netatmo Webhook servlet at /netatmo/camera
2018-01-01 13:35:35.523 [WARN ] [io.transport.cul.CULLifecycleManager] - CUL config is NULL, doing nothing
2018-01-01 13:35:35.631 [INFO ] [g.discovery.internal.PersistentInbox] - Added new thing 'hue:bridge:0017880ae670' to inbox.
2018-01-01 13:35:51.413 [WARN ] [ore.internal.events.OSGiEventManager] - Dispatching event to subscriber 'org.eclipse.smarthome.config.discovery.internal.AutomaticInboxProcessor@3d2dd47e' takes more than 5000ms.
2018-01-01 13:36:03.631 [INFO ] [g.discovery.internal.PersistentInbox] - Added new thing 'hue:0200:00178823c48f:2' to inbox.
2018-01-01 13:36:04.336 [INFO ] [g.discovery.internal.PersistentInbox] - Added new thing 'hue:0010:00178823c48f:6' to inbox.
2018-01-01 13:37:23.486 [WARN ] [ore.internal.events.OSGiEventManager] - Dispatching event to subscriber 'org.eclipse.smarthome.io.monitor.internal.EventLogger@16002e9c' takes more than 5000ms.
2018-01-01 13:42:29.249 [INFO ] [basic.internal.servlet.WebAppServlet] - Stopped Basic UI

Did you try clearing the Cache? Clear the Cache

1 Like

Thanks for reminding me to clear the Cache.
I just did it - will monitor situation. Hopefully the system is more stable now.

OH still keep crashing :frowning: No clue what is causing the frequent shutdown events.

I have the same prpblem. Random crashes with restarts a few Time a day.
Any hints how to read the logs in user data?

What jvm (version) are you using, What os etc. Did you see a high memory or cpu usage before it crashed?

This is interesting the SiGILL means a real error inside the jvm, maybe you could consider updating to the newest jvm?

If you could get a full crashdump I could do some investigations.

hs_err_pid32432.pdf (134.6 KB)

Attached the last hs_err_pid file with the infos about the system etc…

If you need a core dump I can activate this and wait for the next crash.

Thanks!

Hello Martin,
Many thanks for your support!

I did restart my NAS yesterday again, deleted the tmp and cache folder as well one more time.
As I see as well issues with the Homekit Binding (duplicate items reported in the log), I decided not to install this Service for testing.
Until now the system is running stable.

The system created a lot of crash dump files. I can provide you some of them if this help to investigate the issue.
How can I retrieve the Java version?

java -version

1 Like

Thanks Martin,

That was easy :wink:

[~] # java -version                                                                                                                          
java version "1.8.0_151"
Java(TM) SE Runtime Environment (build 1.8.0_151-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode)
[~] #

Yes please, because crashes like this are most of the time caused by problems in the JVM, maybe something in the new Karaf or some native libary triggers this, what kind of bindings do you have installed?

The newest available version is 152 so you could try whether that helps

I use the following bindings:
http, network, samsungtv, sonos and zwave.
Core dump is activated, I wait for the next crash.

i am having the same problem with the craches for 2/3 weeks now. UI keeps working but the items stop updating. The only way of getting the items to work again is restarting openhab.

i am op java version:

openjdk version "1.8.0_152"
OpenJDK Runtime Environment (Zulu Embedded 8.25.0.76-linux-aarch32hf) (build 1.8.0_152-b76)
OpenJDK Client VM (Zulu Embedded 8.25.0.76-linux-aarch32hf) (build 25.152-b76, mixed mode, Evaluation)

the error is:

2018-01-04 21:28:52.237 [WARN ] [org.apache.karaf.services.eventadmin] - EventAdmin: Blacklisting ServiceReference [{org.osgi.service.event.EventHandler, org.eclipse.smarthome.core.events.EventPublisher}={event.topics=smarthome, component.name=org.eclipse.smarthome.core.internal.events.OSGiEventManager, component.id=75, service.id=169, service.bundleid=110, service.scope=bundle} | Bundle(org.eclipse.smarthome.core_0.10.0.201712141004 [110])] due to timeout!

2018-01-04 21:28:52.535 [WARN ] [org.apache.karaf.services.eventadmin] - EventAdmin: Blacklisting ServiceReference [{org.osgi.service.event.EventHandler, org.eclipse.smarthome.core.events.EventPublisher}={event.topics=smarthome, component.name=org.eclipse.smarthome.core.internal.events.OSGiEventManager, component.id=75, service.id=169, service.bundleid=110, service.scope=bundle} | Bundle(org.eclipse.smarthome.core_0.10.0.201712141004 [110])] due to timeout!

Time it keeps working is around 12hrs

bindings i use http,network,mqtt,ntp,astro.

Hello, the link to the core dump to download: