OH broke down over night

I’m having hugh problems since some days now.
First I have to say, that OH as service was never a problem, it’s been running for months now without any problems.
I don’t really know where to start since i’ve never been into a situation like this. I never worked with deamons/services and I really would need help here.
So Basically I’m running Openhabian with a Raspberry 3b, controlling about 15-20 devices with a lot of backend code. I’ve had some OpenHAB2 service shutdowns almost daily the last 3 days until today in the morning, when the OH service broke down completly and didn’t restart.
I had troubles getting it back to work, but after pulling the latest repo with an apt-get upgrade to stable 2.4, it’s running again and I’m very glad.
I don’t know where I can start with the OpenHAB services beeing killed, I don’t have any starting points to look at…

[12:25:16] openhabian@smarthome:~$ systemctl status openhab2
● openhab2.service - openHAB 2 - empowering the smart home
   Loaded: loaded (/usr/lib/systemd/system/openhab2.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2018-12-23 12:24:26 CET; 1min 48s ago
     Docs: https://www.openhab.org/docs/
           https://community.openhab.org
 Main PID: 2380 (java)
   CGroup: /system.slice/openhab2.service
           └─2380 /usr/bin/java -Dopenhab.home=/usr/share/openhab2 -Dopenhab.conf=/etc/openhab2 -Dopenhab.runtime=/usr/share/openhab2/runtime -Dopenhab.userdata=/var/lib/openhab2 -Dopenhab.

Dec 23 12:24:26 smarthome systemd[1]: Started openHAB 2 - empowering the smart home.

Everythings running fine by now, but I want to get rid of the service beeing killed by something…

You would need to check your system logs and openhab.log to see if “something” killed Java or if it was Java stopping. For the first, that’s out of scope here. Search the internet.
But the fact that it didn’t restart at some stage rather points at the latter. I’d enable debug logging on org.apache.karaf and org.eclipse.smarthome to get (lots of) debug output on every startup attempt.

1 Like

I have my logs always shown on my 3th monitor, but there are no particular reasons. I get my ItemStateChangedEvents and at some point the logs just stops due to the fact that the service is stopped…

is this the only way, how the OH service can be stopped?

So OH should be restarted automatically after a service stop?

I’ll do that and dig deeper into this, thanks marcus !

This smells like SD card corruption…
Have you got a recent backup?

yes, I’m backing up my conf every few days, I want to get amanda backup, but didn’t have time yet to configure…

I have had similar problems (not a big one like this), openHab on Raspberry stopped unexpectedly, the Raspberry was not accessible from anywhere and it stayed there for 30min to 2-3 hour. After then, openHab restarted and initialized itself again (but from logs it seemed that the RPi didn’t shut down).

For me the solution seems that I have increased the size of the swap. Memory was not on full (around 90% was used), but swap was almost always was full (100%).
After increasing the memory, the issue is gone. I’m running multiple things on this Raspberry, so if you are only running openHab on it, it might be enough, but not for mulitple services (for me: openHab, TasmoAdmin, RPIMonitor, Mosquitto, NodeRED, some little Python script, etc…)

In a non-modified openHABian setup, systemd would supervise the Java process running and restart it if needed.
But if you instruct it to stop openHAB (various syntaxes, e.g. systemctl stop service.openhab2) then it’ll not restart until you tell it to (… start …).

Possibly so. If your debug logs don’t show anything to tackle, I’d re-install the openHABian image to a new SD card. And follow the guidelines this time.

1 Like

Yeah I know a little how the daemon system works, but didn’t know it will restart deamons, thanks for the hint.
I didn’t modified the setup, so this is weird …
I didn’t had a shutdown yet, but I enabled the logs and I’m getting enough informations if something unusuall happens.

Thank you, I will!

So OH broke down again, the web logviewer doesn’t show anything decisive …
Got some [Debug] logs mainly from swagger, but thats it

2018-12-25 02:53:35.652 [DEBUG] [ternal.JSonPathTransformationService] - transformation resulted in '1.0'

2018-12-25 02:53:35.834 [DEBUG] [home.core.internal.items.ItemUpdater] - Received update of a not accepted type (StringType) for item YeelightcM_4

==> /var/log/openhab2/events.log <==

2018-12-25 02:53:36.330 [vent.ItemStateChangedEvent] - SI_PM_useP changed from 52.0 to 51.9

2018-12-25 02:53:36.374 [vent.ItemStateChangedEvent] - SI_st_u changed from 5675 to 5676

2018-12-25 02:53:36.378 [vent.ItemStateChangedEvent] - SI_PM_availableP changed from 48.0 to 48.1

2018-12-25 02:53:36.542 [vent.ItemStateChangedEvent] - SI_CPU_load changed from 36.4 to 70.2

==> /var/log/openhab2/openhab.log <==

2018-12-25 02:53:36.594 [DEBUG] [.AutomationResourceBundlesEventQueue] - Process bundle event 32, for automation bundle 'com.eclipsesource.jaxrs.provider.swagger' 

2018-12-25 02:53:36.600 [DEBUG] [.AutomationResourceBundlesEventQueue] - Process bundle event 32, for automation bundle 'com.eclipsesource.jaxrs.provider.swagger' 

2018-12-25 02:53:36.604 [DEBUG] [.AutomationResourceBundlesEventQueue] - Process bundle event 32, for automation bundle 'com.eclipsesource.jaxrs.provider.swagger' 

2018-12-25 02:53:36.609 [DEBUG] [.provider.RuleResourceBundleImporter] - Parse rules from bundle 'com.eclipsesource.jaxrs.provider.swagger' 

2018-12-25 02:53:36.619 [DEBUG] [ell.impl.action.osgi.CommandExtender] - com.eclipsesource.jaxrs.provider.swagger (199): Starting destruction process

2018-12-25 02:53:36.622 [DEBUG] [ell.impl.action.osgi.CommandExtender] - com.eclipsesource.jaxrs.provider.swagger (199): Not an extended bundle or destruction of extension already finished

2018-12-25 02:53:36.631 [DEBUG] [.AutomationResourceBundlesEventQueue] - Process bundle event 32, for automation bundle 'io.swagger.jaxrs' 

2018-12-25 02:53:36.635 [DEBUG] [.AutomationResourceBundlesEventQueue] - Process bundle event 32, for automation bundle 'io.swagger.jaxrs' 

2018-12-25 02:53:36.641 [DEBUG] [.AutomationResourceBundlesEventQueue] - Process bundle event 32, for automation bundle 'io.swagger.jaxrs' 

2018-12-25 02:53:36.644 [DEBUG] [.provider.RuleResourceBundleImporter] - Parse rules from bundle 'io.swagger.jaxrs' 

2018-12-25 02:53:36.655 [DEBUG] [ell.impl.action.osgi.CommandExtender] - io.swagger.jaxrs (209): Starting destruction process

2018-12-25 02:53:36.657 [DEBUG] [ell.impl.action.osgi.CommandExtender] - io.swagger.jaxrs (209): Not an extended bundle or destruction of extension already finished

2018-12-25 02:53:36.669 [DEBUG] [.AutomationResourceBundlesEventQueue] - Process bundle event 32, for automation bundle 'reflections' 

2018-12-25 02:53:36.672 [DEBUG] [.AutomationResourceBundlesEventQueue] - Process bundle event 32, for automation bundle 'reflections' 

2018-12-25 02:53:36.677 [DEBUG] [.AutomationResourceBundlesEventQueue] - Process bundle event 32, for automation bundle 'reflections' 

2018-12-25 02:53:36.680 [DEBUG] [.provider.RuleResourceBundleImporter] - Parse rules from bundle 'reflections' 

2018-12-25 02:53:36.686 [DEBUG] [ell.impl.action.osgi.CommandExtender] - reflections (270): Starting destruction process

-- ?

2018-12-25 02:53:36.688 [DEBUG] [ell.impl.action.osgi.CommandExtender] - reflections (270): Not an extended bundle or destruction of extension already finished

2018-12-25 02:53:36.702 [DEBUG] [.AutomationResourceBundlesEventQueue] - Process bundle event 32, for automation bundle 'swagger-jersey2-jaxrs' 

2018-12-25 02:53:36.708 [DEBUG] [.AutomationResourceBundlesEventQueue] - Process bundle event 32, for automation bundle 'swagger-jersey2-jaxrs' 

2018-12-25 02:53:36.710 [DEBUG] [.AutomationResourceBundlesEventQueue] - Process bundle event 32, for automation bundle 'swagger-jersey2-jaxrs' 

2018-12-25 02:53:36.713 [DEBUG] [.provider.RuleResourceBundleImporter] - Parse rules from bundle 'swagger-jersey2-jaxrs' 

2018-12-25 02:53:36.719 [DEBUG] [ell.impl.action.osgi.CommandExtender] - swagger-jersey2-jaxrs (272): Starting destruction process

2018-12-25 02:53:36.721 [DEBUG] [ell.impl.action.osgi.CommandExtender] - swagger-jersey2-jaxrs (272): Not an extended bundle or destruction of extension already finished

2018-12-25 02:53:36.829 [DEBUG] [mpl.info.InfoBundleTrackerCustomizer] - Ignore incorrect info null provided by bundle com.eclipsesource.jaxrs.provider.swagger

2018-12-25 02:53:36.843 [DEBUG] [mpl.info.InfoBundleTrackerCustomizer] - Ignore incorrect info null provided by bundle io.swagger.jaxrs

2018-12-25 02:53:36.863 [DEBUG] [mpl.info.InfoBundleTrackerCustomizer] - Ignore incorrect info null provided by bundle reflections

2018-12-25 02:53:36.886 [DEBUG] [mpl.info.InfoBundleTrackerCustomizer] - Ignore incorrect info null provided by bundle swagger-jersey2-jaxrs

I noticed connecting to my Pi with SSH didn’t work either, so I guess the complete Pi got shutdown, and not only the OH service …
I’m currently backing up my SD card, buying a new PS (noticed it has only 2A, running no peripherie but better safe then sorry) + SD Card and relocate swap and logs. I even consider to change my Host to a Raspi 3b+, since I have one and don’t need it currently anyway.
But is a re-install the only way I could fix this issue?