Openhab silently stops working every night - debug guide?

Good morning all

I’m using Openhabian 2.4 on RPi3. Before Openhab was working stable without any issues. I used astro,zwave,mqtt1 bindings and google calendar service. I don’t have any OH rules.
But last weekend I applied some changes - configured rrd4j persistence and items, added MQTT 2.4 Binding, and tried samsung AC binding but then removed it as I couldn’t configure it.

And since then I became a problem - every night my Openhab silently stops working in strange way:

  • Item states do not update. Event.log stops about 1-2AM in the night. The problem usually happens in the night, but happened once also during the day.
  • openhab.log shows nothing after start.
  • but Openhab PaperIO and BasicUI are working. Changing items in UI doesn’t result in any reaction in event.log.
  • Z-wave binding does work (e.g when I enable Z-wave DEBUG in console, I see communication with nodes) but updates not shown in event.log
  • I have also node-red and mqtt broker at same RPi - both working normally.
  • Rebooting RPi usually helps.

Strange is that I don’t have any clue, as logs showing basically nothing. Is there a best way to start debugging this issue?

My issue seems to be similar to this one: After some time stops updating Items values but I doubt, that problem in SD card, as I have pretty new card and as I wrote I started getting this issue right after I played with OH config.

Thanks
Artyom

Do all your logs in /var/log also stop updating at the same time? Is SSH login also broken at the same time?

My RPI3 stops working about once a month. Once the problem starts I can’t login with SSH, but the Android app is still usable for a few hours and bits of OH appear to be working. Eventually I lose access completely and need to reboot. Once rebooted I notice all the logs had stopped updating hours earlier and don’t contain any of the events that the Android app had shown.

I had the same issue early in the morning and it was because InfluxDB was grabbing all the memory and thus “everything” died. There is a memory watchdog that you can check if it triggered and see if it correlates with that time.
I then proceeded to switch to a normal computer.

Not exactly. My logs are on INFO level by default, so there is not so much to log. The only one is event.log, which updates in the night roughly once per 10 mins, when new temp measurement is coming. So it’s stops at some moment. But I can easily login via SSH into openhab console hours after that, change log level to debug and see that openhab.log starts flooding with some info, which, as far as I can understand, mostly coming from http access to PaperUI and BasicUI. So access is not lost.

I don’t have InfluxDB as far as I know. I had pretty good expirience with my RPis - Rpi2 with OH 1.8 had uptimes of 6 months and longer. RPi3+ now with OH 2.4. was also quite stable till now…

I could roll back to last backup before change, but wanted to diagnose the problem before doing this.

There is no general debug guide.

Increase overall debugging to DEBUG level (org.openhab and org.eclipse.smarthome)

Undo half of your changes to see if the problem persist. If it does, iterate.

Yes a reboot normally does help but you should also clean the cache after making file changes to help with the clutter.

Try:

sudo systemctl stop openhab2

sudo openhab-cli clean-cache

sudo reboot

It looks like amount of events has some impact. I’ve disabled my “Christmas” illumination, which was generating couple of events every minute on evenings and OH lasted a bit longer - one and a half day or so.

I have uninstalled rrd4j persistence and cleaned the cache as recommended by @H102. Will see if it helps.

rrd4j didn’t help. Finaly I had to remove MQTT2.4 binding and it started to work. Is it not possible to run MQTT1 and MQTT2.4 in parallel? Are they conflicting somehow?

You can have both bindings but only one broker.

Yes it’s possible but both need to be configured to use a different client I’d in the broker connection configuration (mqtt.cfg for 1.x and the broker thing for 2.4).