- Platform information:
- Hardware: Raspberry PI 4gb
- OS: Raspbian
- Java Runtime Environment: 11 JRE
- openHAB version: 3
- Issue of the topic: Monitoring of openhab using monit
I am new to openhab, but wanted to share my monitoring setup as this might help other users.
I have a setup of Raspian with openhab installed (so not the image) for some days now. Reason is that I use the system for other purposes and prefer not to have a dedicated server for openhab. Today the page did not load and the status of the service stated some error of to many files loaded. In this case the service was active, but the site did not load. A restart solved this, but this is besides the topic. In the past I used Domoticz and had monit to monitor if the service is working properly.
I allready installed monit, google for details (I used Install Monit on Raspberry Pi | Lindevs)
sudo apt update
sudo apt install -y monit
After installation; add a file in the /etc/monit/conf.d/ folder, named openhab. In a terminal, use “sudo nano openhab” in this folder. I added the following content:
check host openhab with address 127.0.0.1
start program = "/usr/bin/sudo service openhab restart" with timeout 120 seconds
stop program = "/usr/bin/sudo service openhab stop"
for 2 cycles then restart
if 5 restarts within 5 cycles then exec "/sbin/reboot"
This will test the site every cycle (two minutes on my end) and in case the site does not respond twice, restart the openhab service. Waiting two minutes as the site does take some time to become available. And a reboot of the entire raspberry in case this does not work. This does not send out notifications.
Log of monit can be found under /var/log/monit.log
Test: just stop the openhab service (sudo service openhab stop) and monit should kick in within 5 minutes to (re)start openhab.
If anyone have some improvements or know a better way to monitor, feel free to share.
Currently also working on a plugwise stretch binding… hope to share that one in the near future.
Great! Thanks for sharing!
Not required, see openHABian | openHAB
Especially if you are, as you mention, using this machine for more than just openHAB, resorting to an automatic reboot seems to be a drastic action to take. You could easily end up in a situation where monit renders the whole machine unusable as it ends up in an infinite reboot cycle every few minutes, depending on how openHAB is failing.
This configuration also assumes that all problems that would cause the machine to need to be rebooted are caused by, or at least detectable by openHAB failing to start and listen for connections on port 8080. There can be problems experienced by these other services running on the same machine. There could be problems that do not stop openHAB from being able to accept connections on port 8080.
Ultimately, addressing the root problem will be a better approach overall. You should not be seeing a too many open files error, ever. openHAB itself doesn’t open that many files anyway (unless you are doing something unusual in a rule somewhere) so the root cause of the error is almost certainly one of your other services. Maybe it’s running crazy and that needs to be addressed. Maybe it really does need to open thousands of files at the same time (some database servers do this) in which case you can change the max at the OS level.
In general, openHAB is designed and expected to run continuously with only infrequent reboots so it’s better to solve the root problem than resort to a sledgehammer and reboot at the first sign of problems.
Having said all that,
monit is an excellent tool to monitor services. I use it myself on most of my critical machines where Zabbix is too heavy weight. So thank you for posting this example. I’d recommend though that it send alerts rather than reboot though.
This monit thing is a good piece of software and it does what it is supposed to do - it watches own neighborhood. I do use it in all of my setups to automatically restart VPN when its ping fails. I also restart network interfaces when VPN communication fails for few cycles. When X restart attempts fail and VPN is still gone (~about 30 minutes) then I conduct full reboot of system. Sometimes it is useful especially when network is unstable. I found in couple of setups that routers sometimes keep DHCP leases despite of interface cycle and only reboot helps there.
Overall networks do not fail too often, probably twice a year, but last thing you wish to learn is that your openHAB is gone when you are 1000 km away from home. Monit helps a lot with cheap cellular routers which are cheap but subject of some friction (there I have one or two restarts a month). For openHAB itself I use systemd watchdog.