Active/Passive failover setup

jimtng · October 28, 2020, 11:10am

I used to run Openhab inside a docker running on Centos7. On that server, I have an SSD boot disk, 2 spinning data disks, a spinning parity disk, and a spinning backup disk. The 2 data + parity disks form a raid5 setup using snapraid, and its data copied / rsync’ed to the backup disk daily, which is regularly snapshotted for historical backups. I also occasionally do a manual copy to external disks but this happens very very rarely. I’ve been thinking about cloud backup but haven’t figured out which service to use (it will need to offer linux cli access similar to rsync)

Today, my SSD boot disk died (Intel 120G). Luckily I have a spare SSD (Crucial 500MX) in the drawer prepared exactly for this event.

So I spent a good part of the day doing a clean install / setup of Centos8 (an upgrade from Centos7). Luckily I backed up /etc to the backup disk so I can restore most config easily albeit manually (dnsmasq, samba, apache, udev rules, cron jobs, etc).

I know I should probably spend some time looking at configuration automation…

Anyhow, Most of my tasmota switches were configured to take over the buttons action and would at least work as a dumb on/off switch when MQTT is down.

Thanks to having openhab in docker, that part was back up and running painlessly. I also have grafana, influxdb, zigbee2mqtt etc inside docker and they were restored easily once I’ve got docker itself installed.

So I was thinking I should have a raspberry pi that syncs the rules / config for openhab and ready to take over. I could just manually move the zigbee dongle from the centos server to the raspberry pi.

Has anyone done it like this, or is there a better (but simple) way?

mstormi · October 28, 2020, 11:37am

This or similar requests are reliably coming up every few weeks or months, and in particular IT guys like
ourselves like to promote their clever and even-more-clever homemade more-or-less-hot failover solutions to this.
But the bottom line is that most are just parts taken from a comprehensive architecture and methods you would apply in a data center environment. As a standalone measure, they’re overly complicated, very prone to error and most important, they don’t scale down to single [or dual Active/Standby] systems. They just aren’t worth the effort and risk for a single home automation server.

KISS! (keep it simple, stupid). I’m sayin’ that as a cloud architect that I am by profession.
Run on cheap hardware (RPi) and have cold spares ready for everything (SD, power supply, RF sticks etc).
Invest in a reliable, proven and fast backup+recovery concept instead.
Check out the openHABian auto-backup feature.

jimtng · October 28, 2020, 11:44am

Thanks, I should take a look at openhabian. Has it been able to run on rpi4?

mstormi · October 28, 2020, 11:59am

Sure. But don’t get the 8GB model. It works but has issues and it’s a waste of money.
And openHABian does not support Docker. But you don’t need it anyway.

rlkoshak · October 28, 2020, 4:36pm

I’ll second Markus’ advice.

First I highly recommend bumping up the priority on doing configuration automation. I cringed when you said it took most of a day to restore the boot drive.

And then I think your current setup is more than sufficient, sufficient enough not to add more complexity. Make your devices mostly usable when OH is offline (“You’ll never see an escalator out of order sign. Only escalator temporarily stairs. Sorry for the convenience” - Mitch Hedberg). Then when OH goes offline for some reason you won’t freeze or be unable to turn on the lights. And you can therefore take your time restoring openHAB’s functionality. If you have good backups and/or automation you should be able to rebuild from scratch in a matter of a couple of hours of unattended activity. I’d spend my efforts making that possible rather than trying to create a warm backup instance.

denominator · October 28, 2020, 9:38pm

I third Markus’ advice.

I have openhabian on pi 3 running off a MSATA driver with a backup SD plugged in. It is powered by a USB power brick that can supply power when it is charging like a UPS. Unfortunately I have not been able to fully test my backup system because it has not failed yet.

I also have a SD backup card not plugged into anything that has a running openhab and MQTT so if all else fails I can still do everything I could do when I last backed it up.

I also have another PI just in case the original one fails so I have a cold backup and a frozen backup in case of failure.

It would take me less than 5min onsite to restore a disaster due to openhabian’s awesome well tested backup procedure and me having the spare hardware onsite to fix it.

Use the rule 1-2-3 in backup.

3-2-1 backup rule is an easy-to-remember acronym for a common approach to keeping your data safe in almost any failure scenario. The rule is: keep at least three (3) copies of your data, and store two (2) backup copies on different storage media, with one (1) of them located offsite.

jimtng · October 29, 2020, 12:24am

Thanks for the tips on pi’s redundancy.

Does it plug into the pi via usb and the pi can boot off that instead of the internal sdcard, which you are using as the backup?

Which power brick do you use?

I’m still struggling with this idea. I am wondering how this can be done when I wanted to upgrade from centos7 to centos8. There were a few differences in config, e.g. firewall, different respositories, etc.

If it was for an exact recreation (i.e. centos7 to centos7), might as well have some sort of a warm mirrored copy of the boot drive, which I think I might do from now on.

This server isn’t just for openhab, it acts as my nas, dns resolver, ntp server, plex server, etc.

However, I think I might move openhab into a raspbery pi because it’s easier and cheaper to have complete redundancy, including a spare pi.

rlkoshak · October 29, 2020, 3:23am

Look at Ansible. There are a few posts on this forum about it.

About 90-99% of the stuff will be the same and you can edit the playbooks to handle the rest. Notice, moving to a new version of the OS is not the same as restoring a machine to what it was before. But I’ve moved from Raspbian wheezy to jessie to buster with only one change required to my playbooks and from Ubuntu 16.4 to 18.4 to 20.4 with no changes to any of my playbooks or roles. Of course I deploy almost all of my services as containers which helps.

I manage literally all my vms and RPis using Ansible. The only time I had trouble was when Raspbian dropped msmtp when I upgraded to Buster. but it only took an hour to write a new role for ssmtp and I was back up and running.

It’s also great for configuration configs. Lets say I need to change my Gmail app password for ssmtp. I just need to update one variable, run the playbooks and the change is pushed out to all four of my vms and all five of my RPis.

And since I store it all in a personally hosted GitLab, I pretty much only need to backup that GitLab, my databases, and a few other folders. OS and most software configs are built into the Ansible playbooks and get checked out of git and redeployed by Ansible or pulled from the most recent backup.

If I weren’t lazy, I could move my secrets to ansible vault and store my playbooks on GitHub (or some other git service) and I’d have off-site backup too. It’s on my to-do list.

I’m not just running openHAB either.

portainer
code server
openHAB 3
makemkv
hand real
ADE
Calibre desktop
openHAB 2
influxdb
mosquito
grafana
grafana image renderer
Calibre server
plex
nextcloud
postgresql
elastic search
redis
guacamole
gitlab
tripwire deployed to all machines
fish with config deployed to all machines
ssmtp deleted to all machines
cron based backup scripts deployed to all machines

I’m sure I’m forgetting something, but I don’t have to remember. It’s all coded into the playbooks.

denominator · October 29, 2020, 7:13am

On my openhabian main setup I have one of these the MSATA drive is my boot drive with no internal sd card. In an small external reader I have an SD card that openhabian backs up to using the auto backup.

If the MSATA drive dies I can unplug it and move the sd card to internal slot.

If something catastrophic happens (dog eats pi) then I can take another pi and put the sd card in that. To do this I have setup a static ip address. I also have another SD card that has an old backup on.

I got it as a secret Santa present years ago and I am sure you have one floating around. Plug it into the charger and see if it will power a device. You can also buy one specific for the PI

I have not had many power failures here and have not setup a NUT device to shutdown the pi if needed. Its on the todo list.

For mission critical wife approval this is necessary for me. My spare pi is being used as a media player on the tv.