Advise for High Performance HA OpenHAB server hardware

rpwong · July 16, 2020, 6:36am

Debating the technical merits of the RPi with respect to the OP’s question is relevant. That’s not the direction the conversation was going, and I think @igorp was right to point out that we’re off topic and try to get it back on track.

And to be clear, if people can say why they like RPis, then people can say why they dislike them. I’ve actually been more inclined toward saying that while many users are happy to use RPis, some users have concerns about them. And if an OP asks what those concerns are, I can point them to conversations where it’s been discussed at length. Or better yet, introduce them to people with those perspectives.

Now we’re definitely off topic…

xsherlock · July 16, 2020, 10:10am

Just to puts my 2 cents int RPI thing… I love RPI’s and I’m probably on the high edge on the number of PI’s that run anyone house with 14 being the number
But they are not HA and SD card will fail sooner or later. Even high write endurance DVR sdcard failed in the PI after 1.5 year of 1 second persistence writing.

That’s why I was asking for opinion for HA clustering and if that could be done with 4-5 RPI’s that could be a solution (thing like cloverPI are being developed), I also would love it be more or less industrial solution for a rack or din rail.

I think that for research purposes I will invest in trio of Odroids H2+ as they do have more reliable NvME storage and someone did design a nice rack case for up to 8 blades. https://forum.odroid.com/viewtopic.php?f=172&t=36780 Still it will be home made solution and my original post intention was to ask if such a turnkey solution with redundant PS and maybe integrated switch is already commercially available, and it looks that conclusion is that it is not.

mstormi · July 16, 2020, 10:42am

And that’s why I told you to check the ZRAM implementation in openHABian which is greatly reducing the number of writes by a large factor hence increasing SD lifetime by about the same factor.
Pragmatic and to the point.

higgers · July 16, 2020, 11:00am

I use a USB to SATA adapter that only cost 8GBP with a RPi 3 B+ and an PoE hat. This has been working flawlessly for around two years. I have connected a passive four way USB hub to the Pi to which I have plugged in a ZWave stick, an RFXcom 433Mhz stick and an 868Mhz CUL stick that are all served out across the network via USBoIP using ser2net.

The USB to SATA adapter allowed me to connect a spare SSD to the Pi so I don’t need to worry about SD cards failing.

The PoE hat allowed me to locate the Pi on top of a cupboard where there was no mains supply for a PSU. I could install a single Cat5e cable instead of routing a mains cable from the nearest socket.

rkrisi · July 16, 2020, 11:06am

Anyway I wanted to achieve something similar, maybe someone did something like this?

I have several RPis in my house which most of them do nothing, but I need them because they serve other important purposes (like print server, we use it not that much but it is always needed - and the printer is at a dedicated location).
Would be great if I could install openhab on one of this and it will act as a hot-fallback device.
What I will need to achieve and I don’t really know how to do this:

Daily config copy (simply create an openhab backup and copy it to the fallback device, unzip?)
Detect if the main server openhab instance is not running (not the whole computer is down, only openhab - this would also make maintenance on the main server easier and without disruption). Maybe just check if the basicui loads or the rest api is accessible or how?
Switch to the backup server if the main is not available. Start openhab and somehow redirect openhabian.local?
Switch back to the main if the openhab on that machine is up.

Someone has some ideas for this?

Thanks!

higgers · July 16, 2020, 11:14am

I also use Proxmox on three hosts, one four core Xeon and two small Mini-ITX mobos. The VMs discs are all created as ZFS volumes and I use Sanoid and Syncoid to send volumes between hosts. Syncoid sends ZFS volumes between hosts and Sanoid creates policy based snapshots. Snapshots can be specified like this:

[template_production]
        frequently = 0
        hourly = 0
        daily = 15
        monthly = 0
        yearly = 0
        autosnap = yes
        autoprune = yes

In the above example there will be 15 days of daily snapshots.

Syncoid is very useful at reducing downtime when sending a volume to another host. I run it twice, the first time whilst the VM is still running, this sends the bulk of the changes since the last time the volume was sent to the destination host (or the entire volume if this is the first time). I then shutdown the VM and run syncoid again to send the changes that occured during the time between the first send and the end of shutting down the VM. This second send is very quick (it’s unlikely there were many changes to the VM disc in that period of time) and I can then start the VM on the destination host.

I’m sure the above process could be scripted.

papaPhiL · July 16, 2020, 12:54pm

I think I tried that, but because I moved away from HA-cluster to individual nodes, I cannot talk from experience. In the ceph scenario, if the hosting node fails, another host will jump in, see https://www.youtube.com/watch?v=T8Wdko31JB4 . But with clustering, there are downsides, and declustering is a mess. So I recommend to not cluster and achieve redundancy in a different way.

Udo_Hartmann · July 16, 2020, 2:07pm

For clustering you will need at least three nodes (one is allowed to fail) and each of the nodes has to provide a data store for vm images.

marcel_erkel · July 16, 2020, 4:47pm

If you’re willing to put in the effort checkout https://clusterlabs.org/.

If you can read German (or maybe use Google Translate), the following blog may be interesting for you as well:

https://herc.de/de/openhab2-hacluster-mit-pacemaker-und-corosync/

papaPhiL · July 20, 2020, 9:15pm

Here are my scripts
https://github.com/openPhiL/openHouse/tree/master/How%20to%20build%20it/4%20Solutions/Server%20Redundancy

Gad_Ofir · July 21, 2020, 7:54am

wow thank you i will definitely look into this, i am doing a new home just now
i have a proxmox for testing already runing
PfSense (with its own NIC for firewall)
freeNAS
ubuntu server to run all my IOT stuff

but i am still unclear about how to setup the storge right…
storge for proxmox itself(i have an ssd of 256)
storge for freenas (still did not buy any hardware)
i want my NVR also to go there

just wanted to share my headache

papaPhiL · July 21, 2020, 7:26pm

I started with PfSense as well but endet up at OpenSense (because Open)…
I tried FreeNas, but at the end, I had no use but overhead.
I have multiple ubuntu servers.

as for the storage, I choose not to seperate proxmox itself from the main storage, because the main storage is redundant, so be using the storage for proxmox as well, I make proxmox-OS failsafe. ZFS helps to get good speed even with slower drives, you could use your 256ssd as a zfs-cache.
have not started with real nvr yet, hope my cpu power is enough.

Feel free to connect with me if you have open questions.

xsherlock · July 22, 2020, 11:37am

Lets talk storage a bit more…
If I have 3 node proxmox cluster where every node having a single SSD. Is proxmox somehow taking care of creating a common spanned diskspace across those 3 nodes to host the VM’s Or every node has only access to its own single (failure prone) drive.
What would be a setup to be truly resilient to one node failure?
If a node that is running the OpenHAB VM dies. From where do I recover a VM to run it on the spare node that was in standby?
Should I have a seprate NAS to hold the RAID protected network share where the VM’s run (would that not be slow?)

xsherlock · September 7, 2020, 8:46am

There is an excellent multi part guide here at the serve the home that details construction of HA cluster exactly in a way I was intending for a Uber OH server.

binderth · September 7, 2020, 9:02am

Again, I think especially in Homeautomation, there’s a bunch of more or less bulletproof, stable and reliable hardware actuators not hardware server. You should aim to use these stable actuators and it would be less costly (in terms of money and time spent) to use these appliances instead of routines, logic and stuff within openHAB.

So, in terms of High Availabilty and reducing possible Points of Failure:

get robust, stable hardware actuators
use their interfaces to connect stable hardware to openHAB
use openHAB to let’s say orchestrate or bridge inter-hardware use cases
don’t use openHAB as a central and single point of functionality/logic

Just to get an example:

don’t connect valves to openHAB to regulate your central heating
instead: use a robust (local! not cloud-dependend!) heating solution
and use the interface of your heating solution to read the environment variables for your heating to trigger some sort of actions in openHAB for “luxury” purposes
you avoid failures along the way, beginning from an misconfigured openHAB to a failing SD card to a broken IP-connection, …
if your openHAB dies, you still have a warm house, even if you can’t see temperatures or your openHAB tells you to close the windows in the room upstairs…

Don’t fall for the fallacy, openHAB has to organise everything, know everything, works as a central intelligence. It’s not designed for that. It’s a central logic, which extends already in place appliances and their logics.

mstormi · September 7, 2020, 9:28am

Ouch. Sorry but that’s total nonsense.
You want OH to be your central controller else there’s so much stuff you will not be able to implement.

What you probably wanted to say is:
have another control layer that works on lower levels and without openHAB, such as device-to-device communication in KNX or ZWave does.
But that’s a mere fallback solution to only cover basicmost functionality (such as one light per room).

PS: it’s “actuator” in English. A 230VAC powered actor is … “not the brightest idea” … well, maybe for a couple of minutes it is, but mind the smoke

lipp_markus · September 7, 2020, 9:32am

While I fully understand the fascination of having a failover solution for OH for the sake of learning and for the sake of it really being cool and have I have toyed with that thought myself quite a bit. Proxmox makes HA clusters deceivingly simple on first glance. However, thinking it through, it seemingly gets really complicated quickly. What these guides do not mention is the need for separate networks (proxmox highly recommends to put the corosync traffic on a separate network), the need for a redundant storage option and the need for fencing devices (isolating misbehaving cluser nodes). Pair this with the need for making all switches redundant (you may need to use stacked switches for both networks, otherwise if one of your switches goes out, your whole cluster is dead), etc and this is reaching a level of complexity and price that exceeds my willingness to do so. Just my thoughts (the above is all described in the proxomox doc in more detail).

binderth · September 7, 2020, 9:45am

yes, absolutely, changed it to be more specific on that part. and also actuator.

binderth · September 7, 2020, 10:00am

If you work in HA-environments for let’s say Enterprise E-Commerce systems, than I’ll understand the need this also in your home:

seperate networks, or at least VLANs for all your appliances including QoS rules for important traffic, also separating your internal traffic (smart TV, notebook, smartphones, …) from your smarthome traffic
two internet providers including failover
having every hardware at least double with a failover strategy (and I mean every hardware from a single light bulb to separate power cables or hardware hubs or two bus cables, network cables, two power suppliers, diesel engine on standby, …
best case you have two different means of sending commands to your actuators, to avoid failures of one transportation medium
…

having openHAB on a HA-cluster is only the tip of the iceberg, if you ask me. and even IF you have openHAB on some sort of HA-cluster how do you ensure failsafe:

avoid crashing of openHAB after a misconfiguration of openHAB itself
provide all openHAB clusters to use the same configuration on distributed systems
provide openHAB with alternating persistance for your states and item values?
provide openHAB with distributed I/O layer
…

you could do that, and I admit this would be a SHITLOAD OF FUN - but then again: it would cost a fortune - of money and spare time…

xsherlock · September 7, 2020, 11:18am

This is a bit offtop but indeed, master furnace on/off is the single one functionality in the house that I did not trust to OH. An option of having 1.5m3 of water tank in basement to boil, hoping that emergency valve will do it’s job at 2.5bar is not a scenario I want to be a part of. I rely here on the furnace thermostat probe and its controller. However I if that would fail I have OH to at least warn me that the tank temp is above anything that. But for the low temp room circuits of course I use OH to regulate the heating valves.

As for the HA costs being prohibitively expensive. I think what with 3 nodes made of the Odroid H2+ and 2 x CRS305 switches from Mikrotik to do 2.5 Gbps Eth I can pull that off below 1000 Euro, and that is below the price that single PSU QNAP I’m running the house now.