Advise for High Performance HA OpenHAB server hardware

You can have a proxmox running on a single node, managing your VMs.
Extra nodes can be added at a later stage, so that you can utilise the VMs replication mechanism. The minimum replication period is 1 minute, so in case of the node failover you have pretty much equal state of the VM that is stated on a woring node.

The real problem is the availability of the usb connected devices, i.e. in my case zigbee and modbus dongles. would have to be doubled and conected to both proxmox nodes.

i tried it :slight_smile:
and then i decided to shutdown one server and got some qum error
Proxmox does not like when the other node goes to sleep

I’m now reading into the proxmox documentation and that looks like a very nice solution, have two questions can’f find straight answer to.

  1. If I have a 3 node cluster, is it that the cluster manager is running on one of the nodes or it is running on all 3 nodes and and in case of the one server failure it will keep running on the other 2.
  2. Do I really need shared networked storage for such a config? I would still need a NAS to do that yes? Or that can be done with CEPH and hold the VM images/backups within the 3 node cluster.
  1. Not entirely sure about the internal mechanism that proxmox uses to decide which of the cluster nodes is the current manager or some other fencing mechnaisms, but you can connect to either node via web (:8006) and control the cluster from either node. I haven’t tried powercycling one of the nodes to see how my “cluster” will behave, but it remained operational during one of the server reboots. (Although only the VMs that were on the node that was not rebooted remained operational, as I I didnt migrate them)
  2. From my experience, you don’t need a shared storage. Providing that you are not moving a massive amounts of data, the VM migration would simply move the virtual disk from local storage on nodeA to nodeB in couple of minutes without pausing the VM. (at least when ZFS is used for LocalStorage)
    As far as I remember, all mount points have to be available on all nodes. So in case when you have a single physical disk, like I do for Backups and it is connected to nodeB, you can export it using NFS and have NodeA and NodeB mount it. I would say CEPH is an overkill for a a home network :slight_smile:

I have a step by step in the works, but before everbody is jumping onto the wagon, here is a preview. Let me start with the overview, that is my proxmox machine.


As you can read in the middle, this is a very cheap CPU with not that much power nor RAM, so the total power consumption is about 15W, but it is not busy at all.
I have

  • openhab (don’t need explain that :smiley: )
  • TIG is Telegram,InfluxDB,Grafana.
  • Shinobi is doing my Ip-Camera Monitoring
  • UnifiController (Network Manager)
  • Fileserver (just a plain ubuntu with SMB share)
  • Nextcloud (this is where I share my documents/pictures between my devices/people)
  • Mosquitto (MQTT broker)
  • OPNSense (Firewall etc. )

They all share the same host running on a raid1 2TB HDD with a 500GB Nvme stick. The stick is read/write cache, this is easily done with ZFS. (https://www.youtube.com/watch?v=Ui3VTzUjlWk&)

I haven’t had this problem, but I had other problems - so I decided NOT to use Proxmox Cluster. I just have 2 seperate Proxmox Hosts with 2 different IPs, each has 2 LAN ports (one goes to router, one to the LAN). No “sharing” - that was too complicated.

So because I have to seperate machines, I can run the simple command:
ZFS send … | ssh … | zfs receive… on one side, it will move the snapshot to a different machine. If that is combined with ZFS-auto-snapshot( script that manages when snaptshots are made, like 15min keep 8, hoursly keep 24, daily keep 7, weekly keep 4) and zfs-backup,(manages to calculate the difference between the newest snapshot on target and newest snapshot on host and then sends this incremental data to the target), you have 2 very similiar machines. In my case, If one machine breaks, I will continue to have all those services with the snapshot of last night. (https://www.youtube.com/watch?v=2ozrP8ODPTw )

And for the replication part, I have a couple of simple scripts that run as cronjobs and ping a service and if it is down, they WOL the spare machine and on the spare host, a cronjob is pinging the services and start the machine ( on that spare host).

Indeed, one of the next steps is to move the USBs (z-wave and powermeter) to a little raspberryPI that either channels those as virtual USB devices to the hosts… or I just install OpenHab on that raspberryPI and use some MQTT or HTTP scripting to proxy the commands.

Hope that helps to get you started - or you wait a couple of days for my more detailed step -by-step guide.

1 Like

Here is the solution that works flawlessly, I never had a problem with the Zwave anymore as that also helped solving a problem with the range (have server room in basement)

Getting back to the proxmox, thank you for the guide. I still have one more question. Is Proxmox VM in a cluster working in a way that it hard assigned to the one of the node machines. And when it fails it needs to be recovered or migrated from backup to another node?
or it can be “distributed” among more then a single node for HA fail protection so when the node dies it will have no downtime, just maybe reduced performance.

By the time I got to end of thread, it had taken different turn towards virtualization being the answer. I don’t really agree with that, but it seems to be the way OP is leaning currently.

Even in that route, the Zwave (or whatever other) USB stick(s) seem to be a sticking point. Just one of many problems I would imagine going that way. I am not sure what the answer is to that, and I guess I won’t bother throwing out any other options as the point seems moot now.

But as I had read all the way from beginning, I still wanted to include some replies to earlier posts.

The only ARM boards I am aware of with ECC (not talking about data center stuff here, but things readily available to us mere mortals) are the Kobol Helios4 (no longer available) and their new version the Helios64 which is soon to (may have already started) shipping. But even the first batches of Helios64 will not have ECC, that will come in a later version.

Guess which board I am anxiously awaiting as my next purchase? :wink:

And yet, it gets recommended, over and over, on this very forum.

Sorry, Russ. I realize you may have heard all the arguments before in other threads. And I have new appreciation for your distaste for strife. However the debate is extremely relevant to the topic. Not only for OP, but anyone else coming along later, as well.

No, not “SBC in general”, only RPi are lacking this. I guess you must have completely missed this post?

(a moot point by now, perhaps)

My point was that the Pi is not broken. It was not designed primarily fir the purpose that person wants. It does not mean the Pi is flawed, just not the optimal tool for the task.

Debating the technical merits of the RPi with respect to the OP’s question is relevant. That’s not the direction the conversation was going, and I think @igorp was right to point out that we’re off topic and try to get it back on track.

And to be clear, if people can say why they like RPis, then people can say why they dislike them. I’ve actually been more inclined toward saying that while many users are happy to use RPis, some users have concerns about them. And if an OP asks what those concerns are, I can point them to conversations where it’s been discussed at length. Or better yet, introduce them to people with those perspectives.

Now we’re definitely off topic…:wink:

1 Like

Just to puts my 2 cents int RPI thing… I love RPI’s and I’m probably on the high edge on the number of PI’s that run anyone house with 14 being the number :slight_smile:
But they are not HA and SD card will fail sooner or later. Even high write endurance DVR sdcard failed in the PI after 1.5 year of 1 second persistence writing.

That’s why I was asking for opinion for HA clustering and if that could be done with 4-5 RPI’s that could be a solution (thing like cloverPI are being developed), I also would love it be more or less industrial solution for a rack or din rail.

I think that for research purposes I will invest in trio of Odroids H2+ as they do have more reliable NvME storage and someone did design a nice rack case for up to 8 blades. https://forum.odroid.com/viewtopic.php?f=172&t=36780 Still it will be home made solution and my original post intention was to ask if such a turnkey solution with redundant PS and maybe integrated switch is already commercially available, and it looks that conclusion is that it is not.

1 Like

And that’s why I told you to check the ZRAM implementation in openHABian which is greatly reducing the number of writes by a large factor hence increasing SD lifetime by about the same factor.
Pragmatic and to the point.

1 Like

I use a USB to SATA adapter that only cost 8GBP with a RPi 3 B+ and an PoE hat. This has been working flawlessly for around two years. I have connected a passive four way USB hub to the Pi to which I have plugged in a ZWave stick, an RFXcom 433Mhz stick and an 868Mhz CUL stick that are all served out across the network via USBoIP using ser2net.

The USB to SATA adapter allowed me to connect a spare SSD to the Pi so I don’t need to worry about SD cards failing.

The PoE hat allowed me to locate the Pi on top of a cupboard where there was no mains supply for a PSU. I could install a single Cat5e cable instead of routing a mains cable from the nearest socket.

Anyway I wanted to achieve something similar, maybe someone did something like this?

I have several RPis in my house which most of them do nothing, but I need them because they serve other important purposes (like print server, we use it not that much but it is always needed - and the printer is at a dedicated location).
Would be great if I could install openhab on one of this and it will act as a hot-fallback device.
What I will need to achieve and I don’t really know how to do this:

  • Daily config copy (simply create an openhab backup and copy it to the fallback device, unzip?)
  • Detect if the main server openhab instance is not running (not the whole computer is down, only openhab - this would also make maintenance on the main server easier and without disruption). Maybe just check if the basicui loads or the rest api is accessible or how?
  • Switch to the backup server if the main is not available. Start openhab and somehow redirect openhabian.local?
  • Switch back to the main if the openhab on that machine is up.

Someone has some ideas for this?

Thanks!

1 Like

I also use Proxmox on three hosts, one four core Xeon and two small Mini-ITX mobos. The VMs discs are all created as ZFS volumes and I use Sanoid and Syncoid to send volumes between hosts. Syncoid sends ZFS volumes between hosts and Sanoid creates policy based snapshots. Snapshots can be specified like this:

[template_production]
        frequently = 0
        hourly = 0
        daily = 15
        monthly = 0
        yearly = 0
        autosnap = yes
        autoprune = yes

In the above example there will be 15 days of daily snapshots.

Syncoid is very useful at reducing downtime when sending a volume to another host. I run it twice, the first time whilst the VM is still running, this sends the bulk of the changes since the last time the volume was sent to the destination host (or the entire volume if this is the first time). I then shutdown the VM and run syncoid again to send the changes that occured during the time between the first send and the end of shutting down the VM. This second send is very quick (it’s unlikely there were many changes to the VM disc in that period of time) and I can then start the VM on the destination host.

I’m sure the above process could be scripted.

I think I tried that, but because I moved away from HA-cluster to individual nodes, I cannot talk from experience. In the ceph scenario, if the hosting node fails, another host will jump in, see https://www.youtube.com/watch?v=T8Wdko31JB4 . But with clustering, there are downsides, and declustering is a mess. So I recommend to not cluster and achieve redundancy in a different way.

For clustering you will need at least three nodes (one is allowed to fail) and each of the nodes has to provide a data store for vm images.

If you’re willing to put in the effort checkout https://clusterlabs.org/.

If you can read German (or maybe use Google Translate), the following blog may be interesting for you as well:

https://herc.de/de/openhab2-hacluster-mit-pacemaker-und-corosync/

1 Like

Here are my scripts
https://github.com/openPhiL/openHouse/tree/master/How%20to%20build%20it/4%20Solutions/Server%20Redundancy

1 Like

wow thank you i will definitely look into this, i am doing a new home just now
i have a proxmox for testing already runing
PfSense (with its own NIC for firewall)
freeNAS
ubuntu server to run all my IOT stuff

but i am still unclear about how to setup the storge right…
storge for proxmox itself(i have an ssd of 256)
storge for freenas (still did not buy any hardware)
i want my NVR also to go there

just wanted to share my headache :slight_smile:

I started with PfSense as well but endet up at OpenSense (because Open)…
I tried FreeNas, but at the end, I had no use but overhead.
I have multiple ubuntu servers.

as for the storage, I choose not to seperate proxmox itself from the main storage, because the main storage is redundant, so be using the storage for proxmox as well, I make proxmox-OS failsafe. ZFS helps to get good speed even with slower drives, you could use your 256ssd as a zfs-cache.
have not started with real nvr yet, hope my cpu power is enough.

Feel free to connect with me if you have open questions.

1 Like