What's your HA reliable architecture ? Mesh + central controller?

In Zigbee world you would have two solutions for this:

  • use a dimming actuator connected to your existing switches and being capable to dim conventional lightbulb/LEDs without gateway
  • use a Control Unit controlled by your existing switches and create a direct binding to the smart light(Zigbee Light Link), omitting the need for a gateway.

The challenge is that if you want to add intelligence on top but still keep the way of simpler controls by wires - it makes the whole system very complicated. This is the same challenge they have in cars. For example to make drive-by-wire system, but still have ability to control the wheels mechanically. This creates all those complex coupling/decoupling mechanisms and associated problems.

So in case of amateur home installation if you do it wrong most likely you will get more problems, than benefits. And architecture selection here is most critical thing.

So that’s why I’m a supporter of completely smart home approach - e.g. you control your home only by controller and do not override it or backup by wires. So yes - if your controller goes down, the light will not work.
In this case you should concentrate on increasing the reliability and availability of controller. I recommend a centralized system and not a mesh. Wireless actuators and sensors are not a problem - they fail independently, so it will mean that in case of unreliable link you will loose just one lamp - which is anyway redundant. But in case of central controller is much easier to manage failure modes and recoveries.

As for backup + restore process vs. redundant hardware - the first will not work in the above approach - the controller could fail when nobody skilled is at home (and most often too), so nobody will be able to switch on the light or power up backup controller. Thus you need redundancy with automatic changeover.

Which is not a complex thing at all, regardless of what others saying: you will just need a clone of your main controller, which can be as simple as Rpi+cloned USB Z-wave stick. And then you buy a couple of cheap Wi-Fi or Z-wave relays - I use Shellys for example. You power your main and cloned controller via these relays and setup them as a simple watchdog relay - one normally off and another normally on - e.g. main controller, when operating, will periodically send on command to one relay and off to another one to reset them. This will keep them in these states as long as main controller is live.
When main controller fails for any reasons it will stop sending the reset commands. The relays will switch and the main controller will be powered down and at the same time backup controller will be powered up. This process will take just few mins and your smart home functionality will be recovered. Yes, you will loose last states and persistence, but this won’t be a big issue - your house will just restart. And your backup controller doesn’t need to be so sophisticated as main one. It can be less powerful and perform just basic functions of the main one - e.g no sophisticated scenario activity, google calendars or whatever. So you can save HW cost.

Simple, isn’t it? If it doesn’t sound like this, just ask - maybe I didn’t explain it clear enough. IMHO this is much less effort, than investing into wired installation and solving the conflicts of local/remote control. I have a flow in Node-red which manages this process, which I can share.

That’s pretty much wrong.
Sorry to step in but you’re recommending to go for a system that has serious flaws by design.
It’s fine to use that in your own home but you should not recommend it to others, let alone beginners.

First, it applies to all-via-controller setups only that only you advocate - noone of us does. Some of its flaws see our posts above (and many other on the forum, all of us could give lengthy talks on that).

Second, there’s many risks and pitfalls in building failover systems . You need to always keep configurations in sync - let alone that hot standby or active-active versions would even require to keep the OH state consistent. That’s a highly complex and difficult thing and the opposite of K(eep) I(t) S(imple) S(tupid).
Being a cloud & data center architect by profession I know what I’m talking about and why I don’t want to replicate that in a home owner’s context.

Third, with proper preparations, you can have others replace failing hardware.
That’s right what the auto backup function in openHABian was built for.
Let alone it isn’t urgent anyway to get the replacement as lights will still work (and not to mention the hassle you will have with any DIY-controller-only solution if you wanted to sell your home one day).

1 Like

My daytime consultancy is connected to HA firmware on equipment you might use in your datacenter - broadband routers. At least the device I am supporting is - from the configuration perspective for customer identical as a single device. When you have a HA pair you just get some status of operation and alarm in case of HA breaks for some reason.

Ofc this is a complicated problem, especially with community development only. You will also need a dedicated hardware to have a controlled environment, so complexity escalated pretty quickly … but I think it’s worth pursuing this goal. Already having a dedicated hardware for saving support resources should be high on the list.

Markus, we always keep discussing this topic over and over.

I’m a professional designer too, but of embedded real-time control systems for critical applications, which are far closer to home automation than cloud and data centers. And in my systems redundancy is often built in. You don’t have to explain me what is sync and failover.

If you want to talk about flaws, risks and pitfails - please let it do here. I’m sorry, I’m not too often on this forum, because my redundant OH installation just works months and years without any attention, and BTW I donate to OH foundation 1 euro for every month of trouble-free OH operation, even if I don’t use OH cloud or latest version.

For example keeping configurations in sync or syncing states. Yes, this is required if you want to have a so-called bumpless changeover - so that user doesn’t see any difference when changeover occurs. And this is a complex thing, I know. But we don’t need it in HA. Really - if your main controller fails and light goes off, you just touch the panel and switch it on again - what is the problem? Also almost all home automation features are self-recoverable - e.g. they don’t need state synchronization. If your home controller just reboots during the day or night - will you need to perform any manual steps? I don’t need to.

Also in your proposal backup+restore you don’t have any sync between HW either, so in this case my method doesn’t differ at all from yours, except, that my system will do a changeover automatically without any user interaction and within 10 mins at latest.

So please, let stop talking about authority or wrong/right. But just tell me about design flaws in my setup - two identical cloned controllers with cloned z-wave sticks, two watchdog relays. First is on, second is off. Fail - first goes off, second goes on. Keep it simple, stupid.

Really? My light switches report to OH when they are changed. So openHAB always knows the states of the switches. I don’t have to deal with anything complicated. Maybe I just don’t understand what people are talking about but if I install a Shelly 1 behind a wall switch, for example I just have to hook it up to the existing wires and the existing wall switch an now I know the state of the lights at all times, and I can control them from the switch or from openHAB. And because openHAB can control it, the light can be automatically controlled with anything as simple as “turn on the light at 15:00” to using Christoph’s Design Pattern: Bayesian Sensor Aggregation and beyond.

Why is having the physical wall switch so much more complicated? There are no “redundat” wires or alternate control paths or anything like that. Just a smart relay and the wires that are already there.

As I’ve already said, except for controlling color (and that’s even starting to change) there is nothing automation wise that I can’t do with smart switches that is possible with a wireless configuration. And then the redundancy of manual control is built in.

I honestly don’t care what people ultimately do with their own system. That’s their business. But I do push back when people assert “X is really complicated” or “Y is really hard” when in my experience that is not the case.

2 Likes

Sorry, but as far as I know Shelly doesn’t report the status of it’s switch input. So you basically don’t know in which position your switch is now. You can only see the status of the relay. So doing many z-wave relays as well and this is a problem or local/remote - OH will always override the switch action. Imagine if the lamp is also controlled by motion sensor over OH. What switch can do in this case?

Mine must be doing magic then because I get status when I physically flip the switch back in OH in milliseconds. The same for my Zwave and Zigbee switches and outlets and smart plugs.

I always know what state my smart switches are in. I’ve never had a case where the light was ON but OH thought it was OFF or the other way around.

I’ve implemented this exact case, and more.

If it’s the right time of day, when motion is detected and the light isn’t already ON, openHAB turns it ON by flipping the switch to ON. If someone physically flips it OFF, openHAB gets that the light is now OFF unexpectedly (i.e. manually flipped) and understands the person doesn’t want the light to respond to the motion sensor any more so it overrides the motion sensor events and stops turning the light back on, for a time of course.

Whether openHAB turns on the light or a person flips the switch, the Item representing the light changes to ON, corresponding with the state of the light the switch controls.

I did not say that I want to keep wires.
For a dimming feature, or other features that implies to have a control element directly located on the light element, it seems simpler to not have wires to control the light element. That what i spell “advanced installation” in a previous post.
In setup with only On/Off controlled light, it seems simple to have a smart wall switch that power On or Off the light element with existing wire.
When I said keep wires, it was on in it specific case.

From what I understand, @mstormi and @Artyom_Syomushkin are talking about the same thing. There was a misunderstanding about states backup.

The automatic backup solution seems to be a good feature in a smart home.
The way @Artyom_Syomushkin implement it seems to me very interesting and simple to implement. I will take some time to implement it. Thanks for this idea.
Like have said Artyom, the central controller could fail when nobody skilled is at home, so if there is no way to use lights without the central controller, nobody will be able to switch on the lights. Lights and other important elements must be able to work with or without the central controller. The central controller should just add more features via rules and scenario, but should not add fail risks and reduce the availability of lights.

My thinking at this point is as follows :

  1. For simple elements like wall switch + light bulb :
    Add a controller back of the wall switch, to have a smart wall switch. This local controller actuates a relay that power On or power Off the light, using existing wires. This local controller can be triggered by the wall switch wired on an input of the controller, or wirelessly via a central controller like OH. Every time it is triggered (and why not at a specific interval too), it sends his state to the central controller. That is what @rlkoshak use. This way, lights can be controlled, even if the central controller fail.

  2. For advanced elements, like dimmed lights with dimmer switch :
    Add a controller back of the wall switch to have a smart wall switch, and add a controller on the light element (like Hue).
    Like above in 1), the smart wall switch can be triggered locally by the wall switch or wirelessly by a central controller like OH.
    Dimming value can be transmitted from central controller or from smart wall switch, to the light element. It should be able to be transmitted even if the central controller is down. In this case, avoiding a single point of failure is highly advisable. Using a star network with central gateway or MQTT broker is not advisable. Mesh network like Zigbee, Zwave or assimilate seems to be the way to go. Power on and power off feature can be realized like in 1), or via this robust wireless communication protocol.
    States of elements are transmitted on every change, like in 1), so central controller always know who is on and who is off.

  3. For convenience, a failover solution with a second controller could be implemented as Artyom suggested. To keep it simple, this second controller do not synchronize states, bu is just synchronized when upgrades or modifications are made on the setup. Like adding or deleting a element (wall switches, lights, etc).

  4. Bonus point :
    I have started to think about gracefully degradation. As I will mainly go DIY on all the elements, it could be interesting to add a forcing power button or a forcing dimming button on advanced lights elements (those with controller directly located on light, and controlled via mesh network). This way, even if central controller + smart wall switches + mesh network fail, lights can be always powered. It is less convenient to use, but it avoid a complete fail.

Those 4 points allow :

  • Not to have a single point of failure.
  • So to have functional nodes and elements in case of failure of the OH central controller.
  • To have degraded operating state easily usable by non-qualified people (point 4)).
  • Keep the whole installation simple. It just needs a 1 or 2 Pi with OH and some Zigbee/Z-Wave elements (and some DIY skills for advanced elements…)

What do you think about this strategy ?

All good points. Its nice to see people discussing reliability and redundancy.

I use tasmota on as many of the devices I can. It allows you to control things at the device so I have a distrubuted control system. If anything breaks only that thing will not work.

Mqtt is used to connect all devices and if a device goes offline it will try and use http. openHAB is the puppet master.

As advanced users we can do anything we want. For the home enthusiast I recommend using a Rpi running openhabian using sd card mirroring and if you have a ups or power brick run them off that.

Next step would be second pi with mirrored sd card. Static ip’s in config and a changeover switch so you can only power 1 pi at a time. Yes the SD card will need to have a working config on it. Maybe pull a backup from github.

Anything after this is a waste of TIME as it will take you longer to setup and test that the expected downtime.

1 Like

There are "smart " Dimmer switches too. I highly recommend looking into them. Only if you need to control the color of the bulb would you need a setup like you describe here. As with the ON/OFF, these report the current Dimming level. There is some added complexity though for certain devices around whether or not their return to their previous dimmed stated when OH turns them ON or not. You’ll have to read up before you buy something to find out what it’s behavior is and whether it’s acceptable for your use case.

But for all intents and purposes, the Dimmer case is exactly the same as your 1 case.

Why if you still have the central point of failure in openHAB? You’ve not actually solved your single point of failure problem. You either need to make the switch talk directly to the light (which is sometimes possible) or you still have the single point of failure and you’ve limited your choices with no gain. And if you come up with some redundant way to have a hot swappable instance of openHAB, why can’t you do the same for the MQTT broker and your WiFi AP hardware?

I’ve found in a lot of cases, both professionally and personally, if your system degrades gracefully you often don’t need to avoid the single point of failure. The system can limp along in a degraded state long enough for the failed part of the system to be restarted or replaced. Conversely, if you are going to go full on redundant everything, than why worry about degrading gracefully? Striving for both can in some cases add complexity.

Gotta have two Pis for redundancy with some sort of swap over approach and that comes with added complexity in keeping the two in sync as you tinker, adjust, and build your home automation.

1 Like

Yes, no and yes … I don’t question your motivation or competence but frankly I’m tired of discussing it.
Tired of the past year, of Corona, of the last days to get OH3 ready.
The central controller is a complex computer that has many states and communications relationships and dependencies. It’s not a simple relay that can only be on or off as you have put it, it’s a server rather than an embedded controller. Think networking (MAC and IP adresses). Both servers need to be network connected to sync and take the active role at any time.
And this is not a car you can drive to your repair shop once a year to have its controllers updated.
This is about continuous change of the home going on while you (and others, usually) live in and use it.

But you can use retractive switches? Even if openhab is down, the likes of fibaro dimmers can still turn on/off and dim, just not colour without openhab and scenes. And you never need to know the position of the switch as they’re retractive. Keeps the missus happy. And if you were to use say a hue bulb, then you just set the default power on behaviour to come on at full power warm white (in case it was blue when it failed)

Great. So you get back the status of the relay and not a switch. Agree, this is working. But if your OpenHAB controller fails, your system returns to strictly manual mode - e.g. your motion sensor will just stop working. But what if you need to have it working? Most people expect the direct interaction between motion sensor and lamp (e.g. as in mesh) when OH doesn’t work. Obviously this is not possible in your scenario, right? Same with heating for example. Your heating should be running and controlling your house temperature even when OH has died. You can’t bring it to fully manual mode, you still need a kind of sensored temp control and this will require a smart thermostat. But when OH is live you want to have more sophisticated control, and direct “Mesh”-like interaction will just interfere with your main controls. How would you resolve this conflict?

As a side note about switches - yes, you can add them anywhere to have the means of backup control. But I would need dozens of those if I want to have backup for all actuators - for example in my livingroom I have 7 Roller-shutters. Switch for each of them? Ha-ha!

Most certainly no they do not. The system is in (gracefully) degraded mode then where all they expect
is that flipping on the switch will turn on the light.

It still is if you want to. If you use e.g. ZWave sensors as @delid4ve suggested, you can set them up to send commands to both, an actuator and the controller so that’ll still work when the controller is down. KNX would allow for this too I believe.

So this means they are using OH only for hobby-like functions. My approach allows putting on OH something more serious, expanding it’s use nische.

This is again a conflict, which I asked @rlkoshak to resolve. What if you don’t want actuator to accept sensor input directly if OH is running and controlling your installation? Example as above - motion sensor and a light relay(actuator). You can configure motion sensor to send command directly to relay and activate the light wherever movement is detected - yes, Z-wave mesh feature allows this easily, as well as in KNX, and this will work without any central controller. But when you have a controller you might want to change this algorithm, make it more user friendly. You might want to block the light activation sometimes, when you are watching a movie or activate movement detection only at certain time when it’s dark, or reduce the light brightness after 21:00. So you will need to modify the command in controller before it reaches in the actuator. But all this will not work, if you have a direct sensor-actuator link. Therefore I say - Mesh is useless. Or you explain me how can you keep it working like this.

That sort of provocative wording is yours (only). That’s what I’m tired of.

It depends on the actuator’s capabilities if that would work.
But you’re missing the point. This is a design decision you have to take in one way or another.
You would only do that on lights that you believe need to even work in degraded mode and that don’t have a wired switch. Given we have optimized MTTR that’s just a couple of minutes you will run in this degraded mode.
You could even add emergency-only switches. I for instance have a ZWave remote to cover that case.
Placed right next to my bulb light that I might need in case of power outages.
Hardly ever used it although my controller is relatively often down as I use my home as a test bed.
To sum up, if you properly designed your home that scenario won’t exist.
So it’s a purely hypothetical question hence irrelevant to the architecture discussion.

I can’t say but I don’t see why not. I’m in the US so those are not that common. I have mine set up with a decora rocker switch set up where flipping it either direction toggles the light on or off. My Zwave switch looks like a decora switch but it’s just a push button in practice.

Who cares? Based on past experience OH is down for about 2 hours over the course of a year. And that’s for upgrades and power outages. In five years I’ve experienced exactly one case where I was away, came home, and I couldn’t open the Garage Door because something went down. But I had the remote that came with the garage handy and was able to get into the garage just fine. It was just a little less convenient.

That’s why I keep asking, is adding all this redundancy and fail over and such worth it?

But, if you are using Zigbee or Zwave that scenario is possible. You can configure the devices so that the motion sensor controls the light or the switch directly without going through openHAB.

Of course. That’s all part of the failing gracefully. When OH is down, the device need to perform at least at a basic level independently. When OH is down, you can control the lights with the wall switches. When OH is down, you can control the HVAC with the thermostat on the wall. That’s what failing gracefully means. When OH goes down, your stuff behaves more like an escalator than an elevator. It may not be as convenient when OH is down, but it’s not completely unusable.

You have to choose. Is it more important to you that your home automation perform 100% of all automation functions at all times regardless of failure, or do you allow some of the automation functions not work during those brief periods of down time. If you want the former, you probably don’t want to use openHAB anyway as a central controller and instead push all the behaviors and interactions out to the end devices, letting them talk to each other without central control. If the former, modern hardware is reliable enough when only minor precautions are taken.

Do you have to be able to control those roller shutters during those brief periods when openHAB or your network is down? I think the crux of the problem is your home automation does not have to be fully functional at all times and in all ways. Stuff goes wrong. When stuff goes wrong, you need some sort of backup for the important stuff. Stuff like lighting, door entry, HVAC. you know, health and safety issues. Does it really matter if you can’t open or close the blinds for an hour because openHAB crashed for some reason?

Honestly, there are aspects of the openHAB architecture and overall approach that makes it unsuitable for almost all of those more serious applications. There is no real-time processing, no transactions, a lot of times there isn’t even confirmations and acks. There is no determinism. You can’t even guarantee the order of processing for events that occur too close together.

If you need the kind of reliability and deterministic behavior, you need to go with a system designed from the ground up to do it. openHAB is not it.

Yes it’s all about failing gracefully, going to degraded modes, installing backup switches, MTTRs, selecting right actuators or invest couple of hours and two hundred bucks into redundant central controller with automatic changeover and forget about all this.

I selected second approach and recommend it to others. Others can check arguments for both aproaches above, discussion can be closed.

Don’t I know that one :stuck_out_tongue_closed_eyes: but my house still functions as it did before I found openhab so the wifey is happy :muscle: