Stabelize a production system by working on a test system

Hello there,

actually openhab is something my hobby and I try a lot around. But… therefore sometimes my openhab installation is not available, has default setting instead of the right one, dadida, I think you might know what I mean.

How do you solve that? … If you think that’s a problem…

Thinking about a test System to play with, I faced the problem that I don’t have the data to work with like in the production system.
I came up with the following idea:

Generally use two openhab installations, one (Production) has all bindings and items and propagates all item info via MQTT. A second installation (Dev/Test) is subscribed to that and has items of the same name bound to the mqtt channel.

Advantages: - You get all data in a lifely manner on the test, so you test under realistic situation
- You don’t push stuff to your production that you need to deinstall (which somethimes doesn’t seem so complete after all) later on. So you keep you system clean
Disadvanges: Even if you have a clever mechanism on the production so push all item status to MQTT you need all items with an additional mqtt binding on the test/dev system. So that doubles your work.

My questions: What do you think about this idea? Worth trying it or stupid, because… something?
What dou you think might be the best way to propagate the items status via mqtt?
I see two possibilities: Rules or persistence. My favorite would be persistence, because it gives you some automation mechanism, but I don’t have much experience here.

Btw: This way you don’t need a test/dev System permanently. A VM somewhere reachable that is just running while working on it might be enough.

What do you think?

Thanks and best regards

Frank

It is indeed a problem.

Personally, I solve it using periods of processing.

I run OH in Docker and I maintain two sets of folders, a production set and a test set. When I want to do some massive changes I will stop the OH production container and start the OH test container. Then I’ll make my changes to the test config folders. If for some reason I don’t finish in one sitting or run into problems, then I simply stop my test container and restart the production container.

Once I’ve finished testing in my test configuration I’ll check in my changes to my git server and pull them to my production system and those changes now become part of production.

One of the key drivers of this is that you can’t have more than one OH (or any program really) access the USB controllers (e.g. Zwave dongle) at the same time.

Though to be honest, I almost never do this and just make changes to production because most of the changes I need to make are tiny tweaks.

I don’t think it is a bad idea at all. Though for me it seems like a whole lot of extra work and I probably wouldn’t do it personally. It does solve the USB controller problem though.

Use the MQTT Eventbus configuration on your Production system. This will let your Test environment subscribe to all events on all Items in your production. You don’t have to replicate all your Items in the test environment, though I would because there can be interactions, particularly with Rules. In your Test environment, you replicate those Items you need with MQTT binding to subscribe for updates from the event bus. If you want to isolate what your test environment is doing from production then you can simply not publish the updates back on the event bus.

Thanks a lot, that gives me some ideas to think about.

Docker seem a cool idea to get that stuff running. I guess you manage the packaging manually or do you work with a helm chart?

But that means, meanwhile development the System is unmanaged and data and decisions from rules gathered in this period of time will be lost?

Thanks and best regards

Frank

My basic problem with a test system is always: I don’t seem to be able to get a spare house including all installations handy… :wink:

So - best thing would be to clone my house and use it as a dev-house a test-house, a integration-house and a production-house! unfortunately this won’t work…

Of course, if I use “virtual” items [1] that’s not a problem - but I always run into problems if I want to test some changes on let’s say my heating configuration. I only have one heating - and only one RS232 to it… So?
Same goes with my KNX - I’d like to try out the KNX2 binding - but I only have one KNX-Installation and I’m sure, I’d interfere with each instances…

the idea with the MQTT-eventbus is cool and I use it for my test instance also. But - I move the whole binding I’m about to test to the test-instance. So in my heating case - I’d have to deactivate the heating integration in my prod-OH2 and activate it in my test-OH2 - after changing everything I’d have to merge it back… pain in the ass - but I don’t see another way.

[1] no proxy items, but real items that can be accessed from both installations like API-calls or whatever)

mmmh, okay that will work and it gives you real items. That’s a plus.
The disadvantage is… I’m working e.g. right now an my windows shutter items, connected to themperature and if temperature too low and windows open and time too long, give notice to sonos.

So working on that is not much different than copying the whole stuff.

And aren’t we then with Rich’ idea, to shut down prod and work on a prod clone as test?

Interesting point! Until now I have no api calls to the items in openhab directly. I just “control” them in rules.
If e.g. I call an api on homematic server, that would work against homematic, not the item in openhab?
Maybe I don’t get the point.

Thanks and best regards

Frank

To deploy and manage the containers I use Ansible. My setup is not too complex and I like having just one system I need to use to build up and update my VMs. Not everything on the VM is docker (e.g. setting up file shares and users).

I’d like to learn Kubernetes at some point but so far it doesn’t provide any significant benefits for me to make learning it a higher priority. Especially since I can deploy and start an Image with one task in an Ansible playbook.

I guess it all depends on what your Ha system is doing and why you are needing to run a test environment. For me, the test environment is so I can quickly and easily roll back my changes if I break something or can’t finish in one sitting. As such the test system is an exact duplicate of my production system so all Rules, communications, and persistence that the production system would do are still happening. So nothing is really missed.

But I will also mention the following:

  1. This is a home automation (HA) system, not an industrial control system or the control software of a space probe. What is the real impact that a few minutes or hours of data is lost or a rule fails to run? Is the impact large enough to go through the efforts of building up a completely parallel system, knowing that this is physically impossible in some cases (e.g. Zwave, see Thomas’s reply)? For me the answer is “no” because…

  2. I purposefully build my HA to fail gracefully. If my OH is down it is no big deal, it just reverts back to its non-automated mode. So I’ll have to press the button on the remote to open the garage door. I’ll have to flip the wall switch to turn on the light. The Nest will continue to run the HVAC by its own algorithms, etc. Nothing meaningful gets lost and nothing dangerous happens.

I think the point is that rather than using MQTT Event Bus for everything or proxy Items, he configures Items to communicate with his one homematic server from both production and test at the same time.

I would also like to emphasize part of Thomas’s point. You only have one physical device. For example, you don’t have a separate rollershutter controller for production and another one for test. So either your test environment is going to be working with virtual simulated devices or it will be interacting with the production devices. To interact with the production devices would kind of defeat the purpose of having a separate test environment. But to interact with a simulated device means that you are not working with the actual binding and that greatly lessens the benefit you will get from having the test environment in the first place.

For example, let’s say you want to test out a new zwave device and some rules to go with it. You can’t do that in your test environment without taking down zwave in your production environment because only one instance of OH can communicate with the Zwave controller at a time and a device can only be paired to a single zwave controller at a time. So you either have to take down zwave on your production or you have to test the device in production.

OK, so we have set up the device in production and will test out our rules in the test environment. Now you either have to provide a simulation of the zwave device in your test environment or you have to allow your test environment to reach across to production and interact with the device itself. If you do that then your test environment isn’t really separate from production anymore.

So what benefit is the test environment really providing now? It isn’t zero benefit because you can test your rules separately with simulated devices. But is that enough to warrant the significant amount of work required to set up your test environment in the first place?

Everyone will have their own answer to this question but for me setting up a test environment will have too many compromises and hole between it and production to make the effort remotely worthwhile.

Yes, you are right :blush:

Interesting points…
My API example is on importing information into openHAB from external appliances. So e.g. the weather from Wundergrund or my Nuki keyturner. Both provide APIs I could use in both environments. But as I send commands to the Nuki from both installations I’m sure to get into some side effects. And we’re talking about three commands to one device. My KNX installation has like 70+ actuators and 50+ sensors sending telegrams. I’m not sure to not interfere, if I connect from knx1 and knx2 from my two instances…

My automation is also “automation only”! Like Rich I don’t rely on oh2 running - except for some low-level things like checking the wind and temperature and sending a block on my blinds - they go up. So if it’s hot and sunny outside and my production isn’t running - they go up… No big deal … But still…

After a bit more thinking about it, I believe @rlkoshak and @binderth you have been very polite with me.

My idea is a stupid one. It works only with affordable amout of work for “read-only” devices, so using mqtt one-way. I’m many using these and generate messages from that,
I haven’t thought about the other way for actors. That is way too much work,

But the ideas of a docker image, doesn’t get out of my head.

@rlkoshak, if I got you right, you have these two dockers on the same system, both using the underlying services like persistence etc.? Are you using docker volumes to have two separated OH configuration on the same machine not beating each other?

Forgive me I’m thinking very visual.

That looks like an intense amount of configuration work. But is by far the better solution to my question :slight_smile:

Have I got you right?

1 Like

Mostly right.

I just noticed an error in the drawing. The cylinder that InfluxDB points to should be labeled “InfluxDB Database Files”.

I have one Mosquitto instance running in its own container. I have one InfluxDB running in its own container. I have one Grafana instance running in its own container. I have a git server running on another server in its own container.

So all of that remains the same.

I configuration control my conf and parts of userdata on the git server.

There is a separate folder on my host for each container, prod and test. I don’t mess with samba but I see no reason why you couldn’t use it if you wanted to. To initially populate test I checkout the everything saved in git. If I wanted to, now is when I would modify influxdb.cfg to use a different db name and user. Prod would already have the latest and greatest checked out.

I use the following Ansible task to download the openHAB Docker image and create an image named “openhab”.

- name: Start openHAB
  docker_container:
    detach: True
    devices:
      - "/dev/ttyUSB0:/dev/ttyUSB0:rwm"
    hostname: argus.koshak.net
    image: openhab/openhab:2.2.0-amd64-debian
    log_driver: syslog
    name: openhab
    network_mode: host
    pull: True
    recreate: True
    restart: True
    restart_policy: always
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
      - "{{ openhab_data }}/conf:/openhab/conf"
      - "{{ openhab_data }}/userdata:/openhab/userdata"
      - "{{ openhab_data }}/addons:/openhab/addons"
      - "{{ openhab_data }}/.java:/openhab/.java"

{{openhab_data}} is a variable pointing to the root folder of my production openhab config files.

Then I docker stop openhab and run the same task again with any necessary modifications (e.g. change to use 2.3.0-amd64-debian-snapshot), use {{openhab_test_data}} which points to the root of the test folder, and use the name openhabtest. This is all done in a single Ansible playlist.

That gets me set up initially. Then when I want to switch between prod and test to make some big changes I just run docker stop openhab and docker start openhabtest. I make my changes in test, test them, and when I’m done check in the changes. Stop openhabtest, git pull in the prod folder to get the changes I made in test, then start openhab again. I might have to recreate the openhab container if the change is an upgrade of OH. I use addons.cfg to manage my bindings so those get checked in and the new addons will be installed in production when it reads that config file.

The way I run there is literally no difference between test and prod except:

  • they may be running a different version of the OH container
  • they mount different volumes
  • until I check in my changes and pull them any changes I make in the test environment do not appear in prod.

Docker makes this pretty easy and except for a couple of extra tasks in my Ansible playbook I’ve not had to do any additional config beyond what I already would have done if all I had was Prod. The magic is in mounting different volumes to the two containers and coming up with a way to synchronize between those two volumes.

Oh, and never run both containers at the same time.

At the end of the day, I’m incredibly lazy and have not nearly as much time to work on my home automation as I would like. Anything that would take an intense amount of configuration work just to set up a test environment would never get done. I’d do without a test environment if it were much more work than this. And even still, as easy as this was to set up and use, I rarely use it. I mostly just make changes on production. :smiling_imp:

5 Likes

Brilliant @rlkoshak!!

That’s a great piece of work.
And I guess solves my still not satisfying backup and restore problem by-the-way…

Puuuuh, it will take some time until I got that running.:astonished:

Thank you so much!!

Frank

Hi Rich,

it’s an old topic but it describes what I’m willing to do. I’m pretty familiar with Openhab now, I created a docker instance of my Openhab a few days ago, migrating from Openhabian to docker. Everything is working fine after a little tuning.
I want to be able to test before deploying, as I by the past messed up with my stable installation while configuration testing.
I’m not familiar with Ansible, neither with git, but kind to learn. I’m on a two years journey now, started from nowhere and no competencies, learning by testing, and want to continue experimenting and improving.
If I understand well, Ansible helps you duplicating containers with specific environment variables, and Git helps you duplicating changes from a test environment to a production environment. Is that it ?
Would you mind sharing your conf files (docker ? Compose ? Git ? Ansible) so that I can learn from and try to reproduce for myself ?
Thanks anyway, don’t feel obligated.
I already learnt a lot from you !!
Dom

Not quite, or I should say not a complete description.

Imagine you have half a dozen computers (or virtual machines) each with some configurations the same and others different. Setting up each of these machines means installing services, configuring them how you want them to be. On Linux that’s going to be issuing a bunch of commands and editing config files.

OK, it’s been two years and all of suddenly all of your machines have crashed. :scream: Or maybe you picked up some ransomware and it got your backlups too. Or maybe you just don’t want to have to remember how you set up all of those services you are using. Tools like Ansible are designed to address these sorts of problems (and more). These give you something called “infrastructure as code”. Essentially, everything (or almost everything at least) required to set up a machine with a certain configuration is captured in a script. From stuff as simple as setting up your shell variables and hosts table to as complex as installing and configuring NUT all get’s “coded” into Ansible roles. Want to install Nextcloud? There’s a role for that. Want to configure a machine to run openHAB, Mosquitto, Zabbix, and a Wyze bridge, write up a script that uses those roles. Want to Zabbbix to a different machine? Uninstall it from the old one and run the role on the new one.

Want to upgrade all your machines in one go? Run the scripts again. Need to change some configuration parameter on all of your machines? Run the scripts again, with the change.

All your configuration is done as code. Code can be source controlled which brings us to Gti.

Git is a source control service. It keeps track of all of your changes as you check them in. If you are doing it right, you provide messages for what you changed on each “commit” so you can follow your history. You can tag or branch the code to “freeze” the code at a certain point in time. All sorts of stuff.

And here’s the thing, you can check stuff out from git in an Ansible script. So, I’m diligent about checking in my changes to OH config as I go so if I ever need to rebuild OH on a different machine, my Ansible script needs only check out the latest from my git server and I’m back up and running.

I’ve almost 50 roles in my Ansible configs that cover everything from setting up my shell environment to building a custom Nextcloud Docker image that includes some extra stuff I need that doesn’t come with the official container by default, to scripts that set up and create the database tables needed by various services.

I can’t possible post them all. But I can post my current openHAB one and maybe the Mosquitto one. But you’ll probably want to go through a tutorial because it’s a whole complicated system unto itself. Some of the roles call each other (e.g. when creating a new user I have a separate role that my other roles call) and lots of stuff is stored in variables.

openHAB Role:

---
# tasks file for roles/openhab

- name: Debug
  debug:
    msg: |
      openhab_home = {{ openhab_home }}
      repo = {{ openhab_conf_repo }}
      version = {{ openhab_version }}

- name: Create the openhab user and group
  include_role:
    name: create-user
  vars:
    uid: "{{ openhab_uid }}"
    gid: "{{ openhab_uid }}"
    user_name: openhab
    create_home: False
    service: openHAB

- block:

  - name: Add openhab user to the dialout group
    user:
      append: True
      groups: dialout
      name: openhab

  - name: Create if necessary and set the permissions on the openHAB data folder
    file:
      path: "{{ openhab_home }}"
      state: directory
      owner: openhab
      group: openhab
      mode: u=rwx,g=rwx,o=rx

  become: True

- name: See if config is already present
  stat:
    path: "{{ openhab_home }}/userdata/etc/version.properties"
  register: conf_present

- name: Check to see if its is up
  command: "nc -vz {{ git_host }} {{ git_port }}"
  register: git_running
  changed_when: False
  failed_when: False

- name: Checkout openHAB configs if this is a new install
  git:
    repo: "{{ openhab_conf_repo }}"
    dest: "{{ openhab_home }}"
    accept_hostkey: True
    version: main
  when: (git_running['stderr'] is match(".* succeeded!")) and
        (not conf_present.stat.exists)

- name: Create missing folders
  file:
    path: "{{ item }}"
    state: directory
    owner: openhab
    group: openhab
    mode: u=rwx,g=rwx,o=rx
  loop:
    - "{{ openhab_home }}/userdata/cache"
    - "{{ openhab_home }}/userdata/logs"
    - "{{ openhab_home }}/userdata/persistence"
    - "{{ openhab_home }}/userdata/tmp"
  become: True

- name: Change ownership of openHAB configs
  file:
    path: "{{ openhab_home }}"
    owner: openhab
    group: openhab
    recurse: yes
  become: True
  when: (git_running['stderr'] is match(".* succeeded!")) and
        (not conf_present.stat.exists)

# Kept for reference but in OH 3 I've moved to rrd4j and built in charting
#- name: Create the InfluxDB database
#  influxdb_database:
#    hostname: "{{ influxdb_ip_address }}"
#    database_name: "{{ openhab_influxdb_database_name }}"
#    state: present
#    username: "{{ influxdb_admin_user }}"
#    password: "{{ influxdb_admin_password }}"
#
#- name: Create the InfluxDB openHAB user and grant permissions
#  influxdb_user:
#    hostname: "{{ influxdb_ip_address }}"
#    user_name: "{{ influxdb_openhab_user }}"
#    user_password: "{{ influxdb_openhab_password }}"
#    login_username: "{{ influxdb_admin_user }}"
#    login_password: "{{ influxdb_admin_password }}"
#    grants:
#      - database: "{{ openhab_influxdb_database_name }}"
#        privilege: 'ALL'
#
#- name: Create InfluxDB Grafana user and grant read permissions
#  influxdb_user:
#    hostname: "{{ influxdb_ip_address }}"
#    user_name: "{{ influxdb_grafana_user }}"
#    user_password: "{{ influxdb_grafana_password }}"
#    login_username: "{{ influxdb_admin_user }}"
#    login_password: "{{ influxdb_admin_password }}"
#    grants:
#      - database: "{{ openhab_influxdb_database_name }}"
#        privilege: 'READ'

- name: Check the current version of openHAB # noqa 306
  shell: grep openhab-distro {{ openhab_home }}/userdata/etc/version.properties | cut -d ' ' -f 4
  register: old_version
  when: conf_present.stat.exists
  changed_when: False

- name: Pull/update the openHAB docker image
  docker_container:
    detach: True
    devices:
      - "/dev/ttyUSB0:/dev/ttyUSB0:rwm"
      - "/dev/ttyUSB1:/dev/ttyUSB1:rwm"
    env:
      CRYPTO_POLICY: unlimited
    hostname: "{{ ansible_fqdn }}"
    image: openhab/openhab:{{ openhab_version }}
    log_driver: "{{ docker_log_driver }}"
    name: openhab
    network_mode: host
    pull: True
    restart: True
    restart_policy: always
    tty: True
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
      - "{{ openhab_home }}/conf:/openhab/conf"
      - "{{ openhab_home }}/userdata:/openhab/userdata"
      - "{{ openhab_home }}/addons:/openhab/addons"
      - "{{ openhab_home }}/cont-init.d:/etc/cont-init.d"
  register: openhab_pulled

Mosquitto is a little most involved because it has to create the logins.

---
# tasks file for roles/mosquitto

- name: Create the mosquitto user
  include_role:
    name: create-user
  vars:
    uid: "{{ mosquitto_uid }}"
    gid: "{{ mosquitto_uid }}"
    user_name: mosquitto
    create_home: False
    service: Mosquitto

- name: Create mosquitto data folders
  file:
    path: "{{ item }}"
    state: directory
    owner: mosquitto
    group: mosquitto
    mode: u=rwx,g=rwx,o=rx
  loop:
    - "{{ mosquitto_home }}"
    - "{{ mosquitto_home }}/config"
    - "{{ mosquitto_home }}/data"
    - "{{ mosquitto_home }}/log"
  become: True

- name: Copy the prepared mosquitto.conf folder
  copy:
    src: mosquitto.conf
    dest: "{{ mosquitto_home }}/config/mosquitto.conf"
    mode: u=rw,g=rw
  become: True
  become_user: mosquitto

- name: check to see if the passwd file exists
  stat:
    path: "{{ mosquitto_home }}/config/passwd"
  changed_when: False
  register: passwd_file

- name: Create an empty passwd file
  file:
    path: "{{ mosquitto_home }}/config/passwd"
    state: touch
    owner: mosquitto
    group: mosquitto
    mode: u=rw,g=r,o=r
  when: not passwd_file.stat.exists
  become: True

- name: Pull/update and start the Mosquitto service
  docker_container:
    detach: True
    exposed_ports:
      - "1883"
      - "9001"
      - "8883"
    image: eclipse-mosquitto
    log_driver: "{{ docker_log_driver }}"
    name: mosquitto
    published_ports:
      - "1883:1883"
      - "9001:9001"
      - "8883:8883"
    pull: True
    restart: False
    restart_policy: always
    state: started
    user: "{{ mosquitto_uid }}"
    volumes:
      - /etc/passwd:/etc/passwd:ro
      - /etc/localtime:/etc/localtime:ro
      - /usr/share/zoneinfo:/usr/share/soneinfo:ro
      - "{{ mosquitto_home }}/config:/mosquitto/config"
      - "{{ mosquitto_home }}/log:/mosquitto/log"
      - "{{ mosquitto_home }}/data:/mosquitto/data"
  register: mosquitto_pulled

- name: Wait if container restarted # noqa 503
  pause:
    minutes: 1
  when: mosquitto_pulled.changed

- name: Check to see if we can log in
  command: docker exec mosquitto  mosquitto_sub -h localhost -u {{ mosquitto_user }} -P {{ mosquitto_password }} -t test -E
  register: mosquitto_sub
  changed_when: False
  failed_when: False

- name: Update the user and password if loggin in failed or the passwd file doesn't exist
  command: docker exec mosquitto mosquitto_passwd -c -b /mosquitto/config/passwd {{ mosquitto_user }} {{ mosquitto_password }}
  when: (not passwd_file.stat.exists) or
        ('not authorized' in mosquitto_sub['stdout'])

- name: Restart mosquitto if password changed
  command: docker restart mosquitto
  when: (not passwd_file.stat.exists) or
        ('not authorized' in mosquitto_sub['stdout'])

As for the test environment, I’ve long since abandoned it. The only way a wholly separate test environment really works is if all your hardware (e.g. Zwave controllers) are duplicated first. Since that’s not feasible most of the time I do periods of processing instead, when I do anything at all. Most of the time I just edit production. It’s so easy to back stuff out if I need to or disable that rule that’s not working, even with the help of Ansible and VMs it’s too much of a pain to set up a separate test environment.

In a pinch, if I had to incrementally test something, I’d create a stable branch in my git repo and a test branch. When I’m working on test, I’ll shutdown OH, change branches in the configs to test, restart OH, make by edits and tests and if I don’t finish commit the changes and switch back to the prod branch. But I haven’t even done that in a very long time. These days I’ll create a new rule and disable the old one, work on the new one and if it doesn’t work, disable it and reenable the old one, repeating until the new one works.

I should note that all of my config is done through the UI. The only text configs I use are .persist files (because there are no UI equivalents) and my personal JS Scripting npm module (for which there also is not a a UI equivalent).

Thanks a lot, Rich, for sharing… your examples are very helping and illustrative of how to do it.
And they bring new questions :slight_smile:

  • Ansible is certainly the way to go, as I want to be able to deploy a new machine in case of a crash, without having to remember each config file I modified through time. But if I understand well, each time you change a config parameter, you have to do it in Ansible and erase and rerun the role on your existing machine ? Recreating each time from scratch ?

  • As for Git, I’m reading about it, it mentions a staging area, where you modify your code, and when you’re satisfied, you make a commit to keep the last good version. Each time you modify, you commit to keep precedent “good” versions. I can understand in case you have your code in files (my case), you take “snapshots” of your code files. But how do you do it when everything is configured from the UI ?

Thanks for your experience return on test environment. Maybe in fact it’s too much of a hassle. I have two docker containers for openhab, one prod and one test. Then I have InfluxDB and Zigbee2mqtt which can be accessed from the two containers. For z2m, nothing special. For InfluxDB, I have two separate databases, and in the two version of openhab, the only difference is the database name. I backup and restore data from prod to test database. Maybe I should stop here.

I have everything in files, written in DSL, which I learnt when I started : services, things (140) , items (1400), rules (200), sitemaps… nothing done through the UI. So I can backup my files, and when I tested on Test OHA, I just have to copy the file in prod directory. This way, I can for example create a new container, and I just have to copy config files to the conf directory.

I edit my config files on VS Code, which is “plugged” to Openhab, this way, it says me as soon as I have a syntax error, without having to save and discover in log.

Some rules are a bit complicated, with many proxy items an controls. I don’t really see the advantage of going full UI, and I am not at ease on how to reproduce all the controls through UI. I fully avoid to have a mix of config through UI and config from files to avoid complexity.

Do you think I am at risk staying with external config files ? And if not, should I rewrite everything in JS ? The house is fully automated, and I spent countless hours (weeks) of writing and stabilizing everything. I can’t imagine spending the same amount of time just to rewrite everything, but if I have no choice…

It makes me think that I should buy spare parts in case one falls down (RPI, Zigbee Key, sensors etc…). I think I am not bad on the logical side (backups, conf files, etc…) but very bad on the physical side.

Thanks again for sharing your experience, and for the time you dedicate to the Openhab community.

Dom

Yes, this means that all config changes made to the machine needs to be done through Ansible. But when using tools like Ansible, Chef, Puppet, Vagrant and I’m sure there are others, you need to write your scripts to be “idempotent”. When there is nothing to do, don’t do anything.

So the first time you run the role, it will make all the changes. The second time you run the role, it will likely make no changes because there’s nothing to do.

So no, you don’t wipe out the machine every time. You just update the role and when you run it, it will only apply that one change since everything else is already configured.

Removal is a bit harder with Ansible. It doesn’t happen automatically so if you want to uninstall something or remove the configuration for something, you’ll need to do it by hand or write that into an Ansible role as well. I rarely need to do this so I do it manually. Usually it’s as simple as removing a folder and removing a docker container or running apt purge.

The same. It’s text files too, just in userdata instead of conf.

But you may be overthinking the “staging area” concept here and it largely depends on whether you will host your own git server or just work locally. If you use a git server, then your OH folders are the staging area and the server holds the “good” versions. If not, it’s all in your OH folders.

What git lets you do is tag your code and move backwards, forwards, and sideways at a command. So you never lose anything and you can switch between versions easily.

For now perhaps but if you ever start using MainUI, all of that config (custom widgets for example) is stored in JSONDB only. There is no separate text file config for that stuff. There is other stuff you want to capture from the userdata folder too.

You can do this with UI configs too. Again, it’s all text, just in a different format in a different folder.

Without seeing the rules I couldn’t say. But in general, I find the introduction of the concept of rule conditions and the ability to enable/disable rules greatly simplifies a lot of these sorts of rules.

Note that the advice is to not mix UI and text for individual concepts but the recommendation has always been to take advantage of auto discovery of Things. Not only is it easier and much less prone to error, there are some bindings that offer capabilities that are only possible when the Things are discovered and not possible when using .things files. So in short, it’s OK to, for example, have all your Items in .items files and all your Things managed (i.e. done through the UI).

It’s not all or nothing.

As for the UI, keep in mind that there is a marketplace for widgets. You likely can just install and configure most of the widgets you need which are installed just like an add-on, no copy/paste/edit required.

It’s hard to answer that because “risk” is a squishy word. I don’t think you are at risk that support for text based configs, for those things that are currently configurable through text based configs, is going away. However, you do risk missing out on lots of new features and capabilities that have been and continue to be added to OH over time. You might run the risk that some bindings may no longer support text configs in the same way in the future but I suspect that’s pretty low.

Some of the things you are missing out on:

  • easy way to create the semantic model (which in turn will autogenerate the overview tabs which could completely replace your sitemaps in many cases)
  • custom UI widgets
  • rule templates: like custom widgets, there is a marketplace for rule templates which you can install and configure like an add-on. Perhaps a bunch of your rules can be replaced with someone else’s code and you don’t even need to look at it. It’s just configuration.
  • not strictly a UI thing, but even Blockly has features that are simply not possible to reproduce in rules DSL such as
    • dynamically create/remove Items
    • call rules from other rules
    • rule conditions (UI rules only)
    • ability to enable/disable rules
    • creation and use of libraries

If you were to rewrite your rules, and there is nothing saying you have to, yes, I’d recommend either Blockly if you are not much of a programmer, or JS Scripting ECMAScript 11. But you don’t have to change anything if you are happy with where your config is now or if you don’t want to change your workflow and don’t care about new features you might be missing out on.

Just realize, that even if you put everything you can into text files in the .config folder, there will still be some things that only exist in the userdata so any backup/source control strategy needs to include most of that folder too (exclude userdata/tmp and userdata/cache).