Backing up configuration using Git in OH3

Nadahar · June 29, 2022, 5:03pm

I’m in the middle of migrating from OH2 to OH3. My OH2 installation is running on a Windows “server” (actually WinXP), while my OH3 installation is openHABian on a RPi.

I’ve used Git to backup my configuration. By that I mean configuration files that I put time into configuring like items, rules, maps, sitemaps etc. I use .gitignore to exclude everything that is somehow configured automatically or during a “normal” installation. Please don’t start the “backup” discussion, as this isn’t my only form of backup and it isn’t intended to be something that I can restore to regain a running system. This is to preserve the work I put into building “my system”. The history, going back and looking at why I changed and why/when is as important as the “backup” itself. It’s a way to keep control of what I’m doing, and I also check the diff before I commit after making some changes to see that I haven’t done something I didn’t intend to which has turned out to be quite useful if I’ve used the GUI for anything, since “strange things” seems to happen there like stubs suddenly appearing in different places).

I’m not sure if it’s because of the move from Windows to Linux or from OH2 to OH3 (although I suspect the latter), but the file structure of the files I need to include in my git repo has changed. On my OH2 installation, everything is very nicely organized in one place, where conf and userdata live in the same folder and the git repo “root” is the parent folder for all this. Using .gitignore I then simply exclude all the folders that aren’t of interest.

On the openHABian installation however, this is completely different. Here I have $OPENHAB_CONF pointing to /etc/openhab and $OPENHAB_USERDATA pointing to /var/lib/openhab. This makes it impossible to utilize the same strategy unless I want to make / the git repo “root” (and that isn’t really a viable solution for a number of reasons).

At first I configured my git repo “root” in $OPENHAB_CONF, since that’s where most of the data I want to preserve is located. But, it turns out that there are so many things in OH3 which simply can’t be done with configuration files (widgets++), so I don’t seem to have a choice but to include the JSON database. While it would be possible to create a second git repo in $OPENHAB_USERDATA and create a corresponding second remote repository, it wouldn’t be a very good solution and I haven’t really considered it viable. It would take away the ability to compare changes in a practical way, and depend on strict discipline to make sure everything was done “in sync” between the two repos.

My question is how to solve this. What I have done now, is the following (which seems like a “failed” attempt): I stopped openhab and zram-config, created /etc/openhab/userdata and moved all the content from /var/lib/openhab to /etc/openhab/userdata. I then deleted /var/lib/openhab and replaces it with a symlink of the same name pointing to /etc/openhab/userdata. I also reconfigured /etc/ztab to cache persistence in its new location (as I don’t know if zram will work properly via a symlink). I then started everything up again.

While I haven’t seen any errors in the log, it doesn’t feel good and I have widgets that now don’t seem to update themselves. Symlinks can break a lot of stuff, so I just don’t like it. An alternative might be to just change $OPENHAB_USERDATA to point to the “new” location and remove the symlink. I don’t the consequences of this either.

I’m worried that things like backup scripts, openHABian scripts etc won’t play nice with this configuration, even if I manage to figure out why the widgets no longer read the item sources.

Is there a “proper” and safe way to get these folders organized such that they can belong to one git repo? If so, what is that?

rlkoshak · June 29, 2022, 6:40pm

Before I moved to Docker, I would make my root folder for the git repo something like /srv/openhab and then use symbolic links to link /srv/openhab/conf to /etc/openhab and /src/openhab/userdata to /var/lib/openhab.

On Windows, you had what I’d call a “portable” installation of OH. Everything is in one folder and “installation” is pretty much just unzipping an archive. On openHABian, openHAB is installed following the conventions of the underlying Debian Linux operating system. That means putting configs you edit into /etc/ and putting stuff the program writes to in /var/lib and logs into /var/logs (in case you are wonder where those went).

This has been the case since OH 2.0 and has remained the same ever since the introduction of the ability to configure anything through a UI. OH 1 didn’t have that so everything was in /etc/openhab.

I currently run in Docker but my file structure remains the same. I just don’t have to create the symbolic links.

Here is my .gitignore.

userdata/*
!userdata/secrets
!userdata/etc
!userdata/jsondb
!userdata/config
!userdata/habot
!userdata/openhabcloud
!userdata/uuid
!userdata/zigbee
!userdata/zwave
userdata/jsondb/backup
userdata/backup
.Trash-1000
conf/html/hum.jpg
conf/html/light.jpg
conf/html/power.jpg
conf/html/temp.jpg
conf/automation/lib/javascript/personal/node_modules
conf/automation/js/node_modules/*
!conf/automation/js/node_modules/rlk_personal
*py.class
*.swp
*~
*.old
conf/automation/js/rlk_personal.tar

and my /srv/openhab folder looks like

rich@argus:~ 👁️  ls /srv/openhab
addons  conf  cont-init.d  multitail-openhab.cfg  notes.txt  README.md  service-statuse-widget.yml  userdata

As you can see, you’ll want more than just userdata to get everything.

You probably have to create the symlinks while zram is active, or disable zram entirely. If you do want to get the advantage of zram, you’ll need to put /srv/openhab into the zram config too, or at least /srv/openhab/userdata since there can be a lot of writes there depending on your configuration.

If you don’t want to use symlinks, you can’t get there from here. You’ll have to not use openHABian and do a manual install of OH like is done on Windows so everything is in the same folder.

No, that environment variable is for your shell’s benefit. So you can do things like cd $OPENHAB_USERDATA. openHAB itself doesn’t use that environment variable at all.

There might be a way to change the location of userdata in one of the configs in userdata/etc but it’s going to be a core Karaf (the OSGI platform upon which OH runs) but it might be referenced elsewhere too.

Use symlinks or don’t “install” openHAB, which means not using openHABian.

Nadahar · June 29, 2022, 7:19pm

Maybe I expressed myself somewhat inaccurately. I don’t have anything against symlinks, I’m just worried that not everything “respect them”. I know for myself when coding, and having to make the judgement whether a certain file command should “follow symlinks” or not. I often don’t really know how to make that decision, because I don’t know how users are going to organize their systems. Yet I have to make the choice when coding, either to just hardcode it or to somehow make it configurable (which could be a nightmare by itself). I imagine that many others find themselves in the same situation, and that most probably don’t think about it but goes with whatever the default is. Therefore I’m scared that some things will follow symlinks while others won’t.

I tried to check online when trying to figure how to do this, and I was under the impression that git didn’t follow them. That might be configurable somewhere too, but by default at least. The setup you illustrate indicates the opposite. Using symlinks this way is no worry for me at all, these symlinks wouldn’t interfere with openHAB itself, and would just be there to make a “virtual folder tree” for git to see. If this works, I think that’s definitely the way to go.

Ok, good to have that straightened out. So, I could have the same file structure here if I wanted to, but I’d have to not use the deb packages which would pretty much render openHABian pointless.

As I stated earlier, I’m not looking to get everything. I’m trying to separate what I “make” from the stuff that is “autogenerated” to the best of my abilities. I’m not sure what everything is of course, so I might make mistakes, but if I see something making causing large diffs that doesn’t seem related to whatever I have changed, it’s seen as a strong contender for exclusion. I don’t include credentials, keys, secrets etc either, as it’s not intended to be a “backup” but a way to keep the history of my “configuration work”. I’m pushing to my own server that requires authentication also for reading, so it’s not that I’m worried about putting such things in there, it’s just that it’s not that kind of information I intend to preserve with the git repo. But, if the way you describe using symlinks above works, I would be able to include basically anything I want there, if/when I discover other things that I think should be part of it.

I had to shut down zram to be able to move persistence under OPENHAB_USERDATA. That said, I would image that it would be safe to make changes while zram wasn’t running (it uses a long time to stop, so I assume that time is used to flush everything).

I haven’t “interfered” with openHABian’s decisions regarding zram (yet), by default it only caches:

swap    lzo-rle         400M            1G              75              0               150

# dir   alg             mem_limit       disk_size       target_dir                      bind_dir
dir    zstd            300M            750M            /var/lib/openhab/persistence    /persistence.bind

# log   alg             mem_limit       disk_size       target_dir              bind_dir                oldlog_dir
log     zstd            400M            1G              /var/log                /log.bind

I haven’t really done the risk/reward evaluation as to what should be included, so I just updated persistence.bind to point to the new location.

That’s useful to know. When I saw the environment variables I assumed that were there to make something “configurable”, but I couldn’t really find much documentation pointing out how. If they are just “helpers” that pretty much explains that.

That’s just the kind of thing I’m trying to avoid really, and why I’m not happy with my current “solution”. I don’t want to break things, I don’t want to move away from too many assumptions others make when they write scripts etc. So, even if I figured out how to do it, it would most likely be a bad idea for future updates and whatnot.

I’ll start to try to mimic the setup you described using symlinks, if that works it should be the closest I can come to the “ideal” solution. Thanks.

Edit: When going back and looking at some of the many posts complaining that git doesn’t follow symlinks, it suddenly hit me: Maybe we’re talking about different things. I’m thinking of symlinking folders, but they might be talking about symlinking files. That would make sense, since git doesn’t really care about folders (they are just means to and end). You can’t commit a folder, it must have a file in it, and the folder is thus just a part of the path for said file.

e36Alex · June 29, 2022, 7:39pm

Hi,

i do my config backups in two ways:

Running a daily cron for build in cli-backup
Versioning of my config with self hosted Gitlab instance with CI/CD to automatic deploy my config to openhab

For both i take ansible as my helper:

Daily Backup:
I got a playbook to take a openhab-backup (stored on my mounted nas, which is also backuped encrypted to cloud-storage) which is executed by a cron on my ansible host:

---
- hosts: openhab
  gather_facts: yes

  tasks:
    - name: Daily Backup
      shell: openhab-cli backup /srv/backups/openhab/openhab-$(date +\%Y.\%m.\%d-\%H:\%M:\%S).zip
      register: result

Versioning & CI/CD with Gitlab:
I got this .gitlab-ci.yml to deploy my config to my openhab server:

image: ruby:2.6

pages:
  stage: deploy
  before_script:
  ##
  ## Install ssh-agent if not already installed, it is required by Docker.
  ## (change apt-get to yum if you use an RPM-based image)
  ##
  - 'command -v ssh-agent >/dev/null || ( apt-get update -y && apt-get install openssh-client -y )'

  ##
  ## Run ssh-agent (inside the build environment)
  ##
  - eval $(ssh-agent -s)

  ##
  ## Add the SSH key stored in SSH_PRIVATE_KEY variable to the agent store
  ## We're using tr to fix line endings which makes ed25519 keys work
  ## without extra base64 encoding.
  ## https://gitlab.com/gitlab-examples/ssh-private-key/issues/1#note_48526556
  ##
  - echo "$SSH_PRIVATE_KEY" | tr -d '\r' | ssh-add -

  ##
  ## Create the SSH directory and give it the right permissions
  ##
  - mkdir -p ~/.ssh
  - chmod 700 ~/.ssh

  ##
  ## Optionally, if you will be using any Git commands, set the user name and
  ## and email.
  ##
  # - git config --global user.email "user@example.com"
  # - git config --global user.name "User name"
  script:
  - apt-get update
  - apt-get install openssh-client git-core -y
  - eval $(ssh-agent -s)
  - echo "$SSH_PRIVATE_KEY" | tr -d '\r' | ssh-add - > /root/.ssh/known_hosts
  - mkdir -p /root/.ssh && touch /root/.ssh/known_hosts
  - ssh-keyscan ansible.home.domain.de >> /root/.ssh/known_hosts
  - ssh $SSH_USER@$VM_IPADDRESS "ansible-playbook /etc/ansible/playbooks/openhab_deploy_config.yml"

At the end another ansible playbook is triggered, that syncs the files to the config folders of openhab:

---
 - hosts: openhab.home.domain.de
   gather_facts: no
   tasks:
   - name: Clone openhab config from gitlab repo
     git:
       repo: git@gitlab.home.domain.de:homelab/openhab.git
       dest: /tmp/openhab-config
       clone: yes
       update: yes
     delegate_to: localhost
     run_once: True

   - name: Synchronize and delete files in dest on the remote host that are not found in src of localhost.
     ansible.posix.synchronize:
       src: /tmp/openhab-config/
       dest: /etc/openhab
#       rsync_opts:
#         - "--chmod=F755"
       delete: yes
       recursive: yes

   - name: Recursively remove directory
     ansible.builtin.file:
        path: /tmp/openhab-config
        state: absent
     delegate_to: localhost
     run_once: True

I dont know, if its a “best practice”, but for me its working very fine.

rlkoshak · June 29, 2022, 7:44pm

The only thing that matters here is whether openHAB will follow them and the answer there is yes it will.

But you have the added complexity of zram. I see no reason why symlinks wouldn’t work in zram but don’t know that for sure. I can’t remember if I ever tried it.

The distinction gets a little fuzzy though. Is a Zwave Thing that openHAB discovered and you’ve accepted from the Inbox count as something you’ve made or something autogenerated? What about the logging config which lives in userdata/etc/log4j2.xml? What about the locale, timezone, and other system settings?

If you set the timezone in MainUI, is that something automatically generated or something you made? If you’ve accepted a Thing from the Inbox is that something autogened or something you made? What if you’ve changed the Thing’s properties after accepting it, does that turn it into something you’ve made?

It’s not as cut and dry as you might think.

I don’t know anything really about how git handles symlinks. But on Linux, a symlink is just a file (everything on Linux is a file) with a certain bit set and the path to the folder it links to. I think I have checked in symlinks in the past and they both checked in and worked, but all that git sees is the file, not what the file is pointing to.

If the folders are on the same file system, you can also use a hard link instead of a symbolic link. Rather than having a pointer to another location, it literally “mounts” that folder in both places.

Nadahar · June 29, 2022, 8:11pm

The majority of java.nio file methods takes the optional argument for how to handle symlinks. Are you really sure that every part of openHAB, every binding, plugin and whatnot have done this the same way?

While I can’t claim to know much about how zram works, I would be very surprised if it didn’t handle symlinks. So much is symlinks in Linux that I’d say it would make it virtually useless.

I know that there are “grey areas”, but they aren’t generally that difficult to decide on (I have done it this way for years in OH2 already). Timezones, locale and similar are excluded, they are configured during a “normal installation”, the same goes for creating users etc. What I’m trying to separate out is the actual “work” I do. The z-wave XML files are also excluded. The “tricky” part is the JSON database, as it has a bit of both. But, this doesn’t have to be 100% “pure” anyway, I can and do make judgement call. If something is “generated” but it doesn’t keep changing, it doesn’t ruin my diffs and thus oversight over what I actually changed, so they can stay. If they do “disturb” and are generated, they are excluded. I generally don’t consider “discovered” things something I need to keep. Manually created things are different though, although I have very few of those. In a way you could say that I’m trying to distinguish between things that are and aren’t related to any particular openHAB installation. What I want to keep is what I’d want to “bring with me” to a new installation.

I know there are challenges when checkout out code. Git doesn’t really “see” the symlinks and checks out the file content, and will then create a file instead of “restoring a symlink” when checking out. This applies when switching branches etc. There is a setting for defining some of this behavior:
https://git-scm.com/docs/git-config#Documentation/git-config.txt-coresymlinks
I’m hoping that as long as these folders always exist in all branches and whatnot, they will never be deleted and thus never need to be recreated, and all should be fine.

rlkoshak · June 29, 2022, 8:25pm

All of the config file reading/writing is centralized in core. The bindings don’t implement this themselves.

Nadahar · June 29, 2022, 8:51pm

Yes, I’m not saying that openHAB doesn’t support it fully, I’m just saying that this is “fragile” and it’s difficult to be sure that every part of the code does the same thing. One thing that makes it more difficult is that the default vary by method in Java. While most methods do follow symlinks by default, there are things like Files.walkFileTree() that does the opposite. I guess you could argue that there’s a certain logic since a recursive operation might be caught in a circular loop ending in a stack overflow if not “protected” against it, but regardless of the reasoning - if you aren’t very aware of this it could quickly bite you.

A quick search in openHAB core found an example of just this, although this code seems to be part of a test so it probably doesn’t matter:

github.com

openhab/openhab-core/blob/ad936cd83f70d9ac9732f8e1b94acc04939fe9ed/itests/org.openhab.core.config.dispatch.tests/src/main/java/org/openhab/core/config/dispatch/internal/ConfigDispatcherOSGiTest.java#L77


      
          @BeforeAll
          public static void setUpClass() {
              // Store the default values in order to restore them after all the tests are finished.
              defaultConfigFile = System.getProperty(ConfigDispatcher.SERVICECFG_PROG_ARGUMENT);
          }
          
          @BeforeEach
          public void setUp() throws IOException, InvalidSyntaxException {
              configBaseDirectory = tmpBaseFolder.getAbsolutePath();
              final Path source = Paths.get(CONFIGURATION_BASE_DIR);
              Files.walkFileTree(source, new CopyDirectoryRecursive(source, Paths.get(configBaseDirectory)));
          
              configAdmin = getService(ConfigurationAdmin.class);
              assertThat(configAdmin, is(notNullValue()));
          
              // Delete configurations created by previous tests
              Configuration[] configurations = configAdmin.listConfigurations(null);
              if (configurations != null) {
                  for (Configuration configuration : configurations) {
                      configuration.delete();
                  }

You find the opposite in another part of the code, which seems to be a part of the system itself, but it is then “vulnerable” to getting caught by circular references:

github.com

openhab/openhab-core/blob/6b224b70026972e825d234c897fb8d8fa159e309/bundles/org.openhab.core/src/main/java/org/openhab/core/service/WatchQueueReader.java#L134


      
                      registerDirectoryInternal(watchService, watchService.getWatchEventKinds(toWatch), toWatch);
                  }
              } catch (NoSuchFileException e) {
                  logger.debug("Not watching folder '{}' as it does not exist.", toWatch);
              } catch (IOException e) {
                  logger.warn("Cannot customize folder watcher for folder '{}'", toWatch, e);
              }
          }
          
          private void registerWithSubDirectories(AbstractWatchService watchService, Path toWatch) throws IOException {
              Files.walkFileTree(toWatch, EnumSet.of(FileVisitOption.FOLLOW_LINKS), Integer.MAX_VALUE,
                      new SimpleFileVisitor<>() {
                          @Override
                          public FileVisitResult preVisitDirectory(@Nullable Path subDir,
                                  @Nullable BasicFileAttributes attrs) {
                              if (subDir != null) {
                                  Kind<?>[] kinds = watchService.getWatchEventKinds(subDir);
                                  registerDirectoryInternal(watchService, kinds, subDir);
                              }
                              return FileVisitResult.CONTINUE;
                          }

The only point with this is to try to show the basis for my “careful attitude” towards symlinks, I’m not saying that this is something that needs to be discussed. I’m not making any claims that openHAB doesn’t handle it properly, but I do experience some things not working properly after having done my OPENHAB_USERDATA symlink. I am in the middle of reversing it now anyway, it’s far from a desirable solution and following your example seems like a much better “design”.

Nadahar · June 30, 2022, 3:09am

I’m finally done undoing the stuff I had done, making “links” to the folders and modifying the samba shares. It seems to work perfectly, thanks for putting me on the right track.

For others that might find themselves in the same situation (openHABian), I studied this a bit and found that I had already let openhabian-config apply “System tweaks” (menu item 13) which had already created some mount points in /srv. I could have established the git repo in /srv directly, but I felt better having a dedicated folder for the git repo. So, I ended up creating /srv/openhab-git/conf and /srv/openhab-git/userdata where /srv/openhab-git is the git “root folder”.

I noticed that the other mount points under /srv weren’t symlinks but were mounts instead. So, I chose to just replicate the way openhabian-conf had done it. It is a bit “involved”, so I’ll post the details here.

I created two files in /etc/systemd/system (after cd’ing there):

sudo nano 'srv-openhab\x2dgit-conf.mount'

…with this content:

[Unit]
Description=openhab-git/conf mount
DefaultDependencies=no
Before=smbd.service
After=network.target zram-config.service

[Mount]
What=/etc/openhab
Where=/srv/openhab-git/conf
Type=none
Options=bind,rw

[Install]
WantedBy=multi-user.target

sudo nano 'srv-openhab\x2dgit-userdata.mount'

…with this content:

[Unit]
Description=openhab-git/userdata mount
DefaultDependencies=no
Before=smbd.service
After=network.target zram-config.service

[Mount]
What=/var/lib/openhab
Where=/srv/openhab-git/userdata
Type=none
Options=bind,rw

[Install]
WantedBy=multi-user.target

Just ignore the “strangeness” of the names, it’s the rules of systemd mounts that they have to “reflect” to paths of the mount points according to some strange rules. With those two files in place, they can be enabled for automount during startup with:

sudo systemctl enable 'srv-openhab\x2dgit-conf.mount'
sudo systemctl enable 'srv-openhab\x2dgit-userdata.mount'

…and mounted with:

sudo systemctl start 'srv-openhab\x2dgit-conf.mount'
sudo systemctl start 'srv-openhab\x2dgit-userdata.mount'

Their status can be checked like any other systemd service with systemctl status and then can be “stopped” to be unmounted. That is quite handy if you have to do some git trickery, like I had to, to move an existing repo to a new place for example. As long as they are unmounted it’s “safe” to “check out” or “clone” there without risk overwriting anything in the “real” folders. I then manually deleted all the contents of the conf and userdata folders before “starting” the mounts and verifying that all was OK with git status.

To make the new folder shared via samba as well, I added the following section to /etc/samba/smb.conf with

[openHAB-git]
  comment=openHAB config Git root
  path=/srv/openhab-git
  writeable=yes
  public=no
  create mask=0664
  directory mask=0775
  veto files = /Thumbs.db/.DS_Store/._.DS_Store/.apdisk/._*/
  delete veto files = yes

mstormi · June 30, 2022, 6:56am

FWIW the idea of a feature in openHABian to store your config into git repo has been there in a long time: Configuration management in Github · Issue #1220 · openhab/openhabian · GitHub

I’d be happy if anyone to read this joins the discussion over there and maybe helps with development.

Nadahar · June 30, 2022, 3:20pm

I for one had already seen Configuration management in Github · Issue #1220 · openhab/openhabian · GitHub before I started this thread. I didn’t get involved in that, because as I understand it that’s about something slightly different than what I was after. The GitHub “connection” and the level of automation that’s discussed makes it “out of scope” for me. I’m not saying it’s necessarily a bad idea, I’m just not sure I quite understand how it is supposed to work.

I see the use of git like this: Git has a somewhat high threshold to learn because of its counter intuitiveness and frankly sometimes confusing documentation. When you get the fundamentals, it’s a brilliant tool, but I don’t expect non-developers to invest the time needed to get to that point. Personally I use it for many different things, I even store for example 3D models or SVGs I make using git. I just love the ability to see how things evolved and pinpoint “when something went wrong” - and the ability to “review” my own work (using diff) before committing as long as the files are text files.

If you don’t know how to use diffs or history, I’m not sure how much value git adds over just doing a regular file backup. Therefore I’m not sure I understand how automating it so that users unfamiliar with git can use it is useful.

Personally I don’t want to use GitHub or any other online service to store anything that is “personal” in any way (like the overview of my home a configuration would give) either, private repo or not. I’m running my own “basic” git repo on a Linux VM without any fancy GUI that I control and backup.

Because of the differences between my goals and those of #1220, I’m not sure I have anything to contribute to #1220. If I feel I have something to contribute at one point, I will chime in.

I will add that I think it might be helpful if you edit the first post of #1220 to make it easier to understand what the idea is, to potentially increase contributions.

rubens · July 2, 2022, 9:21pm

thanks for this very helpful and inspiring post!

I followed your example and later discovered gitea and after some hours of setup and exploring I now have a self hosted git service running and start adapting my workflows to it. One cool aspect is using the integrated issue tracker for keeping my ideas how to develop my home automisation.