Storing configuration (userdata) in git

Hi all,

I am one of those who store and backup the userdata config folder in a git for a disaster recovery as well as provisioning new raspis from clean slate.

This approach has worked for me well, no matter which way (UI vs textual) the openHAB has been configured.

Now, with openHAB 3.0 I migrated to full UI based configuration, so there is a lot more of the config stored in jsondb. I have noticed that small changes to the rules/things/items might result in large diffs in git, making auditing of configuration history very hard. It seems that ordering of entries in “json objects” is very sensitive to changes in content (e.g. jsondb/automation_rules.json.

How have others solved the issue? Are there any improvements planned in the way openHAB serializes the jsondb config?

I just keep a weeks worth of backups and use cron:

#!/bin/bash

FILELOCATION="/root/sql-backups"

/usr/bin/openhab-cli backup $FILELOCATION/openhab-$(date +%A).zip

It will keep 7 days of backups and just rolls over the old data because of the naming:
-rw-r–r-- 1 root root 28176097 Jul 16 23:16 openhab-Friday.zip
-rw-r–r-- 1 root root 27521646 Jul 19 23:16 openhab-Monday.zip
-rw-r–r-- 1 root root 30771482 Jul 17 23:16 openhab-Saturday.zip
-rw-r–r-- 1 root root 30630379 Jul 18 23:17 openhab-Sunday.zip
-rw-r–r-- 1 root root 27193829 Jul 15 23:16 openhab-Thursday.zip
-rw-r–r-- 1 root root 26580118 Jul 13 23:16 openhab-Tuesday.zip
-rw-r–r-- 1 root root 27299064 Jul 14 23:16 openhab-Wednesday.zip

How do you plan to deal with secrets etc?

Also if you’ve got all your things in things files do you need the json db?

I am looking to build that as a feature into openHABian so interested in any existing code and best practices.
Is it really a problem to simply git-store all the files in /var/lib/openhab/jsondb ? (maybe exclude backup/*)
I don’t understand why you think order inside .json files is important: once in a file, order won’t change unless you delete/re-add components. Note you must not replace these files while OH is running.

Thanks for the answers everyone.

@ubeaut the OP described the topic I am after – having capability to audit the history of changes. Git works really for this purpose, as I can enter commit messages describing the change.

However, currently it is super-hard to audit the actual diffs due to described issue.


@psyciknz I’m trusting my secrets with the git server currently, not encrypting at-rest.

Regarding thing files: you misunderstood – I have configured everything fully in UI. Anyways, there is a lot of stuff in jsondb, MainUI page definitions etc.


@mstormi yes, I’m simply git storing all, ignoring certain transient directories (backup, tmp, cache etc).

The order is not important for openHAB, that is clear. But it plays quite big role for clean diffs when reviewing changes with git.

This comes back to the problem statement I was after in the OP – capability to audit and review changes in history and also clearly see non-committed (“non-finalized” / wip) changes.

once in a file, order won’t change unless you delete/re-add components

This does not match what I observe. Even without adding/deleting components, the top level ordering inside json can and will change according to my experience. (EDIT: this seems to stem to the use of ConcurrentHashMap in core which does not guarantee any “sensible” ordering?)

Here’s a snapshot of ordering between some trivial changes to rules (diff here omits changes inside the rules themselves for cleaner diff). You can see that there are now new rules added, nor any rules removed.

I have used grep '^ "' jsondb/automation_rules.json to filter top level keys of the root automation_rules.json json object.

I’m happy to also see if I could contribute more suitable serialization logic, such that ordering of entries is more rigid/suitable for diffing. But I wanted to see first if there are other ways to solve the problem

I haven’t. I’ve found that it’s mostly an issue with adding and removing new stuff but not when editing existing stuff. But I’ve fully expected that the order might change so I mostly rely on good commit messages to understand what changed for that commit. I’ve not found myself needing to look at the diffs so I’ve not pursued it more.

I suspect that openHAB uses http://jsondb.io/ to do all this. I looked through their webpage and quickly through their docs and I don’t see any way to change it so it saves the entries in a specific order. I suspect they use a Map of some sort to store the Objects in memory and just dump them to file in the order they are stored in the Map.

I suspect there isn’t much OH can do about that short of abandoning it and using something else.

This is a problem all the way around whether you are using JSONDB or .things files. But using git does not necessarily mean storing the configs in the cloud somewhere. It’s very easy to set up a personal git server or, if one wants the pretty web UI it’s easy to run Gogs or GitLab.

I’ve seen some activity on OH’s GitHub for how to deal with secrets but I’ve not followed the progress on that discussion. It’s a known need though.

No, but there are limitations with .things files too. One has to choose whether to have Things be discovered but lack control over the order they are saved to file or preserve the order but have to fight the .things file syntax and lack of certain features (e.g. you cannot set parameters on a device (e.g. change the cool down time on a PIR) for Zwave Things defined in a .things file).

It can be very useful to look at the differences between two versions of the file to see exactly what changed. This becomes unusable though if OH decided to reorder the records in the JSONDB file as the whole file will appear to have changed even if all you really did was add on logging statement to one rule.

Because of this reordering the utility of the git history over all is greatly reduced. Of course, that may not matter to many or even most OH users who store their configs in git. But for those who do (mostly people whose day job is programming I suspect) this reordering of the records essentially breaks git.

Personally, I’ve not experienced too many reorderings but I expect them to happen so I simply do not use the diff and history features. I’m mostly using my git as a way to backup and a way to develop on a separate branch for awhile if I’m doing something involved that breaks things and have an easy way to switch back to “production”. I don’t use it as a way to analyze what I’ve done in the past.

That would be awesome! Sort by UID is what I’d choose as that is the one value that is not going to change for an entity. I didn’t look at how OH is storing the Objects. I just assumed that JSONDB managed all that. If we have control over that though it shouldn’t be too bad to sort the Map during output to file.

1 Like

I have introduced PR to openhab-core to resolve the issue: Consistent ordering of entries in jsondb by ssalonen · Pull Request #2437 · openhab/openhab-core · GitHub

Let’s see how the maintainers feel about the change.

With some internal testing I have come to a case when the order can change in an unpredictable way. By default, java HashMap increases the capacity when we reach 75% of the capacity. By default this happens at 12 items (75% load factor of default capacity of 16). When capacity changes, map items will be internally re-hashed, and therefore order might change.

So if you disable and re-enable an entity (say, a rule), it might trigger increase of the map capacity, and jsondb will contain the same items in a different order. This would be completely unnecessary change in file contents – the meaning of the jsondb is exactly the same.

Perhaps this is what @mstormi was meaning with the comment “[might change the order when] delete/re-add components”.

Marking this discussion “solved”, and welcome further discussion to take place in github PR or issue.

2 Likes