openHAB reliability and upgrade experience

bjscheue · January 9, 2026, 2:37pm

If I may dare to throw another thought into the room that came to me last night: A lot of people here stated that they tend to stick to older versions for longer times, before they finally upgrade. Longer-ranging upgrade steps therefore seem to be common, especially among experienced users with larger setups.

Testing and supporting upgrades from any arbitrary version x to any higher version y is impossible in practice. Nobody will want to take that effort upon themselves, and rightly so. But for the actual upgrade paths that people take this creates a lot of diversity. Some will upgrade from x to y, some from x+2 to y-3, etc.. This creates lots of potential for things to go wrong in specific combinations.

If there was specific support and testing for some (i.e., a very select few) long-range upgrades, many people would apparently benefit. For instance, the respective last bugfix release before any new major feature release could be designated as a “long-range upgrade” release (i.e., v4.3.9, v3.4.3, etc.). Upgrades from the previous long-range upgrade release could then be specifically tested when the next of these rare releases is coming, in addition to testing upgrades from the immediate predecessor release. The additional effort seems rather limited - it would only mean that perhaps around once every 1-2 years one additional upgrade step will need to be tested.

For all the users here that prefer seldom, longer-ranging upgrades, it would mean that they could “sit” with their long-range upgrade release and wait for the next one - unless they specifically need or want the new features in the more fine-grained releases.

An “extended” version of this concept would be long-term service releases, where the designated older versions also receive bugfixes for a longer time. While that would sure also be good for some people here, it comes at very significant additional work for backporting patches. “Just” supporting the upgrade path, in contrast, is very little additional (testing) effort.

florian-h05 · January 9, 2026, 2:54pm

Good news: Gemini was able to create a Python script to reproduce the issue and I was able to implement a fix: IconServlet: Suppress error logging when client disconnected while sending response by florian-h05 · Pull Request #5257 · openhab/openhab-core · GitHub

IMO we have to add a very real issue with the “wishlist” topics: Reading a hundreds of comments is quite time consuming … it would be better if such wishlists were organized as wiki topics where everyone could edit a single post and add his ideas if they aren’t already there, and where someone could organize the ideas.

I would support both of these! Though I would not underestimate the implementation complexity, the upgradetool is relatively “simple” and just modifies the JSONDB while openHAB is not running. Doing this while the system is running would be more challenging.

As a side note: Even listing breaking changes for a release in the UI would be a big visibility improvement!

And an additional thought: As openhab-distro ships the backup functionality IIRC, it should be available on all systems. What if we integrate an automatic backup into the upgrade process? Or offer a backup with Yes as preselected option? I think this should be relatively easy to implement as a first step, as the above will likely take more time.

Spaceman_Spiff · January 9, 2026, 3:04pm

As I’ve written before there is nothing to gain by the version number.
If the file by the user is is compatible with an old version of openHAB there is no reason for the file not to be loaded.
If the file is incompatible (e.g. new structure, etc.) then parsing with an old version should always result in a corresponding error message.
If the file is compatible with the new version it should be loaded
If the file is incompatible with the new version then parsing with a new version should always result in a corresponding error message.

Even backwords compatible parsing is not an issue:
Parse with current parser, on error parse with older parser, on error parse with older parser, … .
If all parsers are exhausted show error message of current parser.

In the end people will just mass search-and-replace the version tag and check the openHAB log for errors anyways … .

It would already help tremendously if we have a proper separation between openHAB cache and openHAB data. E.g. move cache, tmp, config, etc, logs (…) out of userdata.
It’s always a pain to nit pick the files needed and delete the other files when doing an upgrade.
An automatic backup restore tool would benefit from this, too.
Automatically clearing the openHAB cache on a version upgrade would be nice, too.

Mherwege · January 9, 2026, 3:10pm

Do you think it would be an option to insert a check in the start script that compares the version stored in the jsondb to the version being started, and if not the same, starts a management UI instead of OH? That management UI could offer to run the upgrade tool (or integrate it in its code), create a backup, restore a backup, show breaking changes…? It avoids doing all of this while OH is running. That management UI could start simple (just offer to run the upgrade tool) and grow over time.

Mherwege · January 9, 2026, 3:15pm

Cache and tmp are automatically cleared when doing an upgrade using the upgrade scripts already. The issue arises if you do not run these.

Nadahar · January 9, 2026, 3:18pm

Spaceman_Spiff:

As I’ve written before there is nothing to gain by the version number.
If the file by the user is is compatible with an old version of openHAB there is no reason for the file not to be loaded.
If the file is incompatible (e.g. new structure, etc.) then parsing with an old version should always result in a corresponding error message.
If the file is compatible with the new version it should be loaded
If the file is incompatible with the new version then parsing with a new version should always result in a corresponding error message.

Even backwords compatible parsing is not an issue:
Parse with current parser, on error parse with older parser, on error parse with older parser, … .
If all parsers are exhausted show error message of current parser.

In the end people will just mass search-and-replace the version tag and check the openHAB log for errors anyways … .

It sounds to me like you’re saying that versioning in general is a useless and wasted effort, and that people should just “try and see what works”.

I disagree, you’re skipping a lot of details here. Not all parsing “fails” for example, sometimes it might just parse things wrong. Another is the huge mess of (often meaningless) error messages that would be logged.

I think versioning is the only right thing to do, but I think one should bump the version number sparingly and not when not really needed.

rlkoshak · January 9, 2026, 3:52pm

None of those threads I can remember were started by maintainers and usually few if any maintainers participate in those threads. They are almost always a bunch of users arguing over what they want OH to be that usually go nowhere because few issues or PRs get generated out of them.

But we can’t police posts based on the quality of people’s opinions.

I do that. You do that. How do we make everyone else do that?

Not with duplicate devices. But it’s probably only a minority of devices that can be controlled from only one OH instance. And there are tools like Remote openHAB that can handle that.

Mostly one does “periods of processing” where one brings down the old instance, brings up the new one for testing and validation and if something goes wrong that cannot be fixed right away, all that needs to be done is to bring down the new instance and start the old again.

If one is careful this can be done all on the same machine.

Docker is particularly good for this sort of process.

But running in parallel is good for big jumps where the number of breaking changes is huge. But in this case what one is doing is running the new instance using Remote openHAB or MQTT EventBus to replecate and synchronize the Items from the old instance to the new. Then the config is gradually transferred from the old to the new a little at a time.

Not necessarily. The second instance can be a manual install. Once the configs are proved out, then you can upgrade the “installed” version with the upgraded configs from the manual install.

Or, like I said, Docker makes this really easy.

I can describe how I do major upgrades and normal upgrades if it will help.

For a major upgrade, e.g. 2.5 to 3.0, or 3.4 to 4.0, I did the “run in parallel” approach. I run in Docker so it was no problem whatever to just spin up another container on the same machine and have it using different ports. Then I used the MQTT Event Bus (2.5 to 3.0) or the Remote openHAB add-on (3.4 to 4.0) to replicate all the Items in the old instance on the new instance.

At this point the new instance had all my Items, and they were receiving command and updates in sync with the old instance. But the new OH didn’t actually do anything yet.

Then I choose a single binding. I disable the Things for that binding on the old instance, install the binding on the new instance, discover/create the Things and link them to the already existing Items. I also moved the rules that use these Items over at this time (disable the rules on the old instance). Once everything is working, remove the disabled stuff and no longer used Items from the old instance.

Repeat this until everything has moved from the old instance to the new and we’re done. We just need to launch the container with the new instance and updated config using the default ports.

I save persistence for last to avoid duplicate entries.

However, it’s only that involved for a big jump with lots of breaking changes or if one wants to redo their config for some reason (e.g. rename the Items).

For a normal upgrade (4.3 to 5.0, 5.0 to 5.1) I just pull the new image and run the new container with the old config. A backup is created automatically and upgradeTool will be run automatically.

In the event that a breaking change happened, and I don’t have time to deal with it, I just restore the backup and start the old container until I do have time to deal with it.

But i didn’t experience any major problem for either of these upgrades. Certainly there were some manual changes I had to make here and there (recreating some Things, modifying some rules) but I was able to do that in less than half an hour.

Upgrades have become so reliable for me for OH that I even automate it now. Every few days I run Ansible scripts that upgrade all my machines, pull the new Docker images, etc. and if there’s an upgrade for OH (or any other of the many services I run) they get pulled and deployed automatically. I check the OH logs often enough that I catch any problems pretty quickly.

It won’t always be the case that OH can come up enough that MainUI can start in the first palce before upgradeTool was run. That’d be my primary concern.

We don’t have anything like that built into upgradeTool so it’s a theoretical concern right now though.

It should be part of the health check I would think. That already points out problems and offers to automatically fix them where it’s able to.

That only works because those services can come up to a functional state with the old config in place so you have the opportunity to make that choice and do the migration. OH doesn’t currently support that. It would be cool if it did but I think making it so would require significantly more work than just making the upgradeTool runnable from MainUI after an upgrade.

Several of the tasks modify the JSONDB so most definitely there will be problems there, but not unsurmountable problems, particularly if the upgrade web page/service prevents OH from coming up to the point where it has already read those in.

That versioning only applies to YAML text based configs though, not DSL nor managed configs. YAML and DSL text configs should never be touched by upgradeTool I think. The users using those have signaled a desire to have complete control over their configs and mucking around with them with automatic edits from upgradeTool would violate that.

On-the-other-hand, this new upgradeTool page that gets displayed can present to the users the fact that there is now a v2 of the YAML configs with a brief description of what those add. And because of the way the YAML versioning work, OH will still work with the V1 files without problem.

Tried that. People won’t edit the original post and just reply. It’s very frustrating. The only thing I think can work is if whom ever creates the thread takes it upon themselves to update the OP with summaries. That doesn’t mean we cannot make the post a wiki post and that others can’t also help keep it up to date. But someone needs to commit to doing that or else it just becomes a regular post.

It’s already there in Docker. entrypoint.sh notices that the version of the configs (based on /openhab/userdata/etc/versions.properties) is different from the version installed in the image and ~~zips up the conf and userdata folders (with some excluded folders of course) and puts that into /openhab/userdata/backup with the timestamp~~ tars up userdata. I guess it doesn’t touch conf anymore. The behavior seems to have changed sometime. I’ve not had to restore a backup in so long I didn’t notice.

I don’t have a package installed OH instance at the moment, but I always assumed apt/yum upgrades also did this backup too. If not it certainly is a good idea to add it.

Anyway, there is precedence for this in the Docker image. It would be great if what ever is done is mostly consistent across all OH deployment options.

In addition, it adds useful information to the end user and those of us helping on the forum, It becomes very easy to see “OK, this file is version v1, what’s changed between v1 and v2 that could be causing this problem?” for example.

AndrewFG · January 9, 2026, 4:10pm

Simply clone your SD card before upgrading so you can swap it back if something doesn’t work.

Nadahar · January 9, 2026, 4:14pm

Not everybody runs OH that way - and it won’t solve many of the problems mentioned here anyway, with lost persistence data if you have to downgrade etc.

bjscheue · January 9, 2026, 4:16pm

No SD card here, the system’s way too big for that. But that’s essentially what I was referring to with the cloned SSD, yes. If I swap back after two weeks, though, then everything that happened in these two weeks will be lost.

There are many gems in @rlkoshak’s long post once again, so thanks once again. It seems that’s feasible only if you don’t rely on persisted state much (e.g., my rules often depend on state records from a week back or longer to decide what they shall do), and if one doesn’t mind to re-link 1000+ items successively (plus finding+fixing what got accidentally mixed up in that process). Neverthess, this gives me a few ideas how I can streamline the process a bit more in the future.

rlkoshak · January 9, 2026, 4:36pm

One thing you can do is host the database on a different server/VM. With the persisted data on a separate machine you can change the host running OH all you want and the only data lost will be data that was emitted while OH was offline, which you will lose anyway in any scenario.

In the 10+ years OH has been around, there has never been a breaking change to how persistence data is stored so the database will work with any version of OH. Obviously I can’t promise this will always be the case, but the developers use OH too, and they don’t want to lose their data either. So there is a strong incentive to not break the data.

I do too. There are many ways to handle that. In the “migrate little by little” approach, probably the easiest is to so not migrate those rules from the old instance to the new until after you’ve migrated your persistence.

Who said anything about doing it one by one?

If you use .items files, it’s a find and replace.

If you have a managed config for Items, it’s “add points to model” or “add equipment to model”

Or you can migrate the Items to the new instance after you migrate the binding and the Things. If you have .items files and are careful to have the Thing keep the same UUID as it had on the old instance you don’t need to change the links at all.

If you are linking one Item at a time, you are not taking advantage of the tools available to you to make this job much easier.

When I migrated to 3.0, I made heavy use of “add equipment to model” and just recreated my Items from scratch. With that you can create, configure, and link all the Item connected to all the Channels of one Thing in one go. And it ends up in the semantic model where you want it.

If you wanted to use the MQTT EventBus it could be a little easier. This rule template works off of Group membership and there are just two Groups. So you could copy your Items over, leaving the link configs (assuming a file based config). Do a find and replace to add them all to the two Groups (or do it in the UI). If you make sure your Things on the new instance keep their UUID, once migrated you just need to remove the Item from the Group, or not if it’s not causing problems for the old instance.

If this required teduiously messing with each individual Item, I wouldn’t recommend it in the first place.

There’s also a lot you can do from rules or through the REST API if you prefer a more programming approach.

florian-h05 · January 9, 2026, 5:06pm

I think that should be possible.

IIRC they don’t. Homebrew for sure doesn’t automatically backup.

Spaceman_Spiff · January 9, 2026, 5:35pm

That’s not what I said at all.

If one parser sees the file as valid you obviously wouldn’t see any error message.
In the worst case that would be the parser of the version you created the config with.

But that information could also be transported by either a comment or a created_with node.

Let’s assume I am a new user. I unpacked openHAB it into a directory. It runs very well.
Now a new version comes out. I unpack it into a different directory, so I can quickly switch back in case things go wrong.
Then I copy the userdata and conf from the old version to the new version because I obviously want my userdata and configration in the new version.
I am now stuck with cryptic error messages and the new version does not start at all.
That’s very unexpected behavior.

rlkoshak · January 9, 2026, 5:50pm

If that’s the process one followed, one didn’t read the installation and upgrade instructions.

OH has never supported that process for upgrading. The way OH works I’m not sure it ever would be able to.

Mherwege · January 9, 2026, 5:55pm

I am not denying any of that. It’s also why I keep stating people running into issues often did not use the upgrade scripts. The proper procedure is to first make a backup, then run the upgrade using the proper script.. If it doesn’t work out, restore your backup. Just copying over was never expected to always work. But people got spoiled as it works most of the time. And of course, when copying over cache and tmp have not been cleared, often leading to issues as well, and leaving the impression you need to do that manually every time something goes wrong. It is a last resort. In normal situations where it is needed, it is done for you when doing it the way it was intended.

I am open to improve this. That’s why I like the idea of making it more transparent and expose the user to the upgrade when a new version starts for the first time, in a UI, instead of doing it in a script. But as long as we have data upgrades of managed configuration data, you can always run into the trap. We can make it more transparent though.

For file based configurations, there never is an automatic upgrade. So if you make changes when upgrading, you should know you may have to revert them when going back. But again, we can improve and make it more visible.

Nadahar · January 9, 2026, 6:06pm

That’s what I understood of it, which is why I said “It sounds to me like”. So, since I didn’t understand, perhaps you should explain more, or perhaps it doesn’t matter whether I understand

If you’re just throwing the parsers at the file in succession, you will get completely misleading errors from the parser that fail - plus that you can get parsers that don’t “error”, but merely parses it wrong, thus preventing the “right” parser from ever parsing it. So, I’m afraid I don’t understand what you mean at all, probably, because the scenario looks very different to me.

This still makes it sound to me like you’re trying to find a way “around” versioning. What is different with a created_with node? The version in the document is the “created with” node, nobody is refusing a parser “of another version” to parse that file if it can do it correctly.

Here I do agree

It might not be the “supported” process, but I sure don’t read all the documentation for all software I handle. You try to deduce what is logical, and if something is called e.g “conf”, I find it reasonable to believe that if I’ve backed up that folder, I’ve backed up all my configuration/settings. I honestly don’t quite understand why not parts of the JSONDB is in “conf” (the content of the “tables” vary, but some of them are clearly “configuration”).

Userdata is more diffuse, it’s hard to know exactly what that means - I would perhaps assume that it means any “data” (like persistence) and all configuration, or it could be data that only belonged to a certain user - but that’s not really relevant with how OH works, as I’m not aware of any “per user” data or settings.

rlkoshak · January 9, 2026, 6:21pm

And if it doesn’t work the way you guess it should work, if you don’t then go and look at the docs I’m not sure what to say. It’s hard to think that’s our fault if one continues to struggle.

It has been a very long time since I read the FHS, but my memory is that dynamically created data generated at runtime belongs in /var/lib, not /etc. Only static config files belong in /etc. Because the JSONDB is dynamically created and updated by OH as it runs (in response to interactions from the user of course), it belongs in /var/lib.

I don’t know if that was the original reasoning. It’s been this way since OH 2.0. But that was what I understood to be the why.

Of course, it gets confusing with the whole $OH_USERDATA/etc folder. But there too karaf can and does update these files dynamically (e.g. log4j2.xml).

So in short, it’s the FHS that pushes us in this direction. Looking back at the spec again, it does say this about /etc:

The /etc hierarchy contains configuration files. A “configuration file” is a local file used to control the operation of a program; it must be static and cannot be an executable binary. [2]

The bolding is mine.

Nadahar · January 9, 2026, 6:36pm

It’s not about who’s “fault” it is, you could say that in such a case, it is my “fault”. I’m not doing this uncritically, it’s a simplified example, but I’m trying to say that there are so many details in this world that people have to deduce/make assumptions, or they would never get anywhere. As a consequence, everything works smoother if things are “what they appear to be”, because assumptions line up better with the reality.

That is some Linux specific “invention”, I must say that I find that many of those things don’t make that much sense. But, it wouldn’t have to be “the same” configuration folder as the manual configuration files, it could be another dedicated folder. Something like this:

conf
- manual (mapped to etc on Linux)
- managed (mapped to var/lib/...something.../conf on Linux)

Yes, I’ve always found those “Linux rules” to be overly beaurocratic and not always very suitabe. The concern is clearly about those making/managing the OS, not the applications, because these rules makes many things simpler for them. That said, an application probably shouldn’t have write access to /etc at all.

But, as shown above, with some “creativity”, one can make internal structures that allows mapping things around so that Linux is “happy with it”. There’s not really anythng that mandates that you must put things in /etc anyway, is there? You can keep your configuration somewhere else, so that you’re not bound by those rules.

Regardless, we’re not at scratch here, designing this new, so options are much more limited now. It still “bothers me” that some of the configuration isn’t under conf, it makes making backups (with Git, so only configuration data) more cumbersome. To get it all in one Git repo, I’ve had to make my own “/srv mount” on openHABian that encompasses the two folders. I assume that many others have the exact same problem. You don’t want to make your Git repo at the root folder, to get them in the same file structure.

rlkoshak · January 9, 2026, 7:11pm

No, this goes all the way back to Unix and POSIX. It’s the same on the BSDs, all the commercial Unix’s (e.g. AIX) and Mac OSX. FHS itself is a Linux standard, but it’s basically a copy of how Version 7 Unix was layed out with some updates over the decades as Linux .

There are several Linux distros that do not follow FHS. But Debian (and it’s decendents) does and the vast majority of our users run on some form of Debian.

Nothing is mandated on any Linux distro. But /etc is where static configuration files are expected to go. You can put anything anywhere if it makes you happy.

However, it is reasonable to follow the standards of the main OS OH is run on. It is reasonable to expect a package installer (apt/yum) to install software following the conventions of the operating system it’s being installed on.

If there are packages of OH for a Linux distro that doesn’t follow FHS, I’d expect OH to be installed based on what ever convention/standard that distro follows.

A manual install on Linux would have you unzip everything into /opt/openhab and conf and userdata will both be under that. But I don’t think the package installers are supposed to install to /opt like that.

If it really bothers, you can do a manual install and have everything in one folder. Many do, even on Linux for this reason.

Nadahar · January 9, 2026, 7:40pm

I think the point was lost somewhere along the way. I have solved it with my “mount” that collects the two folders under one umbrella. It’s not about me, it’s about it being logical for people to manage, and the fact that configuration, data and cache/temp/log files aren’t properly separated.

What I was trying to say was that even if configuration was split into two folders, one could make a “virtual root folder” for those, that made the structure clear, and it would be obvious/intuitive where the configuration was found, what you have to backup/restore etc.