openHAB reliability and upgrade experience

Ok, so we seriously take this thread off topic, then. :stuck_out_tongue_winking_eye: (Of course, we can continue this elsewhere if advisable.)

I’ll give you my full “rant” then, knowing fully well that some of this will be justified, and other parts will just feel justified to me. I trust that you’re able to take it with the appropriate barrel of salt. And while you may, of course, disagree with me in as many regards as you wish, I hope that you’ll nevertheless get the general points.

I’d also like to say upfront that I love what OH accomplishes. I wouldn’t have gotten as far as I am without it. So it’s love and hate at the same time. :sweat_smile:

The key problem is that whenever something changes (an update, a new binding that I install, an overhaul in some parts of my rule set,…), I must hold my breath whether there will be several other things that suddenly (or not-so-suddenly, noticeable only after days or weeks) fall apart. Usually there will. To some extent, this is clearly due to the complexity of my setup: we’re not talking about a Raspi with an SD card, we’re talking about a full-grown server system where OH is one service among several, and an OH setup that manages 1000+ items, 350+ automation rules, 500+ events per minute. 10+ years ago I started with things like switching the Christmas tree lights on iff the lights in the living room or the TV are on. By now my setup controls virtually everything in and around my home. Heating, rollershutters, PV incl. battery, car charging, smoke detectors, gates, lights, water supply, ventilation, home appliances, phones, doorbell, home entertainment in multiple rooms, alarm clocks, and quite a bit more. Yes, I enjoy that as a hobby. No, I am not expecting open source software made by enthusiasts to come with military-grade stability. Yes, I appreciate the incredible amount of hard work put into this, by so many skilled people. Yes, I know how to build and administer complex IT systems, and how hard it is to build them “right”. And yes, of course I have a plan B if things go very seriously wrong. So even in the worst case I will still be able to get through my front gate, I will still be able to switch on the lights, and with some manual pressing of buttons I won’t sit in the cold without heating. But over the years it has still become really, really mission critical for my entire household.

And yes, you will say that it’s natural that upon an upgrade things may go wrong. That’s true. I have that with other services, too, occasionally. But OH is different in one significant way: there’s no clean separation between code, configuration, and data. And this makes it a) super hard to go back to an older version if it turns out that things did go wrong, and b) to seriously skip updates (as you rightly suggest as a natural strategy for stuff you depend on).

When I do a version upgrade, I want to do it with my distribution’s package manager. I want to do it without having to read up in forums, in release notes, and perhaps even in the code first, hoping not to miss something that comes back to haunt me later. I want to stay in the user role. I spend more than enough time in the guts of other systems, so for this one my desire is that it just does it’s job, with all the flexibility that it allows me to leverage on the user side.

But if I upgrade OH, things like the infamous “upgradetool” start shoveling around stuff - because OH has more and more started to treat configuration like data since the advent of PaperUI, with different, incompatible formats every now and then and stuff getting stirred together in oftentimes surprising ways and places. An upgrade may also result in an unexpected restructuring of my persistency storage, and so on. Of course I will make sure I have a good backup before I upgrade. But if three weeks after the upgrade I figure that something in the new version is not really doing what it’s supposed to do, I can either roll back with the old backup, losing all my changes and likely also most persistency data from the time since the upgrade (even if it’s in a separate database, as the schema may have changed since the upgrade, or because of meanwhile added items, or…). If I don’t want the full roll-back, I’m bound to stick with whatever version I got myself into. Carefully, piece-by-piece transferring configuration data? Perhaps even version-controlling configuration data like I do for most other services? No way, and least not dependably without the constant risk of missing something. As we’ve seen two days ago, it even sucks new binary code from a directory into a place somewhere in the intestines of the system. No way back but by downloading the old version again, hoping that it gets integrated correctly, and still expecting some things to still be differently (like IDs), with possible consequences that are hard to predict. (Yes, shit happens, I am optimistic, but at the same time appalled by the general systems design approach as such.)

So I have learned to live with, for instance

  • unexplainable missed triggers, sometimes more often, sometimes less often (has happened from time to time, happened often after upgrading to 5.1.0, seems to have gotten better again for unknown reasons; system load related? uptime related?)
  • OH not starting occasionally, instead flooding my log with tons of errors - and if I kill it and try a second time right away, it suddenly boots as if nothing had happened (still a phenomenon that I experience from time to time, especially often when it first starts after upgrades, but nondeterministically)
  • error messages by the Shelly binding that “Channel types or config descriptions for thing ‘…’ are missing in the respective registry for more than 120s.”, and that this “should be fixed in the binding”
  • sometimes my EnOcean binding becomes deaf, and only restarting OH helps that it starts receiving again; I haven’t looked into this one deeply, but it seems that in the recent version(s) scanning for new EnOcean devices also stopped working (I worked around this by manually editing addresses, after trying a few times without success)
  • when I open some sitemaps in the Android app, everything works as expected, but something called “IconServlet” occasionally complains in the logs that it “Failed sending the icon byte stream as a response: Reset cancel_stream_error”, with tons of errors afterwards about “IllegalStateExceptions: ABORTED” (as I said: nothing noticeable that anything doesn’t work, despite all the log messages; no indication which specific items/states/situations might cause this and when)
  • the configuration of some MQTT channels silently got silently rewritten and became non-functional sometime recently, so that the channels did not process updates anymore; this messed up my heating control without me noticing for the first few days, and burned lots of gas unnecessarily (I implemented a workaround, reported the bug, and after some discussions a fix seems on the way)
  • very frequent error messages (every few minutes) in my logs that “Handler HomeConnectHoodHandler of thing homeconnect:Hood:*** tried accessing its bridge although the handler was already disposed.”, also without anything apparently becoming non-functional - the hood works as it should. But I lost trust that it will continue to work, and am half-expecting this to fail at some point.

This is not a complete list. It’s also for sure not a to-do list. For most of these, there are (often quite ancient) issues or forum threads with people seeing the same problems, but without solutions that would have helped me nail it down or fix it. And I just don’t have the nerves and the time to pursue reporting every single such thing, especially since the past showed that this seems not unrestrictedly welcome from “outsiders” like me, and the typical first reaction for other people’s reports as well as for mine is along the lines of “I can’t reproduce it, that’s fishy, that can’t be true, works for me, it must be your fault”. So, I’ve just come to accept this as a sign for an overcomplex system, where my scale of using it is apparently exceeding the scope within which it can still be used without encountering overly many “strange” effects.

If I upgrade to a new version of virtually any other service that I am operating, then I can more or less count on being able to at least find a safe way back, usually without losing critical data. Yes, there are exceptions (and I try to avoid them). In particular, it’s highly unusual that a service automatically re-writes and changes configuration information automatically in uncontrolled ways. Most backend software systems are simply doing a much, much better job at keeping stuff (especially configuration, most often also data) forward and backward compatible without resorting to “conversions” or other automatic editing.

It may be that I am particularly sensitive in that regard, because what I do for a living is working with IT systems and data formats in an area where there must be room for unforeseen extensions over long time spans, without compromising on compatibility with (long) past and (far into the) future versions. So I am getting a cold sweat when I install a new version of OH, and notice that yet again it’s telling me that something in my configuration has just been automatically re-written. It just feels like: “This would very likely be totally unnecessary had there been a clean design right from the start, and a few well thought through design rules.”

Given these experiences, I am so much with you when it comes to “not running after every new version”. A clear YES. That is also my general philosophy, especially for things on which I really depend every day. It is a philosophy that I had to give up for OH, though. It’s clearly not foreseen and not appreciated to do upgrades to anything but the next release, especially not in a system with a complex existing configuration where “just reconstructing it” in a higher release if necessary is not an option. The upgradetool (or whatever else) may mess up badly when you jump over releases, I was told and I learned. And spending a weekend catching up on release after release after release, step by step is a) hard with typical package management systems, because it’s not what these packet managers expect the managed software to expect, and b) it doesn’t feel like it’s anywhere more likely to result in a really functional system. A while ago I got badly scolded in this forum for daring not to go through every single minor upgrade separately, but instead waited for things to become stable, and then asked for help when something - as far as I can tell from all that I know and saw completely unrelated - stopped working after I finally did a somewhat longer ranging upgrade. (That, by the way, was also related to rules that didn’t trigger - but that’s coincidental.) So I typically delay my updates for a few days, but then I do them when my distribution offers them, so that I don’t risk missing any apparently so important steps along the mandatory upgrade ladder.

Essentially, what I would expect is: I can freeze a version of OH in my Linux distribution’s package manager, to preventing automatic upgrades even if they appear in the repositories. When I decide that it’s time for an upgrade, I do one, again using the package manager of my distribution, without digging though all the footnotes in all the release notes of all the intermediate releases that I might be jumping over. I’d expect that nevertheless (a) my configuration will stay intact, (b) I get a message if something in my specific configuration is at risk of breaking, before anything is getting modified, (c) I don’t lose the option to go back later without losing configuration changes and persistence data that happened between the upgrade and the decision to go back (unless, obviously, they depend on features that haven’t been there in the old release). I can do that for virtually any other mission-critical software that I am using, but not really with OH. At least I have lost any confidence that I can, and I got quite explicit statements from people here in this forum that it’s entirely my fault if I assume that I should be able to.

So, my way out is that I started developing something tailored, lightweight, which handles persistence and automation rules based on MQTT, and I started porting my automation rules out of OH and into this much more overseeable, less feature-(over)loaded and hopefully also in the longer term more easily debuggable environment. My intention is to separate device/”thing” interfaces from automation logic, and both from user interfaces. I might keep OH as a “bridge” to some types of devices where there’s a binding available, but where I can’t find a more lightweight alternative to interface with the device. I might also keep it for user interfaces, like panels on tablets (though I will evaluate alternatives; for instance, I like that fact that HAss has a working Android Auto integration, something I have really been missing).

1 Like

As per @Nadahar s request I’ve moved this post to a new topic.

1 Like

All I can say is this is not my experience with OH. Things can always get better and OH has gotten much better with upgrades over the years, and upgradeTool is part of what has made it better. But everyone has their own experiences, and it’s not always great.

But I do want to clear some things up regarding the upgradeTool because this post makes it sound like it does way more than it does.

It does not touch file based configs except in one case (and I’m frankly uncomfortable with that one case but the impact was very low, so I didn’t fight it).

For those using managed configs, it does very little.

  1. Between OH 3.4 and OH 4.0, instead of implying an Item’s unit from its label/state description pattern, a new unit metadata was added. upgradeTool copied that implied unit to the new unit metadata. This ensured that Items didn’t suddenly change their unit to the system default and retained the unit it was previously using.

  2. When OH moved to Java 17, Nashorn JS went away as something that just comes with OH. It was replaced with a new SCRIPT transformation capable of letting users write transformations in any installed rules language, not just Nashorn JS. Consequently, the separate JavaScript transformation add-on went away. upgradeTool makes some minor changes to the formatting of the transformation profiles to use the new SCRIPT transformation syntax to prevent users’s old JavaScript transformations from failing to work.

  3. The SCRIPT transformation profile was improved to support different transformations depending on the direction of the event (i.e. from channel to item and from item to channel). upgradeTool moves the current configured SCRIPT transformation to the proper field so these transformation profiles continue to work like they did previously.

  4. When the new YAML file formats were created for configs, there was a small incompatibility with the YAML file previously used to define custom semantic tags (previously it used a list and the new format requires a map). Almost no one used the old YAML file for this because it was never documented. But for the few who did, the change ensured their custom semantic tags continued to work.

  5. The Homie and HomeAssistant sub-bindings were split off from the main MQTT binding in OH 5.1. If you had MQTT installed prior to the upgrade and you had at least one Homie Thing, and you are not using addons.cfg to install and manage your add-ons, upgradeTool installs the new Homie add-on. Same goes for HomeAssistant. This ensures that the Homie and HomeAssistnat Things continue to work after the upgrade without a manual step to install the add-ons.

  6. In OH 5.1 the default strategy was removed because it caused many problems. Users who have a managed perssistence config (i.e. configured through the UI, not .persist files) had the default strategy removed and, if it was being used, an equivalent configuration applied to ensure persistence behaved the same after the upgrade.

This is all upgradeTool does. I list this out because not all of the complaints above could have been caused by upgradeTool. It seems like some of the problems mentioned might be caused by upgradeTool not having run (e.g. using text file configs where upgradeTool does not make these changes).

My only goal with this post is to make sure if someone looks to address any of these problems, they do not look in the wrong places.

Beyond that, I’ll monitor this thread to make sure it doesn’t go off the rails.

7 Likes

I had it moved, as I could already see that this was going to be quite large.

I think you’ll find that we are much more aligned here than you think. I can only speak for mysef though - others might disagree.

I completely agree with you that this is a challenge with OH, I’ve been trying to raise this as a point in the past, and not really gotten anywhere. I did for example suggest that there should be a combined changelog where you could relatively easily figure out what has changed beween version x and version y. It’s hopeless that you’re supposed to look up release notes and blogs all over the place to try to get an overview over what will break (which is the most imporant part of “changes” for me at least).

But, nobody seemed to be willing to put in the effort to do anything about it. Some of this goes to being a volunteer effort. As always, there are boring tasks that needs to be done, and when nobody’s getting paid to do those, they tend to have very few takers. But, I agree with you that more focus should be given to how this looks from a user’s POV. You can’t expect most people to keep up with what happens with OH on a weekly basis.

The share number of contributors and the compexity of many of the things here also adds to this. There are people contributing that have different experience and knowledge levels, and different motivation. Getting all this orchistrated is never going to be “perfect”.

I’m still running 4.2.3 for much the same reasons. I just feel that upgrading is such a big task with tracking down everything that probably goes wrong, that I have trouble convincing myself that it’s worth it. I will have to do it at some point though, to benefit from my own contributions :wink:

I would never say that. In fact, I think “the world” is expecting way too little in this area, I feel that people have been gradually conditioned into accepting that as some kind of “fact of life”. But, this also goes into the larger philosophy: I don’t see any value in having “time based” versioning. I think versioning should be based on features and breaking changes, like it used to be. Major releases should be rare. But, this makes getting “new stuff” out there slower, and “the world” has decided, much to my disagreement, that rapidly deploying changes are more important than stability and quality. All we can do is resist, and pick those solutions that are least “careless” in this regard.

Some things will have to break, but it should be deliberate, planned and announced. Other things will break unintentionally, they should be fixed promptly. OH is trying to do both, but not always succeeding in my view.

I only use file-based configuration, and I store it all in Git. I’ve had many discussions with people about this in the past, and many seem to think that doing so it “backwards” and almost deprecated. But that’s not everybody, and I’m happy to see that file based configuration has also gotten a “boost” lately - with the YAML format for Things and Items, and hopefully soon to be rules as well (I have a PR waiting with this). Work is also being done on “templateing” in YAML to make this quite powerful, at the same time as you can easily export UI created elements to YAML code. This allows experimenting using the UI, and then making it permanent by exporting the result to files when you’re happy with it.

I have little involvement with the upgrade tool, and I don’t really know how things are done there. I think it should be obvious that it should be made to work across versions, but that might not be how it is done. I won’t let that force me into following every upgrade though, doing that simply isn’t an option for me - I need things to work, and if they have to break, they must do so at a time of my choosing, when I have the time to deal with it.

There is supposed to be separation between code, configuration and data, but there might be areas where this isn’t as clear as it should. I don’t have the overview. With my file-based Git system, I have full control over the configuration at least, and I can easily go back and forth. I don’t store data (I assume you mean persistence?) that way though, so if that is converted/changed, I will have to either do backup or live with the potential loss.

I get it, sadly, I find that doing this works with almost no system these days, even OS’es break frequently on update, and some even have started to do “snapshots” so that people can easily revert an upgrade. To me, this is just an admittance of utter failure. OH, with its complexity and width, is very hard to ever get to be this “painless”. But I do agree that it should be a goal, and I think it can be made less painful with some effort with that as a goal.

If you’re thinking of the bundle update, it’s not actually “OH’s fault”, but Karaf, the “framework” OH runs within. Not that it helps you much. It’s relatively easy to restore though, I just said to reinstall the binding because I thought that was the simplest way to get it restored. If you know the “correct address” to the original bundle (which can be found from the console before you make changes), you can just do bundle:update <address> and the original will be restored.

You can do a lot from the Karaf console, perhaps too much some will say. It’s always a difficult balance to allow people to do what they want to do, but not allow them to do things they don’t want to happen :wink:

That shoud not be, and I’m willing to try to help pin down and address whatever causes for this I can.

This is unknown to me except during after upgrades/cache cleaning. I get that it might seem strange, and maybe things should be restructured to avoid this to the extent possible, but basically, OH consists of a lot of smaller components (Karaf/OSGi calls them bundles). Bundles are cached by Karaf, so that when you upgrade them, that isn’t necessarily reflected. Because of this, the cache is cleared during an upgrade. But, this also means that a lot of bundles are missing when the system starts, leading to a flurry of errors and strange conditions. The system “fixes” this by reacquiring whatever is missing, and after a few minutes, everything is usually in place again. Doing another restart then, will make things behave and start up as intended.

Avoiding this “chaos” would probably be possible, but would require some very major restructuring. Basically, the system would have to do all the bundle stuff first, and not try to start the rest until this was all resolved. In theory, it should be possible, but I don’t know how many practical challenges that would materialize should one attempt so. Some things, like bindings, won’t be “handed” until some other parts of the system are up and running, so it could be a complicated dance to get right. The “accepted solution” for now seems to be: Just ignore the errors during first startup after upgrade.

Give it a few minutes to sort things out, and then restart “for real” this time. It’s certainly not ideal, but it might be less worrying to users if they were better informed of why this happens.

I’m not familiar with the situation, but there are various binding that have “challenges” on several areas. There are different people handling most of them, with different priorities, standards, available time, knowledge, etc., so it’s very difficult to get everything to hold one standard. I’ve been trying for some time to address issues that seem to cause problems for many users in various bindings, I just wrote this last night for example: Make cache thread-safe and disposable by Nadahar · Pull Request #21 · markus7017/openhab-addons · GitHub

When things just stop working, and only starts after a restart, it’s often a threading issue. It can be a concurrency bug somewhere that causes deadlocks, or there can be bugs in other places that starve “innocent parts” of the necessary resources. It’s almost certainly a bug somewhere that’s causing this, and it should be found and fixed.

This I know virtually nothing about, but I agree that it sounds like something somebody should do something about. If the failures are “harmless”, their logging should probably be hidden unless in DEBUG. If not, they should be fixed.

Another thing I know nothing about, but there’s sadly no way around it: Somebody must fix it, and to do that, somebody must find it. It’s why I prefer to wait with upgrades, I’m hoping that somebody else does this so I don’t have to :wink:

Yes another bug it sounds like to me. Has it been reported? Not every issue manifests for everybody, so sadly, things must be reported. I do think the threshold should be relatively low though, and it shoudn’t be demanded that the reporter must “proove” the bug by doing all kind of exercises. If I’m met with such requirements, I often just ignore them.- I have still reported the problem, they can do what they will with the information. I’m not willing to sacrifice a lot to convince them that there’s a problem. They either fix it, or I will have to look for other solutions.

I think this is very serious if there are many such situations. It will only achieve one thing: Drive people away. I agree with you that when I find a several years old issue with no solution, I too conclude that this is still broken and that there’s no will to fix it. That might not actually be the situation, but how are you supposed to know if there’s no information about it?

Again, this “hostile attitude” when something is reported is a big problem, but it’s not specific to OH. I know exactly what you mean, I’ve experienced it a lot myself. I wish people would rather admit that they don’t know than get “defensive”. It’s important to remember that everybody aren’t the same, so even if one person expresses such a “hostie attitude”, it might not mean that everybody else is on the same page.

I completely agree with you that this is what you should be able to expect, I don’t so much agree that this is my actual experience. When I have found a working version of something, I will go to great lengths to stay with that version, exactly because I feel that upgrading is a big risk of something breaking, perhaps something that you won’t notice until it’s “too late”.

I find that a lot of software these days do non-reversible configuration conversion during upgrades, and I really hate it. Much of this could have been avoided with a little effort put in to achieve that, but my experience is that this simply isn’t deemed “important” by most people, so it’s largely ignored. I don’t see OH as “special” in this regard, but I completely agree that more could and shoud be done to prevent these kinds of situations. I often write code to interpret “backwards compatibe” versions of configuration etc., and people often just dismiss it as “why do you do that? Why bother?” etc. But again, I don’t percieve OH as being particularly bad here, it’s just a very bad trend that has been going on for quite a long time in my experience.

Time and time again I get into “discussions” where I try to argue that things must be designed to be more “robust”, more generic, meant to handle yet unforeseen challenges. But, it can be hard to do even if you try, and I usually feel that my points don’t even get accross. Instead, people end up discussing some minute detail and push back on some point that can’t be seen if you only look at that situation in isolation. It’s usually a lost case in my experience, but I completely agree with you that so many things could be better if this was done to a greater extent. OH is quite varied here in my experience. Some things are very well designed in my opinion, while others seem like they are done just to do the bare minimum to achieve some task, completely ignoring all the barriers it will cause in the future.

You must resist :winking_face_with_tongue: I certainly won’t submit to those that try to force me to just “accept endless volatibility”. I don’t care what others condone or bless, I do what I think is right. I still think that following every OH update is asking for a lot of challenges. At the same time, we need that a certain amount of users do it, to detect whatever problems arise…

Again, ignore those voices. I certainly do. You do what you think is right, and if they don’t agree, that doesn’t really matter. Not everybody is the same, and some will try to help regardless. Those that most loudly tries to “crack down” on those not following every upgrade or probably those least able to help anyway :wink:

2 Likes

With file based configs you wouldn’t have any experience with it. See my reply above for what it does. And it does work across versions.

There has never been a change to how persistence stores data since OH 1.x.

However, if one was using .items files and did not manually add unit metadata to those Items, the unit of the data being stored in persistence might have changed. For example, without the unit metadata, Number:Dimensionless Items without unit=% would have become unit ONE (a simple ratio). That means previously 75 would have been saved and now 0.75 would be saved.

That’s the only thing I can think of that OP might be talking about. Or maybe there was some change to a persistence add-on I don’t use.

I would say that it is a goal and over the years OH has made great improvements towards this goal. But it still has a ways to go. The whole versioning of the YAML file format I think will open many new opportunities for maintaining backwards compatibility without forcing users to have to change things on an upgrade. That’s just one area where features are being put in place to take OH further towards this goal.

This was recently added to the docs.

I would call this the current work-around and mitigation. I don’t think anyone is happy with this behavior and I, for one, would welcome a more deterministic start process after clearing the cache.

1 Like

That’s really useful, but there’s one thing lacking that I wish was there:

bundle:list -l 156

When using the -l argument, it will list the “location”, which is the “source address” of the current installed bundle, e.g.:

> 156 │ Active │  80 │ 5.1.0.202512050249    │ mvn:org.openhab.core.bundles/org.openhab.core/5.1.0-SNAPSHOT

By copy/pasting this before making a change, it’s very easy to restore the original bundle, because that “address” is all that’s needed as an argument for bundle:install/bunde:update.

1 Like

This one I can relate to. This is part of the reason why I have been rather inactive for some time. I cant keep up with all the changes/upgrades all the time. Everytime I have to upgrade, its most often due to new features or things which has not been working before require an upgrade. And I know, when I do upgrade, I need to spend several hours changing/fixing/dealing with issues. Its not a flowless situation at all. And maybe thats just how it is, but I think things has gotten alot worse since I started using OH way back to OH 2.xx. Things has gotten more complicated, though I thought it was suppose to get easier.

Things goes wrong, even with commercial software. The main issue I have with OH is, that when something goes wrong, it can be very hard to figure, whats wrong and how to fix it.

I just had an issue today with my PV inverter no long beeing able to controle the battery. Just out of nowhere. Log didnt help much. It simply said “it couldnt controle the battery and to check the thing configuration” - Well, guess what. I havn´t changed anything since I upgraded to OH 5.1 a coupple of weeks ago. And it ran just fine, untill today.

When the logfile isnt usefull, there is only one thing to do - Sit down fiddle around untill you hopefully find the reason, or it dissapear by itself (usually it dont).
It turned out nothing was wrong - All I had to do was to restart OH. But thats normally the last I do when troubleshooting. Cause if a binding needs OH to restart, there is probably something else going wrong somewhere. Unfortunatly, its almost impossible to find such cause.

I can live with errors an issues once in a while. But I hate the feeling of beeing totally lost, due to not getting a usefull help from the system log. It usually just tell me, “something went wrong”. And then I have to figure out, whats “something” is. That can be very time killing. And often I end up asking the community, which require even more time spending, and sometimes getting nowhere.

I know - Part of this is the deal with open source software. I have to accept this, if I want to use this software. But that doesnt mean things couldnt be better. Infact it means, that every time there is a new upgrade availble, I think twice (often alot more), before doing the upgrade. Simply because
I know - If I do the upgrade, I will be sitting hours trying to fix this “something”, which I often have no idea how to fix.

This is not a complaint - This is the real world situation from a user (me) with a limited knowlegde though years of experience with OH.

1 Like

Many thanks for your feedback, guys.

Reading through some of this, I am wondering: Would someone be willing to write a howto on using recent OH versions in a way that keeps the complete configuration in a clearly outlined set of text-based config files? Target group: power user, familiar with using config files of various sorts, how to handle and how to edit them, able to read and understand the technical documentation of a config file format, able to read and understand error messages in logs, familiar with the concepts of OH on the user side.

I am sure all that information is somewhere out there, but scattered, sometimes outdated and therefore misleading, and not in a form that would allow me to really figure out where exactly all the required configuration is hidden. In the spirit of this thread (and preceding discussions), it would ideally explain to me how to manage my configuration in a way that allows me to get my working, complete configuration back into a fresh OH installation, e.g. after transitioning to a new machine and possibly a newer OH version. (Yes, I know about the “backup” function. That’s not what I am talking about. It’s more about understanding what exactly is / needs to be in a backup, and what not, and why.)

If I got you right, that’s possible. I wasn’t aware that it’s supposed to be (still/again) possible while making use of all features in all bindings. Having this option would certainly be a bit of relief for some of my headaches.

Edit: I’d sure be willing to help in the sense of testing to follow through and providing feedback.

1 Like

Answering @Nadahar‘s specific questions/points:

I actually wasn’t aware that this is expected behavior. Indeed when I saw this it may very well have been related (sometimes? always?) to situations after the cache was cleared automatically (or even by me?). A consoling console message would have been helpful to prevent near heart attacks now and then. :wink:

Could well be. If I encounter this again (with the EnOcean binding or elsewhere), I’ll swear less and instead watch out more closely for log messages, also a bit further back in time. The problem with this one is that I don’t have very many EnOcean devices, and they don’t have a heartbeat or so, so it usually takes a while until I even notice that they have fallen silent.

It was discussed before in the forum, in 2022:

And again in 2024:

Seems to have happened to quite a few people. The last post in the older thread says something about overriding icons in the sitemap file (which indeed I do occasionally), but I don’t really understand why doing so should be a problem, because being able to do that is an intended feature. So after reading that thread back then, I just shrugged it off as yet another case of “no, certainly that’s not what it’s meant to be, but if nobody cares about it being buggy, it doesn’t keep me from doing what I want to do, so I won’t waste energy either”, and accepting that this stuff will likely fill my logs forever.

Yes, there have been forum posts, and also an issue: [homeconnect] Handler was already disposed · Issue #17037 · openhab/openhab-addons · GitHub . It was closed, though, after someone else reported that some other problem that was discussed on the side had vanished.

Again, important disclaimer: I know why didn’t get help with that - namely because I didn’t ask for help. It’s just examples for why I grew tired of even asking for help, and why in the initial post in this thread I said (not literally, but basically) that OH oftentimes feels like a squeaking, rattling old car, which mostly does it’s job, but it doesn’t really feel like it’ll do that for much longer (subjective, colorfully described impression, not objective truth).

As with anything, one only needs to find a volunteer.

I would, but I purposely do not use any text file based configs (but I do git configuration control my configs and all that all the same). Early on I anted to make sure I was best able to help the less technical users. Over time, I’ve found management and upgrades to be much smoother with a managed config. And, as I like to day, I’d rather solve home automation problems, not syntax errors. Managed configs make it very hard to create syntax errors.

There are many paths so such a tutorial gets complicated. The first but not only fork in the road is what file format, DSL or YAML? Then you have the choice between never touch the UI or use the UI and then export. It gets pretty complicated fast. But not unsolvable.

The docs are pretty complete and up to date. That should always be the first place to look. But rarely can you get by without looking in more than one place. For example, YAML Configuration | openHAB shows you how to create Things in a .yaml file. But to know what properties to put in that structure you need to go to the add-on’s docs.

Or you can configure the Thing in the UI and export a file based version in either YAML or DSL. This latter approach, which has only really been possible since 5.0 and is much better now in 5.1 is probably the way I would recommend. All the fields are presented there to you in the UI with a description on what they do. You can fill out the form and, if you want file based configs, export them to a file. You can even immediately export discovered Things in the Inbox now instead of needing to accept them first.

If you have a managed config (e.g. done through the UI), everything is saved in /var/lib/userdata.
All configurations made this way are stored in plain text.

  • Karaf configs (e.g. console user, logging config, etc) can be found in /var/lib/openhab/etc
  • openHAB core configs (e.g. regional settings) and add-on configs can be found in /var/lib/openhab/config.
  • /var/lib/openhab/secrets. I can’t remember what the rsa key in here is for. I think it’s the karaf console ssh host key, probably does not need to be saved in backup but I do
  • the cloud connector secret and id are stored in /var/lib/openhab/openhabcloud and /var/lib/openhab/uuid
  • everything else is stored in /var/lib/openhab/jsondb

Other features and bindings will create other folders with data in them in that folder too (e.g.persistence, zwave, matter, etc). But that’s data, not configuration. All the configuration for a 100% managed config are in these three locations. All the configuration is stored in plain text, either in XML (e.g. log4j2.xml), properties files, or JSON.

Note, the JSONDB files preserve order making them suitable for software source control (this was added back in OH 3.x sometime, prier to that the entries in the files would become reordered).

All file based configs go in /etc/openhab. I believe MainUI widgets and Block libraries are the only things that cannot be configured using files in this folder.

  • automation: this is where all but Rules DSL rules code goes. See the readme for the specific rules add-on you are using for details.
  • automation/templates: this is where you would place rule templates. This really hasn’t been documented yet because there’s a couple of PRs awaiting review which will change things. If you want to explore rule templates, let me know.
  • html: this is where you place static web pages and images and anything else you want to have the openHAB web server to make available in addition to the UIs (e.g. snapshots from cameras)
  • icons: this is where you place custom OH icons. See Items | openHAB for the rules that OH icons much follow
  • items: this is where .items files go. I think you can put .yaml files here too, I can’t remember where the developers ended up on that decision. Items | openHAB is the documentation for .items files. See below for YAML.
  • misc: this is where configs that don’t fit in the other categories go including the whitelist for the Exec binding and custom ephemeris daysets and holidays. See the Exec binding readme for the former and Actions | openHAB for the latter.
  • persistence: this is where .persist file go. See Persistence | openHAB.
  • rules: this is where Rules DSL .rules files go. See Textual Rules | openHAB
  • scripts: this is where Rules DSL .script files go. This is Rules DSL’s half assed attempt at something akin to libraries and reusable functions. See Actions | openHAB for theior limitations and how to call one from a rule.
  • services: this folder is where openHAB core and individual add-ons are stored. There really are no separate docs for these files but they are simple property files. There are a few examples already in this folder with inline documentation in the comments. For stuff that isn’t there with an example, I recommend making the configuration in MainUI, then finding that file in /var/lib/openhab/config and copying that to this services folder. For example, the JS Scripting add-on’s config file is /var/lib/openhab/config/org/openhab/jsscripting.config
  • sitemaps: this folder is where sitemaps for BasicUI or the phone apps are stored. See Sitemaps | openHAB
  • sounds: this is where you place audio files for playback to an audio sink.
  • tags: this is where you would place custom defined semantic tags configs. Only YAML is supported. See YAML Configuration | openHAB
  • things: this is where you would place .things files. See Things | openHAB and the individual add-on’s readme.
  • transform: see the individual transformation add-on’s readme. For script transformations see the automation add-on’s readme. See the Rules DSL docs (link above) for Rules DSL transformations.
  • yaml: The new YAML format allows defining Things, Items and Tags all in the same file. See YAML Configuration | openHAB.
  • Inside rules, there are various ways to interact with OH and the add-ons called actions. The built in actions are documented at Actions | openHAB which have Rules DSL examples. See the readme for the automation add-on you are using if not using Rules DSL. Blockly’s docs can be found at Rules Blockly | openHAB

Note, the YAML format is new and there are many new features upcoming. I hope to see everything configurable with YAML including rules, rule templates, persistence, etc. There are also new features like templating, variables, etc being added to the YAML format to make the configs easier to config by adding more reusable elements.

That covers it. Nothing is really hidden anywhere and it’s all plain text. Almost all of it is documented. MainUI now has the ability to export an entity to a file format, either YAML or DSL so it’s possible to skip the docs and build it in MainUI and then export to file.

I don’t know if this helps or not, but the links above should be the first place to look. If you don’t find what you need there the forum can be useful. And we are always happy to answer specific questions.

Answering this depends. Do you need persistence backed up to? Data that individual add-ons write out to file that is not config? Or just the configs?

If you want just the configs:

  • /etc/openhab
  • /var/lib/openhab/etc
  • /var/lib/openhab/config
  • /var/lib/openhab/jsondb (excluding /var/lib/openhab/jsondb/backup`)

That will capture all the configuration. Everything else can be rebuild by OH from just this. This will include everything you’ve touched manuall either by editing a file or through MainUI. All the files in all of these folders are plain text so you could check them into git or your source control of choice.

And I guess this is the “why” as well. These are all the files that you touched. These are all the configs. What is in each of these folders and their formats are described above.

I run OH in Docker so the paths are a little bit different (I have a root folder with the conf and useradata folders under it) but here is my .gitignore file I currently use to source control my config.

userdata/*
!userdata/secrets
!userdata/etc
!userdata/jsondb
!userdata/config
!userdata/habot
!userdata/openhabcloud
!userdata/uuid
!userdata/zigbee
!userdata/zwave
userdata/jsondb/backup
userdata/backup
.Trash-1000
conf/html/hum.jpg
conf/html/light.jpg
conf/html/power.jpg
conf/html/temp.jpg
conf/automation/lib/javascript/personal/node_modules
conf/automation/js/node_modules/*
!conf/automation/js/node_modules/rlk_personal
*py.class
*.swp
*~
*.old

Note, this file represents ignores going back to OH 2.0 so not everything here is currently relevant.

This excludes the embedded persistence, any stuff generated by add-ons, logs (which for a Docker install are in userdata) or anything like that. It’s just the stuff I’ve touched or which would be mildly inconvenient.

Note, I also host my own git server, I wouldn’t post uuid and secrets and such to a public server.

The cache is only cleared automatically on an upgreade/downgrade of OH. You can clear it manually but the set of problems this solves is small so it’s rarely something one should do. But is has to be done when the version of OH changes due to an upgrade/downgrade.

There are some ways you can get an alert for this. The simoples is to set an expire on the Items to update to UNDEF or NULL if it doesn’t receive an update in too long.

I’ve developed a full approach to get alerts when something stops responding. I’ve published everyhting to the marketplace so if you are interested in something like that I can help get it set up for you.

4 Likes

Many thanks, @rlkoshak, that’s already quite helpful. Especially the parts that reassure me that if I look at the places that you mention, then I am not missing something. I will need to look into this in more detail, and I might start moving parts of the configuration to see how that goes. Realistically speaking, it’s not going to happen in the next few days though.

The part of persistence that’s important for my is in a PostgreSQL database. What I was worried about - and implicitly referring to in my original post - was the following scenario: assume that I make an OH upgrade. After the upgrade I add persisted items. Tables get added to the database, which come with some sort of numerical ID. Later I decide that I need to go back to an older OH version (I was thinking that this might potentially cost me the new items - by now I have learned that this can be avoided, because if I got you right I can expect that the configuration can be backported without too much headache). I wasn’t quite sure whether after such a step the mapping between the items and the persistence database tables would a) get lost, or perhaps, worse, b) make things fail in funky ways.

I also use this pattern for many items. It’s not an option for my EnOcean devices though. The few that I have are mostly window contacts. They are batteryless, and they really only send an update when the window is opened or closed. I use them for windows that I don’t open super often. So when I finally do, and then notice that the open window is not reflected in the Item state, then I have no idea whether it stopped receiving updates two minutes ago or two weeks ago.

1 Like

The mappings are based on the Item name which is the immutable Item UUID. As long as the Items are of the same name and of the same type there should be no problems. If you change the names, you can use aliases to map the new Item name to the old one. If you radically change the type (e.g. from a String to a Number) you might need to manually clear out that Item’s table first.

I don’t remember when aliases were introduced so you can’t go back too far in OH versions and expect that to work. But they are there at least in OH 5.0, maybe before.

The actual data storage in the database hasn’t changed though. It’s still pretty much one table per Item (or what ever concept the database has for tables) and each row is a timestamp and the state of the Item.

Does Enocean support the REFRESH command? Most binding let you send a REFRESH command to an Item and it will go out and poll the device for it’s current state. That should at least force an update to the Item which should keep Expire happy. If it doesn’t respond to a poll the binding may even set the Item to UNDEF or NULL for you.

Do you mind adding this overview to the docs?

3 Likes

A lot of it is already in the installation docs but I can try to come up with something that isn’t too duplicative for the overview docs.

2 Likes

I think Configuration | openHAB would be the best place for it.

Actually, the first part here is not correct. The JDBC Persistence Services uses a mapping table (itemsManageTable) when using the “legacy” naming scheme (tableUseRealItemNames= false, tableCaseSensitiveItemNames = false). The mapping is based on sequential numbering. However, since the mapping is persisted in the database rather than in a configuration file, they won’t get lost during upgrades or downgrades. So it’s pretty safe to do both without any data loss.

2 Likes

Off topic - but this sounds fairly non-optimal performance wise. Doesn’t that mean that simple lookups end up being joins?

The “index” is read into memory (into a Map) and maintained there as well. No joins.

1 Like

But if I change the name of an Item, no alias, it appears as a new Item, right? I’m not referencing technical details in what I’m saying. I’m just pointing out that it’s the Item name that identifies the Item. If you move OH versions and don’t change the Item names, all the old data is still associated with that Item.

All that is really different is that JDBC does the mapping in a certain way with an extra step, but it’s still a mapping between Items identified by their name to a table in the database.

Or does JDBC have some other way to uniquely identify an Item outside of it’s designated UUID?

I read what you wrote one more time, and you are right. The mapping is entirely based on Item name, I think I got confused by “Item UUID”. So yes, when renaming an Item, it will be considered as a new Item, and another table will be created for it.

I have been through that exercise a couple of times when I wanted to rename some badly named Items and keep persistence. IIRC last time I did that by stopping OH meanwhile, so I could rename the Item in my .items file and rename the table as well. But since JDBC Persistence is not tracking ItemRegistry changes, it might be possible to rename the Item and table/ItemsManageTable and then reload the index using the console command jdbc reload. If I had to do a lot of renaming, I would at least look into that option.