Why are persistence records not deleted when the item is deleted?

What the title says, when persisting with RRD4J or JDBC (mariadb/mysql), all records remain if the associated items is deleted.

Cannot see any reason for this behaviour. Over time a lot of garbage remains in the persistence area and removing then is very cumbersome.

IMHO the persistence records should already be deleted if they are taken out of persistence configuration.

A Home Automation system is a very vital system with lots of data points coming and going.

Within that context: When first creating persistence, the default configuration is “persist all items”. There is no way to get rid of this default configuration, thereby with lots of these unwanted persistence records.

Thx for any comment / suggestion.

Gerry

If you are using JDBC persistence, then these maintenance commands could help you.

Just open the colsole

openhab-cli console

and there list the problematic tables by

jdbc tables list

and then, after you review the problematic tables, you can clean the tables by

jdbc tables clean

1 Like

OH has no idea whether a deleted Item is going to come back or not.

Every time you edit a .items file, every Item in that file gets deleted as the file is unloaded, and then they get created anew when the file is reloaded. You don’t want all your persistence, links, etc to be deleted just because you edited a file (it’s not just the one Item you edited either, it’s all Items in that file).

And when OH closes down, it deletes all the Items as part of the unloading process. You don’t want to delete all your persistence just because OH restarted.

There is simply no way for OH to reliably know that an Item is deleted forever or only momentarily because it’s being edited. So it’s up to you to clean up leftovers as necessary (or just leave them, they do not take up much space and OH will ignore them if there is no Item or the persistence config excludes them) because only you know when you’ve actually permanently deleted an Item.

Unless and until someone volunteers to completely rework how Items (and probably everything else too) are managed by OH or file based Items are eliminated as a possibility (which is never going to happen) leaving the data in case the Item is coming back is better than losing all persisted data just because you edited some other Item in the same file.

If you are using file based configs, define the config prior to installing the add-on.

If you are using a managed config, you can:

  1. install the add-on
  2. configure the persistence
  3. stop OH
  4. remove anything that OH created (what depends on the persistence used, for rrd4j it’s just remove the files in $OH_USERDATA/rrd4j)
  5. start OH again

I am well aware of this, and there are a series of PR’s that have not been reviewed and accepted to solve this. Strategies would not be applied automatically anymore, There would have to be an explicit configuration. See Persistence no default strategies and persistence configuration health check by mherwege · Pull Request #4682 · openhab/openhab-core · GitHub , Persistence strategies not automatically applied by mherwege · Pull Request #3123 · openhab/openhab-webui · GitHub and Breaking change alert: persistence no default by mherwege · Pull Request #1737 · openhab/openhab-distro · GitHub . This recent PR Do not require `Strategies{}` in `persistence/` files by dilyanpalauzov · Pull Request #5094 · openhab/openhab-core · GitHub has similar arguments for proposing a change.

That being said, overall I think it would be a bad idea to automatically delete persisted data if an item is removed, changed, or its persistence configuration is changed or removed.

1 Like

Hi all,

and thanks to Rick & Mark for comments and explanations, although I could not follow all details.

I’m still not convinced that persistence of removed items is vital for the system. It’s a simple dependency graph from item to persistence, not the other way round. Yes, the item must be removed from the persistence configuration automatically.

If a rule stumbles over a missing item of a persistence, it should just throw an exception. (An issue is already open for that)

Regarding default persistence configuration, obvisously also others stumbled upon the default persistence configuration. Defaulting to “persist all” leads to senseless persistence records like timestamps of ‘Device Last Seen’.

There is for sure a target conflict between making it simple for users by automatically activating persistence e.g., for sensors and making it most effective.

While disk space and performance is not a limitation nowadays, “defaulting to all” comes at the price of storing senseless data.

As this is complaning on a high level, I still give my appreciation to all volunteers supplying us with their spare time and dedication.

Regards

Gerry

Simple for you to say. Not so simple to implement because. Items get deleted all the time in OH routinely.

Again, I ask, do you want to lose all the persistence for all the Items in foo.items just because you edited one of the Items defined in that file? Because that is what you are advocated for.

OH has no way to know why an Item was deleted from the registry. Was it deleted forever? Was it deleted because a .items file was reloaded? Was it deleted because OH is restarting? Only in one of these cases should the persistence be removed.

Unless and until the way OH completely changes how it manages Items from files, it is impossible to autodelete persistence because OH never knows if an Item is being deleted temporarily (e.g. the .items file is being reloaded because of a change) or forever.

To help here are the steps that OH takes when you edit a .items file. I’ll skip some details for clarity.

  1. foo.items is saved which generates a file system event
  2. OH sees the file system event and starts to process foo.items
  3. First, OH deletes all Items defined that were defined in foo.items (all your persistence would be deleted here if persistence were automatically removed with the Item)
  4. Next OH starts to parse foo.items, loading each Item one-by-one and adding it to the registry.
  5. If restoreOnStartup is defined for any of these Items, the Item’s state is updated to the most recently saved value in persistence (if persistence were auto deleted there would be no persistence)

OH does not keep track of the Items that were defined by foo.items before it was unloaded compared to after it was reloaded. It doesn’t keep track of what changed for each and every Item. It simple deletes them all and recreates them based on the new contents of the file.

But let’s say OH could keep track of the Items before the file was saved and after the file was saved and only delete persistence for those Items that were removed from the file. What happens if you decide to move an Item from foo.items to bar.items. You lose the persistence for that Item. If you try to save bar.items first you’ll get an error saying the Item already exists. If you save foo.items first, OH will think your Item was deleted and remove the persistence.

This is all much much more complicated than you think it is for file based Items. And it’s far better to not delete user’s persistence unexpectedly than it is to keep around some old data which is no longer used.

It would be nice though if there were a tool similar to the orphaned links tool to find a clean up these on demand and on an Item-by-Item basis. But as long as OH supports defining Items .items or .yaml files, autodeleting persistence is not feasible without completely reworking the architecture or causing tons of users to unexpectedly lose their persisted data during routine operations.