Sorry for the delay, I really haven’t had any free time earlier to write this.
I’ve done some research on InfluxDB database shrinking, and, if I understood the stuff I’ve read correctly, InfluxDB used numbers of storage engines over time (different from version to version - LevelDB, RocksDB, BoltDB), but the latest version (1.3 at the time of writing this) is using Time Structured Merge Tree (their own storage engine). This storage engine is shrinking data automatically, but only shrinking shard groups that are already expired, and haven’t had any new data written to, or deleted from.
Now, let’s get to the basics. InfluxDB database is the collection of the following things:
- data points
- retention policies
Besides that, there are shards, and shard groups - you never actually deal with them directly, they are just a way data is stored within a database. Shard is one set of data (with all the stuff that defines it - data points, series, measurements…), and a shard group is collection of separate shards, that has it’s expiration, redundancy and distribution defined by a retention policy.
From what I’ve read in the documentation, database could have multiple retention policies defined (besides the one that’s default, and applicable to the database as whole), but those retention policies are defined while writing or reading data, so, I guess this can not be applied to OpenHAB InfluxDB persistence.
Retention policy, in OpenHAB use-case scenario should be defined when creating a database. The example of creating database with a custom retention policy is:
CREATE DATABASE "OpenHABDatabase" WITH DURATION 180d SHARD DURATION 7d NAME "OpenHABRetentionPolicy"
In this example, we create OpenHABDatabase that stores data in shard groups with 7 days expiration, and will retain data for 180 days. Shard expiration means that, after creating database, new shard group will be created, database will store data (shards) in it for 7 days, and after that period (SHARD DURATION) is over, it will create a new shard group, and start storing data in it. Parameter DURATION (180d) means that all data within 180 days will be kept in database. After that period is over, database will start dropping (deleting) oldest shard groups (first 7 days shard group, then second 7 days shard group etc.). If DURATION parameter is not supplied, duration is set to infinity (all data is being kept).
The important thing to mention here is influxdb.cfg parameter retentionPolicy, which should be set to the retention policy name you’ve created while creating database, instead of autogen.
@ThomDietrich I hope this post makes sense. Of course, you can change it in any way you like it, if you wish to adapt it to the tutorial in the first post. If you have any additional questions, or some parts of this post aren’t making sense, feel free to ask, and I will try to explain it more thoroughly. There are also redundancy (for data safety - in case of hard disk or something else fails) and distribution (for speeding up reads and writes - in case you hardware can’t handle them without slowing everything else down) options while creating a retention policy, but right now I don’t think I need them, so, I haven’t put much effort in researching those stuff.