We all change item names over the lifetime of our openhab installation.
After some time, we start to acumulate ‘cruft’ in influxdb due to that.
Question: what would be influxdb query, to return (better: delete) all timeseries which where not updated in X time (let’s say 30days shall be enough) ? Preferable ready to be deleted/dropped, but list shall be also enough.
To be clear: I do not want to delete just data older than 30d. What I’m looking for is to drop all data WHILE ONLY in items (timeseries I suppose frominfluxdb perspective) which were NOT updated in last 30d.
Replying to follow for updates also, when my zigbee sensors run out of battery, the stupid things outputs a “last update” with 100 degrees temp and 0% humidity, which throws off my measurements completely. Need to figure how to delete those….
Your answer is probably best going to be found in the InfluxDB docs or forum.
Probably worth a different thread. But you can either first those out using a proxy Item or a role to only persist when the values make sense, or, as with the answer to OP’s question, look at the InfluxDB docs.
Yep, for sure. I might just brush up on my sql queries instead and hopefully it should be enough to get those values out of there.
But your idea would be great to prevent further crap from being sent to the database, thank you
I did this (think this is not on the very latest influx). Change as needed:
$ influx
> use pinelake_db.autogen
> select * from outdoorMinMaxTemp where outdoorMaxTemp > 100
this gives the range of values, you need the timestamps
> delete from outdoorMinMaxTemp where time > 1665187200000000000 and time < 1666915200000000000
Back to the original post, here’s a rough scheme for it. I’m sure there are errors, so you’ll have to double check. The influx forums will provide more accurate information.
$ influx
> use openhab_db
> show measurements (to get the list of what is in the database)
> select * from "measurement" limit 5 <-- use the name of the measurement from above
If the the time for the measurement that comes back is older than 30 days then you can do this:
> drop series from "measurement" <-- this is permanent, have a backup, and double check before running
By the way, I use a default retention policy of 1 month by default so I don’t accumulate junk in the database. Then I use a continuous query in influx to down sample into something I do keep forever. For example, I capture the max and min temperature once per day via a continuous query and keep that forever. That’s a very small amount of data so I keep all of it.
Thank you kindly @jswim788 for this queries. Due to the fact that I have a few hundreds items in this db, one by one is not an option
Looking for sth similar to JOIN sql queries to be able to feed to the drop statement measurements which do not fall in the criteria.
So still looking for solution here.
BTW, where and how exactly did you define retention 1 month policy? This might be alternate solution, as I will set there 2yrs, and after those time, retention will do it for me