Downsampling data in InfluxDB v2.X

svh · May 22, 2023, 2:19pm

I use InfluxDB 2.X to store the data from openhab. I’m interested in maximum storage time so retension polisy is set to Never. It was fine for year or two.
Now when opening charts from sitemap for a year/years everything hangs up because of huge amount of data. All manuals for influxdb recommend to create periodical tasks for downsampling and use different buckets for different periods and granularity. For example, I’d like to downsample my data from current every minute to 1h for years and 15min for months but have no idea how to get it working this way with openhab.
Would be really great if someone could help to figure it out.
Influxdb downsampling howto: Downsampling with InfluxDB v2.0 | InfluxData

jswim788 · May 22, 2023, 4:15pm

Do you want to do the down sample within OH? The straightforward way to down sample is with Influx directly. That’s what I do with the 1.7 Influx. To do this in Influx 2.0 it appears you make a script, then load it.

Downsample data in an InfluxDB task | InfluxDB OSS 2.7 Documentation shows how to create the script, and Create a task for processing data in InfluxDB | InfluxDB OSS 2.7 Documentation shows how to get it to run within Influx. Looks like you can use the GUI or do it from the command line.

I haven’t tried Influx 2.0, so this is only from what I read there.

svh · May 23, 2023, 7:14am

No, i’m ok to do this with influx directly. The only thing i don’t understand is how to get OH working with several buckets. As I understood from the influx documentation the best way would be to store all incoming measurements in one bucket with relatively small retension time (e.g. 2 weeks or 1 month). Then create several tasks which run periodically and downsample data to different buckets (e.g. months and years with retesion policy 1 month and never accordingly). And it’s clear how to work with this in e.g. grafana but i have no idea about OH.
As I also understood it’s not a good idea to combine everything back to one bucket because it’s not possible (in a good way) to delete data in influx.

jswim788 · May 23, 2023, 3:35pm

Usually you wouldn’t try to get OH working with several buckets. You have OH write to a single bucket (as defined in the configuration InfluxDB (0.9 and newer) - Persistence Services | openHAB), then have the down sample within InfluxDB write into another bucket. What I do is to make the default retention policy be 30 days and that is what OH writes to. Then I have a down sample in InfluxDB that writes to a different bucket. As an example, for temperature, I record the minimum and maximum temperature for the day into a bucket with infinite retention. For a few temperature readings that will never amount to much data. You can choose other options, maybe store for a week, then down sample into monthly or yearly. And you can down sample more than once (e.g., day to week to month to year). For me 30 days and infinity work fine.

It is possible to have OH write to the InfluxDB directly without using persistence and then you can write directly to various buckets, but that’s usually not necessary. The single bucket with down sample to other buckets is easier and covers most uses.

svh · May 24, 2023, 8:08am

Yes, it’s clear for me. But how I can use data from these buckets in the OH graphs (it’s possible to set 1 bucket only in the configuration)?

jswim788 · May 24, 2023, 3:42pm

Generally I use Grafana directly to see this data from multiple buckets. I don’t know how you can see data from multiple buckets directly in OH other than embedding a Grafana chart (see Grafana chart with time ranges).

Maybe someone else has ideas on this can be done directly in OH?

By the way, this is what some of my temperature readings look like in Grafana. You can see the short term storage followed by the once per day min/max that goes forever.

svh · May 25, 2023, 1:06pm

I still use built-in graphs defined in sitemap…