InfluxDB query error

Since a couple of days, I’m getting the following warning in my openhab.log file when I try to use the persistence API function “previousState” in my JS rule on one specific item (“previousState” is working without errors/warnings on the other items):

2024-11-15 09:53:40.844 [WARN ] [rnal.influx2.InfluxDB2RepositoryImpl] - Failed to execute query 'FilterCriteria [itemName=vHOUS_SAInverterBusVoltageTotal_Volt, beginDate=null, endDate=2024
-11-15T09:53:30.840776089-07:00[America/Phoenix], pageNumber=0, pageSize=1000, operator=EQ, ordering=DESCENDING, state=null]': timeout

At the same time the following warning is showing in the InfluxDB logfile:

ts=2024-11-15T17:08:11.674126Z lvl=warn msg="internal error not returned to client" log_id=0stB6~u0000 handler=error_logger error="context canceled"

The line in the JS rule which produces this error message is:

const isBusVoltageChangedPrevState = items.vHOUS_SAInverterBusVoltageTotal_Volt.persistence.previousState(true);

The item definition looks like this:

Number:ElectricPotential vHOUS_SAInverterBusVoltageTotal_Volt "SA Inverter Bus Voltage Total" <sun> (gSolar) {channel="mqtt:topic:mymosquittobroker:SolarAssistant:InverterBusVoltageTotal_Volt"}

I’m running OpenHAB 4.2.0 with InfluxDB 2.7.10 (both via docker).
There are no other issues with the installation. It runs since many months without problems (just updated InfluxDB from 2.7.3 to 2.7.10 to see if the error will go away, but it’s the same under both versions).

I don’t know, why this error occur.
Any ideas how I may dig deeper into analyzing this problem?

update:
Just updated from OH 4.2.0 to 4.2.2 but the problem is still there.

I’ve activated TRACE log level in the InfluxDB Persistence service to be able to see ALL queries made to the DB. Unfortunately, there is no query logged for this call (I can see a lot of other queries in TRACE mode).

Maybe the query is not logged, because the warning says “timeout”?

I found the following additional warnings in the OH logfile:

2024-11-17 09:25:10.556 [WARN ] [nce.extensions.PersistenceExtensions] - There is no queryable persistence service registered with the id 'influxdb'
2024-11-17 09:25:10.557 [WARN ] [nce.extensions.PersistenceExtensions] - There is no queryable persistence service registered with the id 'influxdb'
2024-11-17 09:25:10.560 [WARN ] [nce.extensions.PersistenceExtensions] - There is no queryable persistence service registered with the id 'influxdb'

But the persistence is working in general without problems - so I have no idea, if this warning is related to the above described problem.

The “./services/addons.cfg” file contains:

persistence = influxdb

The “./services/influxdb.cfg” file seems to be correct (because if I make changes in this file it will directly reflected in OH).

The “./persistence/influxdb.persist” file also seems to be correct with the defined Strategies and specific Items defined (because if I make changes in this file it will directly reflected in OH).

Where may I look additionally? Any ideas?

I’ve tried many things to dig deeper into this problem, but no success.
I especially don’t understand why this only happens with one item and the same “previousState(true)” function works perfectly for other items.

Does nobody have any idea how I can further analyze the problem?

I found at least a workaround which I can’t explain why it works.

If I specify “false” as a parameter to the method “previousState” than it works. This is the only item where this problem exists all others working with “true” as the skipEqual parameter.

I’ve done two tests after I was able to activate TRACE logging:

The OH log output with TRACE loglevel for InfluxDB while using “previousState(false)” - this works:

2024-11-19 16:49:23.852 [TRACE] [.influxdb.InfluxDBPersistenceService] - Query-Filter: itemname: vHOUS_SAInverterBusVoltageTotal_Volt, ordering: DESCENDING, state: null,  operator: EQ, getBeginDate: null, getEndDate: 2024-11-19T16:49:23.851592120-07:00[America/Phoenix], getPageSize: 1000, getPageNumber: 0
2024-11-19 16:49:23.852 [TRACE] [rnal.influx2.InfluxDB2RepositoryImpl] - Query from(bucket:"openhabdb")
        |> range(start:-100y, stop:2024-11-19T23:49:23.851592120Z)
        |> filter(fn: (r) => r["_measurement"] == "vHOUS_SAInverterBusVoltageTotal_Volt")
        |> keep(columns:["_measurement", "_time", "_value"])
        |> sort(desc:true, columns:["_time"])
        |> limit(n:1000, offset:0)

The OH log output with TRACE loglevel for InfluxDB while using “previousState(true)” - this does NOT work:

2024-11-19 16:49:30.698 [TRACE] [.influxdb.InfluxDBPersistenceService] - Query-Filter: itemname: vHOUS_SAInverterBusVoltageTotal_Volt, ordering: DESCENDING, state: null,  operator: EQ, getBeginDate: null, getEndDate: 2024-11-19T16:49:30.696763713-07:00[America/Phoenix], getPageSize: 1, getPageNumber: 0
2024-11-19 16:49:30.699 [TRACE] [rnal.influx2.InfluxDB2RepositoryImpl] - Query from(bucket:"openhabdb")
        |> range(start:-100y, stop:2024-11-19T23:49:30.696763713Z)
        |> filter(fn: (r) => r["_measurement"] == "vHOUS_SAInverterBusVoltageTotal_Volt")
        |> keep(columns:["_measurement", "_time", "_value"])
        |> last()
2024-11-19 16:49:30.743 [TRACE] [.influxdb.InfluxDBPersistenceService] - Queued InfluxPoint{measurementName='vHOUS_SAInverterBusVoltageTotal_Volt', time=2024-11-19T23:49:30.743636463Z, value=370, tags={item=vHOUS_SAInverterBusVoltageTotal_Volt}} for item vHOUS_SAInverterBusVoltageTotal_Volt (Type=NumberItem, State=370 V, Label=SA Inverter Bus Voltage Total, Category=sun, Groups=[gSolar])
2024-11-19 16:49:33.853 [WARN ] [rnal.influx2.InfluxDB2RepositoryImpl] - Failed to execute query 'FilterCriteria [itemName=vHOUS_SAInverterBusVoltageTotal_Volt, beginDate=null, endDate=2024-11-19T16:49:23.851592120-07:00[America/Phoenix], pageNumber=0, pageSize=1000, operator=EQ, ordering=DESCENDING, state=null]': timeout

If I use both queries in the InfluxDB query builder both are working as they should:

The query with “previousState(false)”:
Takes really long because it retrieves the max. of 1000 values, but works in OH as expected.

The one with “previousState(true)”:
Just returns the last persisted value which is what I also expected in OH, but it’s not working there.

Now I wonder why the query is not working if made from OH (timed out), even if the exact same query in the InfluxDB query builder is working (and really fast).

So it looks like it’s an OH issue and not an InfluxDB issue.

What is really fast in this context ?
Is it less than 1 ms, less than 1 s, less than 10s, less than 1 minute ?
Could it be that openhab vs. InfluxDB query builder use different timeout settings ?

You are absolutely right, my wording “very fast” and “slow” was not very precise…

Here are the results of both queries with the InfluxDB “Data Explorer” “script editor”:
Both queries are working in the “Data Explorer” but the Query 2 is not working if called from OH.

Query 1: This one is generated from OH while using previousState(false):

from(bucket:"openhabdb")
        |> range(start:-100y, stop:2024-11-19T23:49:23.851592120Z)
        |> filter(fn: (r) => r["_measurement"] == "vHOUS_SAInverterBusVoltageTotal_Volt")
        |> keep(columns:["_measurement", "_time", "_value"])
        |> sort(desc:true, columns:["_time"])
        |> limit(n:1000, offset:0)

This query takes “really long” with 11.5s execution time (compared to Query 2).

Query 2: This one is generated from OH while using previousState(true):

from(bucket:"openhabdb")
        |> range(start:-100y, stop:2024-11-19T23:49:30.696763713Z)
        |> filter(fn: (r) => r["_measurement"] == "vHOUS_SAInverterBusVoltageTotal_Volt")
        |> keep(columns:["_measurement", "_time", "_value"])
        |> last()

This query is “really fast” with 0.97s execution time (compared to Query 1).

It seems, that the OH method previousState(true) with Query 2 - which I would like to use - generates a “better” (much faster) query for my use case but unfortunately it’s not working in OH (even if the generated query string is correct). The error message if using previousState(true) in OH gives a hint “timeout” (see my first post) which doesn’t make sense in terms of execution time, because this query executes much faster compared to Query 1 which is >11 times slower but is working and does not produce a timeout.

The Query 1 is working in OH but costs much more CPU resources and execution time.

My workaround for now is to use previousState(false) for this specific OH item. But it would be great to know why this problem only occur with just one of my item but others are working fine with previousState(true).

1 Like

Does anyone have any ideas on how to further narrow down this problem? I can reproduce the problem, change logging levels, run tests, etc. at any time.

I still pretty sure that there is no glitch in the InfluxDB data for this specific item which showing the problem. Maybe there is a general issue with the OpenHAV - InfluxDB persistence when using the persistence method “previousState(true)” under some circumstances.

The only thing I know is for sure is, that the query which OH creates for this request (seeing it with TRACE in the OH logfile) is in general valid and could be used with InfluxDB query editor without issues - but it fails if called from OH.

I am surprised to see the queries generated for the previousState(skipEqual) parameters. For skipEqual false, I would expect to see Query 2. For skipEqual true, I would expect to see Query 1.
It does make sense that Query 1 takes longer as it retrieves much more data. Still, the difference seems huge.

The reason is that false would just take the last value from persistence as in Query 2 last(). For true, it would get 1000 values sorted from most recent to older, iterate over these and compare with the current item state until it finds one with state different from current item state. It will get another 1000 if not found… So the query gets a sorted 1000 at a time. The method logic loops through these.
I don’t see anything in the persistence method code that would explain switching around these parameters. And as this method creates a filter it sends to the specific persistence service, I don’t know why that filter could be inverted.

From what I can see, relevant logging, if any, should com from org.openhab.persistence.influxdb. All implementation of the QueryablePersistenceService.query() method is there. I can only guess there is a timeout defined in the Influx DB persistence addon that kills the query when it is slow to return, and therefore the database also writes an error log. I don’t know the specific Influx DB code, so cannot solve that. The duration of the query though seems excessive and that is not an issue with OH, but with InfluxDB. Running the query from the InfluxDB command line may not make that visible as the timout it uses may well be much longer.