Http binding gets stuck

This is has been driving me crazy, but I have many rules that rely on the HTTP binding. It works 90% of the time, but every now and then it gets stuck and does not poll my devices. I have looked at TRACE logs and I see requests and responses, but then it just stops with NOTHING in the logs. I have them running every 30 seconds so but they will stop for several minutes. If I restart the thing, that get it going again, but it will stop again after minutes or hours.

Thing http:url:rainwise "Rainwise" [
  baseURL="http://10.88.64.97/weather.json", refresh=30] {
    Channels:
      Type string : temp "Temperature" [ stateTransformation="JSONPATH:$.us.atmp.tic" ]
}

Thing http:url:usgs "USGS" [
  baseURL="https://waterservices.usgs.gov/nwis/iv/?sites=01616500&parameterCd=00065", timeout=30000, refresh=3600] {
    Channels:
      Type string : level "Level" [ stateTransformation="XSLT:usgs_level.xsl" ]
}

Thing http:url:outback "Outback" [
  baseURL="http://10.88.64.2/Dev_status.cgi?&Port=0", refresh=60] {
    Channels:
      Type string : dev_status "Dev_status"
}

Thing http:url:pokeys46 "HVAC Temps" [
  baseURL="http://10.88.64.46/devStat.xml", refresh=30] {
    Channels:
      Type string : hvac_supply "HVAC Supply Temp" [ stateTransformation="XSLT:hvac_supply.xsl" ]
      Type string : hvac_return "HVAC Return Temp" [ stateTransformation="XSLT:hvac_return.xsl" ]
}

Thing http:url:pokeys47 "HVAC Temps" [
  baseURL="http://10.88.64.47/devStat.xml", refresh=30] {
    Channels:
      Type string : hvac_supply "HVAC Supply Temp" [ stateTransformation="XSLT:hvac_supply.xsl" ]
      Type string : hvac_return "HVAC Return Temp" [ stateTransformation="XSLT:hvac_return.xsl" ]
}

Thing http:url:pokeys48 "HVAC Temps" [
  baseURL="http://10.88.64.48/devStat.xml", refresh=30] {
    Channels:
      Type string : hvac_supply "HVAC Supply Temp" [ stateTransformation="XSLT:hvac_supply.xsl" ]
      Type string : hvac_return "HVAC Return Temp" [ stateTransformation="XSLT:hvac_return.xsl" ]
}

Thing http:url:pokeys49 "HVAC Temps" [
  baseURL="http://10.88.64.49/devStat.xml", timeout=1000, refresh=30] {
    Channels:
      Type string : hvac_supply "HVAC Supply Temp" [ stateTransformation="XSLT:hvac_supply.xsl" ]
      Type string : hvac_return "HVAC Return Temp" [ stateTransformation="XSLT:hvac_return.xsl" ]
      Type string : hvac1_supply "HVAC1 Supply Temp" [ stateTransformation="XSLT:hvac1_supply.xsl" ]
      Type string : hvac1_return "HVAC1 Return Temp" [ stateTransformation="XSLT:hvac1_return.xsl" ]
      Type string : hvac2_supply "HVAC2 Supply Temp" [ stateTransformation="XSLT:hvac2_supply.xsl" ]
      Type string : hvac2_return "HVAC2 Return Temp" [ stateTransformation="XSLT:hvac2_return.xsl" ]
}

When they stop, do the Things show as online still?

Ultimately I think this is going to need an issue and a maintainer to solve given there’s nothing in the logs that help.

As a temporary work around you can use something like Thing Status Reporting [4.0.0.0;4.9.9.9] (if the Thing goes offline) or Threshold Alert and Open Reminder [4.0.0.0;4.9.9.9] to detect when a Things stops reporting and reset it from a rule. But that’s just to get you by, not since the problem.

Thanks Rich, yes they still show online, I will try Threshold Alert.

I have the threshold alert template installed, I see how it I can set it when a group is outside a range, but how do I trigger it when a value stops changing?

As far as the HTTP binding, it says it is using the secure client, any way to have it use normal HTTP and not secure since there is no HTTPS?

openhab.log:2024-10-26 15:16:58.167 [INFO ] [nding.http.internal.HttpThingHandler] - Using the secure client for thing 'http:url:rainwise'.
openhab.log:2024-10-26 15:16:58.199 [INFO ] [nding.http.internal.HttpThingHandler] - Using the secure client for thing 'http:url:usgs'.
openhab.log:2024-10-26 15:16:58.211 [INFO ] [nding.http.internal.HttpThingHandler] - Using the secure client for thing 'http:url:outback'.
openhab.log:2024-10-26 15:16:58.223 [INFO ] [nding.http.internal.HttpThingHandler] - Using the secure client for thing 'http:url:pokeys46'.
openhab.log:2024-10-26 15:16:58.236 [INFO ] [nding.http.internal.HttpThingHandler] - Using the secure client for thing 'http:url:pokeys47'.
openhab.log:2024-10-26 15:16:58.249 [INFO ] [nding.http.internal.HttpThingHandler] - Using the secure client for thing 'http:url:pokeys48'.
openhab.log:2024-10-26 15:16:58.264 [INFO ] [nding.http.internal.HttpThingHandler] - Using the secure client for thing 'http:url:pokeys49'.

Here are the settings I use for my sensors to alert me when something stops reporting.

Properties:

  dndEnd: 08:00
  rateLimit: ""
  invert: true
  initAlertRule: new-sensor-status-proc
  dndStart: 22:00
  alertRule: new-sensor-status-proc
  endAlertRule: ""
  thresholdState: UnDefType
  defaultRemPeriod: PT12H
  operator: ==
  hysteresis: ""
  reschedule: true
  namespace: sensorStatus
  gkDelay: 1000
  defaultAlertDelay: PT15M
  group: Sensors

The important parts are thresholdState, invert, reschedule, and the defaultAlertDelay.

With this configuration the alerting state is any state that isn’t NULL or UNDEF. This gets the rule started when ever the Item gets a valid state.

There’s a 15 minute alerting delay with reschedule though. That means that every time the Item changes state the alert timer gets rescheduled.

As a result, the initialAlert rule gets called when ever the Item first changes from NULL or UNDEF to a valid state with initialAlert set to true. Only if the Item doesn’t update for 15 minutes (give the defaultAlertDelay) will the alerting rule get called with alerting set to true. This will give you one call when the Item stops reporting and another when it comes back.

I use the same rule for both which is as follows:

configuration: {}
triggers: []
conditions:
  - inputs: {}
    id: "2"
    configuration:
      type: application/javascript
      script: >
        var equipment = actions.Semantics.getEquipment(items[alertItem]);

        var statusItem = items[equipment.name+'_Status'];

        if(equipment === null || statusItem === null) {
          console.warn(alertItem + ' does not belong to an equipment or equipment doesn\'t have a Status Item!');
          false;
        }

        else {
          var statusItem = items[equipment.name+'_Status'];
          console.info('Sensor status reporting called for ' + alertItem + ', equipment ' + equipment.label + ', is alerting ' + isAlerting + ', and is initial alert ' + isInitialAlert 
                       + ', current equipment state is ' + statusItem.state);
          // Sensor is offline                         Sensor back online
          (isAlerting && statusItem.state != 'OFF') || (!isAlerting && statusItem.state != 'ON');
        }
    type: script.ScriptCondition
actions:
  - inputs: {}
    id: "1"
    configuration:
      type: application/javascript
      script: >
        var {alerting} = require('rlk_personal');

        var logger = log('Sensor Alert');

        var equipment = actions.Semantics.getEquipment(items[alertItem]);

        var statusItem = items[equipment.name+'_Status'];


        if(isAlerting && statusItem.state != 'OFF') {
          statusItem.postUpdate('OFF');
          alerting.sendAlert('Offline: ' + equipment.label + ' has stopped reporting', logger);
        }

        else if(!isAlerting && statusItem.state != 'ON') {
          statusItem.postUpdate('ON');
          alerting.sendAlert('Online: ' + equipment.label + ' is back', logger);
        }

        else {
          console.info('Sensor status update alerting ' + isAlerting + ' initial ' + isInitialAlert + ' equipment ' + equipment.label + ' status ' + statusItem.state);
        }
    type: script.ScriptAction

The condition checks to see if the alerting Item is a member of an Equipment and that the Equipment has a Status Item. If not the rest of the rule won’t work. Then the condition checks to see if the Equipment’s status Item matches the state that the alerting rule says it should be in (isAlerting means the Item has stopped reporting so if status != OFF or !isAlerting the Item is reporting so status != ON).

The action does some logging and sets the status Item as appropriate. I have another rule that sends me an alert when a sensor goes OFF or ON.

You would restart the HTTP binding when isAlerting is true.

You wouldn’t need the initAlert rule, just the alerting rule. Set the initial alert to a time longer than the usual reporting rate. I like to choose 2x (e.g. if the Item updates every 30 seconds I’d set the initial alert to 1 minute.

Having issues with the rule:

2024-10-28 12:50:07.286 [ERROR] [pt.javascript.new-sensor-status-proc] - Failed to execute script: TypeError: Cannot read property "name" from null
        at <js>.:program(<eval>:2)
        at org.graalvm.polyglot.Context.eval(Context.java:399)
        at com.oracle.truffle.js.scriptengine.GraalJSScriptEngine.eval(GraalJSScriptEngine.java:458)
        at com.oracle.truffle.js.scriptengine.GraalJSScriptEngine.eval(GraalJSScriptEngine.java:426)
        at java.scripting/javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:262)
        ... 31 more
2024-10-28 12:50:07.286 [ERROR] [ernal.handler.ScriptConditionHandler] - Script execution of rule with UID 'new-sensor-status-proc' failed: org.graalvm.polyglot.PolyglotException: TypeError: Cannot read property "name" from null
2024-10-28 12:50:08.287 [ERROR] [pt.javascript.new-sensor-status-proc] - Failed to execute script: TypeError: Cannot read property "name" from null
        at <js>.:program(<eval>:2)
        at org.graalvm.polyglot.Context.eval(Context.java:399)
        at com.oracle.truffle.js.scriptengine.GraalJSScriptEngine.eval(GraalJSScriptEngine.java:458)
        at com.oracle.truffle.js.scriptengine.GraalJSScriptEngine.eval(GraalJSScriptEngine.java:426)
        at java.scripting/javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:262)
        ... 27 more
2024-10-28 12:50:08.287 [ERROR] [ernal.handler.ScriptConditionHandler] - Script execution of rule with UID 'new-sensor-status-proc' failed: org.graalvm.polyglot.PolyglotException: TypeError: Cannot read property "name" from null

And:

024-10-28 12:55:39.822 [ERROR] [pt.javascript.new-sensor-status-proc] - Failed to execute script: TypeError: Cannot load CommonJS module: 'rlk_personal'
        at com.oracle.truffle.polyglot.PolyglotMapAndFunction.apply(PolyglotMapAndFunction.java:46)
        at org.openhab.automation.jsscripting.internal.OpenhabGraalJSScriptEngine.lambda$13(OpenhabGraalJSScriptEngine.java:268)
        at java.base/java.util.Optional.orElseGet(Optional.java:364)
        at org.openhab.automation.jsscripting.internal.OpenhabGraalJSScriptEngine.lambda$11(OpenhabGraalJSScriptEngine.java:268)
        at <js>.:program(<eval>:1)
        ... 83 more
2024-10-28 12:55:39.823 [ERROR] [internal.handler.ScriptActionHandler] - Script execution of rule with UID 'new-sensor-status-proc' failed: org.graalvm.polyglot.PolyglotException: TypeError: Cannot load CommonJS module: 'rlk_personal'

Which rule? These errors appear to be comming from my processing rule I pasted in above. Your rule is going to be significantly different so I didn’t expect you to try to make it work in your system.

The first error is probably caused by your system not using the semantic model, or at least not using it for the alerting Item. My proc rule assumes all the Items sent to it are members of an Equipment. Though I should handle that error better.

The second error comes from it trying to import a persaonl library you don’t have. I use this lbrary to centralized a bunch of methods that get used across multuple rules but which are not suitable for inclusion in openhab_rules_tools.

In short, you can’t use that example above directly. I meant it mainly as an illustration for what you could do.

If you just want to see the rule tempalte working, I recommend the following:

Create a Rule (it could be a file based rule):

Add a script action with the following code for a UI rule:

console.info("Item " + alertItem + " alerting " + isAlerting + " initial alert " + isInitialAlert);

In the execute block for the rule in a file based rule:

console.info("Item " + event.raw.alertItem + " alerting " + event.raw.isAlerting + " initial alert " + event.raw.isInitialAlert);

That will log out the variables the threshold alert sends to this called rule that will be relevant.

That gets rid of all the extra stuff my rule does and the dependency on my personal library. From there you can start figuring out how to restart the binding to get past where it gets stuck (probably executeCommandLine with a call to karaf console).

So if I create a file base rule, what is my trigger? All my current file base rules have triggers like “when {item} received update” and so on. Also do I need to create items from alertItem, isAlarting and isInitialAlert? If so, are they all strings?

Nothing. Just pass [] for triggers. This rule will have no triggers. It will be called by the threshold alert rule directly.

No, these are variables that will be passed into the your rule when Threshold alert calls it. There is a table with everything that gets passed into the called rule and their types in the Threshold Alert docs. alertItem is the name of the Item alerting, isAlerting and isInitialAlert are both booleans.

does not work with file based rules as far as I can tell.

2024-10-29 16:42:21.143 [WARN ] [el.core.internal.ModelRepositoryImpl] - Configuration model 'foo.rules' has errors, therefore ignoring it: [3,3]: no viable alternative at input '['

You’ll have to show your code. I’m pretty sure there was a PR that made it possible to define a JSRule with an empty array of tirggers. And even when it didn’t support that, you’d get a different error. This is a syntax error.

Oh wait, my code above is a JS rule, not Rules DSL.

You cannnot receive variables when a rule calls another in Rules DSL. You can’t call another rule from a Rules DSL rule either.

I strongly recommend against Rules DSL for new rules development these days. This is only one tiny thing that Rules DSL no longer supports and Rules DSL is no longer easier enough to work in to justify it’s asanine type system and other limitations. Even Blockly is more complete thatn Rules DSL these days.

This is the syntax for a JSRule. It goes in a .js file under automation/js, not a .rules file in the rules folder.

This is so cool, I got it to work with the UI rule (my first :-)) I think I remember you saying before you used text files for sensors? You mind sharing a few with the metadata you use?

Thank you so much for this, I love it!

1 Like

I’m not certain I know what that is referring to. Frankly, I don’t use text file configs for anything in OH any more except for a few functions I have in a personal JS library.

But I do use some metadata on a few of my sensor Items so they can have a unique timeout.

So you see my config above for the Threshold Alert rule (post #4 I think).

The Group Item I use to trigger this rule is called “Sensors”.

Note I’m showing screen shots here so you can see what I see. Normally it’s better to click the code tab and paste the YAML for a forum post.

Each of those Items get updated periodically but they are not all on the same cadence. So if we look at one of them. Hmmm, I need to replace the batteries in the basement Airthings WavePlus so let’s look at my wife’s office temp.

The important part is under Metadata. You can see above I’ve defined the namespace for my instance of the Threshold Alert rule to be “sensorStatus” and you can see an entry under metadata for that. Opening that up shows:

value: " "
config:
  alertDelay: PT10M

For this one Item therefore, instead waiting for no updates for fifteen minutes (defaultAlertDelay: PT15M) this one Item will only take ten minutes (PT10M). Any of the other properties can be overridden in this way for each individual Item. More details are in the post for the Threshold Alert rule template.

That’s pretty much all I ever override at the Item level. I have a different time for my open door reminders, sensor offline detection, humidity alerts, etc (I have lots of instances of this rule in my setup, it makes up roughly a third of all my rules).

Item metadata is optional though. You don’t need it. It’s only there to handle the case where you have one rule where there’s just one or two properties that need to be handled differently for certain Items. It’s better than requiring creating a whole new instance of the rule. But if all your Items work the same, no metadata is required.

I’m not sure if this answers your question or not.

Seeing a lot of these:

2024-10-29 20:30:08.960 [WARN ] [ore.internal.scheduler.SchedulerImpl] - Scheduled job 'ui.monitor_sensors.loopingTimer' failed and stopped
java.lang.IllegalStateException: Multi threaded access requested by thread Thread[OH-scheduler-15,5,main] but is not allowed for language(s) js.
	at com.oracle.truffle.polyglot.PolyglotEngineException.illegalState(PolyglotEngineException.java:129) ~[?:?]
	at com.oracle.truffle.polyglot.PolyglotContextImpl.throwDeniedThreadAccess(PolyglotContextImpl.java:1034) ~[?:?]
	at com.oracle.truffle.polyglot.PolyglotContextImpl.checkAllThreadAccesses(PolyglotContextImpl.java:893) ~[?:?]
	at com.oracle.truffle.polyglot.PolyglotContextImpl.enterThreadChanged(PolyglotContextImpl.java:723) ~[?:?]
	at com.oracle.truffle.polyglot.PolyglotEngineImpl.enterCached(PolyglotEngineImpl.java:1991) ~[?:?]
	at com.oracle.truffle.polyglot.HostToGuestRootNode.execute(HostToGuestRootNode.java:110) ~[?:?]
	at com.oracle.truffle.api.impl.DefaultCallTarget.callDirectOrIndirect(DefaultCallTarget.java:85) ~[?:?]
	at com.oracle.truffle.api.impl.DefaultCallTarget.call(DefaultCallTarget.java:102) ~[?:?]
	at com.oracle.truffle.polyglot.PolyglotFunctionProxyHandler.invoke(PolyglotFunctionProxyHandler.java:154) ~[?:?]
	at jdk.proxy1.$Proxy42990.run(Unknown Source) ~[?:?]
	at org.openhab.automation.jsscripting.internal.threading.ThreadsafeTimers.lambda$0(ThreadsafeTimers.java:85) ~[?:?]
	at org.openhab.core.internal.scheduler.SchedulerImpl.lambda$12(SchedulerImpl.java:189) ~[?:?]
	at org.openhab.core.internal.scheduler.SchedulerImpl.lambda$1(SchedulerImpl.java:88) ~[?:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:840) [?:?]
Caused by: com.oracle.truffle.api.TruffleStackTrace$LazyStackTrace
2024-10-29 20:31:00.529 [INFO ] [openhab.core.model.script.Datacenter] - Inside 69.94400000 Outside 61.4 Setpoint 70.0
2024-10-29 20:32:00.525 [INFO ] [openhab.core.model.script.Datacenter] - Inside 70.35800000 Outside 61.3 Setpoint 70.0
2024-10-29 20:32:00.527 [INFO ] [openhab.core.model.script.Datacenter] - Outside colder then setpoint turn on fan
2024-10-29 20:33:00.525 [INFO ] [openhab.core.model.script.Datacenter] - Inside 69.64400000 Outside 61.2 Setpoint 70.0
2024-10-29 20:34:00.525 [INFO ] [openhab.core.model.script.Datacenter] - Inside 67.65800000 Outside 61.2 Setpoint 70.0
2024-10-29 20:34:00.531 [INFO ] [openhab.core.model.script.Datacenter] - Inside colder than setpoint turn off fan
2024-10-29 20:34:38.725 [WARN ] [ore.internal.scheduler.SchedulerImpl] - Scheduled job 'ui.monitor_sensors.loopingTimer' failed and stopped
java.lang.IllegalStateException: Multi threaded access requested by thread Thread[OH-scheduler-21,5,main] but is not allowed for language(s) js.
	at com.oracle.truffle.polyglot.PolyglotEngineException.illegalState(PolyglotEngineException.java:129) ~[?:?]
	at com.oracle.truffle.polyglot.PolyglotContextImpl.throwDeniedThreadAccess(PolyglotContextImpl.java:1034) ~[?:?]
	at com.oracle.truffle.polyglot.PolyglotContextImpl.checkAllThreadAccesses(PolyglotContextImpl.java:893) ~[?:?]
	at com.oracle.truffle.polyglot.PolyglotContextImpl.enterThreadChanged(PolyglotContextImpl.java:723) ~[?:?]
	at com.oracle.truffle.polyglot.PolyglotEngineImpl.enterCached(PolyglotEngineImpl.java:1991) ~[?:?]
	at com.oracle.truffle.polyglot.HostToGuestRootNode.execute(HostToGuestRootNode.java:110) ~[?:?]
	at com.oracle.truffle.api.impl.DefaultCallTarget.callDirectOrIndirect(DefaultCallTarget.java:85) ~[?:?]
	at com.oracle.truffle.api.impl.DefaultCallTarget.call(DefaultCallTarget.java:102) ~[?:?]
	at com.oracle.truffle.polyglot.PolyglotFunctionProxyHandler.invoke(PolyglotFunctionProxyHandler.java:154) ~[?:?]
	at jdk.proxy1.$Proxy42990.run(Unknown Source) ~[?:?]
	at org.openhab.automation.jsscripting.internal.threading.ThreadsafeTimers.lambda$0(ThreadsafeTimers.java:85) ~[?:?]
	at org.openhab.core.internal.scheduler.SchedulerImpl.lambda$12(SchedulerImpl.java:189) ~[?:?]
	at org.openhab.core.internal.scheduler.SchedulerImpl.lambda$1(SchedulerImpl.java:88) ~[?:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:840) [?:?]
Caused by: com.oracle.truffle.api.TruffleStackTrace$LazyStackTrace

Which version of OH are you running?

GraalVM, the engine upon which JS runs, mimks a Node.js environmnet and Nofe.js doesn’t support multithreaded programming. And because multi-threaded is generally hard to do, GraalVM doesn’t allow it either.

There are locks in place at the JS Scripting add-on level to prevent multithreaded access to the context which is supposed to prevent these exceptions. However there was a regression that just recently got fixed (I think it’s only in the snapshots right now unless 4.3 M3 was released yesterday).

In the mean time, you can add a few hundred milliseconds to the gatekeeper property of the rule. Gatekeeper is a class I wrote that lets you schedule a timeout between commands. In this rule it’s used to prevent the threshold alert rule from calling the alert rules too fast. So if you set the property to 300, even if thershold alert has three alerts it needs to make within a few milliseconds of each other it will space them out by 300 msec.

I mainly implemented this to support rate limiting issues like calling Telegram or similar services too fast but it can help mitigate this multithreaded exception issue too. Or you can upgrade to the latest snapshots and get the regression fix which should prevent this too now.

I am running latest stable. 4.2.2, I will check out 4.3 M3.

Does this rule template work with floating point numbers or just integers? It looks like I have some sensors triggering that are changing, but only right of the decimal point.

It works with floating point as well as integers and numbers with compatible units.

Internally it treats everything as a String. Before trying to compare the state to the threashold or applying hysteresis it first tries the create a Quantity out of the operands (i.e. a number with a unit). If that fails it tries to parse them into a plain number. If that fails it keeps it as a String.

In both the Quantity and number cases, I think any notation supported by JS’s parseFloat can be used including scientific notation.

If you manually run the rule (click the play icon) the role will run through it’s configuration and all the items and generate error messages and warnings if there’s a problem. One of those checks is to see if the threshold has a unit that the Items have units too and vise versa. These are printed to the log.

If there any change at all, no matter how small, the rule will pick it up but that raises a different option.

If you did have an item that updates frequently but doesn’t change that often, you can change the trigger of the threshold alert rule to a received update trigger instead of a changed trigger.

I tried to make that an option but there’s a limitation with rule templates. I can’t configure the type of trigger from a property.