-
Platform information:
- Hardware: Raspberry Pi 3B
- OS: Openhabian
- Java Runtime Environment: Oracle build 1.8.0_191-b12
- openHAB version: 2.4
-
Issue of the topic: please be detailed explaining your issue
Recently I have had problems with cron triggered rules that stop working after a while (a few days after restart). When this last happened I opened and ran the threads
command, and noticed this:
"openHAB-job-scheduler_Worker-1" Id=87 in TIMED_WAITING
at java.lang.Thread.sleep(Native Method)
at org.apache.commons.exec.DefaultExecuteResultHandler.waitFor(DefaultExecuteResultHandler.java:119)
at org.eclipse.smarthome.io.net.exec.ExecUtil.executeCommandLineAndWaitResponse(ExecUtil.java:125)
at org.eclipse.smarthome.model.script.actions.Exec.executeCommandLine(Exec.java:63)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeOperation(XbaseInterpreter.java:1086)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeOperation(XbaseInterpreter.java:1061)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._invokeFeature(XbaseInterpreter.java:1047)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeFeature(XbaseInterpreter.java:992)
at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.invokeFeature(ScriptInterpreter.java:151)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:902)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:865)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:224)
at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:204)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:827)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:264)
at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:204)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:447)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:228)
at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:204)
at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.evaluate(XbaseInterpreter.java:190)
at org.eclipse.smarthome.model.script.runtime.internal.engine.ScriptImpl.execute(ScriptImpl.java:82)
at org.eclipse.smarthome.model.rule.runtime.internal.engine.ExecuteRuleJob.execute(ExecuteRuleJob.java:67)
at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
- locked java.lang.Object@173adaa
"openHAB-job-scheduler_Worker-2" Id=88 in TIMED_WAITING on lock=java.lang.Object@1b7ad49
at java.lang.Object.wait(Native Method)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568)
One of the execution threads is blocked, and when the other one becomes as well, cron rules stop running.
Since this points to the executeCommandLine being the culprit I enabled debug logging for org.eclipse.smarthome.io.net.exec
And added logging statements to my rules that use it.
Today I noticed that one thread was locked again and checked the logs. This is what it normally looks like:
2019-01-09 16:20:38.031 [INFO ] [marthome.model.script.security.rules] - Checking locks
2019-01-09 16:20:38.042 [DEBUG] [lipse.smarthome.io.net.exec.ExecUtil] - executed commandLine 'verisure@@state'
2019-01-09 16:20:42.826 [DEBUG] [lipse.smarthome.io.net.exec.ExecUtil] - exit code '0', result '{"2AQU S2N3":"LOCKED","2AYG XH5E":"LOCKED"}'
2019-01-09 16:20:42.902 [INFO ] [marthome.model.script.security.rules] - Checking locks - Done
And this is when it failed:
2019-01-09 16:25:38.028 [INFO ] [marthome.model.script.security.rules] - Checking locks
2019-01-09 16:25:38.037 [DEBUG] [lipse.smarthome.io.net.exec.ExecUtil] - executed commandLine 'verisure@@state'
No error, but the method just never returns, and thus blocking the thread indefinetely.
This is the rule that faulted:
rule "Doorlocks periodic check"
when
Time cron "38 0/5 * * * ?" //or
//Item Test received command
then
logInfo("security.rules", "Checking locks")
val states = executeCommandLine("verisure@@state", 5000)
if (states === null || states == "") return;
lockLabels.forEach[label |
val rawState = transform('JSONPATH', "$.['" + label + "']", states)
val item = ScriptServiceUtil.getItemRegistry.getItem(transform("MAP", "verisure.map", label))
val state = switch (rawState) {
case 'LOCKED': ON
case 'UNLOCKED': OFF
}
item.postUpdate(state)
]
logInfo("security.rules", "Checking locks - Done")
end
I have a timeout at 5000ms, but this script never takes that long. I have another that sometimes takes too long, but then I still get a return value:
2019-01-09 16:28:19.096 [DEBUG] [lipse.smarthome.io.net.exec.ExecUtil] - exit code '143', result ''
Meaning that openhab kills the process if it takes too long, so that shouldn’t be the cause.
Any idea how I can continue investigating? I know that other users have had problems with cron rules suddenly stops triggering, perhaps this can be a possible cause?