Design Pattern: Gate Keeper

The flag should indicate if the timer is running. I had other issues with the null check, which is why I changed to using the flag. There’s no other thing close that would make another timer to start, but even if that was the case, it would restart a different timer, so when it ends it would turn off the flag.

In general, this technique works very reliably. That may have been an indication of other problems. But it’s free advice, and worth what you pay for it :wink:

EDIT - there is potential for finger trouble when using a generic name like timer with a global variable. Confident you have no other rules using timer?

Add logging statements to the beginning of the timer, in each clause and at the very end so you can see for sure what code is executing.

I can’t imagine how any issues that you would have had with the null check would not be exacerbated by moving to use an Item. Assuming the problem was on of timing, it take a few milliseconds to set a variable to null. It takes up to a few hundred milliseconds for an Item to change state in response to a command. There is a lot of space in 100 msecs to end up in a situation where the Item has received the command ON but is not yet reflecting that state. This is why one usually sees a check for null instead.

Basically, with the null check method (start the timer when it is null) caused a race that two instances both saw it null and it started two timers. With the item, it’ll only start one because the flag changes from OFF to ON only once.

I decided to look at the openhab.log (not sure why I didn’t think about it before). It got a error on the restart timer that it was null, so that explains why the flag didn’t get turned off, so now it’s a different question. How could that be null? Is it actually possible for the timer to start, and get the the restart, before the variable actually gets set? Bizarre. How do timers fit in with the 5 threads for rules (more of a side/curiosity question). I had wondered if running the timer “now” would be a problem, but since the flag seemed to be working I left it that way.

You mentioned editing. “Orphaned” timers from a rules file edit are a known issue - the handle variable (your timer here) gets reinitialized and so no longer connects to the timer. Any existing timer job however continues to exist.

But …

That jogs a memory. I’m pretty sure that in past we have seen, perhaps on a suitably potent platform, a “now” timer beat the creation of its own reference.

fred = createTimer(now, [ |
     if (fred blah) // blowing up as fred not populated yet

Bear in mind fred is not “the” timer, just a handle or reference to it. The timer has to be fully created (and so ticking, ready to run) before the reference is returned to the variable.

Making it now + milliseconds is the obvious recourse.

Probably connected with …

They don’t.
The scheduler runs timer execution blocks from a pool of timer threads, quite separate from regular rules.
Hence there is mechanism for a timer code block to run ahead of the rule that is still creating it.

I would call this a bug myself. The timer shouldn’t start until the variable is set. I’ll change all my now to a later time. What would you call the safe time to use? 10ms, 100ms?
Thanks.

It’s a limitation. Were I the author, I would argue it’s dimwitted usage to schedule a timer for ‘now’, since it will always run late.
Doesn’t matter as no-one is going to fix it, uh, now.
You have to be in stressful circumstances to even come across this issue.

I’d split the difference 50mS, it’s all guesswork. How desperately do you need to run the code “immediately”? That’s your decision, setting the upper limit.

This is going to Alexa, so even 1 second wouldn’t be an issue. I just don’t want to error to happen again. Thanks.

It’s a bit hard to see how your “now” ish timer could get through sending two commands to arrive it its own reference, before it is set, though. Might be worth looking closer at the ‘null’ report details.

Well, here is the error:

2020-05-21 11:36:34.885 [ERROR] [org.quartz.core.JobRunShell         ] - Job DEFAULT.Timer 3545 2020-05-21T11:36:34.881-05:00: Proxy for org.eclipse.xtext.xbase.lib.Procedures$Procedure0: [ | {
  org.eclipse.xtext.xbase.impl.XIfExpressionImpl@714b43ee (conditionalExpression: false)
} ] threw an unhandled Exception:
java.lang.NullPointerException: cannot invoke method public abstract boolean org.eclipse.smarthome.model.script.actions.Timer.reschedule(org.joda.time.base.AbstractInstant) on null
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeOperation(XbaseInterpreter.java:1160) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeOperation(XbaseInterpreter.java:1150) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._invokeFeature(XbaseInterpreter.java:1136) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeFeature(XbaseInterpreter.java:1081) ~[?:?]
    at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.invokeFeature(ScriptInterpreter.java:151) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:861) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:231) ~[?:?]
    at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:215) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:458) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:239) ~[?:?]
    at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:215) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:471) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:255) ~[?:?]
    at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:215) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:458) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:239) ~[?:?]
    at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:215) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.evaluate(XbaseInterpreter.java:201) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.ClosureInvocationHandler.doInvoke(ClosureInvocationHandler.java:46) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.AbstractClosureInvocationHandler.invoke(AbstractClosureInvocationHandler.java:29) ~[?:?]
    at com.sun.proxy.$Proxy276.apply(Unknown Source) ~[?:?]
    at org.eclipse.smarthome.model.script.internal.actions.TimerExecutionJob.execute(TimerExecutionJob.java:48) ~[?:?]
    at org.quartz.core.JobRunShell.run(JobRunShell.java:202) [bundleFile:?]
    at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) [bundleFile:?]
2020-05-21 11:36:34.892 [ERROR] [org.quartz.core.ErrorLogger         ] - Job (DEFAULT.Timer 3545 2020-05-21T11:36:34.881-05:00: Proxy for org.eclipse.xtext.xbase.lib.Procedures$Procedure0: [ | {
  org.eclipse.xtext.xbase.impl.XIfExpressionImpl@714b43ee (conditionalExpression: false)
} ] threw an exception.
org.quartz.SchedulerException: Job threw an unhandled exception.
    at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [bundleFile:?]
    at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) [bundleFile:?]
Caused by: java.lang.NullPointerException: cannot invoke method public abstract boolean org.eclipse.smarthome.model.script.actions.Timer.reschedule(org.joda.time.base.AbstractInstant) on null
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeOperation(XbaseInterpreter.java:1160) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeOperation(XbaseInterpreter.java:1150) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._invokeFeature(XbaseInterpreter.java:1136) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeFeature(XbaseInterpreter.java:1081) ~[?:?]
    at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.invokeFeature(ScriptInterpreter.java:151) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:861) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:231) ~[?:?]
    at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:215) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:458) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:239) ~[?:?]
    at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:215) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:471) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:255) ~[?:?]
    at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:215) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:458) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:239) ~[?:?]
    at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:215) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.evaluate(XbaseInterpreter.java:201) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.ClosureInvocationHandler.doInvoke(ClosureInvocationHandler.java:46) ~[?:?]
    at org.eclipse.xtext.xbase.interpreter.impl.AbstractClosureInvocationHandler.invoke(AbstractClosureInvocationHandler.java:29) ~[?:?]
    at com.sun.proxy.$Proxy276.apply(Unknown Source) ~[?:?]
    at org.eclipse.smarthome.model.script.internal.actions.TimerExecutionJob.execute(TimerExecutionJob.java:48) ~[?:?]
    at org.quartz.core.JobRunShell.run(JobRunShell.java:202) ~[?:?]
    ... 1 more

Frankly, it’s a misuse of timers to create a Timer to run now with absolutely no delay. And whether or not you call it a bug, there’s no way to prevent it. createTimer creates a Timer Object and passes it to the Quartz timer library. In a separate thread the Quartz timer creates a schedule for when to execute the passed in lambda. The execution of the lambda occurs in a separate thread. Thus, Quartz starts executing the lambda even before the call to createTimer returns.

Because we don’t really have access to creating our own threads in Rules DSL, and we shouldn’t be creating our own threads in Rules anyway, we are misusing Timers to create a separate tread in this case.

:laughing:

I’ve mentioned it before and I’ll mention it again. A whole lot of time is being spent to work around limitations of the Rules DSL here. Personally, I think this time would be better spent working with Scripted Automation where:

  • the timing problems go away because only one instance of a Rule can run at the same time eliminating the race condtion
  • this DP is already written and tested and can be used as a simple library call
  • as time passes, Rules DSL is going to slowly become less and less popular as more users come on using NGRE with Python or JavaScript or JSONDB stored Rules first

A couple of things stand out.

  • the error is indeed coming from the Timer’s lambda, I can see that with the Procedures$Procedure0 in the first line of the error
  • it’s definitely coming from the else clause which I can see by the XIfExpressionImpl...(conditionalExpression: false) (I only mention these two parts for general interest in learning to interpret exceptions from Rules)
  • null errors are often misleading; I don’t think that’s the case here but in general, a null error often indicates a type error and the variable isn’t actually null, the engine just couldn’t coerce the Object to a type that makes sense for the operation. The key takeaway from this is that it’s possible that the problem is with the argument passed to reschedule().

I agree with rossko57 that it seems unlikely that you would get through that many lines of code before the call to createTimer returned. But so far that seems the be the most likely explanation despite the bullet points above. You could confirm this by adding a log statement immediately before and immediately after the call to createTimer. If you see the logs from inside the Timer’s lambda before the statement after the call to createTimer you will know with some degree of certainty that the lambda is successfully executing before the call to createTimer returns.

Ultimately the solution will be the same, schedule the timer to at least a little bit into the future to ensure that the call to createTimer returns before the lambda runs.

It’s in your example at the start of this. :slight_smile:

I don’t understand what you mean? Timers are necessary.

Yes, I looked a little more after the last time. A couple things make me hesitant to do this now. While you say this has been around a while, well tested, etc., it seems the documentation is lacking and extra hoops are needed to use it. I think there is more documentation available, but it should be added to the main documentation (so it can be found easily). The other thing is learning a new language. I was disappointed to see that Python 2.7 and 3.x are not compatible, and Jython is still at 2.7. If I convert the rules I want to do it once, not have to convert from 2.7 to 3 later. I am starting to play with Python 3 for other things.

I added these logs, so can see if this can happen again.

Yes, and it’s a misuse of Timers to do it this way. It works just fine on RPis and slower machines. For some reason it’s not working on your machine.

Timer are absolutely necessary to schedule some code to execute in the future. We aren’t doing that here. We are using a timer to execute some code now in another thread. That’s a misuse of Timers. Maybe “using Timers in a way they were never intended to be used” would be a better way to put it.

As opposed to the extra hoops you are going through getting these Rules DSL Rules working?

https://openhab-scripters.github.io/openhab-helper-libraries/index.html

It will be for OH 3.

Unless you are using external Python libraries or using some very esoteric Python 3 or Python 2 language features, your Python 2.7 Rules will work just fine in Python 3. Just be sure to be consistent in your use of white space verses tabs for your indents. That’s the only place I’ve encountered any differences between the two for openHAB Rules (Python 3 is much more strict in making sure they are consistent).

Honestly, as I see it, you are spending a whole bunch of time trying to work around Rules DSL when you could spend that time actually writing Rules instead of debugging and working around limitations of Rules DSL.

And also, it’s not all or nothing. You can install Jython and the helper libraries just to get the Gatekeeper Python library code and write the rest of your Rules in Rules DSL.

image

That’s the kind of hoops I meant. It’s a lot more work than getting a side version of a binding for example. I’m one that uses stable releases.
Thanks for the help.

My point is it’s a whole lot less work than trying to get these Rules DSL Rules to work.

I’m not surprised your Celeron box, modest as it seems, exposes timing issues never seen by Pi users. Not for the first nor last time.
I developed a lot of (OH1) rules on a creaking Pentium laptop. When transplanted to a dual core desktop, a LOT of race conditions emerged that I had not thought about.

Is there an example using this with jython (helper libraries)? I’m slowly migrating my rules dsl to jython but have this pattern for some rules.

Regards S

There’s a link in the original post to a fully implemented and documented Jython version you can download and use unmodified.

1 Like

Here is the documentation and code.

Would it be possible to give an update on this design using openHAB 3 ?

Thanks.