Multiply Triggered Rule

openHAB SNAPSHOT 2.4.0~20181022133806-1 (Build #1400)

I have a Group, gReachable, comprising 7 Items. These are Items that receive an online/offline MQTT LWT from a ESP8266 TASMOTA based switch. A device (or more) can rapidly go online/offline causing the rule to trigger multiple times concurrently… particularly when a device is on the “edge” of connectivity due to Wi-Fi signal strength variations. When this happens, a “java.lang.IllegalStateException” exception gets thrown. I have not included the rule logic which is what will be the specific cause of that exception. Rather, I wanted to ask the community for advice on options to handle the race condition properly. I can provide more of the rule logic if that is required for a proper recommendation. I just wanted to “simplify” this post initially.

I can’t find @rlkoshak’s post on avoiding the use of re-entrant locks now. But I read it and felt my situation “qualified” for a proper use case for using a lock. I now definitely am taking his advice to heart. The lock ended up locking up OH :frowning:

var java.util.concurrent.locks.ReentrantLock accessPointRuleLock  = new java.util.concurrent.locks.ReentrantLock()

rule "Access Point Alerts"
    when
        System started or
        Time cron "0 */5 * ? * *" or
        Member of gReachable changed
    then
        accessPointRuleLock.lock()

        try
        {
            // Rule Logic
        }
        catch(Exception e)
        {
            accessPointRuleLock.unlock()
            logError("my_log", "Failure in Access Point Alerts: {}", e.toString)
        }
        finally
        {
            accessPointRuleLock.unlock()
        }
end

Thanks.

Mike

If there were more than five of these events at the same time I’m not surprised.

Unfortunately, I think the answer may reside in the “Rule Logic”.

The first thing is to doubly make sure that you really do need the lock. There is more than one way to deal with the IllegalStateException when events are coming in on a Group while the Rule is actively running.

Without the code it is hard to provide concrete advice but in general:

  • Why does the Rule take long enough to run that this is routinely a problem? Is there a way you can farm some of this long running code into a Timer?

  • Are you iterating over the members of gReachable? Most likely the answer is “yes” and that is what is generating the error. Can you rework your approach to avoid that and just work with triggeringItem?

  • Create a Timer and don’t process the members of gReachable or whatever you are doing to generate the exception until the Rule has been quiet for a second. This may or may not help depending on how regularly the events come in.

  • Get rid of the cron trigger for now. That seems like it will just increase the likelihood of collisions.

  • If you do need to lock, only lock the bare minimum lines of code, not the whole Rule. Make sure that minimum set of code does not and can not generate any Exceptions, particularly type exceptions.

  • Use something like Design Pattern: Gate Keeper to queue up the events and let a Timer work them off in a separate thread. This would work best if you only need information from triggeringItem.

Rich,

Thanks for the list of areas to consider. It definitely let me look at the means to achieve the desired outcome (i.e., an alert monitor) differently… which is what I needed.

I got rid of the cron but that did not resolve the exception… so one possibility easily eliminated. Unfortunately not a positive outcome.

I do iterate through the Group members… and coding this differently will be a bit more complex… but using triggeringItem will definitely make execution simpler. And, after all, since this ought to be write once, execute many, many more… it’s could be worth doing. However, I think I’ll “debounce” the Group state with a timer. Then looping through the members should reduce the likelihood of a race condition sufficiently.

Mike