[SOLVED] How do I debounce or antiflap notifications for many devices in a group?

Hi Everyone,

I have a number of MQTT and network devices being monitored and on occasion, they disconnect and reconnect triggering a notification. I’d like to try and delay the notification for a minute so that if the device comes back, I can cancel the notification and filter out the unnecessary notifications.

Looking through the patterns, I can use timers and expire to achieve the outcome but it means I have to create a matching timer or switch with expiry attached for every MQTT/Network device. Can anyone suggest a way where I can simply add a new device to a group to achieve the same outcome?

Below is what I’m using right now.

rule "Network Device Notification"
when
    Member of gNetworkAlerts changed
then
	val networkDevice = triggeringItem

	if(networkDevice.state == ON) {
		logWarn("networkdevice", networkDevice.label + " has just come online")
		if (swAlerts_Network.state == ON) {
			sendPushoverMessage(pushoverBuilder(networkDevice.label + " has just come online").withDevice("MyPhone").withPriority(0));
		}
	} else {
		logWarn("networkdevice", networkDevice.label + " has just gone offline")
		if (swAlerts_Network.state == ON) {
			sendPushoverMessage(pushoverBuilder(networkDevice.label + " has just gone offline").withDevice("MyPhone").withPriority(0));
		}
	}
end

Thanks in advance!

Several ways you could accomplish this. You could create a list indexed by the triggering item and manage timing and timers that way. My experience with the rules DSL is that there maybe problems with thread safety and lists. I eventually moved to Jython to solve those problems.

This is called “antiflapping” or “debounce.” You can see an example in Generic Presence Detection and in [Deprecated] Design Pattern: Motion Sensor Timer. [Deprecated] Design Patterns: Generic Is Alive is actually exactly what you are looking for.

To answer your question, yes you can have one Rule that processes all of these events. But there is no way to get around the fact that you will need to create a separate Timer for each Item if you want to alert on each Item individually. Since you will want to potentially cancel that timer at a later time, you will need to keep a handle on that Timer which means the Timer either needs to be an Item using Design Pattern: Expire Binding Based Timers and Design Pattern: Associated Items or creating Timers in the Rule and storing them in a HashMap (ConcurrentHashMap if you are worried about thread safeness).

The advantage of using Expire binding is the Rule itself becomes very simple. You also have the ability to reset the Timer pretty easily on an OH restart. Personally I will always accept more Items in exchange for a simpler Rule.

import org.eclipse.smarthome.model.script.ScriptServiceUtil

rule "Network Device went offline"
when
    Member of gNetworkAlerts changed
then
    val timer = ScriptServiceUtil.getItemRegistry.getItem(triggeringItem.name+"_OfflineTimer") as SwitchItem

    // Device is now ON and we've already sent an offline alert
    if(triggeringItem.state == ON && timer.state == OFF) {
        // send "back online" alert
    }
    // We haven't alerted yet, start or cancel the timer as appropriate
    else {
        triggeringItem.postUpdate(if(triggeringItem.state == ON) OFF else ON)
    }
end

rule "Network device has been offline for too long"
when
    Member of gNetworkAlertsTimers received command OFF
then
    if(swAlerts_Network.state == ON){
        // send  offline alert
    }
end

The code above is actually a bit simpler than your existing code yet it does more.

If you want to use Timers then the code would look like:

import java.util.Map

val Map<String, Timer> timers = newConcurrentHashMap // I think this works, otherwise use new ConcurrentHashMap() and import it above

rule "Network device status changed"
when
    Member of gNetworkAlerts changed
then
    val timer = timers.get(triggeringItem.name)

    // only send an online alert if we've already sent an offline alert
    if(triggeringItem.state == ON && timer === null) {
        // send back online alert
    }
    // cancel the timer if it came back online before we sent an offline alert
    else triggeringItem.state == ON && timer !== null) {
        timer.cancel
        timers.put(triggeringItem.name, null)
    }
    // it went offline, set a timer to send an alert in a minute
    else {
        timers.put(triggeringItem.name, createTimer(now.plusSeconds(60), [ |
            // send offline alert
            timers.put(triggerintItem.name, null)
        ])
    }
end

There are thread safe lists that can be used. The main thread safety problems in Rules DSL revolve around global lambdas and doing operations on members of Groups when the members are actively changing their states. When creating your own lists or maps, you have the choice of thread safe versions. For something like this I’d be very surprised that one would see a concurrency problem more than once a year if a non-thread safe list or map were used.

Thanks @rlkoshak, the timer one was what I was after as I was looking for a solution where I would not have to manually define a timer or switch with expire for every device. Appreciate you taking the time to code it out for me. Below is what was implemented and is working well.

import java.util.Map
import java.util.concurrent.ConcurrentHashMap

val Map<String, Timer> timers = new ConcurrentHashMap() 

rule "Network device status changed"
when
    Member of gNetworkAlerts changed
then
    val timer = timers.get(triggeringItem.name)

    // only send an online alert if we've already sent an offline alert
    if(triggeringItem.state == ON && timer === null) {
        logInfo("Network Monitoring", "Device has come online: " + (triggeringItem.label)) 
        sendPushoverMessage(pushoverBuilder(triggeringItem.label + " has just come online").withDevice("MyPhone").withPriority(0))
    }
    // cancel the timer if it came back online before we sent an offline alert
    else if (triggeringItem.state == ON && timer !== null) {
        logInfo("Network Monitoring", "Device has come back within timeout: " + (triggeringItem.label)) 
        timer.cancel
    }
    // it went offline, set a timer to send an alert in a minute
    else {
        logInfo("Network Monitoring", "Device has just gone offline: " + (triggeringItem.label)) 
        timers.put(triggeringItem.name, createTimer(now.plusSeconds(60), [ |
            logInfo("Network Monitoring", "Device has been offline for too long: " + (triggeringItem.label)) 
            sendPushoverMessage(pushoverBuilder(triggeringItem.label + " has just gone offline").withDevice("MyPhone").withPriority(0))
        ]))
    }
end