Iterating over a group, want to check an alternate item, sometimes

Here is an example of what I do, now that I’m at a computer and can copy and paste. Note that I’m monitoring using MQTT heartbeat messages and LWT messages, Network binding, and heartbeat type messages to detect when devices are online. I’m lucky enough to not need the full Generic Is Alive approach.

Items:

// Set to OFF if any device I want to monitor goes offline
Group:Switch:AND(ON, OFF) gSensorStatus "Sensor's Status [MAP(admin.map):%s]"
  <network>

// Holds an Alerted switch I can use to indicate I've already sent an alert about a given device
Group:Switch gOfflineAlerted

// Sonoff
Switch vSonoff_3157_Online "Powercord 3157 [MAP(admin.map):%s]"
    <network> (gSensorStatus, gResetExpire)
    { mqtt="<[mosquitto:tele/sonoff-3157/LWT:state:MAP(sonoff.map)", epxire="24h,state=OFF" }

String vSonoff_3157_Uptime "Powercord 3157 Uptime [%s hours]"
    <clock> (gResetExpire)
    { mqtt="<[mosquitto:tele/sonoff-3157/STATE:state:JSONPATH($.Uptime)]", expire="1h,state=NA" }

Switch vSonoff_3157_Online_Alerted (gOfflineAlerted)

Switch vSonoff_4079_Online "Powercord 4079 [MAP(admin.map):%s]"
    <network> (gSensorStatus, gResetExpire)
    { mqtt="<[mosquitto:tele/sonoff-4079/LWT:state:MAP(sonoff.map)", epxire="24h,state=OFF" }

String vSonoff_4079_Uptime "Powercord 4079 Uptime [%s hours]"
    <clock> (gResetExpire)
    { mqtt="<[mosquitto:tele/sonoff-4079/STATE:state:JSONPATH($.Uptime)]", expire="1h,state=NA" }

Switch vSonoff_4079_Online_Alerted (gOfflineAlerted)

// more Sonoff devices, using Tasmota firmware

// Nest
Switch vNest_Online "Nest Status [MAP(hvac.map):%s]"
    <network> (gSensorStatus)
    { nest="<[thermostats(Entryway).is_online]" }

Switch vNest_Online_Alerted (gOfflineAlerted)


// Network devices and services
Switch vNetwork_Cerberos "Cerberos Network [MAP(admin.map):%s]"
  <network> (gSensorStatus, gResetExpire)
  { channel="network:servicedevice:cerberos:online", expire="2m" }

Switch vNetwork_Cerberos_Alerted (gOfflineAlerted)

Switch vNetwork_Hydra "Hydra Network [MAP(admin.map):%s]"
  <network> (gSensorStatus, gResetExpire)
  { channel="network:servicedevice:hydra:online", expire="2m" }

Switch vNetwork_Hydra_Alerted (gOfflineAlerted)

// lots more Network binding Switches

// Services
Switch vCerberos_SensorReporter_Online "Cerberos sensorReporter [MAP(admin.map):%s]"
    <network> (gSensorStatus, gResetExpire)
    { mqtt="<[mosquitto:status/sensor-reporters:command:OFF:.*cerberos sensorReporter is dead.*],<[mosquitto:status/cerberos/heartbeat/string:command:ON]", expire="11m,command=OFF" }

String vCerberos_SensorReporter_Uptime "Cerberos sensorReporter Uptime [%s]"
    <clock> (gResetExpire)
    { mqtt="<[mosquitto:status/cerberos/heartbeat/string:state:default]", expire="10m,state=NA" }

Switch vCerberos_SensorReporter_Online_Alerted (gOfflineAlerted)

Switch vCerberos_Camera_Online "Cerberos Camera [MAP(admin.map):%s]"
        <network> (gSensorStatus, gResetExpire)
        { channel="network:servicedevice:garagecamera:online", expire="2m" }

Switch vCerberos_Camera_Online_Alerted (gOfflineAlerted)

// Lots more network services (i.e. Network binging configured to ping specific ports)

// RFM69HW
Switch vMasterBedroom_Sensors_Online "Master Bedroom Sensors [MAP(admin.map):%s]"
    <network> (gSensorStatus, gResetExpire)
    { mqtt="<[mosquitto:rfm69/uber_sensors/mbr/uptime/string:command:ON]", expire="5m,command=OFF" }

String vMasterBedroom_Uptime "Master Bedroom Sensors Uptime[%s]"
    <clock> (gResetExpire)
    { mqtt="<[mosquitto:rfm69/uber_sensors/mbr/uptime/string:state:default]", expire="5m,state=NA" }

Switch vMasterBedroom_Sensors_Online_Alerted (gOfflineAlerted)

// More Items that represent my RFM69HW Arduino sensors

// Smoke/CO Alarms
Group:Switch:OR(OFF,ON) gAlarmStatus "The Smoke/CO Alarms are [MAP(admin.map):%s]"
    <network>

Switch vMainFloorSmokeCOAlarm_Heartbeat "Main Floor Smoke/CO Alarm is [MAP(admin.map):%s]"
    <network> (gAlarmStatus, gSensorStatus, gResetExpire)
    { channel="zwave:device:dongle:node5:alarm_general", expire="24h,command=OFF" }

Switch vMainFloorSmokeCOAlarm_Heartbeat_Alerted (gOfflineAlerted)

// More Zwave smoke alarms which generate a heartbeat message on the alarm_general channel every so often

Key things to note:

  • For MQTT I rely upon LWT messages to immediately detect when a device goes offline. I also use a heartbeat message to keep the online Switch ON and the Expire binding to turn it OFF after a certain amount of time with no updates.

  • The gResetExpire Group is used in a System started Rule to sendCommand the restoreOnStartup state to the Item to reset the Timer. Otherwise you might get false ON states for Items if for some reason a device went down while OH was down. You will see this Rule below.

  • I’m lucky enough to have all devices and services I want to monitor that I can rely on a heartbeat or regular messages to detect they are online. If you have to rely on updates to some other Item you will need the full DP I posted above.

Rules:

import java.util.concurrent.locks.ReentrantLock
import java.util.Map

val ReentrantLock statusLock = new ReentrantLock
val Map<String, Timer> timers = newHashMap

rule "A sensor changed its online state"
when
        Item gSensorStatus received update
then
    try {
        statusLock.lock
        Thread::sleep(100) // give persistence time to catch up

        // Process any sensors that have updated in the last second. The Alerted flag will keep us from sending out duplicate alerts
        val recentUpdates = gSensorStatus.members.filter[sensor|sensor.lastUpdate("mapdb") != null && sensor.lastUpdate("mapdb").isBefore(now.minusSeconds(1).millis)]

        recentUpdates.forEach[sensor|

                // Get the alerted flag Item
                val alerted = gOfflineAlerted.members.filter[a|a.name==sensor.name+"_Alerted"].head
                if(alerted == null) logError("admin", "Cannot find Item " + sensor.name+"_Alerted")

                // If we haven't alerted this state change
                if(alerted != null && alerted.state == sensor.state && timers.get(sensor.name) == null){
                        val currState = sensor.state
                        // wait a few seconds and check again before sending alert if it is still changed to avoid flapping
                        timers.put(sensor.name, createTimer(now.plusSeconds(15), [|
                                if(sensor.state == currState) {
                                        var name = transform("MAP", "admin.map", sensor.name) // Use the Human Readable Names DP to get a nice name from the Item name
                                        if(name == "") name = sensor.name
                                        aInfo.sendCommand(name + " is now " + transform("MAP", "admin.map", sensor.state.toString) + "!") // Uses Separation of Behaviors DP for sending alert messages
                                        // Set the flag so we know this state change has already been alerted
                                        alerted.postUpdate(if(sensor.state == ON) OFF else ON)
                                }
                        ]))
                }
        ]
    }
    catch(Exception e){
        logError("admin", "Error processing an online status change: " + e.toString)
    }
    finally {
        statusLock.unlock
    }
end

// Sends a digest alert at 08:00 with a list of all devices that are offline
rule "Reminder at 08:00 and system start"
when
        Time cron "0 0 8 * * ? *" or
        System started
then
    val message = new StringBuilder

    val offline = gSensorStatus.members.filter[sensor|sensor.state != ON]
    if(offline.size > 0) {
        message.append("The following sensors are offline: ")
        offline.forEach[sensor|
                var name = transform("MAP", "admin.map", sensor.name)
                if(name == "") name = sensor.name
                message.append(name)
                message.append(", ")
            gOfflineAlerted.members.filter[a|a.name==sensor.name+"_Alerted"].head.postUpdate(ON)
        ]
        message.delete(message.length-2, message.length)
        aInfo.sendCommand(message.toString)
    }

    gSensorStatus.members.filter[sensor|sensor.state == ON].forEach[sensor |
        sensor.sendCommand(ON) // reset the Expire timer
    ]
end

// If your persistence is slow with restoreOnStartup you may need to add a sleep here so all the Items have been populated from the database
rule "Reset Expire binding at startup"
when
  System started
then
  logInfo("Startup", "Restarting Expire binding Timers")
  gResetExpire.members.forEach[ i | i.postUpdate(i.state)]
end

Theory of operation:

The “A sensor changed its online state” rule triggers whenever any sensor receives an update. It occurs to me that if we defined the Group as Group:Number:SUM gSensorStatus we can trigger the rule using Item gSensorStatus changed and the Rule will be uselessly triggered far less often.

Anyway, this rule gets a lock so only one instance of the rule is executing at a given time to prevent the multiple triggers of the rule caused by one update from running concurrently. If you use the Number:SUM approach, I think you cna elimiinate the lock.

In the rule, we sleep for a little bit to give persistence a chance to save the most recent updates and get all the Items that have updated in the past second. For each one we get its Alerted flag using Associated Items DP.

If the alerted Switch is different from the online Item’s state and there isn’t a running Timer we know this is a state change we need to send an alert for. So we capture the current state and create a 15 second timer. When the Timer goes off, we check to see if the online Item is still changed and send an alert using Design Pattern: Separation of Behaviors and finally setting the Alerted Item so we don’t alert for further updates to this Item that are not a change. I use Design Pattern: Human Readable Names in Messages to convert the Item name to something more friendly for the alert messages.

The timer helps avoid flapping which I’ve noticed is a bit of a problem with Network Binding. This Rule will generate an alert when ever a device goes OFF and ON so we get an alert both directions.

The “Reminder at 08:00 and system start” generates a message listing all the devices that are still offline and sends it at 08:00 every day. If everythig is online then no message gets sent.

The “Reset Expire binding at startup” triggers at System start and postUpdate’s it’s restoreOnStartup state to kick off the Expire binding timers again.

Sitemap snippet:

                Text item=gSensorStatus {
                        Frame label="Devices"{
                                Text item=vNest_Online
                                Text item=vTopFloorSmokeCOAlarm_Heartbeat
                                Text item=vMainFloorSmokeCOAlarm_Heartbeat
                                Text item=vBasementFloorSmokeCOAlarm_Heartbeat
                                Text item=vSonoff_3157_Online
                                Text item=vSonoff_3157_Uptime
                                Text item=vSonoff_4079_Online
                                Text item=vSonoff_4079_Uptime
                                Text item=vSonoff_0587_Online
                                Text item=vSonoff_0587_Uptime
                        }
                Frame item=vNetwork_Cerberos {
                        Text item=vCerberos_SensorReporter_Online
                        Text item=vCerberos_SensorReporter_Uptime
                        Text item=vCerberos_Camera_Online
//                      Text item=vCerberos_reelyActive_Online
                        // Switch item=aCerberos // power switch connected to cerberos, currently being used for Christmas lights
                }
                Frame item=vNetwork_Hydra {
                        Text item=vHydra_SensorReporter_Online
                        Text item=vHydra_SensorReporter_Uptime
//                      Text item=vHydra_reelyActive_Online
                        // Switch item=aHydra // power switch connected to hydra, currently being used for Christmas lights

                }
                Frame item=vNetwork_Manticore {
                        Text item=vManticore_SensorReporter_Online
                        Text item=vManticore_SensorReporter_Uptime
                }
                Frame item=vMasterBedroom_Sensors_Online {
                        Text item=vMasterBedroom_Sensors_Online
                        Text item=vMasterBedroom_Uptime
                }
                Frame item=vBasement_Sensors_Online {
                        Text item=vBasement_Sensors_Online
                        Text item=vBasement_Uptime
                }
                Frame item=vNetwork_Medusa {
                        Text item=vGogs_Online
                        Text item=vCalibre_Online
                        Text item=vPlex_Online
                }
                Frame item=vNetwork_Argus {
                        Text item=vAlfresco_Online
                        Text item=vMosquitto_Online
                        Text item=vInfluxdb_Online
                        Text item=vGrafana_Online
                }
...