EDIT: Grrrrr, second try. First version was completely lost.
Rule
This is one of my more complicated Rules but it doesn’t really matter if it doesn’t work right. This is a Rule that alerts me when a service or device goes offline.
import org.eclipse.smarthome.model.script.ScriptServiceUtil
import java.util.Map
val Map<String, Timer> timers = newHashMap
rule "A sensor changed its online state"
when
Member of gSensorStatus changed
then
if(previousState == NULL) return;
// val alerted = gOfflineAlerted.members.findFirst[ a | a.name == triggeringItem.name+"_Alerted"] as SwitchItem
val alerted = ScriptServiceUtil.getItemRegistry.getItem(triggeringItem.name+"_Alerted") as SwitchItem
if(alerted === null) {
logError("admin", "Cannot find Item " + triggeringItem.name+"_Alerted")
aInfo.sendCommand(triggeringItem.name + " doesn't have an alerted flag, it is now " + transform("MAP", "admin.map", triggeringItem.state.toString) + "!")
return;
}
var n = transform("MAP", "admin.map", triggeringItem.name)
val name = if(n == "") triggeringItem.name else n
// If we are flapping, reschedule the timer and exit
if(timers.get(triggeringItem.name) !== null) {
timers.get(triggeringItem.name).reschedule(now.plusMinutes(1))
logWarn("admin", name + " is flapping!")
return;
}
// If alerted == OFF and triggeringItem == OFF then sensor went offline and we have not yet alerted
// If alerted == ON and triggeringItem == ON then the sensor came back online after we alerted that it was offline
if(alerted.state == triggeringItem.state) {
val currState = triggeringItem.state
// wait one minute before alerting to make sure it isn't flapping
timers.put(triggeringItem.name, createTimer(now.plusMinutes(1), [ |
// If the current state of the Item matches the saved state after 5 minutes send the alert
if(triggeringItem.state == currState) {
aInfo.sendCommand(name + " is now " + transform("MAP", "admin.map", triggeringItem.state.toString) + "!")
alerted.postUpdate(if(currState == ON) OFF else ON)
}
timers.put(triggeringItem.name, null)
]))
}
end
Theory of Operation
The Rule triggers whenever a member of gSensorStatus changes state.
First we check to see if the change was from a state we don’t care about, namely NULL or UNDEF (there is a small bug in this original version as it’s missing UNDEF).
Next we use Design Pattern: Associated Items to get an Item that stores a flag indicating whether we have already alerted for this Item or not. Then we us Design Pattern: Human Readable Names in Messages to convert the Item name to a more friendly name for the logs and alert messages.
If a Timer already exists we know the device is flapping so reschedule the alert (I think this is wrong and the timer should just be canceled).
Finally, if we are not flapping, and the alerted state matches the Item state we set a timer to send the alert message. So if an Item goes OFF and we haven’t alerted, we set the timer. In the timer body we send the alert and set the alerted flag Item to ON. Then when the device comes back online alerting will match the Item state and a new alert will be sent indicating the device is back online.(I’m wondering if all the changes I’ve made over the years whether this is still necessary.)
Python
JSR223 provides access to a newish feature on Items that we can leverage with this Rule to simplify the overall configuration: metadata. Item metadata allows one to define key/value pairs on an Item and with JSR223 the metadata can be accessed and modified fro Rules. Sorry Rules DSL, this isn’t supported (yet?) and the REST API is not fine grained enough to be useful.
With metadata we can replace the associated Item and map file to translate the Item name to a friendly name. One thing to be aware of with metadata is while you can define it on an Item in the .items file, any dynamic changes made to that namespace will be overwritten on a reload of that .items file. So for this Rule I’m going to use two namespaces, a static one defined in the .items file and a dynamic one created and managed by the Rule.
Here is an example of an Item with a statically defined metadata. See the Helper Library docs for the full documentation of metadata.
Switch vGrafana_Online "Grafana Status [MAP(admin.map):%s]"
<network> (gSensorStatus, gResetExpire)
{ channel="network:servicedevice:argusgrafana:online",
expire="2m",
Static="meta"[name="Grafana"] }
Here is the Rule.
from core.rules import rule
from core.triggers import when
from core.metadata import get_key_value, set_metadata
from personal.util import send_info
from threading import Timer
from core.actions import Transformation
from core.log import log_traceback
# -----------------------------------------------------------------------------
# Python Timers for online alerts
alertTimers = {}
@log_traceback
def alert_timer_expired(itemName, name, origState):
status_alert.log.debug("Status alert timer expired for {} {} {}".format(name, origState, items[itemName]))
del alertTimers[itemName]
if items[itemName] == origState:
send_info("{} is now {}".format(name, Transformation.transform("MAP", "admin.map", str(items[itemName]))), status_alert.log)
set_metadata(itemName, "Alert", { "alerted" : "ON"}, overwrite=False)
else:
status_alert.log.warn("{} is flapping!".format(itemName))
@rule("Device online/offline", description="A device we track it's online/offline status changed state", tags=["admin"])
@when("Member of gSensorStatus changed")
def status_alert(event):
status_alert.log.info("Status alert for {} changed to {}".format(event.itemName, event.itemState))
if isinstance(event.oldItemState, UnDefType):
return
alerted = get_key_value(event.itemName, "Alert", "alerted") or "OFF"
name = get_key_value(event.itemName, "Static", "name") or event.itemName
#If the Timer exists and the sensor changed the sensor is flapping, cancel the Timer
if event.itemName in alertTimers:
alertTimers[event.itemName].cancel()
del alertTimers[event.itemName]
status_alert.log.warning(name + " is flapping!")
return
'''
If alerted == "OFF" and event.itemName == OFF than sensor went offline and we have not yet alerted
If alerted == "ON" and event.itemName == ON then the sensor came back online after we alerted that
it was offline
'''
status_alert.log.debug("Looking to see if we need to create a Timer: {} {}".format(alerted, event.itemState))
if alerted == str(event.itemState):
# Wait one minute before alerting to make sure it isn't flapping
alertTimers[event.itemName] = Timer(60, lambda: alert_timer_expired(event.itemName, name, event.itemState))
alertTimers[event.itemName].start()
The rule works pretty much the same as the original Rules DSL Rule with a couple of minor changes to address the issues I identified above (i.e. cancelling the Timer on flapping, check for UNDEF as well as NULL).
Lessons Learned
-
Timers in Python are tricky. See the thread below for details.
-
Some errors (e.g. calling
status_alert.log("Some log statement")
in the function called by the timer lambda will not be reported in the logs, they will just fail silently. In fact it would seem that all errors in a Timer lambda are suppressed. I suspect one could put a try/exxcept in the function to log out the errors.
Bonus Rule
I have a simple Rule that runs at System started
to reset any Design Pattern: Expire Binding Based Timers on an OH restart. This is done through a Group, Persistence, and restoreOnStartup.
Rules DSL Version
rule "Reset Expire binding at startup"
when
System started
then
logInfo("Startup", "Restarting Expire binding Timers")
gResetExpire.members.forEach[ GenericItem i | i.postUpdate(i.state)]
end
At System start, send an update to each member of gResetExpire with the state that was restored. This will cause any running Expire binding Timers to start again.
Python
# -----------------------------------------------------------------------------
@rule("Reset Expire Binding Timers", description="Sends an update to all members of gResetExpire of their restored state", tags=["admin"])
@when("System started")
def reset_expire(event):
reset_expire.log.info("Restarting Expire binding Timers")
for timer in ir.getItem("gResetExpire").members:
events.postUpdate(timer, timer.getState())
Previous post: Journey to JSR223 Python 1 of 9
Next post: Journey to JSR223 Python 3 of?
Edit: Applied corrections and updates based on feedback from CrazyIvan359 and 5iver below.