You asked for it! 
All of my monitoring is based off of Threshold Alert and Open Reminder [4.0.0.0;4.9.9.9]. This is a rule template which you can install from the marketplace and instantiate and configure rules. The main thing this does is if an Item meets a certain critera for a certain amount of time, call another rule. For example, if an Item doesn’t receive an update for a given amount of time, if it remains above a certain number for a given amount of time, it remains in a certain state for a given amount of time.
This is a really common thing one needs to do in openHAB. You can do a lot with this one rule:
- turn on a humidifier when the humidity remains below 35% for five minutes
- send an alert if a window remains open for too long
- control a light based on a motion sensor
- send an alert if a sensor doesn’t report for too long a time
- send an alert if a battery is below a certain level
The rule template only does the part that detects the event. It then calls a rule or script you write to respond to the event.
Batteries
I don’t have the same situation you do. All my battery Items report in a percentage. I’ve configured the Threshold Alert rule as follows:
- Group: AllBatteries
- Threshold: 10 %
- Operator: <
- Alert Delay: PT1S
- Reminder Period: PT12H
- Alert Rule: battery_alert
- Do Not Disturb Start: 22:00
- Do Not Disturb End: 08:00
All other properties are the defaults. With this configuration, if any member of AllBatteries remain < 10% for at least one second battery_alert will be called. If battery_alert was previously called and the Item becomes >= 10% battery_alert will be called. If either of those occur between 22:00 and 08:00, the rule will not be called until 08:00 (who wants to be woken up at night to be told a battery is low?). If the Item remains < 10% for another 12 hours after that first alert, the alert rule will be called again and repeat every 12 hours until the Item becomes >= 10%.
My battery_alert
rule is as follows:
configuration: {}
triggers: []
conditions: []
actions:
- inputs: {}
id: "1"
configuration:
type: application/javascript;version=ECMAScript-2021
script: >
var {alerting} = require('rlk_personal');
var logger = log('Low Battery');
alerting.sendAlert('The following batteries are below 10%: ' +
threshItemLabels, logger);
type: script.ScriptAction
or just the JS code:
var {alerting} = require('rlk_personal');
var logger = log('Low Battery');
alerting.sendAlert('The following batteries are below 10%: ' + threshItemLabels, logger);
The rule only gets called when a battery first goes below 10% and thershItemLabels
is passed from Threshold Alert and is a comma separated String of all the Items that are in the alerting state (i.e. < 10%).
Because I get this alert every 12 hours as long as there is at least one battery < 10%, I don’t need another rule to send me a summary daily or anything like that.
But notice, all this comes from installing the template, setting 8 properties and writing three lines of code.
Online Monitoring
I have two flavors of this use case. A lot of devices can go offline and be detected through the status of their Things, pinging with the Network binding, they stop reporting for a time, etc. So I have several different ways to detect that a device has gone offline.
But I have a standard approach to representing the online status of my devices. I use the semantic model and each Equipment that I want to track has a Status Item. This Item is ON when the device is online and OFF when the device is offline.
I have a widget on MainUI that shows those status Items that are OFF. It’s not on the marketplace but it works very much like Battery Level Status.
Things
I use Thing Status Reporting [4.0.0.0;4.9.9.9] to call a rule that looks for those Things I care about and find and update the relevant status Item for the equipment based on the status of the Thing.
Network Devices
I use the Network binding to ping servers and services that are relevant and link that directly to the status Item.
Sensors
I use Threshold Alert for this too. This is more of an advanced usage and it does require a little bit of extra config. First I pick one Item from the equipment to monitor that reports a new value relatively frequently.
I use the following config for the Threashold Alert rule:
- Group: Sensors
- Threshold: UnDefType
- Operator: ==
- Invert: true
- Alert Delay: PT15M
- Reminder Period: PT12H
- Reschedule: true
- Metadata Namespace: sensorStatus
- Alert Rule: new-sensor-status-proc
- Initial Alert Rule: new-sensor-status-proc
- Do Not Disturb Start: 22:00
- Do Not Disturb End: 08:00
- Gatekeeper Delay: 500
This configuration will call new-sensor-status-proc immediately when any Item changes from NULL or UNDEF to anything else. It’s a little backwards in we are treating the online state as the alerting state.
As long as the Item updates to something other than UNDEF or NULL the alerting timer gets rescheduled on every update (that’s what reschedule=true does). If the Item doesn’t update for the alert delay amount of time, new-sensor-status-proc gets called.
To avoid hitting the new-sensor-status-proc rule too fast, we add a half second gatekeeper delay which prevents the rule from being called faster than every half second.
Metadata can be applied to individual Items to override any of the properties above for that Item so we can have a different alert delay on a per Item basis.
The alert rule will be called every 12 hours as long as the sensor doesn’t update to a state different from NULL or UNDEF.
And of course, we don’t need alerts while we are sleeping.
The script is called with several values which can be used to tell if this is when the Item first goes into the alerting state, the Item has been in the alerting state for the alert delay amount of time, or the Item has exited the alert state. We use those to determine if the sensor has gone offline or come online. We use the semantic model actions to get to the status Item for the equipment.
configuration: {}
triggers: []
conditions:
- inputs: {}
id: "2"
configuration:
type: application/javascript
script: >
var equipment = actions.Semantics.getEquipment(items[alertItem]);
var statusItem = items[equipment.name+'_Status'];
if(equipment === null || statusItem === null) {
console.warn(alertItem + ' does not belong to an equipment or equipment doesn\'t have a Status Item!');
false;
}
else {
var statusItem = items[equipment.name+'_Status'];
console.debug('Sensor status reporting called for ' + alertItem + ', equipment ' + equipment.label + ', is alerting ' + isAlerting + ', and is initial alert ' + isInitialAlert
+ ', current equipment state is ' + statusItem.state);
// Sensor is offline Sensor back online
(isAlerting && statusItem.state != 'OFF') || (!isAlerting && statusItem.state != 'ON');
}
type: script.ScriptCondition
actions:
- inputs: {}
id: "1"
configuration:
type: application/javascript
script: >
var {alerting} = require('rlk_personal');
var logger = log('Sensor Alert');
var equipment = actions.Semantics.getEquipment(items[alertItem]);
var statusItem = items[equipment.name+'_Status'];
if(isAlerting && statusItem.state != 'OFF') {
statusItem.postUpdate('OFF');
alerting.sendAlert('Offline: ' + equipment.label + ' has stopped reporting', logger);
}
else if(!isAlerting && statusItem.state != 'ON') {
statusItem.postUpdate('ON');
alerting.sendAlert('Online: ' + equipment.label + ' is back', logger);
}
else {
console.info('Sensor status update alerting ' + isAlerting + ' initial ' + isInitialAlert + ' equipment ' + equipment.label + ' status ' + statusItem.state);
}
type: script.ScriptAction
The condition of this rule checks to see if the equipment has a status Item. If it does, it checks to ensure that the status Item isn’t already in the proper state. Remember the alerting state is that the equipment is online so is isAlerting
is true, the status should be ON.
var equipment = actions.Semantics.getEquipment(items[alertItem]);
var statusItem = items[equipment.name+'_Status'];
if(equipment === null || statusItem === null) {
console.warn(alertItem + ' does not belong to an equipment or equipment doesn\'t have a Status Item!');
false;
}
else {
var statusItem = items[equipment.name+'_Status'];
console.debug('Sensor status reporting called for ' + alertItem + ', equipment ' + equipment.label + ', is alerting ' + isAlerting + ', and is initial alert ' + isInitialAlert
+ ', current equipment state is ' + statusItem.state);
// Sensor is offline Sensor back online
(isAlerting && statusItem.state != 'OFF') || (!isAlerting && statusItem.state != 'ON');
}
If the condition passes, the action finds the status Item and updates it as required and sends an alert.
var {alerting} = require('rlk_personal');
var logger = log('Sensor Alert');
var equipment = actions.Semantics.getEquipment(items[alertItem]);
var statusItem = items[equipment.name+'_Status'];
if(isAlerting && statusItem.state != 'OFF') {
statusItem.postUpdate('OFF');
alerting.sendAlert('Offline: ' + equipment.label + ' has stopped reporting', logger);
}
else if(!isAlerting && statusItem.state != 'ON') {
statusItem.postUpdate('ON');
alerting.sendAlert('Online: ' + equipment.label + ' is back', logger);
}
else {
console.info('Sensor status update alerting ' + isAlerting + ' initial ' + isInitialAlert + ' equipment ' + equipment.label + ' status ' + statusItem.state);
}
Services
I aggregate all the status reporting from Things, Network, and Sensors into a Services Offline pair of rules. Again I use Threshold Alert.
- Group: ServiceStatuses
- Threshold: OFF
- Operator: ==
- Invert: false
- Alert Delay: PT5M
- Reminder Period: PT12H
- Metadata Namespace: sensorStatus
- Alert Rule: serviceStatusProc
- EndAlert Rule: serviceStatusProc
- Do Not Disturb Start: 22:00
- Do Not Disturb End: 08:00
- Gatekeeper Delay: 1000
With this configuration serviceStatusProc is called if any member of ServiceStatuses becomes OFF for five minutes. If the Item becomes something other than OFF (e.g. ON) the rule is called to indicate the service is back ON.
The called rule is:
configuration: {}
triggers: []
conditions: []
actions:
- inputs: {}
id: "1"
configuration:
type: application/javascript
script: >-
var {alerting} = require('rlk_personal');
var logger = log('Service Alert');
var service = items.getItem(alertItem).label.replace(' Online Status',
'');
if(isAlerting) {
cache.private.put('alerted', true);
alerting.sendAlert('Service offline: ' + service);
}
else if(cache.private.get('alerted', () => false)) {
cache.private.put('alerted', false);
alerting.sendAlert('Service online: ' + service);
}
// else don't send any alert if we haven't alerted offline previously
type: script.ScriptAction
This just sends an alert when a device goes offline or online with some extra code in there I need to remove (I added it while debugging a problem).
I’m going to do cancelable notifications eventually instead of sending separate alerts for offline and online. I haven’t gotten around to it yet.
Load / Temp
I use Zabbix to get alerts on these. But if I were to put this into OH I’d use good old Threashold Alert again. I’d use the following properties for the temperature:
- Group: doesn’t matter, since there’s only one Item I’d manually change the trigger to just that one Item
- Threshold: 45 °C
- Operator: >=
- Invert: false
- Alert Delay: PT1M
- Alert Rule: cpuTooHot
- End Alert Rule: cpuTooHot
Pretty much the same configuration for the system load too.
The called rule would have some JS code along the following:
var {alerting} = require('rlk_personal');
if(isAlerting) alerting.sendAlert("The CPU is too hot!");
else alerting.sendAlert("The CPU temp is now normal.");
Same for load.
I also use Threhshold alert to control some dumb humidifiers based on a near-by humidity sensor, send me an alert of a motion sensor at my dad’s house doesn’t detect motion for too long of a time, and to get an alert if one of the doors remains open for too long of a time.
Yes but I use Zabbix for that. openHAB only monitors those things which are directly related to home automation. All IT/homelab related stuff is monitored outside of OH. And even there, I mainly just get emails when something starts to exceed thresholds. But every now an then it’s hand to see exactly when something went wonky and see what the RAM and CPU were doing at that time.