[SOLVED] Curiously state change / initializing of Items

WeRo · June 15, 2019, 9:19am

I am facing a strange system behavior:
Every night at 04:04 a.m. all items (MAX and Zigbee) seem to be initialized/state changed and therefore item related rules are fired.
None of my rules has a time cron at this time.
I am running a script based backup at 03:00 a.m. stopping the openHAB and the homegear services.

Any hint what and where to look for, to find the cause of this issue?

rossko57 · June 15, 2019, 9:31am

events.log snippet to demonstrate?

WeRo · June 15, 2019, 9:44am

This is the beginning - I hav to admit it starts a few seconds before 04:04 a.m.

2019-06-15 04:03:55.462 [ome.event.ItemCommandEvent] - Item ‘OG_Bad_Heating_SetTemp’ received command 4.5
2019-06-15 04:03:55.466 [ome.event.ItemCommandEvent] - Item ‘OG_Schlafen_Heating_SetTemp’ received command 4.5
2019-06-15 04:03:55.471 [ome.event.ItemCommandEvent] - Item ‘OG_Wohnzimmer_Switch’ received command OFF
2019-06-15 04:03:55.475 [ome.event.ItemCommandEvent] - Item ‘OG_Arbeit_Heating_SetTemp’ received command 4.5
2019-06-15 04:03:55.479 [nt.ItemStatePredictedEvent] - OG_Bad_Heating_SetTemp predicted to become 4.5
2019-06-15 04:03:55.494 [nt.ItemStatePredictedEvent] - OG_Schlafen_Heating_SetTemp predicted to become 4.5
2019-06-15 04:03:55.529 [nt.ItemStatePredictedEvent] - OG_Arbeit_Heating_SetTemp predicted to become 4.5
2019-06-15 04:03:55.552 [ome.event.ItemCommandEvent] - Item ‘EG_Hauseingang_Light_A’ received command 0
2019-06-15 04:03:55.558 [ome.event.ItemCommandEvent] - Item ‘EG_Hauseingang_Light_B’ received command 0
2019-06-15 04:03:55.565 [nt.ItemStatePredictedEvent] - EG_Hauseingang_Light_A predicted to become 0
2019-06-15 04:03:55.585 [nt.ItemStatePredictedEvent] - EG_Hauseingang_Light_B predicted to become 0
2019-06-15 04:03:55.597 [ome.event.ItemCommandEvent] - Item ‘OG_Wohnzimmer_Switch’ received command OFF
2019-06-15 04:03:56.687 [vent.ItemStateChangedEvent] - homematic_HG_BC_RT_TRX_CyN_45d8294b_OEQ1040467_0_SIGNAL_STRENGTH changed from NULL to 4
2019-06-15 04:03:57.116 [me.event.ThingUpdatedEvent] - Thing ‘zigbee:device:000001A5:84182600000f88ee’ has been updated.
2019-06-15 04:03:57.788 [vent.ItemStateChangedEvent] - homematic_HG_BC_RT_TRX_CyN_66e41b58_OEQ1041650_0_SIGNAL_STRENGTH changed from NULL to 3
2019-06-15 04:03:58.062 [me.event.ThingUpdatedEvent] - Thing ‘zigbee:device:000001A5:84182600000f9f2f’ has been updated.
2019-06-15 04:03:58.680 [me.event.ThingUpdatedEvent] - Thing ‘zigbee:device:000001A5:7cb03eaa0a013bf3’ has been updated.
2019-06-15 04:03:58.889 [vent.ItemStateChangedEvent] - homematic_HG_BC_RT_TRX_CyN_45d8294b_OEQ1039365_0_SIGNAL_STRENGTH changed from NULL to 3

WeRo · June 15, 2019, 10:38am

and this is the rule related to the SetTemp of the heating valves:

rule “Thermostate An-Aus”
when
Item Thermostate_AnAus changed
then
if (Thermostate_AnAus.state.toString == “OFF”) {
OG_Bad_Heating_SetTemp.sendCommand(4.5)
OG_Schlafen_Heating_SetTemp.sendCommand(4.5)
OG_Arbeit_Heating_SetTemp.sendCommand(4.5)
Thread::sleep(50000)
OG_Kueche_Heating_SetTemp.sendCommand(4.5)
OG_Wohnen_Heating_SetTemp.sendCommand(4.5)
EG_Arbeit_Heating_SetTemp.sendCommand(4.5)
Thread::sleep(50000)
EG_Waschkueche_Heating_SetTemp.sendCommand(4.5)
EG_Garten_Heating_SetTemp.sendCommand(4.5)
EG_Kinder_Heating_SetTemp.sendCommand(4.5)
EG_Eingang_Heating_SetTemp.sendCommand(4.5)
}
else if (Thermostate_AnAus.state.toString == “ON”) {
OG_Bad_Heating_SetTemp.sendCommand(OG_Bad_Heating_Soll_Jetzt.state as Number)
OG_Schlafen_Heating_SetTemp.sendCommand(OG_Schlafen_Heating_Soll_Jetzt.state as Number)
OG_Arbeit_Heating_SetTemp.sendCommand(OG_Arbeit_Heating_Soll_Jetzt.state as Number)
Thread::sleep(50000)
OG_Kueche_Heating_SetTemp.sendCommand(OG_Kueche_Heating_Soll_Jetzt.state as Number)
OG_Wohnen_Heating_SetTemp.sendCommand(OG_Wohnen_Heating_Soll_Jetzt.state as Number)
EG_Arbeit_Heating_SetTemp.sendCommand(EG_Arbeit_Heating_Soll_Jetzt.state as Number)
Thread::sleep(50000)
EG_Waschkueche_Heating_SetTemp.sendCommand(EG_Waschkueche_Heating_Soll_Jetzt.state as Number)
EG_Garten_Heating_SetTemp.sendCommand(EG_Garten_Heating_Soll_Jetzt.state as Number)
EG_Kinder_Heating_SetTemp.sendCommand(EG_Kinder_Heating_Soll_Jetzt.state as Number)
EG_Eingang_Heating_SetTemp.sendCommand(EG_Eingang_Heating_Soll_Jetzt.state as Number)
}
end

WeRo · June 15, 2019, 10:45am

but the last change of item Thermostate_AnAus was after backup:

2019-06-15 03:04:10.561 [vent.ItemStateChangedEvent] - Thermostate_AnAus changed from NULL to OFF

rossko57 · June 15, 2019, 11:47am

But an Item generally only has a NULL state immediately after system boot. Or … when Items files are reloaded after an edit.
Does your backup ‘touch’ the files or something? What does openhab.log have to say at the time?

WeRo · June 15, 2019, 12:01pm

The rule related item changed at 03:04 a.m. after Backup, but the SetTemp triggered one hour later at 04:03 a.m.
Same with

2019-06-15 04:03:55.552 [ome.event.ItemCommandEvent] - Item ‘EG_Hauseingang_Light_A’ received command 0

It is also triggered by a rule, but one hour after system start after Backup.

rossko57 · June 15, 2019, 12:07pm

Sure. My curiousity is not about the change, but how that Item got to be NULL if it had some valid value since your last reboot or edit.

Changing TO state NULL is abnormal and will not show in events.log, because a boot or ile reload will show in openhab.log

WeRo · June 15, 2019, 12:08pm

This is the part of openHAB.log

„Rasensprenger Timer A aus“ is also triggered by a rule - and I don’t know why.

2019-06-15 03:55:38.566 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = XXXXXXXX, base $
2019-06-15 03:55:42.644 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = XXXXXXXX, base URL =$
2019-06-15 04:04:37.703 [INFO ] [.eclipse.smarthome.model.script.INFO] - Rasensprenger Timer A aus
2019-06-15 04:04:37.707 [INFO ] [.eclipse.smarthome.model.script.INFO] - Rasensprenger Timer B aus
2019-06-15 04:05:59.089 [ERROR] [ematic.handler.HomematicThingHandler] - -100 No answer from device. (sending setValue()
PEQ1010801:1
SET_TEMPERATURE
4.5
)

WeRo · June 15, 2019, 1:30pm

This is my backup script - executed by crontab at 03:00 a.m.

#!/bin/bash
echo 'openHAB Backup-Skript'
# openHAB Service stoppen
echo 'Stoppe openhab2-Service'
sudo systemctl stop openhab2.service
# Homegear Service stoppen
echo 'Stoppe Homegear-Service'
sudo systemctl stop homegear.service

# Current Date And Time
dt=$(date '+%m-%d-%Y_%H-%M-%S');

# Backup-Verzeichnis erstellen
BACKUPDIR="/mnt/OHBackupDrive/OHBackup_"$dt"/"
mkdir -p $BACKUPDIR

# Löschen von Cache und TMP
echo 'Löschen von Cache und TMP'
rm -rf /var/lib/openhab2/tmp/*
rm -rf /var/lib/openhab2/cache/*

# Sichern der openHAB Konfiguration
echo 'Erstelle openHAB Backup'
openhab-cli backup
# echo 'Erstelle TAR von addon-Verzeichnis'
# tar -zcpf /backup/openhab2/openhab2-addons-backup-$dt.tar.gz /usr/share/openhab2
echo 'Erstelle ZIP von addon-Verzeichnis'
zip /backup/openhab2/openhab2-addons-backup-$dt.zip /usr/share/openhab2/*
echo 'Kopiere openhab-Backups auf USB'
cp -arv "/backup/openhab2" "$BACKUPDIR"

# Sichern der HomeGear Konfiguration
echo 'Erstelle Homegear Backup TAR'
tar -zcpf /backup/homegear/homegear-backup-$dt.tar.gz /etc/homegear /var/lib/homegear
echo 'Erstelle Homegear Backup ZIP'
zip /backup/homegear/homegear-backup-$dt.zip /etc/homegear/* /var/lib/homegear/*
echo 'Kopiere Homegear-Backups auf USB'
cp -arv "/backup/homegear" "$BACKUPDIR"

# Sichern sonstiger Konfigurationsdateien
echo 'Erstelle Backup von Konfigurationsdateien'
cp -apv "/bin/OHBackup_script" "/backup/config/OHBackup_script"
cp -apv "/bin/OHRestore_script" "/backup/config/OHRestore_script"
cp -apv "/etc/default/openhab2" "/backup/config/openhab2"
cp -apv "/etc/udev/rules.d/99-usb-serial.rules" "/backup/config/99-usb-serial.rules"
cp -apv "/etc/ssh/sshd_config" "/backup/config/sshd_config"
cp -apv "/home/openhabian/.ssh/authorized_keys" "/backup/config/authorized_keys"
cp -apv "/home/openhabian/.smbcredentials" "/backup/config/.smbcredentials"
cp -apv "/etc/samba/smb.conf" "/backup/config/smb.conf"
cp -apv "/etc/fstab" "/backup/config/fstab"
echo 'Kopiere Config-Backups auf USB'
cp -arv "/backup/config" "$BACKUPDIR"

# Löschen Backup-Dateien
echo 'Löschen Backup-Dateien'
rm -rf /backup/config/*
rm -rf /backup/openhab2/*
rm -rf /backup/homegear/*

# Services wieder starten
echo 'Starte Services wieder'
sudo systemctl start homegear.service
sudo systemctl start openhab2.service

rossko57 · June 15, 2019, 10:10pm

Okay, here’s what I see.

You stop and restart openHAB as part of your backup process, around 03:00

The restart causes all Items to get state NULL. (Maybe you restore-on-startup some)

Item Thermostate_AnAus gets changed NULL to OFF at 03:04
This will trigger your rule “Thermostate An-Aus”, all expected so far.
Or rather, it will trigger the rule when a thread becomes available.

At 04:03 we see the first evidence of that rule actually running, a command it issues.

Also at 04:03 some homematic Items get a routine update from NULL to xxx.
Bearing in mind these will have been NULL since reboot at 03:00, why do these unconnected things suddenly get an update too?

I am deeply suspicious of this almost exactly one hour gap. Do you have anything in logs from between these times? Any chance your system clock is adjusting, maybe some NTP feed or something?

Assuming the one hour gap is genuine, it would seem something has grabbed system resources and blocked any openHAB tasks from running.

The rule you have shown is a horrible thing, it blocks a rule thread for around 5 minutes with its long sleeps. But that is not going to cause an hour loss, unless you have many such rules.

A closer look at this rule’s results is interesting., though.
It should send three heating commands, wait around a minute, send three more, wait, send more.
What we actually see in the events.log timestamps however is just a stream of ALL the rule commands, with no one minute gaps.

I think what has happened here is that the rule is all finished before the logging takes place. Could it actually have finished an hour ago?
I’ve no idea if the bottleneck, be it an hour or be it five minutes, could be just in the logging - waiting for disc write?? - or the openHAB event bus has a queue/backlog of commands?.

For an outright guess - something is hogging your system i/o for an hour after backup. How long does an ordinary reboot take on your system?

WeRo · June 16, 2019, 3:03pm

The reboot just takes a few minutes.
Each item is restored on restart by mapdb.persist:

Strategies {
default = everyChange
}

Items {
 // persist all items on every change and restore them from the db at startup
* : strategy = everyChange, restoreOnStartup
}

Some items are stored every minute by rrd4j.persist.

I do not understand why rules are triggered even though the items didn’t change because of the restore on startup.

I added a loginfo at every rule to check which rule is triggered at which time.
This night it seems like no rule was triggered at 04:00 a.m.
Very curious

Let’s see what kind of story the loginfo of the next night will tell us.

rossko57 · June 16, 2019, 3:17pm

But the Item did change - you showed us that
2019-06-15 03:04:10.561 [vent.ItemStateChangedEvent] - Thermostate_AnAus changed from NULL to OFF

I believe restore on startup never shows in the events.log, I’m sure I read that it acts directly and not via event bus. Normally restore is done before your rules load, so it’s an “invisible” process.

I’m guessing that, like the rule that takes an hour to run, this time the restore was greatly delayed and had not taken place.

Look here, over an hour later and this Item still hasn’t been restored either. It’s NULL before this change.
2019-06-15 04:03:58.889 [vent.ItemStateChangedEvent] - homematic_HG_BC_RT_TRX_CyN_45d8294b_OEQ1039365_0_SIGNAL_STRENGTH changed from NULL to 3

This is a symptom, not a cause. I’m keeping my guess that something is causing a logjam in your system I/O with all kinds of backlog effects.

WeRo · June 16, 2019, 5:35pm

Just found this post about restoreOnStartup:
„When OH restarts there is no specific order of what’s loaded first. If you watch the log it shows items as undef/null then after a minuet or so it’s value will change to the last known state. This works for all item that you have listed in persistence.“

So items changing from NULL to xxx at 03:04 seems to be ok.
But I agree it is strange, if this happens one hour later at 04:04.

WeRo · June 16, 2019, 6:42pm

I checked this. All „signal strength“ items were restored on startup (changed from NULL to a value) immediately after restart at 03 am. About 10 minutes later many of them changed from value to NULL. I guess it’s due to a maximum number of signals per time MAX is able to handle.
That’s why I added some threads to my rule regarding the MAX heating valves.

rossko57 · June 16, 2019, 7:53pm

You’re quite right. I completely misremembered something about restore and events (it suppresses persistence events, not updates, so it doesn’t persist its own changes).
Checked on my own system, and right enough restore updates are visible in events.log. Of course that is finished before any rules are available, that is how it is supposed to work.

That remains surprising, outside of items file reloads.

Flip · June 16, 2019, 9:43pm

I had a problem a while ago that my persistence setup wasn’t working due to a special character in the config. So maybe it helps to take a quick look via openhab-cli if it’s working:

log:set INFO org.openhab.persistence.mapdb

WeRo · June 18, 2019, 7:32pm

And here I am again.
I identified the main cause of my problems.
It was this rule:

rule "Things Offline"
when
   system started or
   Thing 'xxx' changed or
   Thing 'yyy' changed
then
   var status_a = getThingStatusInfo("xxx").getStatus()
   var status_b = getThingStatusInfo("yyy").getStatus()
   if (status_a.toString != "ONLINE") {sendBroadcastNotification("ZigBee-Stick offline!") logInfo("Info","ZigBee-Stick offline")}
   else if (status_b.toString != "ONLINE") {sendBroadcastNotification("Local Weather offline!") logInfo("Info","Local Weather offline")}
end

There are several posts about problems using „system started“ at rules:

I deleted „system started“ in this and other rules and changed the whole when-part of this rule to two time crons.

I made a reboot of the pi and everything was alright.

But what I still don’t understand is the order of starting openHAB after Backup at 03:00 a.m.:
03:02:04 loading rules
03:02:40 loading items
03:02:44 loading persists
03:02:47 starting rules
03:04:37 first zigbee device from unknown to online
03:05:35 first homegear/Max device initializing
Why are the devices started after loading items and after starting rules?