No rule are triggered after some runtime

  • Platform information:
    • Hardware: RaspberryPi 3
    • OS: openHABian
    • Java Runtime Environment: openjdk version “1.8.0_152”
    • openHAB version: 2.3.0

For a few days I have the problem that openHAB after a few minutes or hours of running time no longer works correctly: No more rules are triggered and the persistence (mysql) no longer works.
There are no errors in the log files (openhab.log and events.log). The changes to items through the bindings continue to work (as I can see on the events.log and my sitemap), but they are not persisted in MySQL.

I can restart the openhab service or even reboot the whole RaspberryPi. But after a short time or a few hours this problem is back.

The last rule that I had added, I had already taken out as a test. But this has not changed anything.

I’ve added some log outputs in my rules. So I find out the the processing of rules stops randomly at different points during the execution of rules.

Also, I can not detect any overloading of the hardware.

Is there any other way to find out the cause, if nothing stands in the usual log files?

No, so ramp up log levels to DEBUG first via Karaf console to narrow it down:
org.eclipse.smarthome.model
org.openhab.persistence.mysql

and probably some others, experiment a bit.

Also add debug output lines to your rules as shown in the logging docs link to find out where they stop.

There’s also the chance that your system is corrupted due to SD card wear out, so also check your system logs for errors. SD corruption is infamous for producing seemingly irregular problems like these. Unfortunately there’s no reliable indicator to find out if that happened to you.

If you don’t find anything that helps in investigating, I’d suggest to setup your system from scratch using a new SD card. Read and follow this post this time.
You can use openhab-cli backup/restore to migrate your OH config.

I don’t think this is your problem, but it might be worth having a look at Why have my Rules stopped running? Why Thread::sleep is a bad idea, in particular post 12 where Scott lists some commands you can use to see how many of your rules threads are in use when your rules stop running.

Today, openHAB running fine since last night (last restart was 1.55 am), but stop working again since 7.15 pm.

Now I have a look at this count of rules threads, but there are currently 0 threads, so I think this is not the problem on my installation …

As I wrote, I have already done this. But unfortunately I could not find any connections to the problems: For exampe, there are two postUpdate-commands and I add some log outputs there in one of my rules:

logInfo("locationchange.rules", "I'm here (7)")
Lisa_LocationLat.postUpdate(locationPhone.getLatitude())
logInfo("locationchange.rules", "I'm here (8)")
Lisa_LocationLong.postUpdate(locationPhone.getLongitude())
logInfo("locationchange.rules", "I'm here (9)")

So, I will see in frontail:

I'm here (7)
Lisa_LocationLong changed from xxxxxxxx to xxxxxxx
I'm here (8)
Lisa_LocationLat changed from xxxxxxxx to xxxxxxx

The last log is missing and the second Item change are not persisted in MySQL.

But it was just like this in one of the observed cases. It seems to be random when the problem occurs.

Now, I have changed the log level to DEBUG, restarted the openhab service and waiting that something happen …

So this line is where your rule crashed (at least this time).
What’s that item locationPhone ?
I would suspect a problem with the service that this call is triggering.
Find out the class name this runs at and ramp up that service’s log level to debug, too.

Does the mysql log show the update attempt or any error ?
Using rrd4j persistence (which uses one file per item to store), I once had a persistence file with bad access rights. Try add comment out persistence of these items, just for test.

You don’t ever need to restart if you change rules or log levels (just in case you didn’t know that).

That’s a PointType, comes from the location of a ‘device’-Thing of the icloud-binding.
But, as I wrote, I get the long- and latitude of this Point correctly, because I can postUpdate both values to the other Items and see correct values of this changes on events.log. But the second of this two item-changes are not more persisted.

… after some hours of runtime, the problem comes back last night. The last sign of life of the mysql service are the following lines on openhab.log (This is not at the same position of my rule, that we discussed above):

2018-09-16 05:25:24.743 [DEBUG] [sql.internal.MysqlPersistenceService] - mySQL: Stored item 'Openhab_CPU_Load1' as '0.3'[0.3] in SQL database at 2018-09-16 05:25:24.719.
2018-09-16 05:25:24.745 [DEBUG] [sql.internal.MysqlPersistenceService] - mySQL: query: INSERT INTO Item357 (TIME, VALUE) VALUES(?,?) ON DUPLICATE KEY UPDATE VALUE=?;
2018-09-16 05:25:24.769 [DEBUG] [sql.internal.MysqlPersistenceService] - mySQL: Stored item 'Lisa_LocationLastUpdate' as '2018-09-16T05:24:22'[2018-09-16T05:24:22] in SQL database at 2018-09-16 05:25:24.75.
2018-09-16 05:25:24.772 [DEBUG] [sql.internal.MysqlPersistenceService] - mySQL: query: INSERT INTO Item442 (TIME, VALUE) VALUES(?,?) ON DUPLICATE KEY UPDATE VALUE=?;
2018-09-16 05:25:24.769 [DEBUG] [sql.internal.MysqlPersistenceService] - mySQL query: item is Lisa_LocationLat
2018-09-16 05:25:24.809 [DEBUG] [sql.internal.MysqlPersistenceService] - mySQL: Stored item 'Openhab_Network_PacketsSent' as '8097036'[8097036] in SQL database at 2018-09-16 05:25:24.783.
2018-09-16 05:25:24.809 [DEBUG] [sql.internal.MysqlPersistenceService] - mySQL: query: SELECT Time, Value FROM Item454 WHERE TIME<'2018-09-16 05:20:24' ORDER BY Time DESC LIMIT 0,1
2018-09-16 05:25:24.811 [DEBUG] [sql.internal.MysqlPersistenceService] - mySQL: query: INSERT INTO Item351 (TIME, VALUE) VALUES(?,?) ON DUPLICATE KEY UPDATE VALUE=?;

The last INSERT command was executed correctly, as I can see on by database table. But for the SELECT command there is no more DEBUG-output in the logfile like ‘‘mySQL: query returned 1 rows in 4ms’’

Could there be a problem with to many parallel MySQL queries?

Could it be a problem, that the Item Lisa_LocationLastUpdate, which triggers my rule, is changed again during the execution of my rule?

Are there any other packages, where I should enable DEBUG outputs?

Of course, any of these is a possible explanation. But that’s guessing instead of proper analysis.
If I were you I’d first make sure that the problem really is either with the icloud binding or your MySQL setup so we can exclude rules processing and all the rest of the OH framework from the list of potential causes.
As I already said: try commenting out persistence (or try with a different one), try disabling icloud-binding or at least comment out part of the rules that make use of it to find out if that makes your overall problem go away.
Don’t investigate any further until you’re sure.
Then in turn solve those that I can’t help with, sorry, as I’m not using these.

My problem seems to be solved: I now use the jdbc-persistence (with a Postgres database) instead of the mysql-persistence. And the system has been running stable for almost 48 hours.

So I have to assume that the mysql persistence is faulty or apparently overloaded with the numerous (parallel) queries.

Thank you anyway for your quick help.