Persistence not recording all events

Thinqer · September 25, 2022, 9:48pm

Hi all,

I’ve noticed that the OH3 persistence service doesn’t seem to be logging all events in the database.

Setup is as follows:

OpenHAB 3.3.0 running in a Docker container on Synology
Mosquitto running as native app on same Synology
MariaDB 10 running as native app on same Synology
InfluxDB 2 running in a Docker container on same Synology

I have two persistence services set up, one for MariaDB and one for InfluxDB (the latter for testing to see if this is an issue with MariaDB specifically or with the persistence functionality more generally). Both persistence services are set to log on ‘everyUpdate’.

I have a custom made temperature sensor that logs the temperature every 2 minutes via MQTT. I have a thing (Generic MQTT Thing via the binding) set up for the sensor with an Item linked to the temperature channel.

When I use a PC-based MQTT client I can see that the sensor is correctly delivering payloads to the MQTT broker at 2 minute intervals. When I look in the OpenHAB event log, I can see that it is also picking up the same temperature readings every 2 minutes as ItemStateChangeEvent for the temperature Item.

However, when I look in MariaDB it is missing a lot of the temperature changes and has only recorded about half of them (and to confirm, there have been actual changes in the temperature reading so it should have logged them, even if I wasn’t using ‘everyUpdate’).

I find the same thing in InfluxDB and it has logged the same temperature readings as MariaDB (or to look at it another way, it has missed the same readings as MariaDB did).

Something else I’ve noticed is that the timestamp of the data items in MariaDB are anywhere from 4 to 14 seconds after the data was logged in the OpenHAB event log. It also appears to have logged one item twice, the second instance a minute after the first. There appears to be the same issue in InfluxDB too (including the duplicate log item).

I have also checked a couple of different Thing and Item types, and they also have the same issue.

So from this it looks like:

It’s not a problem with the sensor
It’s not a problem with MQTT
It’s not a problem with the Thing or the Item configuration or type
It’s not a problem with the OpenHAB MQTT binding
It’s not a problem specifically with MariaDB
It’s not a problem specifically with InfluxDB

This means that it’s most likely either a problem with the persistence service in OpenHAB, or something else that affects MariaDB and InfluxDB, or something to do with my Synology/Docker setup (or something else I’ve not thought of). My Synology is pretty well specified and the Docker containers do not seem to be short of RAM or CPU allocation.

I’m not sure where next to look to solve this issue, or whether or not it’s a known problem. Happy to provide any further config data that’s useful.

Thanks in advance for your thoughts and suggestions.

B.

rossko57 · September 25, 2022, 10:57pm

May we see?

How much work have you given them to do i.e. how many Items, estimated update rate. Is the default rrd4j-everything still in place? There’d be quite a lot to do.

Thinqer · September 25, 2022, 11:35pm

jdbc.persist looks like this:

Strategies {
	// for rrd charts, we need a cron strategy
	every15Minutes : "0 0,15,30,45 * * * ?"
	default = everyUpdate
}

Items {
	// persist all items on every change and restore them from the db at startup (note everyChange and restoreOnStartup are Statically defined)
    * : strategy = everyUpdate, restoreOnStartup
}

influxdb.persist looks like this (I’m only logging a few items now for testing purposes):

Strategies {
	// for rrd charts, we need a cron strategy
	default = everyUpdate
}

Items {
	// persist all items on every change and restore them from the db at startup (note everyChange and restoreOnStartup are Statically defined)
	GosundUP11101_Gosund01Power : strategy = everyUpdate
    ShellyM01_Lux : strategy = everyUpdate
	ShellyM02_Lux : strategy = everyUpdate
	ShellyM03_Lux : strategy = everyUpdate
	ShellyM03_Lux -> "M03 everyChange" : strategy = everyChange
	ShellyM04_Lux : strategy = everyUpdate
	ShellyM05_Lux : strategy = everyUpdate
	ShellyM06_Lux : strategy = everyUpdate
	TempSens06_TempSens06Temperature : strategy = everyUpdate
	TempSens01Workshop_Workshopbattery : strategy = everyUpdate
}

Not sure how to tell if rrd4j is still in place - can you tell me how? Default persistence service is set to JDBC.

I have ~300 items. About 70 of those are being updated once every 30 seconds (not in sync with each other, so they will all be logging on different 30 second periods). About 50 of them are being updated every 2 minutes (again, not in sync with each other). The rest are far less frequent. Looking at events.log I can see that for any given 5 minute period there are ~450 entries, suggesting an average of about 1.5 events per second.

My previous OH 1 installation was running on a much older Synology box but with roughly the same number of items with the same refresh frequency and it was fine.

Thanks for any help.

Thinqer · September 25, 2022, 11:39pm

I’ve just pulled data from my old and new systems for the same sensor for the same date and charted them next to each other and you can see the difference:

Actually, looking at my old .items file, I didn’t have all the items that are reporting every 30 seconds, but I did have all the ones reporting every 2 mins. About 200 items in total.

rossko57 · September 25, 2022, 11:55pm

That’s not “excessive” but I do think, especially with multiple databases, that ‘everything’ strategies can lead to traffic jams and discards. And you might have two ‘everythings’ at work.

If you are displaying charts in the UI at the same time, that can add to traffic.

It’s an add-on just like the others, look in the GUI.
“Default persistence” is only about selecting one for charting etc. and does not affect what is installed/active.

I know nothing about Docker, but I assume this results in the databases appearing like “external box” to openHAB. I can’t think that should add much overhead, though.

The defaults are great for getting started, but when develop further to non-trivial configurations you have to start making performance considerations.

Don’t care really, you’re in a different environment now.

I’ve got a nagging worry that one “bad” Item might cause a performance hit, by introducing a wait/timeout situation when its turn comes around. I think you’d need DEBUG persistence logging to spot that kind of thing. At OH3.3 and below, “bad” might include Item names with a leading numeric digit.

Thinqer · September 26, 2022, 10:59pm

Ok, I’ve uninstalled the rrd4j add-on and will see what that does.

I enabled DEBUG logging on the jdbc persistence package and there seems to be huge variability in the time it takes to INSERT data into MariaDB, even for the same item, ranging from 80ms to 4000ms and I can’t spot any pattern. Unfortunately the influxdb logging doesn’t tell me how long it took to insert a datapoint into the DB. Could possibly be a performance issue with MariaDB, which I have noticed being very laggy when accessing via phpMyAdmin. That could be delaying writes to InfluxDB.

If you are displaying charts in the UI at the same time, that can add to traffic.

Nope, I’m not displaying charts in the UI. I do run some Grafana charts off the MariaDB instance but they’re not constantly being displayed anywhere and when I do view them in a browser they’re decently performant.

At OH3.3 and below, “bad” might include Item names with a leading numeric digit.

None of my item names begin with a digit so that shouldn’t be a problem.

Will monitor the situation for a bit and report back if it’s improved with removal of rrd4j. If not then I may disable jdbc altogether and switch to influxdb to see if that is any better.

Thanks for your help so far.

rossko57 · September 26, 2022, 11:23pm

That’s rather sad. Of course, if something is in a queue and complaining about wait time it’s not necessarily the something that is the problem.

It’s going to be hard to identify the bottleneck.

Thinqer · September 27, 2022, 9:35pm

Ok, so something very odd is happening.

I decided to turn off the ‘everything’ strategy for MariaDB and just log the same 10 items as I am for the InfluxDB persistence in the hope that if I reduced the load on the database, it would start logging correctly, or at least I could rule database performance out as the issue.

What’s happened is that the persistence service has gone crazy and is logging dozens of entries for the items, even though they have not received any updates.

For example, per MQTT my temperature sensor reported the following two temperatures, approximately 2 minutes apart:

22:15:53.649 - 19.37
22:17:51.294 - 19.33

When I look in the OpenHAB event.log I see the following corresponding entries (there’s an hour’s difference because OpenHAB is running on UTC/GMT and doesn’t seem to want to take into account DST):

21:15:52.761 - 19.37
21:17:50.397 - 19.33

Then when I look in MariaDB I see 40 entries from 21:15:56.329 to 21:17:46.174 all with the value 19.37. It then starts recording dozens of entries for the next value of 19.33 for the next two minutes until the next value comes in, etc. etc. Note that I’ve forced MariaDB to run on UTC too.

I have exactly the same number of records created in InfluxDB too, so it doesn’t look like it’s a problem just with MariaDB. It seems like the persistence service is trying to continuously write as much as it can to the database constantly, not just on updates. This suggests that when I have the ‘everything’ strategy on, it’s just running as fast as it can to write multiple values for the other 300 items.

Any ideas what’s going on or what else to test?

Thanks again,

B.

Thinqer · September 27, 2022, 9:47pm

Update: I’ve just completely disabled the JDBC persistence by commenting out everything in the Items section and so only have the InfluxDB persistence logging and it’s behaving perfectly - it’s only logging one record every 2 minutes when a new value comes in.

Is this an issue with the JDBC persistence service then?

rossko57 · September 27, 2022, 9:48pm

Timestamping seen in the logs relies on the host Java timezone. That is not influenced by openHABs locale/timezone setting, which is used for displays etc.
Java time is used for other hidden machinery, like scheduling of rules etc.

Who knows what timezone your databases are operating in … but I do wonder what timezone OH persistence service uses and what happens if it disagrees with the database it is trying to interrogate.

I’d fix that before proceeding.
You mentioned docker, see

Thinqer · September 27, 2022, 9:52pm

I think the DBs and OH are all running on UTC, it’s just my PC (on which the MQTT client I was using is running) that is set to BST.

Did you see my message above about what happened when I disabled the JDBC persistence?

rossko57 · September 27, 2022, 9:56pm

Yes, it went screwy. Persistence used to require being stopped to implement any changes. Parts of it still might i.e. reboot OH and see what you got.

Thinqer · September 27, 2022, 10:20pm

Restart of OH seems to have fixed it. Persistence logging looks normal for now and INSERT times look lower than they were.

Will monitor and report back if it goes funky again.

Thanks for your help!

Christian_Pogea · November 11, 2022, 4:21pm

I have pretty much the same Problem.

I can see all events are in events.log but many of them are missing in InfluxDB.

	76958	2022-11-11 09:35:58.232 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 707 to 708
	76973	2022-11-11 09:36:28.242 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 708 to 707
	76989	2022-11-11 09:36:58.250 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 707 to 705
	77005	2022-11-11 09:37:28.223 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 705 to 704
	77019	2022-11-11 09:37:58.261 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 704 to 705
	77038	2022-11-11 09:38:28.264 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 705 to 706
	77058	2022-11-11 09:38:58.252 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 706 to 705
	77076	2022-11-11 09:39:28.272 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 705 to 704
	77097	2022-11-11 09:39:58.272 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 704 to 697
	77111	2022-11-11 09:40:28.231 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 697 to 702
	77135	2022-11-11 09:40:58.232 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 702 to 709
	77152	2022-11-11 09:41:28.233 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 709 to 705
	77167	2022-11-11 09:41:58.281 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 705 to 701
	77188	2022-11-11 09:42:28.285 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 701 to 699
	77209	2022-11-11 09:42:58.251 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 699 to 698
	77225	2022-11-11 09:43:28.262 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 698 to 696
	77241	2022-11-11 09:43:58.242 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 696 to 702
	77259	2022-11-11 09:44:28.262 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 702 to 700
	77276	2022-11-11 09:44:58.271 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 700 to 699
	77293	2022-11-11 09:45:28.271 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 699 to 697
	77317	2022-11-11 09:45:58.272 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 697 to 2

In the InfluxDB is just one single event:

It is not always that bad:

	85549	2022-11-11 16:40:29.344 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 697 to 695
	85573	2022-11-11 16:40:59.345 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 695 to 691
	85588	2022-11-11 16:41:29.304 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 691 to 689
	85608	2022-11-11 16:41:59.335 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 689 to 691
	85624	2022-11-11 16:42:29.294 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 691 to 694
	85667	2022-11-11 16:43:29.304 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 694 to 685
	85683	2022-11-11 16:43:59.304 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 685 to 684
	85704	2022-11-11 16:44:29.334 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 684 to 687
	85721	2022-11-11 16:44:59.354 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 687 to 686
	85738	2022-11-11 16:45:29.315 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 686 to 689
	85751	2022-11-11 16:45:59.326 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 689 to 690
	85794	2022-11-11 16:46:59.315 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 690 to 686
	85813	2022-11-11 16:47:29.295 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 686 to 687
	85848	2022-11-11 16:48:29.313 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 687 to 695
	85867	2022-11-11 16:48:59.357 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 695 to 703
	85893	2022-11-11 16:49:29.344 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 703 to 692
	85916	2022-11-11 16:49:59.354 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 692 to 690
	85929	2022-11-11 16:50:29.336 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'MQTTSoyo_SoyoWatt' changed from 690 to 691

During the same period just 8 events are missing:

openhab.log is empty for the whole day.

this is my influxdb.persist file:

Strategies {
    everyMinute : "0 * * * * ?"
    everyHour   : "0 0 * * * ?"
    everyDay    : "0 0 0 * * ?"
    default = everyChange, restoreOnStartup
}

Items {
    everyMinutePersistance*             : strategy = everyMinute, restoreOnStartup
    everyHourPersistance*               : strategy = everyHour, restoreOnStartup
    everyDayPersistance*               : strategy = everyHour, restoreOnStartup
	jkbms_jkbmsstateofcharge 			: strategy = everyChange, everyHour, restoreOnStartup
    *                                   : strategy = everyChange, restoreOnStartup
}

I have a script inserting data in influxdb (OpenHab is not involved) and thats absolutely reliable. So the Problem must be anywhere in OpenHab3.

Openhab3 is running an Synology / Docker Version 3.3.0
InfluxDB is running on same machin (Synology / Docker) Version 2.4
My Synology has got 6GB of free RAM / CPU hat an avg of 10%
I have like 2000 event changes per hour / 33 events per minute

Christian_Pogea · November 11, 2022, 5:10pm

I made a test application in c# connecting to mqtt and writing directly to influxdb.

In OpenHab events.log we can see 19 events received:

	88135	2022-11-11 17:55:28.132 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -654.97 to -641.16
	88146	2022-11-11 17:55:43.388 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -641.16 to -641.14
	88156	2022-11-11 17:55:58.586 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -641.14 to -640.58
	88163	2022-11-11 17:56:13.821 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -640.58 to -631.52
	88172	2022-11-11 17:56:29.063 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -631.52 to -649.74
	88180	2022-11-11 17:56:44.317 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -649.74 to -635.72
	88192	2022-11-11 17:56:59.576 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -635.72 to -644.70
	88201	2022-11-11 17:57:14.762 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -644.70 to -644.34
	88212	2022-11-11 17:57:30.030 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -644.34 to -644.03
	88216	2022-11-11 17:57:45.258 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -644.03 to -635.01
	88228	2022-11-11 17:58:00.509 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -635.01 to -634.68
	88232	2022-11-11 17:58:15.774 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -634.68 to -634.66
	88241	2022-11-11 17:58:30.964 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -634.66 to -634.33
	88248	2022-11-11 17:58:46.202 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -634.33 to -634.40
	88260	2022-11-11 17:59:01.448 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -634.40 to -638.82
	88268	2022-11-11 17:59:16.691 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -638.82 to -638.54
	88277	2022-11-11 17:59:31.956 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -638.54 to -642.90
	88284	2022-11-11 17:59:47.204 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -642.90 to -642.95
	88301	2022-11-11 18:00:02.408 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'jkbms_jkbmspower' changed from -642.95 to -638.11

My dotnet application has successfully inserted 19 values but OpenHab inserted only 2 values:

So we can say:

Openhab is receiving exactly the same amount of messages from mqtt
Openhab doesn’t insert all values
My dotnet application is able to insert all values, so there can’t be a performance problem in influxdb
it seams that might be a OpenHab bug?