RRDJ stopped working ...... UPDATE: again

Hello.

(my system is Openhabian in a Raspi 4, 4GB, current stable OH3.)

For reasons unknown to me, the RRD4J system stopped working. I haven’t changed anything in the persistence settings for a long time, so I can’t explain this.

Every time I save the file “rrd4j.persist” again, I see in the log that it is read in again without errors. Shortly after that, however, the following error message always appears:

2022-08-24 16:15:01.105 [WARN ] [ore.internal.scheduler.SchedulerImpl] - Scheduled job '<unknown>' failed and stopped

java.lang.IllegalArgumentException: Null consolidation function specified

	at org.rrd4j.core.ArcDef.<init>(ArcDef.java:43) ~[?:?]

	at org.rrd4j.core.RrdDb.getRrdDef(RrdDb.java:1288) ~[?:?]

	at org.openhab.persistence.rrd4j.internal.RRD4jPersistenceService.getConsolidationFunction(RRD4jPersistenceService.java:397) ~[?:?]

	at org.openhab.persistence.rrd4j.internal.RRD4jPersistenceService.store(RRD4jPersistenceService.java:142) ~[?:?]

	at org.openhab.core.persistence.internal.PersistItemsJob.run(PersistItemsJob.java:60) ~[?:?]

	at org.openhab.core.internal.scheduler.CronSchedulerImpl.lambda$0(CronSchedulerImpl.java:62) ~[?:?]

	at org.openhab.core.internal.scheduler.CronSchedulerImpl.lambda$1(CronSchedulerImpl.java:69) ~[?:?]

	at org.openhab.core.internal.scheduler.SchedulerImpl.lambda$12(SchedulerImpl.java:191) ~[?:?]

	at org.openhab.core.internal.scheduler.SchedulerImpl.lambda$1(SchedulerImpl.java:88) ~[?:?]

	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]

	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]

	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]

	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]

	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]

	at java.lang.Thread.run(Thread.java:829) [?:?]

I also removed and re-installed the RRD4J-Persitence but nothing has changed

How can i solve that ?

My rrd4j.persist file (with which I had no problems for months):

Strategies {
    // for rrd charts, we need a cron strategy
    everyMinute : "0 * * * * ?"
}

Items {
    *  : strategy = everyMinute
}

If you would allow us a look into that .persist file we might find a clue.
Additionaly please post the contents of your rrd4j.cfg ( any consolidation function would be set in there).

Here is my “rrd4j.cfg” file (the rrd4j.persist is already in the initial post of this thread)

# configure specific rrd properties for given items in this file.
# please refer to the documentation available at
# https://www.openhab.org/addons/persistence/rrd4j/
#
# default_numeric and default_other are internally defined defnames and are used as
# defaults when no other defname applies

#<dsName>.def=[ABSOLUTE|COUNTER|DERIVE|GAUGE],<heartBeat>,[<minValue>|U],[<maxValue>|U],<sampleInterval>
#<dsName>.archives=[AVERAGE|MIN|MAX|LAST|FIRST|TOTAL],<xff>,<samplesPerBox>,<boxCount>
#<dsName>.items=<comma separated list of items for this dsName>

My persistence settings:

You haven’t changed anything in the persistence settings, did you change anything else?
Changes of item types for examples can cause a “stumble”.

I would increase the log:level of

org.openhab.persistence.rrd4j

to DEBUG and restart the bundle of rrd4j. That way we should see more log entries regarding rrd4j.

I changed the level to DEBUG and after rewrite the .persist file I got following output.

log.txt (74.0 KB)

Some ideas ?

That log shows:

Several items can’t be persisted, probably due to an unsupported item type ( my guess: groups?)

Several items are persisted although for some of them the reported error shows. In other words the rrd4j is working!

I am missing the lines which show the creation of archives, they would print if the rrd4j bundle would be stopped and started again. The archives are using consolidation functions which are reported as Null in the error.
I know you didn’t change any of the default setups however…
Additionally, please state the item types for those items that receive the error message.

yep

I have versioned the configuration files with git and have now simply reset the files to the last status. It now seems to work again.

It seems that with the last function extensions (a mode for heating maintenance) I added some definitions that RRD4J did not like.

Unfortunately, I still don’t know the specific cause, but at least the system is running cleanly again for now.

Thank you for your support :slight_smile:

So, now a sequel again:
I wanted to add another functionality
and created some groups and items for it.

Everything works except for the creation of the last items in the groups and I don’t understand why at all.
I have already deleted the new rrd files of the items and then restarted rrd4j. As long as I don’t include any of the last open items, everything works, but if I include even one of the last items, it crashes immediately.

This is the config of one (IFSR_HeatRequested - it commented out here) of the failing new items (6 in number):


Group:Number:SUM              gIR_HeatRequests         
Group:Number:COUNT("ON")      gIBR_HeatRequests         (gIR_HeatRequests)
Group:Number:COUNT("ON")      gIGR_HeatRequests         (gIR_HeatRequests)
Group:Number:COUNT("ON")      gIFR_HeatRequests         (gIR_HeatRequests)

Group IFS_Radiator        "HK Schlafzimmer"       <radiator>  (IF_Schlafzimmer,OG_Radiators)     ["RadiatorControl"]
    Number  IFSR_Current          "Temperatur"        <temperature>  	  (IFS_Radiator,OG_RadiatorCurrents)        ["Temperature"]         { channel="zwave:device:z_wave_ctrl_01:node13:sensor_temperature" }
    Number  IFSR_Dest             "Soll"              <temperature>       (IFS_Radiator,OG_RadiatorDestinations)        ["Temperature"]         { channel="zwave:device:z_wave_ctrl_01:node13:thermostat_setpoint_heating" }
    Number  IFSR_Mode_Internal    "Thermostat-Modus"  <radiator>          (IFS_Radiator)        ["Temperature"]         { channel="zwave:device:z_wave_ctrl_01:node13:thermostat_mode" }
    Number  IFSR_Day              "Soll (Tag)"        <temperature_hot>   (IFS_Radiator)        ["Temperature"]    
    Number  IFSR_Night            "Soll (Nacht)"      <temperature_cold>  (IFS_Radiator)        ["Temperature"]    
    String  IFSR_Mode_Active      "Modus (soll)"      <radiator>  	      (IFS_Radiator,OG_RadiatorModeActives)         
    String  IFSR_Mode_Valve       "Ventilmodus"       <radiator>          (IFS_Radiator,OG_RadiatorModeValves)    
    Number  IFSR_Batterie         "HK Schlafzimmer"          <batterylevel>      (IFS_Radiator,IH_BatteryLevels)        ["Energy"]         { channel="zwave:device:z_wave_ctrl_01:node13:battery-level" }
    String  IFSR_TLP_TransferItem "[%s]"                                  (IFS_Radiator,gTimepicker)
    Dimmer  IFSR_ValveOpening     "Ventilöffnung"                         (IFS_Radiator,OG_RadiatorValveOpenings)        ["Opening"]       { channel="zwave:device:z_wave_ctrl_01:node13:switch_dimmer"}
//    Switch  IFSR_HeatRequested    "Wärmeanforderung"          <heating>   (IFS_Radiator,gIFR_HeatRequests)                            ["Energy"]        

This is the log without the item IFSR_HeatRequested:

2022-09-17 12:44:01.309 [TRACE] [d4j.internal.RRD4jPersistenceService] - Ignoring item 'IFSR_Mode_Active' since its type String is not supported
2022-09-17 12:44:01.310 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'IFSR_Day' as value '19.0' in rrd4j database (again)
2022-09-17 12:44:01.310 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'IFSR_Day' as value '19.0' in rrd4j database
2022-09-17 12:44:01.312 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'IFSR_Night' as value '16.0' in rrd4j database (again)
2022-09-17 12:44:01.312 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'IFSR_Night' as value '16.0' in rrd4j database
2022-09-17 12:44:01.313 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'IFSR_Current' as value '18.75' in rrd4j database (again)
2022-09-17 12:44:01.314 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'IFSR_Current' as value '18.75' in rrd4j database
2022-09-17 12:44:01.315 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'IFSR_Batterie' as value '50.0' in rrd4j database (again)
2022-09-17 12:44:01.316 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'IFSR_Batterie' as value '50.0' in rrd4j database
2022-09-17 12:44:01.317 [TRACE] [d4j.internal.RRD4jPersistenceService] - Ignoring item 'IFSR_TLP_TransferItem' since its type String is not supported
2022-09-17 12:44:01.318 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'IFSR_Dest' as value '16.0' in rrd4j database (again)
2022-09-17 12:44:01.318 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'IFSR_Dest' as value '16.0' in rrd4j database
2022-09-17 12:44:01.320 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'IFSR_ValveOpening' as value '0.0' in rrd4j database (again)
2022-09-17 12:44:01.320 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'IFSR_ValveOpening' as value '0.0' in rrd4j database
2022-09-17 12:44:01.321 [TRACE] [d4j.internal.RRD4jPersistenceService] - Ignoring item 'IFSR_Mode_Valve' since its type String is not supported
2022-09-17 12:44:01.322 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'IFSR_Mode_Internal' as value '0.0' in rrd4j database (again)
2022-09-17 12:44:01.322 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'IFSR_Mode_Internal' as value '0.0' in rrd4j database

when IFSR_HeatRequested is not commented out I get following logs:

2022-09-17 12:52:15.340 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'og_schlafzimmer.items'

==> /var/log/openhab/events.log <==

2022-09-17 12:52:15.372 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'IFSR_HeatRequested' changed from NULL to OFF
...
2022-09-17 12:53:00.842 [WARN ] [ore.internal.scheduler.SchedulerImpl] - Scheduled job '<unknown>' failed and stopped
java.lang.IllegalArgumentException: Null consolidation function specified
	at org.rrd4j.core.ArcDef.<init>(ArcDef.java:43) ~[?:?]
	at org.rrd4j.core.RrdDb.getRrdDef(RrdDb.java:1288) ~[?:?]
	at org.openhab.persistence.rrd4j.internal.RRD4jPersistenceService.getConsolidationFunction(RRD4jPersistenceService.java:397) ~[?:?]
	at org.openhab.persistence.rrd4j.internal.RRD4jPersistenceService.store(RRD4jPersistenceService.java:142) ~[?:?]
	at org.openhab.core.persistence.internal.PersistItemsJob.run(PersistItemsJob.java:60) ~[?:?]
	at org.openhab.core.internal.scheduler.CronSchedulerImpl.lambda$0(CronSchedulerImpl.java:62) ~[?:?]
	at org.openhab.core.internal.scheduler.CronSchedulerImpl.lambda$1(CronSchedulerImpl.java:69) ~[?:?]
	at org.openhab.core.internal.scheduler.SchedulerImpl.lambda$12(SchedulerImpl.java:191) ~[?:?]
	at org.openhab.core.internal.scheduler.SchedulerImpl.lambda$1(SchedulerImpl.java:88) ~[?:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.lang.Thread.run(Thread.java:829) [?:?]
...

When I delete the rrd-file an restart the rrd4j I get following additional trace line to the error:

2022-09-17 13:06:01.062 [TRACE] [d4j.internal.RRD4jPersistenceService] - Using rrd definition 'default_other = GAUGE heartbeat = 3600 min/max = NaN/NaN step = 5 4 archives(s) = [ LAST xff = 0.5 steps = 1 rows = 720 LAST xff = 0.5 steps = 12 rows = 10080 LAST xff = 0.5 steps = 180 rows = 35040 LAST xff = 0.5 steps = 2880 rows = 21900] 0 items(s) = []' for item 'IFSR_HeatRequested'.

2022-09-17 13:06:01.256 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception: 
java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code
	at jdk.internal.misc.Unsafe.putLongUnaligned(Unsafe.java:3553) ~[?:?]
	at java.nio.ByteBufferAsDoubleBufferB.put(ByteBufferAsDoubleBufferB.java:145) ~[?:?]
	at java.nio.DoubleBuffer.put(DoubleBuffer.java:887) ~[?:?]
	at org.rrd4j.core.ByteBufferBackend.writeDouble(ByteBufferBackend.java:81) ~[?:?]
	at org.rrd4j.core.RrdPrimitive.writeDouble(RrdPrimitive.java:78) ~[?:?]
	at org.rrd4j.core.RrdDoubleMatrix.<init>(RrdDoubleMatrix.java:15) ~[?:?]
	at org.rrd4j.core.Archive.<init>(Archive.java:66) ~[?:?]
	at org.rrd4j.core.RrdDb.<init>(RrdDb.java:482) ~[?:?]
	at org.rrd4j.core.RrdDb.of(RrdDb.java:417) ~[?:?]
	at org.openhab.persistence.rrd4j.internal.RRD4jPersistenceService.getDB(RRD4jPersistenceService.java:331) ~[?:?]
	at org.openhab.persistence.rrd4j.internal.RRD4jPersistenceService.store(RRD4jPersistenceService.java:140) ~[?:?]
	at org.openhab.core.persistence.internal.PersistItemsJob.run(PersistItemsJob.java:60) ~[?:?]
	at org.openhab.core.internal.scheduler.CronSchedulerImpl.lambda$0(CronSchedulerImpl.java:62) ~[?:?]
	at org.openhab.core.internal.scheduler.CronSchedulerImpl.lambda$1(CronSchedulerImpl.java:69) ~[?:?]
	at org.openhab.core.internal.scheduler.SchedulerImpl.lambda$12(SchedulerImpl.java:191) ~[?:?]
	at org.openhab.core.internal.scheduler.SchedulerImpl.lambda$1(SchedulerImpl.java:88) ~[?:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.lang.Thread.run(Thread.java:829) [?:?]

How can I solve this problem ?

PS: It is also interesting that I have already successfully integrated exactly these items on other radiators.

PPS: Ok, there is a small update:
For test reasons, I wanted to save all rrd files and then delete them (not only those of the items with the problem as I did before)

But that caused other problems (e. g. now I also know that the persistence files are in an overlay directory).

The result is that I lost all my historical data - but I was now able to create the last items without any problems.

If anyone has any tips on how I should have proceeded better in this case or what the cause probably was - please let me know.