Migration OH2 to OH3 - Unknown errors - no more ideas

Peter_Novotny · February 27, 2023, 9:07pm

Hello everybody

In recent days/weeks/months I tried to migrate my small automation project that moves rollershutter in our house depending on light and other conditions. The result is after more than 100 hours of work I ended up with unbelievable mess.
It does not want to start and after start rules are not triggered, UI shows incorrect values etc…
My time I’m willing to spend on the migration is really run out together with mental power. I’m not saying OH3 is bad product, but I’m in place where I can not move forward with my knowledge.
Either someone here will be able and willing to help or I will change back to OH2. I’m ready to accept also non-free support, but please note that I do not earn money by this and IT is only my hobby.

List of known issues I can not get rid of:

Startup: I ended with very ugly workaround Binding not yet ready in startup sequence, how to detect
Numbers: In sitemap I have Switch item=AUTO_MODE mappings=[0="Neutral", 1="Heating", 2="Cooling"], when I change AUTO_MODE no indication is made in UI, if I log the value it is logged as 2.0 (on OH2 it was 2), this might be the reason but have no idea how to fix and did not find anything in documentation about number format change, this was observed on all number variables and maybe is also reason for the next point
Rules that are not directly triggered by UI (e.g. rule to start when light reaches certain level) are not triggered at all. There is absolutely no record in log file and do not know where to continue search
Out of OPENHAB scope, there is C program that send 433 and NRF wireless, NRF works, but 433 does not work. Hardware and wiring is for sure OK because I’m using the same HW connected to the same pins, this HW is moved between old Raspberry PI and new Raspberry PI and on old it works without problem

If anyone is interested repo is here: GitHub - IcePlanet/HomeAutomation: Home automation with Arduino and Pi

master is guaranteed to work on OH2 and Raspberry PI 2B
branch OH3_GW is guaranteed to NOT work on OH3 and Raspberry PI 3B
please note that it is under documented, but I can explain anything unclear
code is not nice, but again it is not my job and I’m trying my best

Anyone interested/able/willing to help?

rlkoshak · February 27, 2023, 9:50pm

This seems overly complex and brittle. If your system can’t run while those Exec Things are OFFLINE, why not just simply do the following:

Create a rule that triggers at system runlevel 100 and when the Exec Thing goes ONLINE and update a Switch Item to ON.
Create a rule that triggers when the Exec Thing goes OFFLINE and updates the Switch to OFF.

In the rest of your rules where it matters, check to see if the Switch is ON or OFF and behave as appropriate.

Another thing you can do is:

trigger a rule at system runlevel 40 (I think that’s the earliest runlevel) and when the Exec Thing goes OFFLINE and disable all the Rules that depend on it.
trigger a rule when the Exec Things goes ONLINE to enable the rules that depend on it.

All these sleeps and locks and other timing things are just way to complicated and way to brittle. With this approach you not only get a more robust system that doesn’t depend on timing but you can only enable/disable those rules that really care (e.g. some rules may care about this Exec binding Thing but another set may care about the MQTT Broker Thing).

But in general, you cannot guarantee that any Thing is initialized and ONLINE at any point in a rule’s life cycle, but particularly at startup since the rule engine is available (runlevel 40) even before the Things are loaded (runlevel 80).

I’ve not seen this one reported. What happens if you use a Dimmer for the Item instead of a Number? PercentType, state carried by Dimmer, can only be integers. That would confirm or eliminate the decimal point as being the problem. I don’t know how you use these, but a String Item would also work here without problem. You’d have to change your mappings by adding quotes and your rules of course.

In $OH_USERDATA/etc/log4j2.xml find the line

<Logger level="ERROR" name="openhab.event.RuleStatusInfoEvent"/>

and change the level to DEBUG. While you are at it, change the StartlevelEvent logger to DEBUG too. That will add the rule state changes to events.log as well as the runlevel events so you can see if perhaps your system never reaches that runlevel.

Alternatively, you can open the Developer Sidebar in MainUI (alt-shift-d) and use the event stream there to see all the events that happen in OH (there’s a lot more than just what you see in events.log). You can set up a filter to see just the rule events. But this won’t help you much during a system startup.

There’s also the good old technique of adding logInfo lines to the top of all your rules before any of the logic. That can show when the rules actually trigger.

Without more details, such as confirmation that the rules are indeed not running, I doubt anyone will be able to help.

Peter_Novotny · February 27, 2023, 10:25pm

Hello rlkoshak I very much appreciate all the effort and help you are trying to give me. For sure your knowlege is much better than my, but please can you be more specific what to try in the parts I will comment below:

STARTUP: Any operation will trigger Java garbage, simple comparison to null, for cycle… create errors. How in this state can I check if service is online (btw this was my first attempt as you can see in another thread because I also do not like delays and hoping, but really have no idea how to write it in a way that it will not trigger errors.

NUMBERS: For the one type mentioned in previous example your solution will work, I can use more or less anything, but I’m using number constructs across all rules comparing them, multiplying etc… none of this works now, or to be more precise works ‘randomly’ for example log from today:

2023-02-26 21:05:54.215 [INFO ] [penhab.core.model.script.Persistence] - Persistence refreshing S_rain_southffxRAIN in state 0
2023-02-26 21:05:54.225 [INFO ] [penhab.core.model.script.Persistence] - Persistence refreshing S_light_southffxSLD in state 2.0
2023-02-26 21:05:54.260 [INFO ] [penhab.core.model.script.Persistence] - Persistence refreshing AUTO_MODE in state 0.0
2023-02-26 21:21:18.596 [INFO ] [org.openhab.core.model.script.Light ] - Light changed input is 2.0/0

You can see sometimes there is 0, sometimes 0.0, sometimes 2.0 in the Light input, first value (with decimal places) is OH state based on second value that is raw number received from API measured by sensor.

DEBUGGING: I admit that I have not used the DEBUG level because it floods very quickly and even if there might be something useful with my knowledge I’m not able to find it. With loginfo I feel like I’m already overusing and it confuses me even more, lets take the RealStartup rule. It contains

logInfo ("STARTUP", "System Starting... home_away status: "+HOME_AWAY.state)

as first line and

logInfo ("STARTUP", "Startup finished...")

as almost last line. System is rebooted multiple times and each time the appearance is different. First line (System Starting) is logged 80% cases with the delay workaround it increased to 100%, but last line (startup finished) is logged in cca 50% cases. No file or config has changed, only raspberry pi rebooted. How can I debug something when each run has non deterministic result???

Because of your knowledge I have seen here on forum when I was often following your advice to other people would you be interested to help me on non-free basis?

rlkoshak · February 27, 2023, 10:57pm

In Rules DSL:

rule "Exec Thing ONLINE"
when
    Thing <UID:OF:EXEC:THING> changed to ONLINE or
    System reached start level 50
then
    if(getThingStatusInfo(<UID:OF:EXEC:THING>).toString == "ONLINE") {
        ExecThingOnline.postUpdate(ON);
    }
    else {
        ExecThingOnline.postUpdate(OFF);
    }
end

rule "Exec Thing ONLINE"
when
    Thing <UID:OF:EXEC:THING> changed
then
    if(getThingStatusInfo(<UID:OF:EXEC:THING).toString != "ONLINE") {
        ExecThingOnline.postUpdate(OFF);
    }
end

Obviously replace <UID:OF:EXEC:THING> with the actual ID of your Exec Thing.

These are driven by events. Except in the runlevel case, the rule won’t run unless the Thing is already loaded so you shouldn’t encounter any errors related to that. If you do encounter errors, keep upping the runlevel until the errors go away or you reach 100.

If the errors continue after you get to 100, your system is very sick and there really isn’t anything you can do from rules to truly address the problems.

Note: that “Java garbage” is there for a reason. It’s meaningful. It’s likely going to tell us what is really wrong.

Yes, the state is what ever the sensor, UI, or rule updates the Item to and exactly that. Maybe there was a change in the binding you are using and now it’s sending floats instead of ints.

Honestly, the fact that you’ve implemented your own “persistence” raises all sorts of red flags anyway. It’s really hard, with the limited information provided, to even begin to tell what’s failing because you’ve some sort of timing issue or the like that has lain dormant until now, a simple configuration error, or a bug in OH (least likely or else others who have been running OH 3 for two years now would have reported it).

The fact that you’ve gone and implemented your own persistence in rules really points to the first option more than anything. But you’ve kept so much from us who can tell.

Well, first of all, it’s not non-determinisitc. An error is happening of some sort causing the rule to either never stop or exit before that last line. Maybe an error occurs (one of those “Java garbage” things) and the rule exits prematurely. Without looking at logs

Peter_Novotny · February 27, 2023, 11:31pm

I understand that my lame coding practices might raise red flags by you, but I would like to point out that system was running without problem for more than 2 years on OpenHab2. Please take into account the working state on OH2 and any modification made was only because of migration, no new functions or changes to logic have been introduced.

You can see my primitive persistence (that was working for 2 years without issue on OH2) here HomeAutomation/simple_persistence.sh at OH3_GW · IcePlanet/HomeAutomation · GitHub

In things I have:

Thing exec:command:persistence [
            command="/usr/local/bin/simple_persistence.sh %2$s",
            interval=0,
            timeout=33,
            autorun=false]

In items there is persistence as group, anything what should be persistent is assigned to group gPersistent like this:

Number S_rain_southffxRAIN "Rain sensor SOUTH [%s]" (gS_TRIGGER_ALL, gPersistent) // 0:dry; 1:rain
Switch HOME_AWAY "Away" <lock> (gS_TRIGGER_ALL, gPersistent) // ON:away; OFF:home

The rules taking care about persistence are really simple:

// reading persistence at startup
rule "RealStartup"
  when 
    Item LATE_STARTUP received command ON
  then
// skipped no persistence relevant code
    gPersistent.getMembers.forEach [p | 
      logInfo ("Persistence", "Persistence refreshing "+p.name+" in state "+p.state.toString())
      if ( p.state == NULL ) {
        try {
          persistence_lock.lock()
          PERSISTENCE_Args.sendCommand("-R -a "+p.name)
          PERSISTENCE.sendCommand(ON)
          while (PERSISTENCE.state == ON) {
            Thread::sleep (100)
          }
          Thread::sleep (1000)
          if ( PERSISTENCE_Out.state.toString.length > 0 ) {
            logInfo ("Persistence", "Persistence read for "+p.name+" as "+PERSISTENCE_Out.state+" ("+PERSISTENCE_Out.state.toString.length+")")
            postUpdate (p.name.toString, PERSISTENCE_Out.state.toString)
          }
          else {
            logInfo ("Persistence", "Persistence NOT read for "+p.name+" as "+PERSISTENCE_Out.state+" ("+PERSISTENCE_Out.state.toString.length+")")
          }
        } catch(Throwable t) { logInfo ("Rollershutter", "ERROR EXCEPTION IN Timer_expired") }
        finally {
            persistence_lock.unlock()
        } //finally
      }
    ]
    Thread::sleep (1000)
    running_persistence_read = 0
// skipped no persistence relevant code
end

// and writing persistence
rule "Write_Persistence"
  when
    Member of gPersistent changed
  then
    if ( running_persistence_read == 0 ) {
      try {
        logInfo ("Rollershutter", "Writing persistence for:" + triggeringItem.name)
        persistence_lock.lock()
        PERSISTENCE_Args.sendCommand("-W -a "+triggeringItem.name+" -v "+triggeringItem.state)
        PERSISTENCE.sendCommand(ON)
        while (PERSISTENCE.state == ON) {
          Thread::sleep (500)
        }
      } catch(Throwable t) { logInfo ("Rollershutter", "ERROR EXCEPTION IN write persistence") }
      finally {
          persistence_lock.unlock()
      } //finally
    }
end

Peter_Novotny · February 27, 2023, 11:36pm

It is vice versa, in the log

2023-02-26 21:21:18.596 [INFO ] [org.openhab.core.model.script.Light ] - Light changed input is 2.0/0

the 1st number 2.0 (in this case) is what openhab has calculated based on value received from sensor 0 via API

rlkoshak · February 27, 2023, 11:46pm

It has nothing to do with “primitive code.” I has to do with the fact that you’ve reimplemented something that comes with OH out of the box. I’m sure you had your reasons but I guarantee you’ve spent more time and effort creating and maintaining it than if you used what’s built into OH.

What raises the red flags are all the sleeps and locks. If this worked in OH 2, it was only a trick of timing and you were luck. You are lucky you didn’t starve out your rules after consuming all the rules threads while rules are sleeping (something that is impossible to do in OH 3 now that each rule gets it’s own thread).

By all appearances it worked only by accident in OH 2 and now you’ve moved to a different system with different timing, you are suffering from timing issues.

But :shrug. Still haven’t seen a single log from a startup.

And this rule is?

It is now and has always been the case that the value stored by the Item is the state it was updated to. There is no calculations that take place when an Item is updated. It gets a little thornier when dealing with UoM but even then, the units might get thrown away or converted. But OH never changes the state.

Note, if you called MyLightItem.postUpdate(2) and then immediately logged out MyLightItem.state the log is almost certainly going to be the old state of the Item, not the new state. Updates are processed asynchronously. If this worked before, again, it was a trick of timing. OH 3 executes rules more efficiently than OH 2, it now doesn’t take as long as it used to and your log is happening before the Item updates now. But again :shrug. You’ve not posted the rule. You’ve not posted relevant logs. I’m only guessing what could be happening here.

But all this is already too confusing. Let’s stick to one thing at a time. Do you want to address the startup behavior or the number behavior. Going after both at the same time will lead us nowhere.

If startup, post the full openhab.log of the startup sequence along with the corresponding events.log and any rules that throw errors.

If you want to go after the Number issue, post the definition of these Items, relevant rules, relevant parts of openhab.log and relevant parts of events.log.

Wolfgang_S · February 28, 2023, 6:36am

I don’t see the /etc/openhab/misc/exec.whitelist mentioned in the github repo which is required to run exec commands ( Exec - Bindings | openHAB ).
In case you don’t have it I would expect to see messages in the logfile.

Max_G · February 28, 2023, 10:07pm

I can feel your pain. I really do.
I recently migrated from OH2 to OH3 with a similar effort in hours (and all sorts of hair-pulling and ideas of giving up).

Based on what I read here, my suggestion would be to go back to OH2.
This will give you back a working system. It allows you to relax and recuperate from the experience you had so far.

I would then get another rPi (or whatever you’re running it on), and rebuild the system in OH3, but the proper way, as documented.

I used openHABian; migrated rule by rule, while fixing any errors. This will get easier as you progress. I also documented the changes in command for ease of repeatability.

By doing so you end up with a system more people are willing to support you.
You then have something that is set-up properly and you are on the last version, people are also more inclined to help you with.

On a side note: if Rich would tell me I am doing something wrong or unusual, I would follow his advice, and change it to what it should be.

All the very best.

Peter_Novotny · March 1, 2023, 9:20pm

First of all I would like to ensure rlkoshak and Max_G that I strongly respect knowledge, experience and all the effort of rlkoshak and also other members of this community. I have often used the hints in other threads (where I have not participated) and the hints and pieces of code helped me a lot. With all the respect it does not mean that I would blindly implement something without thinking. And asking, arguing, discussing various approaches does not mean I do not respect the other opinion. Maybe I do not understand it sufficiently, maybe it is in different situation… If you feel offended that I’m not willing to listen to you then I apologize because it was not my intention and I have never thought of you and/or your hints like this.

Now to the practical points

Persistence If I would be starting today I would do it differently, at that time I was starting with openhab and whole automation. It has taken me lot of effort to understand the basic principles of OH, rules, api, sensors, arduino, radios, HW… and I need really the most primitive system to keep variable values, for this reason I re-invented the wheel, in fact at that time I was not knowing that the wheel is already invented.

Sleeps and locks For many reasons (starting by variables race conditions and finishing by relay currents…) it was big benefit in my setup to serialize all (almost) rules that can move the rollershutter (price for this is slower reaction to commands, but time is not important for my use case). This in turn can easily use all available execution threads as rlkoshak correctly mentioned. To prevent this if rule is going to trigger multiple rules that might result in rollershutter movement it starts them only if the lock is open (this is the loop with delay inside), because of race conditions this results in max 2 execution threads (little bit simplified, but it is not that important to go into deep details, was tested for more than 300 motors control and worked perfectly) occupied by rules that can move rollershutter and this threads are serialized by locks.
Result are long running rules (I was not aware this causes problems to OH as I have not experienced problems with this until now) but keeping still threads free for manually triggered rules, that can even ‘overtake’ the automated rules.

Numbers with(out) decimal places
Rule is here:

rule "South SLD"
// 0:Sunshine 1:Light 2:Dark
  when
    Item S_light_southffxINRAW changed
  then
    logInfo("Light", "Light changed input is " + S_light_southffxSLD.state + "/" + S_light_southffxINRAW.state)
    if ( S_light_southffxDarkTreshold.state == NULL ) { S_light_southffxDarkTreshold.postUpdate(sensor_default_dark) }
    if ( S_light_southffxSunshineTreshold.state == NULL ) { S_light_southffxSunshineTreshold.postUpdate(sensor_default_sunshine) }
    if ( S_light_southffxHysteresis.state == NULL ) { S_light_southffxHysteresis.postUpdate(sensor_default_hyst) }
    if ( S_light_southffxSLD.state == NULL ) {
      if ( (S_light_southffxINRAW.state as Number) <= (S_light_southffxDarkTreshold.state as Number) ) { S_light_southffxSLD.sendCommand (2) }
      else if ( (S_light_southffxINRAW.state as Number) <= (S_light_southffxSunshineTreshold.state as Number) ) { S_light_southffxSLD.sendCommand (1) }
      else if ( (S_light_southffxINRAW.state as Number) > (S_light_southffxSunshineTreshold.state as Number) ) { S_light_southffxSLD.sendCommand (0) }
    }
    else {
      if ( (S_light_southffxSLD.state as Number) == 0 ) {
        if ( (S_light_southffxINRAW.state as Number) <= (S_light_southffxDarkTreshold.state as Number) ) { S_light_southffxSLD.sendCommand (2) }
        else if ( (S_light_southffxINRAW.state as Number) <= ((S_light_southffxSunshineTreshold.state as Number) - (S_light_southffxHysteresis.state as Number)) ) { S_light_southffxSLD.sendCommand (1) }
      }
      if ( (S_light_southffxSLD.state as Number) == 1 ) {
        if ( (S_light_southffxINRAW.state as Number) >= ((S_light_southffxSunshineTreshold.state as Number) + (S_light_southffxHysteresis.state as Number)) ) { S_light_southffxSLD.sendCommand (0) }
        else if ( (S_light_southffxINRAW.state as Number) <= ((S_light_southffxDarkTreshold.state as Number) - (S_light_southffxHysteresis.state as Number)) ) { S_light_southffxSLD.sendCommand (2) }
      }
      if ( (S_light_southffxSLD.state as Number) == 2 ) {
        if ( (S_light_southffxINRAW.state as Number) >= (S_light_southffxSunshineTreshold.state as Number) ) { S_light_southffxSLD.sendCommand (0) }
        else if ( (S_light_southffxINRAW.state as Number) >= ((S_light_southffxDarkTreshold.state as Number) + (S_light_southffxHysteresis.state as Number)) ) { S_light_southffxSLD.sendCommand (1) }
      }
    }
end

The value of S_light_southffxINRAW is changed by API from C code as the C code is receiving data from arduinos that measure light intensity and after it is checked and validated it is submitted to openhab and openhab by rule above prepares cleaned value of S_light_southffxSLD. Depending on the change it might trigger rollershutter movement.
This is code scheduled to refactor, because with this approach each sensor would require own rule (not ideal), for this reason the RAW variables will be in group and if member of group will be changed the corresponding ‘cleaned’ variable will be updated. After refactor rain sensors + north and east light sensors will be connected.

Exec whitelist Sorry my mistake, not in GIT, forgot to add, now I’m not at my development machine, but will fix.

START Yes we should ge one by one and correct start is good start, as soon as I move the OH3 to development area (made change too fast and my OH3 has now HDMI turned off, wifi off and live IP address = need to change IP but without monitor and network ) I will post fresh log with all the details as mentioned by rlkoshak . In live is currently OH2 back.

Thanks for the help and hope to have some news by next week (this week I have very intensive work in my real work and not sufficient time for my hobby).

rlkoshak · March 1, 2023, 10:09pm

Except that a rule waiting for a lock is consuming a thread.

When you move over, forget about the hardware to start. Us the ~Scratchpad~ or build a Sitemap/MainUI interface where you can modify Item states and watch the logs as the rules execute. Then you can have both systems running in parallel and when it’s time to switch over to the new system, everything but that hardware interface will have been tested.

Peter_Novotny · March 4, 2023, 7:30pm

Adjusted code of startup to trigger on runlevel as suggested by rlkoshak few posts above
100 (thinking 100 is safer than 50)
Added equivalent of GreenFlag from nyholm, called SystemStarted
For now the ‘watch’ for exec status is NOT implemented as it requires rework, what is in current startup sequence can not be run more than once.

var SystemStarted = 0
rule "Startup"
  when 
    System reached start level 100
  then

First restart:

2023-03-04 20:00:25.919 [WARN ] [el.core.internal.ModelRepositoryImpl] - Ignoring file 'rollershutter.rules' as we do not have a parser for it.

Second restart (please note that according to my knowledge there is NO RRD4jPersistenceService used in my rules)

2023-03-04 20:05:55.685 [INFO ] [rg.openhab.core.model.script.STARTUP] - System Starting... home_away status: NULL System Started value:0
2023-03-04 20:06:00.245 [ERROR] [internal.handler.ScriptActionHandler] - Script execution of rule with UID 'rollershutter-1' failed: The name '<unkown>' cannot be resolved to an item or type; line 0, column 0 in rollershutter
2023-03-04 20:06:09.390 [ERROR] [nal.common.AbstractInvocationHandler] - An error occurred while calling method 'QueryablePersistenceService.query()' on 'org.openhab.persistence.rrd4j.internal.RRD4jPersistenceService@7904a1a0': Value must be between 0 and 100
<Java garbage follows>

Maybe runlevel 100 was over optimistic, changed startup runlevel to 50 as suggested originally by rlkoshak.

rule "Startup"
  when 
    System reached start level 50

Log file on 1st reboot

2023-03-04 20:13:34.406 [WARN ] [el.core.internal.ModelRepositoryImpl] - Ignoring file 'rollershutter.rules' as we do not have a parser for it.

Log file on 2nd reboot has this as last message in log and then system is dead, does not react to anything, attempt to reach UI ends with 404 Not Found

2023-03-04 20:16:26.120 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'rollershutter.rules'

Source used located in GIT branch OH3_GW.

Peter_Novotny · March 7, 2023, 8:06pm

Want to make today new test, modified rules to follow exec readiness as Rich suggested but still there is java garbage on QueryablePersistenceService.query?

ice@OpenHab:/etc/init.d $ sudo apt list --installed | grep openhab
openhab-addons/stable,stable,now 3.4.2-1 all [installed]
openhab/stable,stable,now 3.4.2-1 all [installed]

ice@OpenHab:/etc/init.d $ sudo grep DEBUG /var/lib/openhab/etc/log4j2.xml # log configuration
                <Logger level="DEBUG" name="openhab.event.RuleStatusInfoEvent"/>
                <Logger level="DEBUG" name="openhab.event.StartlevelEvent"/>

ice@OpenHab:/etc/init.d $ sudo systemctl restart openhab.service

In log files I do not see any single line named DEBUG. Do I made correct modification of the /var/lib/openhab/etc/log4j2.xml? Attached complete log and source posted on github branch OH3_GW.

Interesting error line is:

2023-03-07 20:50:04.239 [ERROR] [nal.common.AbstractInvocationHandler] - An error occurred while calling method 'QueryablePersistenceService.query()' on 'org.openhab.persistence.rrd4j.internal.RRD4jPersistenceService@688a70fb': Value must be between 0 and 100

Full
openhab.log (6.7 KB)
and
events.log (10.0 KB)
attached.
Does anybody knows where it comes from? Is it part of Openhab core?

rlkoshak · March 8, 2023, 3:32pm

The error is coming from rrd4j. It’s a query so it’s failing to get data from rrd4j. The specific error is complaing that the value must be between 0 and 100 which implies the following:

You’ve a Dimmer Item that is stored in rrd4j.
Somehow a value that is not supported by a Dimmer Item was stored in rrd4j. This could happen if you previously defined an Item as a Number and later changed it to be a Dimmer.
Assuming this appears at system startup, when the saved value is read from the DB to initialize the Item with the saved state (restoreOnStartup) that fails because the value isn’t an integer between 0 and 100.

You wouldn’t. The openhab.event loggers don’t log anything below the INFO level. But you should start to see StartlevelEvents and RuleStatusInfoEvents logged (at the INFO level) in events.log.

Peter_Novotny · March 8, 2023, 10:25pm

With all the respect I believe this is not the case for following reasons:

I checked with my eyes, checked with grep, checked in notepad++ and there is no single occurrence of string rrd4j in my code (and never was)
I checked with my eyes, checked with grep, checked with notepad++ and there is exactly one occurrence of string dimmer in my code and it has nothing to do with openhab as it is used within comment to C code. (search returns file Server_HomeAutomation.c)
In items only Number, Switch, Group, Rollershutter, DateTime types are used NO dimmer
If I use startup sequence with delays the error is not shown

I know you might not believe me, but please look at the search in repo where rrd4j does not find any occurrences.

The above reasons lead me to conclusion that the java garbage is startup problem and the error message is not anyhow related to the real error.

About the log can you please be more specific where to search for the levels beyond INFO? I can not find them. Maybe if you have sample how these DEBUG lines should look like.

rlkoshak · March 8, 2023, 11:49pm

It’s not coming from your code. Rrd4j is a persistence engine. In OH 3 rrd4j comes with OH and is installed by default as the default persistence engine and it’s configured to save all Items it supports on every change, every minute (a requirement for rrd4j) and it does restoreOnStartup.

But even if you were interacting with persistence in your rules, i"rrd4j" is unlikely to show up anywhere. It’s the default. If you call MyItem.lastUpdate in a rule, without changing from the defaults in OH 3, you will be interacting with rrd4j. No grep would show you that.

It could be any Item type or any other place that uses a PercentType as its state or as part of its state. Dimmer and Color are two off the top of my head. But I’m only guessing as to a reason why rrd4j would report that specific error.

Again, I wouldn’t expect it to be. I use the heck out of rrd4j. It drives my charts and all my persistence based calls in all my rules. rrd4j doesn’t appear in any of my configs either.

That may or may not be the case. It’s hard to say. But based on the behaviors you’ve reported I have strong suspicions that your whole SD card is worn out, corrupted because of a power failure, or the machine is so utterly underpowered that OH is unlikely to ever work properly. With the way you’ve written your rules I can’t eliminate those as potential causes either. You are tying up a whole lot of resources doing nothing with all those sleeps and reeentrant locks which may be interfearing with core stuff in OH too.

One thing I am certain of, there isn’t the real error, there are likely several errors.

You will not find the word DEBUG anywhere in events.log. Ever. Period.

A runlevel log looks like this:

2023-03-08 16:10:48.137 [INFO ] [openhab.event.StartlevelEvent       ] - Startlevel '40' reached.

A RuleStatusInfoEvent looks like this:

2023-03-08 16:11:33.433 [INFO ] [openhab.event.RuleStatusInfoEvent   ] - sensor_status_detection updated: RUNNING
2023-03-08 16:11:33.471 [INFO ] [openhab.event.RuleStatusInfoEvent   ] - sensor_status_detection updated: IDLE

Notice, “DEBUG” appears nowhere.

You could set the entries in log4j2.xml to INFO instead of DEBUG and nothing would change. The setting suppresses logs below a certain level. It doesn’t change those levels.

Peter_Novotny · March 10, 2023, 12:27pm

Thank you for your response, I have learned something new again.

Thank you, I missed this information when checking OH3 docs. Was looking for possibility to turn it off or uninstall. It is not clear to me from manual how to do it correctly? It seems (?) that deleting rrd4j.persist should be sufficient?

Sorry, no dimmer, no percentage, have double checked and switch and number are the main items + DateTime and Group.

SD card is bought on 20.11.2022 (from that time I’m trying to make OH3 running) and it is Kingston Industrial rated for “up to” 1920 TBW (I can not find anything better, if you have tip for better card would be glad to know), created backup image now, filled with 1, read, filled with 0, read and no errors found (image flashed back). The card was not used before and is whole time used only for OH3 migration, maybe 200 hours runtime (no 24/7 run, I manually started the Rapberry when I was working on the migration and shutdown afterwards)

Apologize for stupid questions and thank you for explanation. I was simply used to see DEBUG in log when I turn on DEBUG. Was not expecting (maybe have not read the manual carefully enough) that all records will be info regardless of level.
Is there a way to verify my change of log level configuration was correct and taken into account by OpenHab?

I see in the logs runlevel reached

2023-03-07 20:50:05.752 [INFO ] [openhab.event.StartlevelEvent       ] - Startlevel '40' reached.
2023-03-07 20:50:05.756 [INFO ] [openhab.event.StartlevelEvent       ] - Startlevel '50' reached.
2023-03-07 20:50:05.760 [INFO ] [openhab.event.StartlevelEvent       ] - Startlevel '70' reached.
2023-03-07 20:50:05.769 [INFO ] [openhab.event.StartlevelEvent       ] - Startlevel '80' reached.
2023-03-07 20:50:05.773 [INFO ] [openhab.event.StartlevelEvent       ] - Startlevel '100' reached.

and this is the Java garbage first occurence.

2023-03-07 20:50:04.239 [ERROR] [nal.common.AbstractInvocationHandler] - An error occurred while calling method 'QueryablePersistenceService.query()' on 'org.openhab.persistence.rrd4j.internal.RRD4jPersistenceService@688a70fb': Value must be between 0 and 100

Comparing timing, the Java error 20:50:04.239 is earlier than start level 40 20:50:05.752.
If I remember correctly items does not exists prior to runlevel 80 20:50:05.769(?)

hafniumzinc · March 10, 2023, 2:45pm

Settings → Other Add-ons → Persistence Services → RRD4j Persistence → Remove

rlkoshak · March 10, 2023, 3:49pm

It’s an addon like any other. Go to Settings → Other Addons and click “Remove” next to rrd4j.

That will just cause it to save and restore Item states using the default strategy.

You’ve just done it. You change the setting to print out logs above DEBUG level. You are not seeing INFO level logs for that logger.

No, Items are loaded and restoreOnStartup occurs arouund level 20 or 30. Definitely before your rules are loaded at level 40. It’s the Things that are loaded later at runlevel 80. The sequence is:

Items loaded → Items populated with states from restoreOnStartup (the part that is generating this error) → Rules are loaded → Rule Engine Starts → UI is loaded → Things are loaded

The previous startlevel need not complete before the next one starts.

Peter_Novotny · March 11, 2023, 8:02pm

I understand it is nice to have clickable interface, but if the steps should be repeatable it can be problem. Is there any command line option to delete the add-on? Because my screen with other add-ons is empty.

Part bindings contain some items, but for example I know that I have installed exec binding and RegEx transformation and now can not find them any more.