Error since 4.1.2 update

Rbartlett · April 2, 2024, 4:45pm

I am getting this error since doing the 4.1.2 update. It is locking up constantly.

2024-04-02 10:26:06.155 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception: 
java.lang.OutOfMemoryError: Java heap space
Caused by: com.oracle.truffle.api.TruffleStackTrace$LazyStackTrace
2024-04-02 10:28:16.161 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception: 
java.lang.OutOfMemoryError: Java heap space
Caused by: com.oracle.truffle.api.TruffleStackTrace$LazyStackTrace
2024-04-02 10:30:55.772 [WARN ] [org.eclipse.jetty.server.HttpChannel] - /rest/events/states
java.lang.OutOfMemoryError: Java heap space
Caused by: com.oracle.truffle.api.TruffleStackTrace$LazyStackTrace
2024-04-02 10:44:08.129 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - 
java.lang.OutOfMemoryError: Java heap space
Caused by: com.oracle.truffle.api.TruffleStackTrace$LazyStackTrace
2024-04-02 10:46:57.570 [ERROR] [org.apache.felix.fileinstall        ] - In main loop, we have serious trouble
java.lang.OutOfMemoryError: Java heap space
Caused by: com.oracle.truffle.api.TruffleStackTrace$LazyStackTrace
2024-04-02 10:46:57.570 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception: 
java.lang.OutOfMemoryError: Java heap space
Caused by: com.oracle.truffle.api.TruffleStackTrace$LazyStackTrace
2024-04-02 10:49:39.231 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception: 
java.lang.OutOfMemoryError: Java heap space
Caused by: com.oracle.truffle.api.TruffleStackTrace$LazyStackTrace
2024-04-02 11:00:12.960 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception:

mstormi · April 2, 2024, 5:37pm

You should at least be providing details on your system.

Rbartlett · April 2, 2024, 5:57pm

I am sorry I don’t speak as nicely as you do. I thought you wanted to hear when people have issues with an update.

JustinG · April 2, 2024, 6:16pm

I am not sure if you intended this reply to be rude, but it does read that way. We do try to keep these forums civil so in the future it would be appreciated if you put a little more effort into making sure your posts are not taken incorrectly.

You are absolutely correct: it is very useful to get reports of issues that users are having both so that they can get help here and so that OH devs can identify areas to target. Markus’s point, however, is that an issue, particularly one that appears that it might be related to hardware or configuration, is not nearly as useful when it is out of context of the hardware it is running on.

So, if you are trying to be helpful with this report, it will be much more helpful to include your system details.

If you are trying to get help about this issue, it will be much easier for others to help you if you include your system details.

Rbartlett · April 2, 2024, 6:35pm

(1) OH4.1.2 Master on RPI4 4gig with Zibgee dongle
(3) OH4.1.1 Remotes on RPI3B+ with Zibgee dongle
(2) OH3.4.2 Remotes on RPI3B+ with Zibgee dongle due to Amazon binding issue with OH4.
(46) TP-Link devices
(19) Zigbee Contact devices for Doors and Gates
(3) Zigbee Temp / Humidity Sensors
(1) Zigbee Relay board
(14) Amazon Echo devices
(122) Things
(493) Items
(153) Rules
Multiple devices controlled through Alexa Skills

This issue is on my Master unit. It started after updating to OH 4.1.2.

Rbartlett · April 2, 2024, 6:46pm

I have also had a lot of rules lose the “things”.

2024-04-02 13:41:03.033 [ERROR] [b.automation.script.javascript.stack] - Failed to execute script:
org.graalvm.polyglot.PolyglotException: ReferenceError: "things" is not defined
	at <js>.:program(<eval>:1) ~[?:?]
	at org.graalvm.polyglot.Context.eval(Context.java:399) ~[?:?]
	at com.oracle.truffle.js.scriptengine.GraalJSScriptEngine.eval(GraalJSScriptEngine.java:458) ~[?:?]
	at com.oracle.truffle.js.scriptengine.GraalJSScriptEngine.eval(GraalJSScriptEngine.java:426) ~[?:?]
	at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:262) ~[java.scripting:?]
	at org.openhab.automation.jsscripting.internal.scriptengine.DelegatingScriptEngineWithInvocableAndAutocloseable.eval(DelegatingScriptEngineWithInvocableAndAutocloseable.java:53) ~[?:?]
	at org.openhab.automation.jsscripting.internal.scriptengine.InvocationInterceptingScriptEngineWithInvocableAndAutoCloseable.eval(InvocationInterceptingScriptEngineWithInvocableAndAutoCloseable.java:78) ~[?:?]
	at org.openhab.automation.jsscripting.internal.scriptengine.DelegatingScriptEngineWithInvocableAndAutocloseable.eval(DelegatingScriptEngineWithInvocableAndAutocloseable.java:53) ~[?:?]
	at org.openhab.automation.jsscripting.internal.scriptengine.InvocationInterceptingScriptEngineWithInvocableAndAutoCloseable.eval(InvocationInterceptingScriptEngineWithInvocableAndAutoCloseable.java:78) ~[?:?]
	at org.openhab.core.automation.module.script.internal.handler.ScriptActionHandler.lambda$0(ScriptActionHandler.java:71) ~[?:?]
	at java.util.Optional.ifPresent(Optional.java:178) [?:?]
	at org.openhab.core.automation.module.script.internal.handler.ScriptActionHandler.execute(ScriptActionHandler.java:68) [bundleFile:?]
	at org.openhab.core.automation.internal.RuleEngineImpl.executeActions(RuleEngineImpl.java:1188) [bundleFile:?]
	at org.openhab.core.automation.internal.RuleEngineImpl.runRule(RuleEngineImpl.java:997) [bundleFile:?]
	at org.openhab.core.automation.internal.TriggerHandlerCallbackImpl$TriggerData.run(TriggerHandlerCallbackImpl.java:87) [bundleFile:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:840) [?:?]
2024-04-02 13:41:03.037 [ERROR] [internal.handler.ScriptActionHandler] - Script execution of rule with UID 'RelayBoardStatusRule' failed: org.graalvm.polyglot.PolyglotException: ReferenceError: "things" is not defined

if (things.getThing('zigbee:device:6776ba0f5a:a4c138595e201468').status != 'ONLINE') {
  items.getItem('Relay_Board_Status').postUpdate('OFF');
}
if (things.getThing('zigbee:device:6776ba0f5a:a4c138595e201468').status == 'ONLINE') {
  items.getItem('Relay_Board_Status').postUpdate('ON');
}

mstormi · April 2, 2024, 8:28pm

That’s what I thought when reading, too.

Update of what ? I don’t know how you come to say that, but it’s all about context and that’s what your posts are clearly lacking.

This means your java process is crashing because of one of two things:
either you have a memory leak, or you are hitting some too low system limit on memory.
To tell, we need you to provide all the relevant details upfront. This is not about hardware (4G should be enough), it’s about OS and SW limits.
Tell us the exact OS version, 32/64 bit? The Java parameters you use, “details” such as if you use Docker or other virtualization.
How long does it take for a crash to happen and how is memory usage developing after a restart?
Use top or some other tool to graph java process memory size.

Now since you say it started after an OH upgrade, you should be explaining what you did to upgrade.
What was the original version you came from, what commands did you use exactly, which parameters have been changed, did you add any bindings, change configuration and so on.

What have you tried so far to find out yourself? Tried removing any bindings ?

Rbartlett · April 2, 2024, 9:23pm

Thank you for the clarification. I will put this together and respond tomorrow. Thank you.

Rbartlett · April 3, 2024, 7:52pm

Sorry I was so rude. I will try to make my post clearer in the future.

When OH 4.0 came out, I decided to totally rebuild my system. I used new SD cards and installed the stable image. I have since been installing the updates using Openhabian-config option 2. All my settings are the defaults and I do not make any changes. I do have 2 Overview pages, the default one that has a lot of information on it and a secondary formatted for my iPad with the same info. Openhab 4.1.1 was very stable, and I did not have any issues with it, except the Amazon Control Binding. When FiremotD showed I had apt updates I applied them and that took me to 4.1.2. That is when I started having issues. It did take about an hour, but all the errors cleared up. A few days later is when it started locking up.

I just cleared my Cache, and the Out of Memory errors went away but I still have the rule errors. It has been 60 minutes, and I am going to reboot now and see what happens. After rebooting the system, I could not get back into the Overview page. I stopped OH and cleared the cache again and restarted OH. I can now get into the Overview page. After another hour all the errors have cleared up and it is working as it should. I have now gone another 90 minutes without an error or warning.

This is what I saw immediately after reboot as soon as I could get into it.

##        Ip = 192.168.1.8
##   Release = Raspbian GNU/Linux 11 (bullseye)
##    Kernel = Linux 6.1.21-v8+
##  Platform = Raspberry Pi 4 Model B Rev 1.1
##    Uptime = 0 day(s). 0:7:30
## CPU Usage = 25% avg over 4 cpu(s) (4 core(s) x 1 socket(s))
##  CPU Load = 1m: 1.19, 5m: 2.48, 15m: 1.56
##    Memory = Free: 2.13GB (57%), Used: 1.62GB (43%), Total: 3.75GB
##      Swap = Free: 2.99GB (100%), Used: 0.00GB (0%), Total: 2.99GB
##      Root = Free: 22.76GB (81%), Used: 5.11GB (19%), Total: 29.10GB
##   Updates = 0 apt updates available.
##  Sessions = 1 session(s)
## Processes = 129 running processes of 32768 maximum processes

Now that it is back up running properly, this is what I have.

#        Ip = 192.168.1.8
##   Release = Raspbian GNU/Linux 11 (bullseye)
##    Kernel = Linux 6.1.21-v8+
##  Platform = Raspberry Pi 4 Model B Rev 1.1
##    Uptime = 0 day(s). 1:30:32
## CPU Usage = 7.02% avg over 4 cpu(s) (4 core(s) x 1 socket(s))
##  CPU Load = 1m: 0.53, 5m: 0.48, 15m: 1.04
##    Memory = Free: 0.83GB (22%), Used: 2.91GB (78%), Total: 3.75GB
##      Swap = Free: 2.99GB (100%), Used: 0.00GB (0%), Total: 2.99GB
##      Root = Free: 22.58GB (81%), Used: 5.29GB (19%), Total: 29.10GB
##   Updates = 0 apt updates available.
##  Sessions = 1 session(s)
## Processes = 132 running processes of 32768 maximum processes

I see the used memory has gone from 43% to 78%. I will keep an eye on this and see if it continues to grow. I never see anything listed as used in Swap.

Here is the TOP capture.

top - 14:25:16 up  1:39,  1 user,  load average: 0.57, 0.47, 0.79
Tasks: 128 total,   1 running, 127 sleeping,   0 stopped,   0 zombie
%Cpu(s):  6.3 us,  3.9 sy,  0.0 ni, 89.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   3844.2 total,    856.7 free,   1200.4 used,   1787.2 buff/cache
MiB Swap:   3072.0 total,   3072.0 free,      0.0 used.   2566.6 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 4393 openhab   20   0 1301436   1.0g  11832 S  38.4  27.5  38:50.10 java
  407 avahi     20   0    7112   3040   2608 S   0.3   0.1   0:35.25 avahi-daemon
 9943 root      20   0       0      0      0 I   0.3   0.0   0:00.19 kworker/0:1-events_power_efficient
10272 openhab+  20   0   11224   3108   2560 R   0.3   0.1   0:00.16 top
10342 openhab   20   0    8508    668    572 S   0.3   0.0   0:00.01 ping
10344 openhab   20   0    8644   4612   4344 S   0.3   0.1   0:00.01 arping

I will check this again tomorrow and see what it shows.

Thank you.

Rbartlett · April 5, 2024, 8:57pm

Here is the results 2 days later. I see memory used has gone to 86%. I also see that there are now 12 new Apt updates available.

##    Uptime = 2 day(s). 3:6:32
## CPU Usage = 6.09% avg over 4 cpu(s) (4 core(s) x 1 socket(s))
##  CPU Load = 1m: 0.51, 5m: 0.37, 15m: 0.30
##    Memory = Free: 0.54GB (14%), Used: 3.21GB (86%), Total: 3.75GB
##      Swap = Free: 2.99GB (100%), Used: 0.00GB (0%), Total: 2.99GB
##      Root = Free: 22.58GB (81%), Used: 5.29GB (19%), Total: 29.10GB
##   Updates = 12 apt updates available.

top - 15:56:23 up 2 days,  3:10,  1 user,  load average: 0.40, 0.38, 0.31
Tasks: 129 total,   1 running, 128 sleeping,   0 stopped,   0 zombie
%Cpu(s):  5.3 us,  4.7 sy,  0.0 ni, 90.0 id,  0.0 wa,  0.0 hi,  0.1 si,  0.0 st
MiB Mem :   3844.2 total,    558.9 free,   1249.3 used,   2036.0 buff/cache
MiB Swap:   3072.0 total,   3072.0 free,      0.0 used.   2517.6 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 4393 openhab   20   0 1326068   1.1g  11832 S  37.4  28.3 699:06.76 java
  407 avahi     20   0    7112   3040   2608 S   0.7   0.1  18:52.39 avahi-daemon
31548 openhab   20   0    8644   4608   4344 S   0.7   0.1   0:00.02 arping
   15 root      20   0       0      0      0 I   0.3   0.0   3:42.08 rcu_preempt
29472 root      20   0       0      0      0 I   0.3   0.0   0:00.06 kworker/2:1-events
31526 openhab+  20   0   11224   3064   2512 R   0.3   0.1   0:00.12 top

mstormi · April 6, 2024, 9:33am

That’s not really the details it takes. It doesn’t say what image you installed (creation date?), was it 64 or 32 bit, what’s the EXTRA_JAVA_OPTS environment varible set (see /etc/default/openhab or better check what parameters the running java process has).
Either way it says it’s bullseye. If I were you and if the issue still persists, I’d reinstall using openHABian v1.9b which is based on the newer Linux distribution.
Anything else requires others like myself to invest much more time to identify and check all the settings and possible causes that could be of relevance to your issue.

In case it’s about a mem leak, you need to watch development of mem usage, not a single state in time.
You didn’t tell what you changed in terms of bindings. When it’s a mem leak it’s usually a single binding leaking so you need to find out which one by running, watching, and selectively leaving one or more out.

matt1 · April 10, 2024, 3:04am

Free and Available memory are two very different things, the fact your swap is not in use proves everything is fine. Since Java uses the HEAP and not ram like most people expect, I recommend you check your system by using this binding. It will give you advice if something needs changing via WARN messages in the logs.

Rbartlett · April 19, 2024, 3:02pm

I totally rebuilt my system from scratch when 4.0 came out. I downloaded the image for Openhabian v1.8b and built my systems with this. With 1 primary and 4 secondary servers this took weeks to do. Since then, I have installed all updates as it said they were available. Apparently getting off of Bullseye was not one of the updates. As I get a chance, I will update them again.

Rbartlett · April 19, 2024, 3:13pm

My issues have cleared up since installing the latest updates that came out about 2 weeks ago. There must have been a bug in the previous updates. Yesterday I was having a lot of Amazon control errors and they seemed to be having trouble finding Amazon.com. I think I tracked it back to a DNS issue. I added 8.8.8.8 to the resolv.conf file and that cleared up the errors.

Matt I will check out the Doctor binding.

Thank you all for your great support. You have a wonderful product and I enjoy seeing what all I can do with it. Even if at times it is frustrating.

mstormi · April 20, 2024, 11:05am

There won’t be an upgrade off bullseye.
Instead, simply install 1.9c from scratch onto a new SD card to get bookworm, then import your config. Pretty straightforward and low risk as you don’t change OH versions.

Rbartlett · April 22, 2024, 9:25pm

I am trying this now. I did run into one issue where there is not enough space on /boot to copy the initial.zip file to it. So, I am going the long way around.

Rbartlett · April 22, 2024, 11:27pm

This worked great. I am now on 1.9c. I do have a time issue and have not found a solution in the forum. My logs and rules are running on UTC time instead of Central.

Last login: Mon Apr 22 17:43:50 2024

openhabian@openhab9:~ $ date
Mon Apr 22 05:52:47 PM CDT 2024

Event Log
2024-04-22 23:52:04.600 [INFO ] [openhab.event.ItemStateChangedEvent ]

Openhabian-conf
timezone=America/Chicago
locales=en_US.UTF-8
system_default_locale=en_US.UTF-8

After a little more poking around, I found /etc/timezone was set to Europe/London. After updating this to America/Chicago the time on my logs and rules is correct now.

I later found that this could be updated in Openhabian-config. But I would have thought openhabian.conf would have fixed this.

This is one of my small servers I have tested this on. Now I will update the rest. Thank you for all your help.

Rbartlett · April 26, 2024, 2:23am

Upgrading to 1.9c definitely fixed a lot of issues I was having. I now have all my 4.1.2 servers running on 1.9c.
I have noticed a couple of issues with 1.9c.

/boot is not big enough to allow initial.zip to be copied to it. It needs about another 150 meg. Most of my zip files were about 500 meg, but I only had about 400 meg available.
During initial boot the timezone says it is being set by openhabian.conf, but I had America/Chicago in the file, and it still set it to Europe/London. I had to manually go in and change the timezone afterwards. It did use the locale settings I had in the file.
It skipped the username password setup using openhabian.conf. I had to change it afterwards.

Thanks again for all the great help in getting to this point.

system · June 6, 2024, 6:23pm

This topic was automatically closed 41 days after the last reply. New replies are no longer allowed.