Hi
I need some help in identifying an issue that regularly requires a reboot of OH.
The issue started after upgrading to OH 4.2 and has continued into OH 4.3.
Hardware: Was previously running OH on a RPi 3 B+ with 1GB RAM. Thought it’s time I upgraded to more RAM after these issues started so currently running an RPi 4 with 8GB RAM. The problem started on the former and has continued on the latter.
JVM version: openjdk 17.0.13
This may be a symptom but in the logs, I start getting the following every second:
2024-12-29 07:07:40.421 [ERROR] [e.ember.internal.ash.AshFrameHandler] - AshReceiveParserThread Exception:
java.lang.IllegalStateException: Queue full
at java.util.AbstractQueue.add(AbstractQueue.java:98) ~[?:?]
at java.util.concurrent.ArrayBlockingQueue.add(ArrayBlockingQueue.java:329) ~[?:?]
at com.zsmartsystems.zigbee.dongle.ember.internal.ash.AshFrameHandler.handleIncomingFrame(AshFrameHandler.java:242) ~[bundleFile:?]
at com.zsmartsystems.zigbee.dongle.ember.internal.ash.AshFrameHandler.access$700(AshFrameHandler.java:59) ~[bundleFile:?]
at com.zsmartsystems.zigbee.dongle.ember.internal.ash.AshFrameHandler$AshReceiveParserThread.run(AshFrameHandler.java:319) [bundleFile:?]
and
2025-01-05 03:31:38.943 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception:
java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
at java.lang.Thread.start0(Native Method) ~[?:?]
at java.lang.Thread.start(Thread.java:809) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:945) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.ensurePrestart(ThreadPoolExecutor.java:1593) ~[?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor.reExecutePeriodic(ScheduledThreadPoolExecutor.java:360) ~[?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:307) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:840) [?:?]
When these errors happen, I get different behaviours. Sometimes, no rules will run at all. Other times, rules run but I get to notifications, rules run but the actions aren’t completed such as updating the state of an item, sometimes, rules will work after 2-3 attempts.
I came across the following (old) topic Zigbee devices offline after migration to openhabian 1.7 - #3 by domofl and although it didn’t offer any real solution, I:
- removed the two Zigbee devices I have.
- Removed the Zigbee Controller.
- Re-added the Zigbee Controller.
- Hard reset the two Zigbee Devices
- Re-added the two Zigbee devices to the Controller.
None of this has made a difference.
So, in order to try and get to the bottom of it, I set up Zabbix with additional JVM Monitoring and I’m still confused.
I am far from a Java expert however, nothing stands out to me in terms of resource constraints. I’ve had the issue twice in the last 7 days and during that time:
- CPU utilisation avg. is 2.5% and has been a max of 5.7%.
- RAM utilisation has been an avg. of 17% with a maximum of 19%.
- Committed Heap Memory is 250MB and the avg used by the JVM is 210MB with a max of 216MB.
What I have seen from Zabbix on the back of the latest incident and it may be expected (I’m not sure):
- Garbage collections per second also climbs at the point of JVM start however, 3 days before I start seeing the errors, it goes to 0 and stays there until the next reboot.
- Threads max out at approx. 4,900.
I also added the following line to the Java Options file
EXTRA_JAVA_OPTS="-Xms192m -Xmx768m -XX:+ExitOnOutOfMemoryError"
however, the JVM does not restart because looking at the data, it’s because it doesn’t look like a memory issue despite what the errors in the logs say.
Any help is much appreciated.