OH4 runs out of memory

Hello @markus7017
I wanted to try it also to test the issue with the Wall Screen but installing the new Jar gives me thi> s error:

Error while starting bundle: file:/usr/share/openhab/addons/org.openhab.binding.shelly-4.1.0-SNAPSHOToom.jar

org.osgi.framework.BundleException: Could not resolve module: org.openhab.binding.shelly [305]
Unresolved requirement: Import-Package: com.google.gson; version=“[2.10.0,3.0.0)”
at org.eclipse.osgi.container.Module.start(Module.java:463) ~[org.eclipse.osgi-3.18.0.jar:?]
at org.eclipse.osgi.internal.framework.EquinoxBundle.start(EquinoxBundle.java:445) ~[org.eclipse.osgi-3.18.0.jar:?]
at org.apache.felix.fileinstall.internal.DirectoryWatcher.startBundle(DirectoryWatcher.java:1260) ~[?:?]
at org.apache.felix.fileinstall.internal.DirectoryWatcher.startBundles(DirectoryWatcher.java:1233) ~[?:?]
at org.apache.felix.fileinstall.internal.DirectoryWatcher.doProcess(DirectoryWatcher.java:520) ~[?:?]
at org.apache.felix.fileinstall.internal.DirectoryWatcher.process(DirectoryWatcher.java:365) ~[?:?]
at org.apache.felix.fileinstall.internal.DirectoryWatcher.run(DirectoryWatcher.java:316) ~[?:?]

Maybe because i’m on OH 4.0.3??

urg, gson is one of the core packages and shpuld be always there

4.0.3 should be fine

try

  • delete the jar, stop OH
  • openhab-cli clean-cache
  • start OH, wait 5min until cache is recreated
  • open OH console
  • enter „ feature:install oh-transport-coap“
  • copy jar to addins folder
  • wait a bit and check with bundle:list that binding is Active

I had that same problem. I resolved it by first upgrading my whole openhab to the latest snapshot release (changing my sources.list to include unstable), then upgrading to the new jar. I think it’s a newer version of gson required - so you have it, but not a new enough version.

I did manually upgrade gson, which you can do, but it then asked for a different jar to be upgraded as well, and I decided I’d be chasing my tail.

Overnight, no new threads created. Working well so far.

I created PR #15198, which includes those fixes. There is a good chance that it also goes into 4.0 (beside next 4.1 milestone build)

I am suffering from oom since 4.0.3. I tried to use your jar without look under 4.0.3. Updated to 4.1 unstable but no luck either. Got massive Exceptions. Is there any special Version of 4.1 to use your jar with ?

I just ran directly against unstable - so I updated using APT to the full unstable, then used the console to update just that one jar using the link provided.

What exception exacatly are you getting? Is it the Gson one?

Updating via console did the trick. Thank you. Fixed my OOMs instantly.

Might be not related since I am still running OH 3.4.4, but I was suffering OOM for few weeks and it was defenetly the Nest binding.
I stopped such binding for several days and solved. Today I restarted it and the out of memory problem came back in few hours.
It’s a real pity that this device had such troubles with account, support and functionality confusion.

This isn’t the nest binding though, it’s the Shelly binding.

I’m juat finalizing the PR

It’s the shelly binding for me too.
Markus’s solution works for me. No more out of memory errors anymore.

PR #15798 is ready to merge, check out: [shelly] Fix resource leak, BLU script installation, TRV init, NPE on IPv6 mDNS discovery by markus7017 · Pull Request #15798 · openhab/openhab-addons · GitHub
which includes OOM Fix #2 and #3 and some other important fixes.
since then you could use https://github.com/markus7017/myfiles/blob/master/shelly/org.openhab.binding.shelly-4.1.0-SNAPSHOToom.jar?raw=true

1 Like

Hi there,

I have been reading through this discussion with great interest as I think I am facing a similar problem.

However, I am not quite sure and am hoping to get some advice from this group here…

I am running

  • on openHAB 4.2.0,
  • using the appropriate Shelly binding,
  • I have 17 Shelly devices, such as EM3PRO, PM-Mini, 1PM-Mini, etc.

I often see the following warnings

2024-08-09 09:11:43.179 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-08-09 09:15:50.812 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-08-09 09:25:44.333 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-08-09 09:27:05.907 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-08-09 09:44:05.394 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-08-09 09:46:18.686 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null

and my openHAB system runs out of memory after a few hours, causing it to become unresponsive and require a restart.

As I have “TheDoctor” add-on installed, I also see it telling me that there is a problem with the heap size.

Trying to get to the root cause, I found

  • Firstly, that it does not occur at all when I remove the Shelly binding,
  • Second, I seem to have problems with the stability of the Shelly’s wifi connection.

Whenever I run a ping against a wifi connected Shelly device, I see the ping time out from time to time (infrequently, but somehow occurring in intervals between 20 and 120 seconds).

I suppose the answer is simple and straightforward - fix the wifi problem and it won’t happen again :wink: - but I was wondering if the Shelly binding is robust enough to deal with unexpectedly non-answering devices in such a way that the heap size does not fill up over time?

I could be completely wrong, but wanted to throw this into the discussion in the hope of getting some more insight and/or advice…

Thanks and best regards
Karwak

PS: In the meantime I will continue to try to find the reason behind the wifi issue I experience :slight_smile:

In general recovery should work, but this sounds like there is a resource leak when xxx happens

Are you using Gen 1 or 2 devices?

Please enable DEBUG logging and check the logs to narrow the causes.

Dear Markus,

Thanks for your prompt reply! I am traveling at the moment and might need some more time to answer…

I am using both Gen 2 and Gen 3 devices.

I will try to enable debug logging as soon as possible. However, have no idea how to enable it yet. Will read into it and find out. Any hint is appreciated :slight_smile:

Regards.
Karwak

Hi,
Easiest way to enable debug logging for a installed binding is from UI go to settings then Add-on Settings then click on the binding name ex .Shelly Binding then change Add-on log setting from info to debug

2 Likes

Hi there,

after I came back from a long trip, I managed to take a closer look at the problem I described before.

I was able to enable the debug logging and also found a way to split the logging output into separate files for a better overview :-).

The amount of logging information generated by the DEBUG level is overwhelming. However, looking at the combination of messages thrown by the log for jetty and the debug log for the shelly addon, I seem to have found the following relationship:

Whenever jetty reports “Stopped without executing or closing null”,

  • there is a preceding message from shelly about “Disconnecting WebSocket” for a specific connection,
  • and a subsequent message from shelly about “WebSocket connection closed”.

In general, this does not look wrong, right?

However, one thing I also found is:

  • The assumption that this must be related to the WiFi connection instabilities of my Shelly devices is wrong.
  • The reason is that some of the reported devices are connected via LAN cable.

I am continuing my testing to see if there is any specific relationship to when the heap size blows out, but thought to share this already. Any comments/advice would be greatly appreciated.

Thanks and best regards
Karsten

PS: There is much more logging info to share. Just need a hint where to look and what is meaningful to provide.
PPS: I removed / changed serial numbers and ip addresses on purpose.

2024-09-10 12:41:03.104 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellypro3em-abcdef111111: Disconnecting WebSocket (/aaa.bbb.ccc.249:35656 -> /aaa.bbb.ccc.72:80)
2024-09-10 12:41:03.118 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-09-10 12:41:03.119 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellypro3em-abcdef111111: WebSocket connection closed, status = 1006/Disconnected

2024-09-10 12:58:03.340 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellypro3em-abcdef222222: Disconnecting WebSocket (/aaa.bbb.ccc.249:53486 -> /aaa.bbb.ccc.73:80)
2024-09-10 12:58:03.343 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-09-10 12:58:03.344 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellypro3em-abcdef222222: WebSocket connection closed, status = 1006/Disconnected

2024-09-10 13:50:02.839 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellypro3em-abcdef222222: Disconnecting WebSocket (/aaa.bbb.ccc.249:48682 -> /aaa.bbb.ccc.73:80)
2024-09-10 13:50:02.843 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-09-10 13:50:02.843 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellypro3em-abcdef222222: WebSocket connection closed, status = 1006/Disconnected

2024-09-10 14:32:06.191 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellypro3em-abcdef222222: Disconnecting WebSocket (/aaa.bbb.ccc.249:57270 -> /aaa.bbb.ccc.73:80)
2024-09-10 14:32:06.193 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-09-10 14:32:06.194 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellypro3em-abcdef222222: WebSocket connection closed, status = 1006/Disconnected

2024-09-10 14:41:14.803 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellypro3em-abcdef111111: Disconnecting WebSocket (/aaa.bbb.ccc.249:57690 -> /aaa.bbb.ccc.72:80)
2024-09-10 14:41:14.811 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-09-10 14:41:14.811 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellypro3em-abcdef111111: WebSocket connection closed, status = 1006/Disconnected

2024-09-10 14:43:14.803 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellypro3em-abcdef111111: Disconnecting WebSocket (/aaa.bbb.ccc.249:36482 -> /aaa.bbb.ccc.72:80)
2024-09-10 14:43:14.805 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-09-10 14:43:14.806 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellypro3em-abcdef111111: WebSocket connection closed, status = 1006/Disconnected

2024-09-10 16:50:16.933 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellypro3em-abcdef333333: Disconnecting WebSocket (/aaa.bbb.ccc.249:51552 -> /aaa.bbb.ccc.71:80)
2024-09-10 16:50:16.936 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-09-10 16:50:16.936 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellypro3em-abcdef333333: WebSocket connection closed, status = 1006/Disconnected

2024-09-10 16:55:16.738 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellypro3em-abcdef111111: Disconnecting WebSocket (/aaa.bbb.ccc.249:44250 -> /aaa.bbb.ccc.72:80)
2024-09-10 16:55:16.740 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-09-10 16:55:16.740 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellypro3em-abcdef111111: WebSocket connection closed, status = 1006/Disconnected

2024-09-10 17:00:20.233 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellypro3em-abcdef222222: Disconnecting WebSocket (/aaa.bbb.ccc.249:51886 -> /aaa.bbb.ccc.73:80)
2024-09-10 17:00:20.237 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-09-10 17:00:20.239 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellypro3em-abcdef222222: WebSocket connection closed, status = 1006/Disconnected

2024-09-10 17:10:20.897 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellypro3em-abcdef111111: Disconnecting WebSocket (/aaa.bbb.ccc.249:38710 -> /aaa.bbb.ccc.72:80)
2024-09-10 17:10:20.909 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-09-10 17:10:20.909 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellypro3em-abcdef111111: WebSocket connection closed, status = 1006/Disconnected

2024-09-10 17:30:27.114 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellypro3em-abcdef333333: Disconnecting WebSocket (/aaa.bbb.ccc.249:45442 -> /aaa.bbb.ccc.71:80)
2024-09-10 17:30:27.116 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-09-10 17:30:27.116 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellypro3em-abcdef333333: WebSocket connection closed, status = 1006/Disconnected

2024-09-10 17:43:23.550 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellypro3em-abcdef111111: Disconnecting WebSocket (/aaa.bbb.ccc.249:60944 -> /aaa.bbb.ccc.72:80)
2024-09-10 17:43:23.555 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-09-10 17:43:23.557 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellypro3em-abcdef111111: WebSocket connection closed, status = 1006/Disconnected

2024-09-10 18:04:11.470 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellypro3em-abcdef111111: Disconnecting WebSocket (/aaa.bbb.ccc.249:54218 -> /aaa.bbb.ccc.72:80)
2024-09-10 18:04:11.472 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-09-10 18:04:11.473 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellypro3em-abcdef111111: WebSocket connection closed, status = 1006/Disconnected

2024-09-10 18:35:28.158 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellypro3em-abcdef222222: Disconnecting WebSocket (/aaa.bbb.ccc.249:40842 -> /aaa.bbb.ccc.73:80)
2024-09-10 18:35:28.162 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-09-10 18:35:28.163 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellypro3em-abcdef222222: WebSocket connection closed, status = 1006/Disconnected

2024-09-10 18:44:28.602 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellypro3em-abcdef111111: Disconnecting WebSocket (/aaa.bbb.ccc.249:34792 -> /aaa.bbb.ccc.72:80)
2024-09-10 18:44:28.604 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Stopped without executing or closing null
2024-09-10 18:44:28.605 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellypro3em-abcdef111111: WebSocket connection closed, status = 1006/Disconnected

Hi all,

I’ve been experiencing a very similar / the same problem with a Shelly Pro 4PM on my OH installation.

However, I am using OH3.4.2.
I know this thread is about 3.4.2 but this thread is what finally helped me to isolate the issue.

The history: I installed a Shelly Pro 4PM, updated FW to 1.4.2.
The day after, OH began to crash. Once restartet, it would crash again after approx 1h or so.
As there were some other (most likely unrelated, but time-wise correlated) issues with my knx interface, I was lead to the wrong track at first.
After a lot of googling, I finally found the clue that it’s actually creating new threads over and over, up until the Task limit and then crashes. The threads were always related to WebSockets.

This could be narrowed down to the new Shelly Pro. I thought that disabling the related thing would fix it, but it actually doesnt. The only “fix” is to unpower / physically remove the Shelly.

Reading this thread it seems that the underlying issue has been solved in OH4.2 and the new Shelly binding for that release. However, this is probably not ported back to the Shelly binding for 3.4.x, correct?

tl:dr: The Shelly binding (the "official one from the openHAB Distribution) has a bug in connection with a Shelly Pro 4PM (creating WebSockets endlessly until crash) in OH3.4.2.

So my question: is there any other solution to my problem other than upgrading to OH4? Because there is no other reason for me to upgrade OH (never touch a working system) and I would want to avoid it.

Thanks for your advice!

EDIT:
Just checked a couple of things, all to no avail:

  1. removed thing in OH config
  2. ignored thing in Inbox
  3. removed thing Inbox
    All of that does not help, still creating WebSockets and subsequently crashing OH.
1 Like