java.lang.OutOfMemoryError: unable to create new native thread

Team, I’m running OH2.5.2 on RPI3 with Raspbian Buster. Since upgrade to 2.4.2 and then later 2.5.1/2 I’m seeing this kind of error after couple of days from OH restart (~5-6 days).

2020-03-26 21:45:50.171 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception:
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method) ~[?:1.8.0_201]
        at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957) ~[?:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1367) ~[?:1.8.0_201]
        at org.openhab.binding.network.internal.PresenceDetection.performPresenceDetection(PresenceDetection.java:362) ~[?:?]
        at org.openhab.binding.network.internal.PresenceDetection.lambda$10(PresenceDetection.java:606) ~[?:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_201]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[?:1.8.0_201]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_201]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[?:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
2020-03-26 21:48:03.066 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception:
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method) ~[?:1.8.0_201]
        at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957) ~[?:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor.ensurePrestart(ThreadPoolExecutor.java:1603) ~[?:1.8.0_201]
        at java.util.concurrent.ScheduledThreadPoolExecutor.reExecutePeriodic(ScheduledThreadPoolExecutor.java:350) ~[?:1.8.0_201]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:296) ~[?:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]

I browsed similar topics in the forum but they didn’t bring me closer to any resolution. I had a look a the suggestion to increase TasksMax parameter (currently for me is 2200), 1 day from restart current tasks are 175 but it is growing endlessly. The problem is not a hard limit but tasks leaking.
I didn’t observe this problem before upgrade, my rules/other setup haven’t changed, it looks like leak introduced by one of the add-ons
Very first occurance of this problem happened after disconnecting/connecting to cloud service:

2020-03-26 21:10:48.893 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxhidden, base URL = http://localhost:8080)
2020-03-26 21:10:50.705 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxhidden, base URL = http://localhost:8080)
2020-03-26 21:10:51.036 [ERROR] [io.socket.thread.EventThread        ] - Task threw exception
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method) ~[?:1.8.0_201]
        at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957) ~[?:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor.ensurePrestart(ThreadPoolExecutor.java:1603) ~[?:1.8.0_201]
        at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:334) ~[?:1.8.0_201]
        at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533) ~[?:1.8.0_201]
        at java.util.concurrent.ScheduledThreadPoolExecutor.execute(ScheduledThreadPoolExecutor.java:622) ~[?:1.8.0_201]
        at okhttp3.internal.ws.RealWebSocket.runWriter(RealWebSocket.java:423) ~[?:?]
        at okhttp3.internal.ws.RealWebSocket.send(RealWebSocket.java:380) ~[?:?]
        at okhttp3.internal.ws.RealWebSocket.send(RealWebSocket.java:359) ~[?:?]
        at io.socket.engineio.client.transports.WebSocket$3.call(WebSocket.java:145) ~[bundleFile:?]
        at io.socket.engineio.parser.Parser.encodePacket(Parser.java:68) ~[bundleFile:?]
        at io.socket.engineio.parser.Parser.encodePacket(Parser.java:47) ~[bundleFile:?]
        at io.socket.engineio.client.transports.WebSocket.write(WebSocket.java:140) ~[bundleFile:?]
        at io.socket.engineio.client.Transport$3.run(Transport.java:100) ~[bundleFile:?]
        at io.socket.thread.EventThread.exec(EventThread.java:55) ~[bundleFile:?]
        at io.socket.engineio.client.Transport.send(Transport.java:95) ~[bundleFile:?]
        at io.socket.engineio.client.Socket$7.call(Socket.java:371) ~[bundleFile:?]
        at io.socket.emitter.Emitter$OnceListener.call(Emitter.java:164) ~[bundleFile:?]
        at io.socket.emitter.Emitter.emit(Emitter.java:117) ~[bundleFile:?]
        at io.socket.engineio.client.Transport.onOpen(Transport.java:114) ~[bundleFile:?]
        at io.socket.engineio.client.transports.WebSocket.access$000(WebSocket.java:24) ~[bundleFile:?]
        at io.socket.engineio.client.transports.WebSocket$1$1.run(WebSocket.java:58) ~[bundleFile:?]
        at io.socket.thread.EventThread$2.run(EventThread.java:80) [bundleFile:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]

but I don’t know whether it is a reason or just symptom.
Can anyone hint me how I can narrow down to the root cause?

1 Like

How did you install openHAB? I know openHABian adds some EXTRA_JAVA_OPTS settings.

In Raspbian I used apt-get

I had a problem like this not long ago. The issue was in my own binding not closing connections properly and because of that a lot of threads was created. Configure htop so that it shows thread names and hopefully you’ll get a clue whats creating those threads.

2 Likes

@gitMiguel, any other hint how to find it out? I didn’t manage to clearly identify the problem. My tasks are still growing, but after disabling CloudConnector (which was my suspect but not proven) are growing slower. I can’t conclude binding or add-on name from htop, I can only see the main process repeated X times or simply if configured thread names, just java in the tree of main process

openhab  19948     1 25 22:23 ?        00:11:02 /usr/bin/java -Dopenhab.home=/usr/share/openhab2 -Dopenhab.conf=/etc/openhab2 -Dopenhab.runtime=/usr/share/openhab2/runtime -Dopenhab.userdata=/var/lib/openhab2 -Dopenhab.logdir=/var/log/openhab2 -Dfelix.cm.dir=/var/lib/openhab2/config -Djava.library.path=/var/lib/openhab2/tmp/lib -Djetty.host=0.0.0.0 -Djetty.http.compliance=RFC2616 -Dorg.ops4j.pax.web.listening.addresses=0.0.0.0 -Dorg.osgi.service.http.port=8080 -Dorg.osgi.service.http.port.secure=8443 -Djava.awt.headless=true -Duser.country=PL -Duser.language=pl -Djava.endorsed.dirs=/usr/lib/jvm/java-8-oracle/jre/jre/lib/endorsed:/usr/lib/jvm/java-8-oracle/jre/lib/endorsed:/usr/share/openhab2/runtime/lib/endorsed -Djava.ext.dirs=/usr/lib/jvm/java-8-oracle/jre/jre/lib/ext:/usr/lib/jvm/java-8-oracle/jre/lib/ext:/usr/share/openhab2/runtime/lib/ext -Dkaraf.instances=/var/lib/openhab2/tmp/instances -Dkaraf.home=/usr/share/openhab2/runtime -Dkaraf.base=/var/lib/openhab2 -Dkaraf.data=/var/lib/openhab2 -Dkaraf.etc=/var/lib/openhab2/etc -Dkaraf.log=/var/log/openhab2 -Dkaraf.restart.jvm.supported=true -Djava.io.tmpdir=/var/lib/openhab2/tmp -Djava.util.logging.config.file=/var/lib/openhab2/etc/java.util.logging.properties -Dkaraf.startLocalConsole=false -Dkaraf.startRemoteShell=true -classpath /usr/share/openhab2/runtime/lib/boot/org.apache.karaf.diagnostic.boot-4.2.7.jar:/usr/share/openhab2/runtime/lib/boot/org.apache.karaf.jaas.boot-4.2.7.jar:/usr/share/openhab2/runtime/lib/boot/org.apache.karaf.main-4.2.7.jar:/usr/share/openhab2/runtime/lib/boot/org.apache.karaf.specs.activator-4.2.7.jar:/usr/share/openhab2/runtime/lib/boot/osgi.core-6.0.0.jar org.apache.karaf.main.Main

1 Like

You followed my advice in the second post & it did not help?

Sorry @Bruce_Osborne what was your advice? I understood it was just questions how I installed my oh

Installing openHABian as recommended for the Pi sets some memory options for Java since the Pi has very limited memory.

In my initial post I mentioned I am on RPi Raspbian Buster…

You can install openHABian on Raspbian Buster. Just follow the instructions for a generic Linux install. I have done that many times.

Bruce… op states in first post

op states

me thinks this is not a matter of an improper installation

@Andrew_Rowe, exactly. I haven’t seen this problem before upgrade therefore I think one or more binding leak. Don’t know though how to find them

Your snippets from logs are symptoms. They just show the exception thrown when something tries to create a thread and there is no more available. They don’t show what is using them in the first place.

Is everything up to date? Disable all your bindings and then enable them one by one. When the thread count spikes you might have found the root cause/binding.

Yes exactly. I was after the tree view and thread names. When I had almost 2k “dead” threads I could spot it from the listing that there’s just too many.

See my topic about OOM.

I had very similar errors to this a few weeks back when I moved apartments. The reason seems to be that one or more bindings worked incorrectly when some devices/bridges/vacuum cleaner etc were disconnected. OpenHAB ate all memory and eventually had these errors. After connecting my devices or setting them as disabled in PaperUI everything has been running fine for about 2 weeks now.

I.e my best tip if you have not tried is to disable all devices that might not respond/be connected but still be included in your configurations/setup.

2 Likes

Thanks guys, thanks to your help, especially @gitMiguel latest post, it seems I identified the leaking binding. Need some more time to confirm it 100% as eventhough it stopped growing like before every couple of seconds, maybe I entered some coincidental period. Need some more time for observation but it seemed to be Xiaomi Mi IO Binding which started opening connections endlessly (more then 850 in the last 12 hours). I’m not solving it yet to confirm although I think this was the problem

3 Likes

Hey,

did you mention it to the binding creator. I am observing the same behaviour. Don’t want to hassle him twice.

2 Likes

Create an issue on the GitHub page and give a link to this thread in the issue, plus post a link to the issue here in this thread. It is helpful to know how many people get issues like this and if they are not easily reproducible to look at what is common between effected peoples setups.

Also worth trying older versions of a binding, if you find version x does not have the issue it helps find the cause quicker.

1 Like

@Simsal I haven’t mentioned it to the creator as there is a thread (where creator is involved) which mentions that in the recent release there is a need to configure some device Id or some other parameter otherwise binding will behave improperly (I tried to find this thread to quote it here but for some reason can’t do it easily). I don’t have that device anymore to test biding deeply therefore I haven’t mentioned anything yet as maybe it is abovementioned problem. I suggest you raise it with binding creator. I will keep digging for the thread with problem description and post it here if find anything.

@Simsal this may happen if you have text config and/or connectivity issues.
If you have text config, please take a look at the readme and fully define the thing according to the example. If you have defined your thing with text config, ensure your IP and token are correctly populated. If you continue to have the issue, please send a debug log (longer period) so we can see what is happening that may trigger the behaviour.

Hi Marcel,
thank you for reaching out to me. Appreciate it.

I recreated all my things this morning. Now I have just Paper UI created things with token from the cloud.
Items are via text config. This shouldn’t be a problem?
I will monitor now if the problem keeps coming up, if yes i will get you a debug log.