java.lang.OutOfMemoryError: unable to create new native thread

marcel_verpaalen · May 15, 2020, 8:32am

Nope… that’s just fine. it is what I use.

shorty707 · May 17, 2020, 2:17pm

I posted my logs in another thread.
The MI io binding consumes already 1900 threads for me after 12hrs uptime

You can see the threads causing it there.
And it starts growing right after reboot of openhab. I did not see this before I upgraded the binding.

Simsal · May 17, 2020, 3:29pm

So, after two days, i don’t see the problem anymore.
For me, the solution was to delete all old miio things and add the new ones with token.

TRIROG · June 18, 2020, 7:35pm

Huh, i have a same issue but i dont use xiaomi binding…any ideas?

EDIT:
If using amazon echo binding this might be the reason:
check here:

brevilo · September 18, 2020, 6:58am

The reason seems to be that one or more bindings worked incorrectly when some devices/bridges/vacuum cleaner etc were disconnected. OpenHAB ate all memory and eventually had these errors.

Same here. For me it was a vacuum I left switched off. My openHABian (Pi3B, 1GB) ran into OOM errors within 24 hours!

I wish openHAB would detect and warn about this scenario somehow since that OOM error isn’t trivial to hunt to down.

Cheers

marcel_verpaalen · September 18, 2020, 8:13am

@brevilo which version of the miio binding are you using?

There were 2 fixes for this included in 2.5.8 version.
Do you already have this version of the binding and still run into the problem?

brevilo · September 18, 2020, 8:37am

Hey Marcel. Yep, I’m on 2.5.8. I don’t have hard proof that your binding was indeed the culprit but the vacuum is the only device/item than can get (and was) disconnected. I’ll keep an eye on things over the next days and report back.

Cheers

marcel_verpaalen · September 18, 2020, 10:15am

If indeed you have the issue, you may be able to run dev:dump-create in the kraf console which creates a zip file. Inside that zip there is a file threads.txt.

If you see there hundreds of threads related to miio than… Houston we have a problem

Rar9 · February 27, 2021, 7:58am

Getting the same error on my Ubunutu 20.04 VIM3 with Openhabian 3

Any clue what could be wrong?

unrelentingtech · August 16, 2021, 6:26pm

Same problem on FreeBSD 12, OpenJDK 11.0.11+9-1. shell:threads shows them all like this:

"Mi IO MessageSenderThread" Id=2143 in TIMED_WAITING
    at java.base@11.0.11/java.lang.Thread.sleep(Native Method)
    at org.openhab.binding.miio.internal.transport.MiIoAsyncCommunication$MessageSenderThread.run(MiIoAsyncCommunication.java:275)

From a quick look at the code, I wonder if the thread leak happens when isAlive() returns false.

marcel_verpaalen · August 16, 2021, 6:58pm

did you do the dev:dump-create action? That should give a good hint

In case of running out of threats issues related to the miio binding you should see hundreds or thousands of waiting threads related to miio.

Expected behaviour is to have 2 or 3 threads per miio thing and a few more for the http part (cloud) and 2 for the discovery

unrelentingtech · August 16, 2021, 7:34pm

Yes, the >thousand of waiting "Mi IO MessageSenderThread" threads are only present in shell:threads (and the system e.g. top -H) but not in threads.txt of the dump. (Though the style of output is the same between shell:threads and the dump, aren’t they supposed to be the same thing?)

marcel_verpaalen · August 16, 2021, 7:45pm

that’s very odd, if I run on my system the 2 are equal. what I see in the console is what I see in the dump. Can you share the zip file to see if anything can be learned from it (the hprof file included can be loaded in eclipse as it has a suspected leak tool)

btw, are you running with text file config or with gui config for your things?

unrelentingtech · August 16, 2021, 8:07pm

Oops, was looking at the wrong dump. Yes, it’s all in there too. Could the hprof file contain sensitive information? (Also thread leaks aren’t like memory leaks anyway…)

Using GUI config for everything.

marcel_verpaalen · August 16, 2021, 8:12pm

I don’t know… sorry this kind of analysis is rather new to me. Maybe you check that first before sharing.

I’m trying to understand the issue that you have and what might cause it.

wrt your remark… my understanding of isalive()… if it is false, the thread is finished, hence the binding starts a new receiver thread . A waiting thread should report alive and not causing another thread. the function is a synchronised one, meaning there should not be multiple instances running of the function at the same time.

For me the most likely scenario is that something might go wrong is at the very start of the thing… it is the time where OH is sending many many refresh commands for the channels, maybe with some strange race condition and the device not responding or parameters not available. (in the getConnection part maybe something goes wrong. I do expect you would see evidence of that in the logs though (things offline, repeated failed pings/deviceId messages etc)