Too much time before a Sonos thing becomes definitively ONLINE

For anyone who is having this issue, please try to move to 3.1.0-S2130 or later. Work has been done to resolve thread exhaustion issues which should be good after there.

Hi @morph166955

Just tested latest sonos binding version (I use build 2142).
I wanted to test the behaviour of my first post of this thread:
I switched of the power of my sonos device (unplugged device) and switched it on again. I still get the messages:

2021-01-14 16:51:46.130 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'sonos:CONNECTAMP:celler' changed from OFFLINE (COMMUNICATION_ERROR): Der Sonos Speaker RINCON_000 ist nicht im Netzwerk verfĂŒgbar. to OFFLINE (COMMUNICATION_ERROR)
2021-01-14 16:51:46.141 [INFO ] [openhab.event.InboxRemovedEvent     ] - Discovery Result with UID 'sonos:CONNECTAMP:RINCON_000' has been removed.
2021-01-14 16:51:49.715 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'sonos:CONNECTAMP:celler' changed from OFFLINE (COMMUNICATION_ERROR) to ONLINE
2021-01-14 16:51:49.722 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'sonos:CONNECTAMP:celler' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Der Sonos Speaker RINCON_000 ist nicht im Netzwerk verfĂŒgbar.

After about 10 minutes the device is recognized:

2021-01-14 17:00:42.673 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing ‘sonos:CONNECTAMP:celler’ changed from OFFLINE (COMMUNICATION_ERROR): Der Sonos Speaker RINCON_000 ist nicht im Netzwerk verfĂŒgbar. to ONLINE

Do you think the rediscovery of the ONLINE status can be improved, to become quicker?

Thanks,
Thomas

I submitted a PR for JuPnP yesterday to make this setting configurable. It should seriously reduce the timer.

1 Like

Do you have a link to the PR?

Thanks a lot!
Can you give an example how and where to set the configuration via config file?

in services/runtime.cfg:

org.jupnp:retryAfterSeconds=10

For those interested in testing, this is a very experimental image (although it seems to be stable on my system) that may resolve this issue. In fact, it may significantly improve performance of upnp devices.
This version of jupnp has some extra code on the threads that let them work without a pool. There is a chance that this can cause instant spikes in memory or cpu, but they should be short lived.

In addition to retryAfterSeconds, add:

org.jupnp:asyncThreadPool=false
org.jupnp:mainThreadPool=false

Please make the config addition with OH stopped, it will be highly disruptive if it’s active.

Please report success or failure so we can determine if this should get merged. Thanks!

To test the jar, do I have to copy it to the addon dir, like I do for bindings?

No. You need to login to the karaf console and do bundle:update.

1 Like

I tested it but still no improvement. I do see the old behaviour after plug out power and plug it in again. The Thing needs about 10minutes after getting online again.

Log:

2021-02-03 09:33:05.347 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing sonos:CONNECTAMP:celler' changed from OFFLINE (COMMUNICATION_ERROR): Der Sonos Speaker RINCON_000E58D0E73001400 ist nicht im Netzwerk verfĂŒgbar. to OFFLINE (COMMUNICATION_ERROR)
2021-02-03 09:33:08.243 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing sonos:CONNECTAMP:celler' changed from OFFLINE (COMMUNICATION_ERROR) to ONLINE
2021-02-03 09:33:08.249 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing sonos:CONNECTAMP:celler' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Der Sonos Speaker RINCON_000E58D0E73001400 ist nicht im Netzwerk verfĂŒgbar.
......
2021-02-03 09:41:41.307 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing sonos:CONNECTAMP:celler' changed from OFFLINE (COMMUNICATION_ERROR): Der Sonos Speaker RINCON_000E58D0E73001400 ist nicht im Netzwerk verfĂŒgbar. to ONLINE

Setup runtime.cfg

org.jupnp:asyncThreadPool=false
org.jupnp:mainThreadPool=false
org.jupnp:retryAfterSeconds

JUPnP/Sonos version
242 │ Active │ 80 │ 3.1.0.202102030342 │ openHAB Add-ons :: Bundles :: Sonos Binding
229 │ Active │ 80 │ 2.6.0.202101312125 │ JUPnP Library

Here the syntax how I have updated the lib (thanks @morph166955):
Login into karaf and get the ID of the JUPnP Lib, in my case 229.
Link to new lib was posted by morph.
bundle:update 229 https://github.com/morph166955/jupnp/releases/download/2.6.0-d073e87/org.jupnp-2.6.0-SNAPSHOT-d073e87.jar

You need to set retryAfterSeconds to a value that you want it to recover after. That will bring them back faster.

I’ll post a log config later when I can copy paste it in. I want to see that the threads are working as expected.

Of course I have to set it, blame on me :yum:
Totally forget that, sorry!

But now, IT IS WORKING :smiling_face_with_three_hearts:

I will keep running the system and report after some days if I face some unwanted side effects!
I have set the value to 10. What value do you recommend?

Mine is set to 10. I’m curious how you know it’s working already? Do your devices fail that quickly?

Ok let’s make sure this is all working as expected. Please do the following (assuming you’re on a linux):

  1. Shutdown openhab

  2. edit /var/lib/openhab/etc/log4j2.xml

At the top add:

<RollingRandomAccessFile fileName="${sys:openhab.logdir}/jupnp.log" filePattern="${sys:openhab.logdir}/jupnp.log.%i" name="JUPNP">
<PatternLayout pattern="%d{yyyy-MM-dd HH:mm:ss.SSS} [%-5.5p] [%-36.36c] - %m%n"/>
<Policies>
<OnStartupTriggeringPolicy/>
<SizeBasedTriggeringPolicy size="16 MB"/>
</Policies>
</RollingRandomAccessFile>

Then on down add (you may already have the first jupnp line.):

<Logger level="FATAL" name="org.jupnp"/>
<Logger additivity="false" level="DEBUG" name="org.jupnp.OSGiUpnpServiceConfiguration">
<AppenderRef ref="JUPNP"/>
</Logger>
  1. Start openhab

  2. tail -f /var/log/openhab/jupnp.log | grep -i thread

You should see logs that look like:

2021-02-03 08:45:29.986 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Skipping mainThreadPool creation.
2021-02-03 08:45:29.986 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Skipping asyncThreadPool creation.
2021-02-03 08:45:30.018 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-registry
2021-02-03 08:45:30.026 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-03 08:45:30.028 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-stream
2021-02-03 08:45:30.033 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-03 08:45:30.033 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-03 08:45:30.038 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-stream
2021-02-03 08:45:30.049 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-control
2021-02-03 08:45:30.065 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-03 08:45:30.066 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-03 08:45:30.079 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-03 08:45:30.087 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-03 08:45:30.094 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-03 08:45:30.105 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-search
2021-02-03 08:45:30.106 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-03 08:45:30.109 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-03 08:45:30.122 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-search
2021-02-03 08:45:30.123 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-search
2021-02-03 08:45:30.123 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-search
2021-02-03 08:45:30.138 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-03 08:45:30.139 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-03 08:45:30.155 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-03 08:45:30.174 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-03 08:45:30.174 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router

If you see anything about queueing, upnp-main, or upnp-async then the threadpools are still happening. If you see the “New Named Thread” then you’re on the test code. To note, it can take a minute or two before anything shows up. JuPnP doesn’t start until after all your configs are loaded and bindings are about to load.

The old behaviour was, the device came back after about 10min, after poweroff/on. That really means pull out the power plug and plug in again.
Using the new lib, the device comes back immediately.

I will give you feedback tomorrow

I do see the “New Named Thread” entries.

2021-02-04 11:08:50.731 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-notification
2021-02-04 11:08:50.812 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-04 11:08:50.815 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-04 11:08:50.816 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-notification
2021-02-04 11:08:50.819 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-04 11:08:50.821 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-notification
2021-02-04 11:08:50.821 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-04 11:08:50.824 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router
2021-02-04 11:08:50.825 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-notification
2021-02-04 11:08:50.826 [DEBUG] [g.jupnp.OSGiUpnpServiceConfiguration] - Create New Named Thread upnp-router

In the standard openhab.log I have seen:
I do not know if this belongs to the jupnp lib change?

2021-02-04 10:30:01.279 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception:
java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
at java.lang.Thread.start0(Native Method) ~[?:?]
at java.lang.Thread.start(Thread.java:803) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:937) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.ensurePrestart(ThreadPoolExecutor.java:1583) ~[?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor.reExecutePeriodic(ScheduledThreadPoolExecutor.java:360) ~[?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:307) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]

Have you only seen the OutOfMemoryError since loading the new jar? I’ve seen that recently on other threads but it’s not related to this AFAIK.

Yes, but I did also update to latest openhab snapshot yesterday - so hard to say, maybe also the openhab update had an influence

Till yesterday I have seen this message only once.