Too much time before a Sonos thing becomes definitively ONLINE

HALLO01 · January 2, 2019, 2:01pm

Since I have upgraded from 2.4.0 snapshot to 2.4.0 stable I have a strange behaviour with the sonos binding:

When I switch off the power of the sonos amp and switch it on I see in the openhab log with stable 2.4.0 and latest snapshot:

2019-01-01 14:20:07.058 [hingStatusInfoChangedEvent] - 'sonos:CONNECTAMP:celler' changed from OFFLINE (COMMUNICATION_ERROR) to OFFLINE (COMMUNICATION_ERROR): Der UPnP
Speaker RINCON_000E58D0E73001400 ist nicht registriert.
2019-01-01 14:20:08.189 [hingStatusInfoChangedEvent] - 'sonos:CONNECTAMP:celler' changed from OFFLINE (COMMUNICATION_ERROR): Der UPnP Speaker RINCON_000E58D0E73001400
ist nicht registriert. to ONLINE
2019-01-01 14:20:08.211 [hingStatusInfoChangedEvent] - 'sonos:CONNECTAMP:celler' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Der Sonos Speaker RINCON_000E58D
0E73001400 ist nicht im Netzwerk verfügbar.

It is ONLINE and immediately it goes OFFLINE.
Only a restart of openhab helps to get it online (or I wait 10 minutes). But when I switch off the power and switch it on I have the same behaviour again.

The old and correct behaviour (2.4.0 snapshot):

2019-01-01 14:08:19.986 [hingStatusInfoChangedEvent] - 'sonos:CONNECTAMP:RINCON_000E58D0E73001400' changed from OFFLINE (COMMUNICATION_ERROR): Der Sonos Speaker RINCON
_000E58D0E73001400 ist nicht im Netzwerk verfügbar. to OFFLINE (COMMUNICATION_ERROR)
2019-01-01 14:08:22.940 [hingStatusInfoChangedEvent] - 'sonos:CONNECTAMP:RINCON_000E58D0E73001400' changed from OFFLINE (COMMUNICATION_ERROR) to ONLINE

I have already posted this in this thread, but in the meantime I am not sure if this "[WARN ] [e.jetty.util.thread.ThreadPoolBudget] " problem belongs to my problem.

H102 · January 2, 2019, 2:31pm

Maybe this topic can help with the warning.

HALLO01 · January 2, 2019, 7:08pm

Thanks a lot, I have set the values and till now I haven’t seen the warnings.

The sonos problem is still there.

Lolodomo · January 2, 2019, 7:11pm

I will check if I can reproduce.

HALLO01 · January 2, 2019, 7:24pm

Thank’s a lot for your quick support

For your info: There is also another thread regarding this topic ([OH 2.4.0 M7] Sonos online and immediately offline again) where other users face the same issue

Lolodomo · January 2, 2019, 8:40pm

I can confirm the issue.
Considering only the logs and the code, my analysis is that the UPnP discovery code notifies the binding that the Sonos device is present; then any invocation of the action DeviceProperties:GetZoneInfo by the binding is failing !
It looks like a bug in the jUPnP library.
I will investigate a little more.

Lolodomo · January 2, 2019, 8:41pm

After several minutes, the thing finally goes ONLINE.

HALLO01 · January 2, 2019, 8:47pm

I can confirm that

But I would like to have the old behavior where it goes ONLINE immediately

Lolodomo · January 2, 2019, 9:38pm

Of course.

Here is the sequence of logs with more detailed logs:

22:05:30.807 [DEBUG] [covery.ZonePlayerDiscoveryParticipant] - Discovered a Sonos 'sonos:PLAY5' thing with UDN 'RINCON_000E588EAE6601400'
22:05:30.813 [DEBUG] [sport.upnp.internal.UpnpIOServiceImpl] - Device 'RINCON_000E588EAE6601400' reachability status changed to 'true'
22:05:30.831 [DEBUG] [os.internal.handler.ZonePlayerHandler] - Polling job
22:05:30.837 [INFO ] [ome.event.ThingStatusInfoChangedEvent] - 'sonos:PLAY5:chambre' changed from OFFLINE (COMMUNICATION_ERROR): Le périphérique UPnP RINCON_000E588EAE6601400 n'est pas encore découvert. to ONLINE
22:05:30.855 [TRACE] [sport.upnp.internal.UpnpIOServiceImpl] - Invoking Action 'GetZoneInfo' of service 'DeviceProperties' for participant 'RINCON_000E588EAE6601400'
22:05:30.871 [DEBUG] [sport.upnp.internal.UpnpIOServiceImpl] - Current state of service prevents invoking that action. Connection error or no response received.
22:05:30.878 [DEBUG] [covery.ZonePlayerDiscoveryParticipant] - Created a DiscoveryResult for device '192.168.x.xx - Sonos Play:5' with UDN 'RINCON_000E588EAE6601400'
22:05:30.886 [DEBUG] [os.internal.handler.ZonePlayerHandler] - Sonos player RINCON_000E588EAE6601400 is not available in local network

The problem is that error: “Current state of service prevents invoking that action”.

What I can see 4 seconds before in the logs:

22:05:26.332 [DEBUG] [covery.ZonePlayerDiscoveryParticipant] - Discovered a Sonos 'sonos:PLAY5' thing with UDN 'RINCON_000E588EAE6601400'
22:05:26.329 [DEBUG] [sport.upnp.internal.UpnpIOServiceImpl] - A GENA subscription 'ZoneGroupTopology' for device 'uuid:RINCON_000E588EAE6601400' was ended
22:05:26.329 [DEBUG] [sport.upnp.internal.UpnpIOServiceImpl] - A GENA subscription 'AudioIn' for device 'uuid:RINCON_000E588EAE6601400' was ended
22:05:26.323 [DEBUG] [sport.upnp.internal.UpnpIOServiceImpl] - A GENA subscription 'DeviceProperties' for device 'uuid:RINCON_000E588EAE6601400' was ended
22:05:26.347 [DEBUG] [sport.upnp.internal.UpnpIOServiceImpl] - Device 'RINCON_000E588EAE6601400' reachability status changed to 'false'

As soon as the device is detected through UPnP, some subscriptions are ended and the current state is first set to not reachable.

4 seconds later, the state is set to reachable but action invocation fails and I don’t know why.

I think it could be a problem either in the jUPnP library.

@Kai for information

Lolodomo · January 2, 2019, 10:07pm

I forgot to mention but I reproduced the issue in 2.5 snapshot 480.
So the problem is certainly still present in current master.

Lolodomo · January 2, 2019, 10:10pm

Could you rename the topic into “Too much time before a Sonos thing becomes definitively ONLINE” ?

Issue created, in ESH as I don’t know exactly where is the problem except that it was reproduced for Sonos.

HALLO01 · January 5, 2019, 1:16pm

Is it possible to downgrade the installed 2.5.1 jupnp library in current 2.5.0 snapshot?
If so, I would try to install the 2.4.0 version.

HALLO01 · January 8, 2019, 10:24am

Thanks to @Lolodomo I got the information that the jUPnP bundle is somehow broken - see ESH issue above.
So I was able to implement a hopefully temporary () workaround:
I implemented a rule which always restarts the jUPnP bundle when the sonos Thing goes offline.

This is working because the first time after a power on the sonos thing is always recognized immediately. The problem starts with the next power off/on. You then can either restart openhab or just restart the jUPnP bundle - that fixes the behaviour.

General question: How is the status regarding a possible jUPnP fix, does anybody know?

khb3bw2 · January 14, 2019, 8:29pm

Hello Thomas,

I would be interessted in some more details about your workaround - is there a chance you share it here? I guess an offical fix will definetly take some time and I would like to see if this could help me too.

I’m in the same situation like you, since I updated to OH2.4 my Sonos speakers can no longer be controlled in a stable way via Basic UI.

My log is flooded ramdonly with below errors for any of my speakers:

[hingStatusInfoChangedEvent] - ‘sonos:PLAY3:XYZ’ changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): The Sonos player RINCON_XBE932481F6601400 is not available in local network.

[hingStatusInfoChangedEvent] - ‘sonos:PLAY3:XYZ’ changed from OFFLINE (COMMUNICATION_ERROR): The Sonos player RINCON_XBE932481F6601400 is not available in local network. to ONLINE

It looks even like the error message is partially upside down with that “to ONLINE” statement at the end of the string.

Thanks & Regards,
Christoph

HALLO01 · January 14, 2019, 9:08pm

Sure, I will do so.
I will post it tomorrow, now it’s time for bed

HALLO01 · January 15, 2019, 7:37am

My rule to restart the jUPnP bundle:

rule "jupnp bundle restart"
when
  Thing "sonos:CONNECTAMP:amp" changed from ONLINE to OFFLINE
then
executeCommandLine("sudo -u openhab ssh -p 8101 -i /home/openhabian/rsh/openhab.id_rsa openhab@localhost bundle:restart org.jupnp",6000)
end

I am using openhabian on an raspberry pi 3.
That the rule is able to call the karaf call without entering a password you have to generate a ssh key.
I used this tutorial

I created a directory called rsh:
mkdir /home/openhabian/rsh
enter the directory and generate ssh key:
sudo -u openhab ssh-keygen -t rsa -f openhab.id_rsa
Add content of the public key file to /srv/openhab2-userdata/etc/keys.properties
openhab=<yourkey>,_g_:admingroup

Test it:
sudo -u openhab ssh -p 8101 -i /home/openhabian/rsh/openhab.id_rsa openhab@localhost

Additionally to the tutorial I had to do the following, that I was able to restart the bundle without entering a password:
First reset root password:
sudo passwd root
Next, add openhab to sudoers file:
su -
visudo
Finally add openhab user to sudoers file:
add: openhab ALL=(ALL) ALL

Then I was able to restart the bundle from the rule

HALLO01 · April 15, 2019, 12:49pm

@Lolodomo: Do you have any news about a possible fix of the UPnP library ?

morph166955 · June 8, 2019, 5:01pm

I’m having a similar but slightly different issue. My sonos speakers come on relatively quickly. After a few days they all go offline and stay offline. A reload of org.jupnp as mentioned above resolves the issue for a few days. I have a rule now that monitors the thing state as mentioned above and reloads it when the crash happens. I noticed this happened when I moved from 2.5.0M1 to one of the recent nightly builds.

jwiseman · November 20, 2019, 3:04pm

Best, Jay

morph166955 · February 2, 2020, 8:08pm

I’ve updated the bug with the following info:

I decided to turn on jupnp debugging for a few minutes while my Sonos PlayBar was in a bad state. Every second I get this message set:

2020-02-02 13:40:34.657 [DEBUG] [org.jupnp.transport.Router ] - Sending via TCP unicast stream: (OutgoingActionRequestMessage) POST http://REDACTED:1400/DeviceProperties/Control
2020-02-02 13:40:34.657 [DEBUG] [org.jupnp.transport.spi.StreamClient] - Will not attempt request because it failed in the last 600 seconds: (OutgoingActionRequestMessage) POST http://REDACTED:1400/DeviceProperties/Control

Then, every 600 seconds I get:

2020-02-02 13:44:11.804 [DEBUG] [org.jupnp.transport.Router ] - Sending via TCP unicast stream: (OutgoingActionRequestMessage) POST http://REDACTED:1400/DeviceProperties/Control
2020-02-02 13:44:21.804 [INFO ] [org.jupnp.transport.spi.StreamClient] - Timeout of 10 seconds while waiting for HTTP request to complete, aborting: (OutgoingActionRequestMessage) POST http://REDACTED:1400/DeviceProperties/Control

2020-02-02 13:54:22.222 [DEBUG] [org.jupnp.transport.Router ] - Sending via TCP unicast stream: (OutgoingActionRequestMessage) POST http://REDACTED:1400/DeviceProperties/Control
2020-02-02 13:54:32.222 [INFO ] [org.jupnp.transport.spi.StreamClient] - Timeout of 10 seconds while waiting for HTTP request to complete, aborting: (OutgoingActionRequestMessage) POST http://REDACTED:1400/DeviceProperties/Control

A CURL (with POST) to the URL gives me:
<s:Envelope xmlns:s=“http://schemas.xmlsoap.org/soap/envelope/” s:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"><s:Body><s:Fault>s:ClientUPnPError401</s:Fault></s:Body></s:Envelope>
NOTE: I get this from ALL Sonos speakers I have. I’m not sure if I’m missing something on the query to get a bad authentication.

I have verified that I have <1ms ping times to the bar (no packet loss) and they are on the same network so the Sonos TTL=1 is not an issue here.

I let this go for about an hour with the same result. Then I restarted OH2. All of my speakers immediately came online and are stable. Given this, I do NOT believe the issue to be the Sonos itself as a “random” restart of OH2 should not clear out the issue on the Sonos.

Going through the debug of the OH2 restart it looks like OH2 effectively registers as an endpoint with the Sonos speaker. My assumption (since I can’t just make one fail at will) is that this registration becomes invalid at some point. Assuming that this theory is correct, at this point I would suggest that the Sonos binding be modified to detect this issue and “self heal” by restarting the registration process with that specific speaker when the HTTP request to the control URL fails to reply.