Non deterministic malfunction of shelly 1 plus

Hello,

I’m not sure if this is an OpenHAB related bug or a bug in the Shelly device.

I have about 20 Shelly 1 Plus devices running in detached mode. Each action triggers a script in OpenHAB (e.g. switching the light). Two devices are behaving strangely. Sometimes after a few days the rule is not triggered (also the log in OpenHAB shows nothing). After rebooting the Shelly everything works fine again.

Here is the debug code from the shelly web ui when the shelly is working properly (with one button press):

  {
    "ts": 1702828414.872,
    "level": 2,
    "data": "shelly_notification:209 Event from input:0: {\"component\":\"input:0\", \"id\":0, \"event\":\"btn_down\", \"ts\":1702828414.86}"
  },
  {
    "ts": 1702828414.973,
    "level": 2,
    "data": "shelly_notification:209 Event from input:0: {\"component\":\"input:0\", \"id\":0, \"event\":\"btn_up\", \"ts\":1702828414.96}"
  },
  {
    "ts": 1702828415.373,
    "level": 2,
    "data": "shelly_notification:209 Event from input:0: {\"component\":\"input:0\", \"id\":0, \"event\":\"single_push\", \"ts\":1702828415.36}"
  },
  {
    "ts": 1702828418.302,
    "level": 1,
    "data": "shos_rpc_inst.c:355     0x3ffdc800: duplicate id 'shellyplus1-7c87ce58b858'"
  },
  {
    "ts": 1702828418.316,
    "level": 2,
    "data": "shos_rpc_inst.c:230     Shelly.GetStatus via HTTP_in POST 10.240.0.201:45848"
  },
  {
    "ts": 1702828463.361,
    "level": 1,
    "data": "shos_rpc_inst.c:355     0x3ffdc800: duplicate id 'shellyplus1-7c87ce58b858'"
  },
  {
    "ts": 1702828463.375,
    "level": 2,
    "data": "shos_rpc_inst.c:230     Shelly.GetStatus via HTTP_in POST 10.240.0.201:36032"
  },

The string “duplicate id” is noticeable. The debug output for the shelly if it does not work properly:

  {
    "ts": 1702828075.604,
    "level": 2,
    "data": "shelly_notification:209 Event from input:0: {\"component\":\"input:0\", \"id\":0, \"event\":\"btn_down\", \"ts\":1702828075.59}"
  },
  {
    "ts": 1702828075.736,
    "level": 2,
    "data": "shelly_notification:209 Event from input:0: {\"component\":\"input:0\", \"id\":0, \"event\":\"btn_up\", \"ts\":1702828075.72}"
  },
  {
    "ts": 1702828076.105,
    "level": 2,
    "data": "shelly_notification:209 Event from input:0: {\"component\":\"input:0\", \"id\":0, \"event\":\"single_push\", \"ts\":1702828076.09}"
  },
  {
    "ts": 1702828102.399,
    "level": 2,
    "data": "shos_rpc_inst.c:230     Shelly.GetStatus via HTTP_in POST 10.240.0.201:34740"
  },

There is no “duplicate id” string here. I have no idea how to do any more debugging on this problem. Do you have any ideas?

Greetings,
Mario

I think device-id is not about the second part in the string like shellyplus1-7c87ce58b858 but about the id you find in the section device info if you log in to the device where you find either a 6 or a 12 digit string/number.
Have a look to this blogpost RobIII: Shelly: The device is already owned by another user/account

1 Like

Hi @Wolfgang_S
Thanks for your reply.
I can confirm, that the second part in the string “shellyplus1-7c87ce58b858” is the device id that is also shown in the shelly UI.
So the ID is a 12 digit string and the shelly is not connected to the cloud, so I think the linked blogpost is not relevant to my problem.

The strange thing is, that in my example where the “duplicate ID” message appears, this device works fine. When this message is missing, the switching is not recognized.

Any ideas?? :face_with_diagonal_mouth:

At this moment I have no other idea than trying to contact the Shelly support.

Can anyone confirm, that there is the “duplicate id” debug message, when using the shelly in detached mode?

I’ve got a strangely behaving shelly 1 plus. Switch is in detached mode. It was working fine when I did the setup until it stopped working properly a couple of hours later.

Connected.
08:31:22
shelly_debug.cpp:135 Streaming logs to 192.168.178.20:61877 (ws 1 json 1)
08:31:11
shos_dns_sd.cpp:519 0x3ffd73b8(st): Announced ShellyPlus1-xxx61FD926C4 any@any (192.168.178.46)
08:31:12
shos_rpc_inst.c:355 0x3ffde010: duplicate id 'shellyplus1-xxx61fd926c4'

I once had a device that was indeed owned by another account, but back then I got a message from Shelly App - this isn’t the case now.

Thing is - the channel of the input isn’t updating anymore which is bad.

@gerrit Is it working again after a reboot of the Shelly?

No, rebooting the device didn’t make it work again, but it seems like the “duplicate ID” messages are gone now.
But deactivating and activating the thing seems to solve the problem of the channel no longer updating - at least for now.

…hold on. the messages are back again:

hos_rpc_inst.c:230 Shelly.GetStatus via HTTP_in POST 192.168.178.200:57796
10:01:31
shos_rpc_inst.c:355 0x3ffdcdb0: duplicate id 'shellyplus1-xxx61fd926c4'
10:02:31
shos_rpc_inst.c:230 Shelly.GetStatus via HTTP_in POST 192.168.178.200:39086
10:02:31
shos_rpc_inst.c:355 0x3ffdf20c: duplicate id 'shellyplus1-xxx61fd926c4'

But the channel is still being updated. Seems as if those two issues aren’t related.

Regarding the “duplicate id” messages, the shelly documentation says the RPC request frame should have : https://shelly-api-docs.shelly.cloud/gen2/General/RPCProtocol

src string Name of the source of the request (you can choose whatever string you like to identify you as the source of the request).

The shelly binding sends the device name in the frame:

request.src = thingName;

I can see this in log messages (the code is in Shelly2ApiClient.java)

Could this be the source of the duplicate ID message?

I have several Shelly i4 that show the “duplicate ID” message when they work.
When they stopp working the “duplicate ID” message is not seen in the shelly debug log.

@markus7017 please comment on this. The Shelly 1 plus stop working correctly irregullarly. A reboot brings things back. I noticed this “duplicate ID” and think the request.src should be different.

@PaulFQ For me it also works again, if I pause the Thing and then start it again in OpenHAB.

I need to check that
AFAIK ir was fixed, but maybe never merged

turn on DEBUG log for the binding and check output

1 Like

@markus7017 The failure occurs infrequently, only once a week or once every two weeks. So its difficult to debug.

I updated the Dev build to use some other value for .src, please check
https://github.com/markus7017/myfiles/blob/master/shelly/org.openhab.binding.shelly-4.2.0-SNAPSHOT.jar?raw=true

Hi all,

I have a Shelly Plus I4 and want to remotely control some Shelly Plus 1. I had the same issues as stated above and unfortunately, in my case the patched dev build didn’t help. I had to reset the corresponding thing nearly every day as a workaround.

So I did some investigations myself and it seems to me that the Shelly is closing the websocket after some idle time. In that case openhab does not receive NotifyEvents (i.e. button presses) from the Shelly anymore. The workaround (resetting the corresponding thing in openhab) leads to a re-establishment of the websocket and openhab again is notified in case of button presses.

Apparently, there is missing a ping/pong mechanism to keep the websocket alive. For testing purposes, based on the 4.1.2 version of the shelly binding I added a Shelly.GetStatus websocket request in the getStatus() method (Shelly2ApiRpc.java) which seems to be called in order to poll the device’s status. I ran this setup for two weeks now while I wasn’t at home: So far everything still works. It is definitely not the most elegant solution to the problem and it also does not get rid of the “duplicate id” issue, but it works for now.

Speaking of which, in the meanwhile I reproduced the “duplicate id” message in the Shelly logs. It occurs if the same src is used in HTTP POST requests as well as in the websocket communication. Unclear, if that’s on purpose or a bug on Shelly side…

Long story short: This problem looks solvable and I am happy to contribute a PR. But firstly, feedback is highly appreciated @markus7017

I think that’s not the solution

How long api does it take until the WS gets disconnected (check logs)

I use @WebSocket(maxIdleTime = Integer.MAX_VALUE) as annotation, which should set the Jetty WebSocket idle timeout to 2147483647 msec, which should set the idle timeout to 24,8 days. The refresh should happy by the standard device poll every 60sec (detecting a non-responding device after 70sec). You should see those GetStatus requests. Do you see those in the log? Every time the binding receives a command response or notification that timer is restarted. So it show never timeout, but obviously it does.

  • turn TRACE log on
  • wait 3 minutes
  • check OH log for those pings every 60sec or notifications (which restart the timer)

make sure to use the latest DEV build.

4.2-DEV | README | READMEbeta
Avdanced Users - Shelly Manager - Bugs/Features - API Doc

Also the duplicate id messages should have gone. You should see in the commands that src id should be some random number and dst the device id.

Hello,

are there any news on this topic?
For me it doesn’t seem to work with openHAB version 4.2.2.
Still the same issue as described in the first post.

Is it working for all others now or is there a known workaround?

I’m running openHAB in a docker container and I’m not soooo familiar with openHAB.
Where can I activate the debug log for specific items?

Is the (possible running) solution from @baschdi verified and integrated to current stable build?

Edit: Where can I modify the socket idle timeout?

Hi Patrick,

if you want to see debug logs from your Shelly binding you need to:

  • log in to your host/docker container running openhab
  • log in to your openhab console via ssh -p 8101 openhab@localhost
  • there you can change the log level via log:set debug org.openhab.binding.shelly
  • have a look at the logs via log:tail org.openhab.binding.shelly

Unfortunately, when I update to the most recent versions of openhab and the Shelly binding (which was 4.2.1 at the time) I still have the same issue. In the device’s logs I still read ‘duplicate id’ messages and after some idle time the device renders inoperable (then, no ‘duplicate id’ messages occur anymore).

I checked on log level TRACE and, as before, only see the HTTP GetStatus requests every 60 s according to the update interval setting of the binding. I then double-checked the Websocket maxIdleTime annotation in the source code, that’s also fine and you shouldn’t need to modify it.

So then, I cherry-picked my changes described in my prior post, build a custom version of the binding and since then I have a stable connection again. Since nobody else seemed to have this issue anymore I didn’t press the issue any further.

1 Like

@baschdi I am still experiencing the original problem. Could you please provide your custom build and explain how to install it? I would really appreciate it!