OH4 runs out of memory

PR has been resubmitted for review. :slight_smile:

https://github.com/openhab/openhab-addons/pull/15721

I’ve got thread list from memory dump in earlier post. It was visible through standard views available in Eclipse MAT. On other hand, making dev:dump-create Will give you thread names and their state, which could be much quicker way. What was most helpful for me was stack browser which lead me to bindings which launched particular threads.

Apropos the Shelly issue, I recall from memory from some other thread, that the problem is particularly present if there is a Shelly thing related to a URL that does not actually exist, or the URL is only intermittently accessible; which means sockets are tried to be opened and fail, and are subsequently not properly closed.

cool, I’ll look into it soon (had this morning, but it was still pending)

another good hint I need to check, thx

@markus7017 : there is also this PR waiting for your review since few weeks:

Hello, @markus7017, @wborn , @Lolodomo and all the others

Thank you all for support. I think I have found the responsible device. I think it is the new Shelly Wall display (With last FW). Even if I had no things with it, but it seems that discovering it is enough to create problems. I have shutdown the panel and OH is now running with Shelly binding (Default one from marketplace with OH 4.0.3) (Still without any things created) for more than 30mn and the thread numbers are still stable around 320. Will let it like this for the night and see how it is tomorrow.

good progess :+1: could you please provide the TRACE log, so I could check what’s going on

Yes of course,… will do that tomorrow. Never done it before, only have to type log:set TRACE org.openhab.binding.shelly and copie you the output?

That command just sets the level. After entering it, you need to reproduce the issue again (probably by running another discovery cycle?), and afterwards copy the relevant parts from openhab.log.

you need to enter this on the OH console (openhab-cli console).
the output will be written to openhab.log
after reproducing the issues (I think you’ll see reinit/reconnect way before OH running into OOM) edit openhab.log and copy the relevant part into a text file. Make sure no credentials are included. post the log here or send me a PM, make sure to copy a relevant chunk, not a few lines. I’m fine with a 30min log send as PM

@markus7017 I have send you a private message with logs

On this thread I think I have a similar issue. Openhab out of memory, looks to have a lot of threads waiting on web socket - #6 by justaoldman

I have 5 x Shelly Motion devices. Similar to the wall display, these are battery powered devices. That means they sleep for a period of time, and then respond when they wake up. I think that is also true of the wall display - it’s battery powered, therefore its web server is not always on. That could be a pattern.

This comment is a bit of a stream of consciousness - I find it helpful to write what I think as I poke around, it makes me think a bit more clearly, and also allows someone else to look at it and point out my obvious errors. It can however make it a bit hard for people to follow, I apologise for that.

I hypothesise that there is some scenario that creates a web socket, attempts to communicate, then stalls. It’s either giving up on that web socket (timing out perhaps), or the web socket is blocking waiting for a response that never comes.

Presumably when the next polling cycle comes it creates a new web socket, but that web socket is blocked by some form of lock - so it’s waiting for the first web socket to release the lock. The thread traces I have show it’s waiting on a concurrency lock of some sort.

I’ll look in the code to see what I can see, but it’s really not my area other than at a conceptual level, I’m not sure I will be able to point to anything. I am however quite happy to run TRACE logging, or run a dev version of the binding that has extra logging around likely culprits, and see if we can find it.

Reading the code hasn’t told me much, well out of my depth. I can’t easily see the point where it attempts to connect to the web socket - I can see it being created, but the logic flow isn’t clear to me.

I restarted about 12 hours ago. I have 730 web socket threads again already. I will turn on TRACE logging and see what it tells me.

I’ve turned on TRACE logging. I can see that my powered devices have a watchdog that expires every 70 seconds. The Shelly motions seem to have a watchdog that expires every 3660 seconds.

The number of threads is increasing, after about 10 minutes I now have 752 web socket threads. The trace log associated with the Shelly Motions doesn’t show anything that I would consider interesting/smoking gun. There are two basic patterns I see.

Firstly, one that has a COIT message:

2023-10-14 09:29:10.825 [DEBUG] [lly.internal.api1.Shelly1CoapHandler] - shellymotion2-8cf681cd2138: CoIoT Message from /192.168.1.242:5683 (MID=1846): {"G":[[0,3101,16.2],[0,3102,61.1],[0,6107,0],[0,3119,1697223
405],[0,3120,1],[0,6110,0],[0,3106,1627],[0,3111,99],[0,9103,6]]}
2023-10-14 09:29:10.827 [TRACE] [y.internal.handler.ShellyBaseHandler] - shellymotion2-8cf681cd2138: Watchdog restarted (expires in 3660 sec)
2023-10-14 09:29:10.829 [DEBUG] [lly.internal.api1.Shelly1CoapHandler] - shellymotion2-8cf681cd2138: CoIoT Sensor data {"G":[[0,3101,16.2],[0,3102,61.1],[0,6107,0],[0,3119,1697223405],[0,3120,1],[0,6110,0],[0,310
6,1627],[0,3111,99],[0,9103,6]]} (serial=1846)
2023-10-14 09:29:10.830 [DEBUG] [lly.internal.api1.Shelly1CoapHandler] - shellymotion2-8cf681cd2138: 9 CoAP sensor updates received
2023-10-14 09:29:10.832 [TRACE] [lly.internal.api1.Shelly1CoapHandler] - shellymotion2-8cf681cd2138:  Sensor value[0]: id=3101, Value=16.2 (temp, Type=T, Range=-55/125;999, Link=1: sensor_0)
2023-10-14 09:29:10.834 [TRACE] [lly.internal.api1.Shelly1CoapHandler] - shellymotion2-8cf681cd2138:  Sensor value[1]: id=3102, Value=61.1 (temp, Type=T, Range=-67/257;999, Link=1: sensor_0)
2023-10-14 09:29:10.835 [TRACE] [lly.internal.api1.Shelly1CoapHandler] - shellymotion2-8cf681cd2138:  Sensor value[2]: id=6107, Value=0.0 (motion, Type=A, Range=0/1;-1, Link=1: sensor_0)
2023-10-14 09:29:10.837 [TRACE] [lly.internal.api1.Shelly1CoapHandler] - shellymotion2-8cf681cd2138:  Sensor value[3]: id=3119, Value=1.697223405E9 (timestamp, Type=S, Range=U32;-1, Link=1: sensor_0)
2023-10-14 09:29:10.838 [TRACE] [lly.internal.api1.Shelly1CoapHandler] - shellymotion2-8cf681cd2138:  Sensor value[4]: id=3120, Value=1.0 (motionActive, Type=A, Range=0/1;-1, Link=1: sensor_0)
2023-10-14 09:29:10.840 [TRACE] [lly.internal.api1.Shelly1CoapHandler] - shellymotion2-8cf681cd2138:  Sensor value[5]: id=6110, Value=0.0 (vibration, Type=A, Range=0/1;-1, Link=1: sensor_0)
2023-10-14 09:29:10.841 [TRACE] [lly.internal.api1.Shelly1CoapHandler] - shellymotion2-8cf681cd2138:  Sensor value[6]: id=3106, Value=1627.0 (luminosity, Type=L, Range=U32;-1, Link=1: sensor_0)
2023-10-14 09:29:10.843 [TRACE] [lly.internal.api1.Shelly1CoapHandler] - shellymotion2-8cf681cd2138:  Sensor value[7]: id=3111, Value=99.0 (battery, Type=B, Range=0/100;-1, Link=2: device)
2023-10-14 09:29:10.844 [TRACE] [lly.internal.api1.Shelly1CoapHandler] - shellymotion2-8cf681cd2138:  Sensor value[8]: id=9103, Value=6.0 (cfgChanged, Type=EVC, Range=U16, Link=2: device)
2023-10-14 09:29:10.848 [DEBUG] [y.internal.handler.ShellyBaseHandler] - shellymotion2-8cf681cd2138: Channel sensors#motionTimestamp updated with 2023-10-13T18:56:45.000+0000 (type class org.openhab.core.library.
types.DateTimeType).
2023-10-14 09:29:10.851 [DEBUG] [lly.internal.api1.Shelly1CoapHandler] - shellymotion2-8cf681cd2138: 3 channels updated from CoIoT status, serial=1846

Secondly one that doesn’t appear to have the COIT:

2023-10-14 09:27:06.043 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellymotion2-8cf681cd2138: HTTP GET for http://192.168.1.242/status 
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Authorization: Basic YWRtaW46YWRtaW4=
Content-Type: application/x-www-form-urlencoded


2023-10-14 09:27:07.474 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellymotion2-8cf681cd2138: HTTP Response 200: {"wifi_sta":{"connected":true,"ssid":"OscarSlow","ip":"192.168.1.242","rssi":-56},"cloud":{"
enabled":true,"connected":true},"mqtt":{"connected":false},"time":"09:26","unixtime":1697228816,"serial":1679,"has_update":true,"mac":"8CF681CD2138","cfg_changed_cnt":0,"actions_stats":{"skipped":0},"sleep_time":
0,"lux":{"value":1385,"illumination":"bright","is_valid":true},"tmp":{"value":16.2,"units":"C","is_valid":true},"sensor":{"vibration":false,"motion":false,"timestamp":1697223405,"active":true,"is_valid":true},"ba
t":{"value":99,"voltage":4.074},"charger":false,"update":{"status":"unknown","has_update":true,"new_version":"20231009-064331/v2.2.1@1e8f2795","old_version":"20220811-152232/v2.1.8@5afc928c","beta_version":null},
"ram_total":97280,"ram_free":22728,"fs_size":65536,"fs_free":59416,"uptime":1751139,"fw_info":{"device":"shellymotion2-8CF681CD2138","fw":"20220811-152232/v2.1.8@5afc928c"},"ps_mode":0,"dbg_flags":0}
Server: lwIP/2.1.2 (http://savannah.nongnu.org/projects/lwip)
Content-Type: application/json


2023-10-14 09:27:07.476 [TRACE] [y.internal.handler.ShellyBaseHandler] - shellymotion2-8cf681cd2138: Watchdog restarted (expires in 3660 sec)
2023-10-14 09:27:07.478 [TRACE] [.shelly.internal.api1.Shelly1HttpApi] - shellymotion2-8cf681cd2138: HTTP GET for http://192.168.1.242/status
2023-10-14 09:27:07.654 [TRACE] [.shelly.internal.api1.Shelly1HttpApi] - shellymotion2-8cf681cd2138: HTTP Response 200: {"wifi_sta":{"connected":true,"ssid":"OscarSlow","ip":"192.168.1.242","rssi":-56},"cloud":{"
enabled":true,"connected":true},"mqtt":{"connected":false},"time":"09:26","unixtime":1697228816,"serial":1679,"has_update":true,"mac":"8CF681CD2138","cfg_changed_cnt":0,"actions_stats":{"skipped":0},"sleep_time":
0,"lux":{"value":1385,"illumination":"bright","is_valid":true},"tmp":{"value":16.2,"units":"C","is_valid":true},"sensor":{"vibration":false,"motion":false,"timestamp":1697223405,"active":true,"is_valid":true},"ba
t":{"value":99,"voltage":4.074},"charger":false,"update":{"status":"unknown","has_update":true,"new_version":"20231009-064331/v2.2.1@1e8f2795","old_version":"20220811-152232/v2.1.8@5afc928c","beta_version":null},
"ram_total":97280,"ram_free":22728,"fs_size":65536,"fs_free":59416,"uptime":1751139,"fw_info":{"device":"shellymotion2-8CF681CD2138","fw":"20220811-152232/v2.1.8@5afc928c"},"ps_mode":0,"dbg_flags":0}

I don’t see any so far that appear to have different behaviour. Yet the web sockets are increasing.

I don’t see any errors or timeouts, and it’s actually getting responses back from the device. That would lead me to a new hypothesis - something about the battery powered devices (which I can see have a different logic pathway) is creating new web sockets where it shouldn’t / not closing/destroying web sockets when it’s done with them. I’m now up to 776 web sockets after a few minutes, so I’d guess it’s creating 5 new web sockets every 3660 seconds.

Looking further into the logs, I do see some web socket related logs. This one looks to be a normal close:

2023-10-14 09:40:20.274 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af2029730: Closing Rpc API (socket is connected, discovery=true)
2023-10-14 09:40:20.275 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus1-083af2029730: Disconnecting WebSocket (/192.168.1.15:45734 -> /192.168.1.225:80)
2023-10-14 09:40:20.276 [TRACE] [helly.internal.api2.Shelly2RpcSocket] - shellyplus1-083af2029730: Rpc connection closed: 1006 - Disconnected
2023-10-14 09:40:20.279 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af2029730: WebSocket connection closed, status = 1006/Disconnected

This one, however, looks interesting. It is reporting an error, but not reporting that it has closed the socket:

2023-10-14 09:40:20.288 [TRACE] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: Shelly device discovered with empty IP address (service-name=[ServiceInfoImpl@14711726 name: 'shellyplus2pm-806
46fdb89e0._http._tcp.local.' address: '(null):0' status: 'DNS: raspberrypi2-planar-id-au.local. [raspberrypi2.planar.id.au/192.168.1.15] state: probing 1 task: null', has NO data
        shellyplus2pm-80646fdb89e0._http._tcp.local.: ])
2023-10-14 09:40:20.289 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: Shelly device discovered: IP-Adress=[fe80:0:0:0:8264:6fff:fedb:89e0], type=shellyplus2pm
2023-10-14 09:40:20.298 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: Connect Rpc Socket (discovery = true)
2023-10-14 09:40:20.299 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus2pm-80646fdb89e0: Connect WebSocket, URI=ws://[fe80:0:0:0:8264:6fff:fedb:89e0]/rpc
2023-10-14 09:40:20.315 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus2pm-80646fdb89e0: HTTP POST for http://[fe80:0:0:0:8264:6fff:fedb:89e0]/rpc {"id":26410396,"src":"shellyplus2pm-80646fdb89e0","me
thod":"Shelly.GetConfig"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 09:40:20.316 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: WebSocket error
2023-10-14 09:40:20.318 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: Device discovery of device  with address [fe80:0:0:0:8264:6fff:fedb:89e0] failed: SocketException: Could not connect to /[fe80:0:0:0:8264:6fff:fedb:89e0]:80
2023-10-14 09:40:20.319 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: Shelly device discovered: IP-Adress=192.168.1.221, type=shellyplus2pm
2023-10-14 09:40:20.328 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: Connect Rpc Socket (discovery = true)
2023-10-14 09:40:20.329 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus2pm-80646fdb89e0: Connect WebSocket, URI=ws://192.168.1.221/rpc
2023-10-14 09:40:20.344 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus2pm-80646fdb89e0: HTTP POST for http://192.168.1.221/rpc {"id":162862681,"src":"shellyplus2pm-80646fdb89e0","method":"Shelly.GetConfig"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 09:40:20.379 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus2pm-80646fdb89e0: WebSocket connected /192.168.1.15:48360<-/192.168.1.221:80, Idle Timeout=2147483647
2023-10-14 09:40:20.424 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus2pm-80646fdb89e0: HTTP Response 200: {"id":162862681,"src":"shellyplus2pm-80646fdb89e0","dst":"shellyplus2pm-80646fdb89e0","result":{"ble":{"enable":false,"rpc":{"enable":true},"observer":{"enable":false}},"cloud":{"enable":false,"server":"shelly-78-eu.shelly.cloud:6022/jrpc"},"input:0":{"id":0,"name":null,"type":"button","invert":false,"factory_reset":true},"input:1":{"id":1,"name":null,"type":"button","invert":false,"factory_reset":true},"mqtt":{"enable":false,"server":null,"client_id":"shellyplus2pm-80646fdb89e0","user":null,"topic_prefix":"shellyplus2pm-80646fdb89e0","rpc_ntf":true,"status_ntf":false,"use_client_cert":false,"enable_rpc":true,"enable_control":true},"switch:0":{"id":0, "name":null,"in_mode":"detached","initial_state":"off", "auto_on":false, "auto_on_delay":60.00, "auto_off":false, "auto_off_delay":60.00,"power_limit":2800,"voltage_limit":280,"undervoltage_limit":0,"autorecover_voltage_errors":false,"current_limit":10.000},"switch:1":{"id":1, "name":null,"in_mode":"detached","initial_state":"off", "auto_on":false, "auto_on_delay":60.00, "auto_off":false, "auto_off_delay":60.00,"power_limit":2800,"voltage_limit":280,"undervoltage_limit":0,"autorecover_voltage_errors":false,"current_limit":10.000},"sys":{"device":{"name":"LightsGarage","mac":"80646FDB89E0","fw_id":"20230912-081945/1.0.3-g6176478","discoverable":true,"eco_mode":true,"profile":"switch","addon_type":null},"location":{"tz":"Pacific/Auckland","lat":-36.845300,"lon":174.757520},"debug":{"level":2,"file_level":null,"mqtt":{"enable":false},"websocket":{"enable":false},"udp":{"addr":null}},"ui_data":null,"rpc_udp":{"dst_addr":null,"listen_port":null},"sntp":{"server":"time.google.com"},"cfg_rev":19},"wifi":{"ap":{"ssid":"ShellyPlus2PM-80646FDB89E0","is_open":true, "enable":false, "range_extender": {"enable":false}},"sta":{"ssid":"OscarSlow","is_open":false, "enable":true, "ipv4mode":"dhcp","ip":null,"netmask":null,"gw":null,"nameserver":null},"sta1":{"ssid":null,"is_open":true, "enable":false, "ipv4mode":"dhcp","ip":null,"netmask":null,"gw":null,"nameserver":null},"roam":{"rssi_thr":-80,"interval":60}},"ws":{"enable":false,"server":null,"ssl_ca":"ca.pem"}}}
Content-Type: application/json
Content-Length: 2129
Server: ShellyHTTP/1.0.0
Connection: close


2023-10-14 09:40:20.430 [TRACE] [helly.internal.api2.Shelly2ApiClient] - shellyplus2pm-80646fdb89e0: API value detached was mapped to momentary
2023-10-14 09:40:20.431 [TRACE] [helly.internal.api2.Shelly2ApiClient] - shellyplus2pm-80646fdb89e0: API value detached was mapped to momentary
2023-10-14 09:40:20.432 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus2pm-80646fdb89e0: HTTP GET for http://192.168.1.221/shelly 
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/x-www-form-urlencoded


2023-10-14 09:40:20.448 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus2pm-80646fdb89e0: HTTP Response 200: {"name":"LightsGarage","id":"shellyplus2pm-80646fdb89e0","mac":"80646FDB89E0","slot":1,"model":"SNSW-102P16EU","gen":2,"fw_id":"20230912-081945/1.0.3-g6176478","ver":"1.0.3","app":"Plus2PM","auth_en":false,"auth_domain":null,"profile":"switch"}
Content-Type: application/json
Content-Length: 244
Pragma: no-cache
Server: ShellyHTTP/1.0.0
Connection: close


2023-10-14 09:40:20.449 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: Closing Rpc API (socket is connected, discovery=true)
2023-10-14 09:40:20.450 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus2pm-80646fdb89e0: Disconnecting WebSocket (/192.168.1.15:48360 -> /192.168.1.221:80)
2023-10-14 09:40:20.451 [TRACE] [helly.internal.api2.Shelly2RpcSocket] - shellyplus2pm-80646fdb89e0: Rpc connection closed: 1006 - Disconnected
2023-10-14 09:40:20.455 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: WebSocket connection closed, status = 1006/Disconnected
2023-10-14 09:40:20.457 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: Shelly settings : {"ble":{"enable":false},"cloud":{"enable":false,"server":"shelly-78-eu.shelly.cloud:6022/jrpc"},"mqtt":{"enable":false,"rpc_ntf":"true","status_ntf":"false"},"sys":{"cfg_rev":19,"device":{"name":"LightsGarage","mac":"80646FDB89E0","fw_id":"20230912-081945/1.0.3-g6176478","profile":"switch","eco_mode":true,"discoverable":true},"location":{"tz":"Pacific/Auckland","lat":-36.8453,"lon":174.75752},"sntp":{"server":"time.google.com"},"debug":{"mqtt":{"enable":false},"websocket":{"enable":false},"udp":null},"ui_data":null,"rpc_udp":{}},"wifi":{"ap":{"enable":false,"ssid":"ShellyPlus2PM-80646FDB89E0","is_open":true,"range_extender":{"enable":false}},"sta":{"ssid":"OscarSlow","is_open":false,"enable":true,"ipv4mode":"dhcp"},"sta1":{"is_open":true,"enable":false,"ipv4mode":"dhcp"},"roam":{"rssi_thr":-80,"interval":60}},"input:0":{"id":0,"type":"button","invert":false,"factory_reset":true},"input:1":{"id":1,"type":"button","invert":false,"factory_reset":true},"switch:0":{"id":0,"in_mode":"detached","initial_state":"off","auto_on":false,"auto_on_delay":60.0,"auto_off":false,"auto_off_delay":60.0,"power_limit":2800,"voltage_limit":280,"current_limit":10.0},"switch:1":{"id":1,"in_mode":"detached","initial_state":"off","auto_on":false,"auto_on_delay":60.0,"auto_off":false,"auto_off_delay":60.0,"power_limit":2800,"voltage_limit":280,"current_limit":10.0}}
2023-10-14 09:40:20.458 [TRACE] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: thingType=shellyplus2pm, deviceType=SNSW-102P16EU, mode=relay, symbolic name=LightsGarage
2023-10-14 09:40:20.460 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: Adding Shelly LightsGarage, UID=shelly:shellyplus2pm-relay:80646fdb89e0

This might lead me to say the whole battery powered devices thing is a red herring, and the problem is in the discovery process. I’m not sure why it would be discovering a new device though - all my devices should be known and connected. Perhaps one device has lost config, and it’s re-discovering it? I only have two shelly2pms, so should be easy to work out.

I can see which device. There’s nothing in the inbox, and the device clearly has an IP address.
Looking at the log more closely, what I can see is that it is on an IPv6 IP address when the discovery occurs. Then it gets an IPv4 IP address for the device. I’m wondering if the device is either losing its DHCP IP address, then presenting as a new device when it connects via IPv6, or it’s just getting tied up in IPv6 vs IPv4. So a new hypothesis could be two part:

  • Something going on with IP addressing, leading to the device presenting as a new IPv6 device
  • Something going on in the discovery code, where it gets an error on the socket and doesn’t close it properly

The second of those is probably the one to fix, the first is maybe a problem, but IPv6 just doesn’t really work for me at all, so working out how to trouble shoot that is a bigger issue, and probably something we couldn’t expect of everyone’s home network.

Looking through the log, I see those errors a few times, not enough perhaps to be the number of new sockets I’m getting. I see maybe 5 instances in the log, whereas I have 50 new web sockets in that time. Having said that, the flow I think I see is:

  1. Device discovered with IPv6 address. Fails to connect
  2. Same device (same MAC address) then discovered on IPv4, new socket created and connected

That is at least a plausible flow where it’s creating a new web socket for the device without closing the old one and/or not closing the web socket from discovery.

I’m now up to 792 web socket threads, an increase of 60 over perhaps half an hour to an hour.

If you correct the unconfigured device and allow it to reacquire a valid iPv4 address assignment does the waiting thread count decrease?
I do not have any battery operated shelly devices so I can not say much on whether that has a bearing or not as I have not been able to reproduce the run away thread count condition my self .
It maybe I do not have enough devices or the fact that I do not have battery operated devices.
As for discovery for my test system with 2 shelly devices a smart plug and a Shellyplus4i I have not been able to complete a discovery when trying to use them on 4.XX
But if added manually they seem to work fine.
however, both devices discover just fine on OH3.4XX version no manual addition was required.
Not sure how relevant that is to that you see or not.
Also, the binding doc does mention that only IPv4 is supported so any IPv6 assigned device is going to have issues I would think.
I also have IPv6 disabled throughout my network on every NIC and in the wireless access point.

I haven’t consciously activated IPv6, nor configured it. But it’s always there I guess.

I’ve never had to manually add devices.

I suspect I’m seeing enough that I could probably resolve the issue for myself by turning off IPv6 or allocating static IP addresses or some other action that could stop the ip address gap (if that’s the issue). But that won’t help anyone else - if this is a problem other people get but that is hard to diagnose, then I probably should help to get to a point where we know what it is and can fix it - I’m probably more able to help diagnose than many people are.

I’ve run a grep over the log file to see how many disconnects I’m really getting.

grep "WebSocket\|discovery" /var/log/openhab/openhab.log > websocket.log

That gives 530 lines, which represent about 53 total web socket connections. Only a handful of them have the Websocket error - most of them seem to connect fine. It’s sort of possible that this is enough to be all my excess web sockets, I have about 100 extra in that time.

Example of one that looks fine:

2023-10-14 09:27:55.262 [TRACE] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: Shelly device discovered with empty IP address (service-name=[ServiceInfoImpl@28507710 name: 'shellyplus2pm-806
46fdb89e0._http._tcp.local.' address: '(null):0' status: 'DNS: raspberrypi2-planar-id-au.local. [raspberrypi2.planar.id.au/192.168.1.15] state: probing 1 task: null', has NO data
        shellyplus2pm-80646fdb89e0._http._tcp.local.: ])
2023-10-14 09:27:55.264 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: Shelly device discovered: IP-Adress=192.168.1.221, type=shellyplus2pm
2023-10-14 09:27:55.284 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: Connect Rpc Socket (discovery = true)
2023-10-14 09:27:55.286 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus2pm-80646fdb89e0: Connect WebSocket, URI=ws://192.168.1.221/rpc
2023-10-14 09:27:55.319 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus2pm-80646fdb89e0: HTTP POST for http://192.168.1.221/rpc {"id":970475598,"src":"shellyplus2pm-80646fdb89e0","method":"Shelly.GetC
onfig"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 09:27:55.403 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus2pm-80646fdb89e0: WebSocket connected /192.168.1.15:47678<-/192.168.1.221:80, Idle Timeout=2147483647
2023-10-14 09:27:55.439 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus2pm-80646fdb89e0: HTTP Response 200: {"id":970475598,"src":"shellyplus2pm-80646fdb89e0","dst":"shellyplus2pm-80646fdb89e0","resul
t":{"ble":{"enable":false,"rpc":{"enable":true},"observer":{"enable":false}},"cloud":{"enable":false,"server":"shelly-78-eu.shelly.cloud:6022/jrpc"},"input:0":{"id":0,"name":null,"type":"button","invert":false,"f
actory_reset":true},"input:1":{"id":1,"name":null,"type":"button","invert":false,"factory_reset":true},"mqtt":{"enable":false,"server":null,"client_id":"shellyplus2pm-80646fdb89e0","user":null,"topic_prefix":"she
llyplus2pm-80646fdb89e0","rpc_ntf":true,"status_ntf":false,"use_client_cert":false,"enable_rpc":true,"enable_control":true},"switch:0":{"id":0, "name":null,"in_mode":"detached","initial_state":"off", "auto_on":fa
lse, "auto_on_delay":60.00, "auto_off":false, "auto_off_delay":60.00,"power_limit":2800,"voltage_limit":280,"undervoltage_limit":0,"autorecover_voltage_errors":false,"current_limit":10.000},"switch:1":{"id":1, "n
ame":null,"in_mode":"detached","initial_state":"off", "auto_on":false, "auto_on_delay":60.00, "auto_off":false, "auto_off_delay":60.00,"power_limit":2800,"voltage_limit":280,"undervoltage_limit":0,"autorecover_vo
ltage_errors":false,"current_limit":10.000},"sys":{"device":{"name":"LightsGarage","mac":"80646FDB89E0","fw_id":"20230912-081945/1.0.3-g6176478","discoverable":true,"eco_mode":true,"profile":"switch","addon_type"
:null},"location":{"tz":"Pacific/Auckland","lat":-36.845300,"lon":174.757520},"debug":{"level":2,"file_level":null,"mqtt":{"enable":false},"websocket":{"enable":false},"udp":{"addr":null}},"ui_data":null,"rpc_udp
":{"dst_addr":null,"listen_port":null},"sntp":{"server":"time.google.com"},"cfg_rev":19},"wifi":{"ap":{"ssid":"ShellyPlus2PM-80646FDB89E0","is_open":true, "enable":false, "range_extender": {"enable":false}},"sta"
:{"ssid":"OscarSlow","is_open":false, "enable":true, "ipv4mode":"dhcp","ip":null,"netmask":null,"gw":null,"nameserver":null},"sta1":{"ssid":null,"is_open":true, "enable":false, "ipv4mode":"dhcp","ip":null,"netmas
k":null,"gw":null,"nameserver":null},"roam":{"rssi_thr":-80,"interval":60}},"ws":{"enable":false,"server":null,"ssl_ca":"ca.pem"}}}
Content-Type: application/json
Content-Length: 2129
Server: ShellyHTTP/1.0.0
Connection: close


2023-10-14 09:27:55.446 [TRACE] [helly.internal.api2.Shelly2ApiClient] - shellyplus2pm-80646fdb89e0: API value detached was mapped to momentary
2023-10-14 09:27:55.447 [TRACE] [helly.internal.api2.Shelly2ApiClient] - shellyplus2pm-80646fdb89e0: API value detached was mapped to momentary
2023-10-14 09:27:55.448 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus2pm-80646fdb89e0: HTTP GET for http://192.168.1.221/shelly 
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/x-www-form-urlencoded


2023-10-14 09:27:55.462 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus2pm-80646fdb89e0: HTTP Response 200: {"name":"LightsGarage","id":"shellyplus2pm-80646fdb89e0","mac":"80646FDB89E0","slot":1,"model":"SNSW-102P16EU","gen":2,"fw_id":"20230912-081945/1.0.3-g6176478","ver":"1.0.3","app":"Plus2PM","auth_en":false,"auth_domain":null,"profile":"switch"}
Content-Type: application/json
Content-Length: 244
Pragma: no-cache
Server: ShellyHTTP/1.0.0
Connection: close

2023-10-14 09:27:55.463 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: Closing Rpc API (socket is connected, discovery=true)
2023-10-14 09:27:55.464 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus2pm-80646fdb89e0: Disconnecting WebSocket (/192.168.1.15:47678 -> /192.168.1.221:80)
2023-10-14 09:27:55.465 [TRACE] [helly.internal.api2.Shelly2RpcSocket] - shellyplus2pm-80646fdb89e0: Rpc connection closed: 1006 - Disconnected
2023-10-14 09:27:55.471 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: WebSocket connection closed, status = 1006/Disconnected
2023-10-14 09:27:55.473 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: Shelly settings : {"ble":{"enable":false},"cloud":{"enable":false,"server":"shelly-78-eu.shelly.cloud:6022/jrpc
"},"mqtt":{"enable":false,"rpc_ntf":"true","status_ntf":"false"},"sys":{"cfg_rev":19,"device":{"name":"LightsGarage","mac":"80646FDB89E0","fw_id":"20230912-081945/1.0.3-g6176478","profile":"switch","eco_mode":tru
e,"discoverable":true},"location":{"tz":"Pacific/Auckland","lat":-36.8453,"lon":174.75752},"sntp":{"server":"time.google.com"},"debug":{"mqtt":{"enable":false},"websocket":{"enable":false},"udp":null},"ui_data":n
ull,"rpc_udp":{}},"wifi":{"ap":{"enable":false,"ssid":"ShellyPlus2PM-80646FDB89E0","is_open":true,"range_extender":{"enable":false}},"sta":{"ssid":"OscarSlow","is_open":false,"enable":true,"ipv4mode":"dhcp"},"sta
1":{"is_open":true,"enable":false,"ipv4mode":"dhcp"},"roam":{"rssi_thr":-80,"interval":60}},"input:0":{"id":0,"type":"button","invert":false,"factory_reset":true},"input:1":{"id":1,"type":"button","invert":false,
"factory_reset":true},"switch:0":{"id":0,"in_mode":"detached","initial_state":"off","auto_on":false,"auto_on_delay":60.0,"auto_off":false,"auto_off_delay":60.0,"power_limit":2800,"voltage_limit":280,"current_limi
t":10.0},"switch:1":{"id":1,"in_mode":"detached","initial_state":"off","auto_on":false,"auto_on_delay":60.0,"auto_off":false,"auto_off_delay":60.0,"power_limit":2800,"voltage_limit":280,"current_limit":10.0}}
2023-10-14 09:27:55.474 [TRACE] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: thingType=shellyplus2pm, deviceType=SNSW-102P16EU, mode=relay, symbolic name=LightsGarage
2023-10-14 09:27:55.475 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: Adding Shelly LightsGarage, UID=shelly:shellyplus2pm-relay:80646fdb89e0
2023-10-14 09:27:55.482 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: Shelly device discovered: IP-Adress=192.168.1.221, type=shellyplus2pm
2023-10-14 09:27:55.570 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: Connect Rpc Socket (discovery = true)
2023-10-14 09:27:55.575 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus2pm-80646fdb89e0: Connect WebSocket, URI=ws://192.168.1.221/rpc
2023-10-14 09:27:55.590 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus2pm-80646fdb89e0: HTTP POST for http://192.168.1.221/rpc {"id":223212011,"src":"shellyplus2pm-80646fdb89e0","method":"Shelly.GetConfig"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 09:27:55.732 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus2pm-80646fdb89e0: WebSocket connected /192.168.1.15:47684<-/192.168.1.221:80, Idle Timeout=2147483647
2023-10-14 09:27:55.743 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus2pm-80646fdb89e0: HTTP Response 200: {"id":223212011,"src":"shellyplus2pm-80646fdb89e0","dst":"shellyplus2pm-80646fdb89e0","result":{"ble":{"enable":false,"rpc":{"enable":true},"observer":{"enable":false}},"cloud":{"enable":false,"server":"shelly-78-eu.shelly.cloud:6022/jrpc"},"input:0":{"id":0,"name":null,"type":"button","invert":false,"factory_reset":true},"input:1":{"id":1,"name":null,"type":"button","invert":false,"factory_reset":true},"mqtt":{"enable":false,"server":null,"client_id":"shellyplus2pm-80646fdb89e0","user":null,"topic_prefix":"shellyplus2pm-80646fdb89e0","rpc_ntf":true,"status_ntf":false,"use_client_cert":false,"enable_rpc":true,"enable_control":true},"switch:0":{"id":0, "name":null,"in_mode":"detached","initial_state":"off", "auto_on":false, "auto_on_delay":60.00, "auto_off":false, "auto_off_delay":60.00,"power_limit":2800,"voltage_limit":280,"undervoltage_limit":0,"autorecover_voltage_errors":false,"current_limit":10.000},"switch:1":{"id":1, "name":null,"in_mode":"detached","initial_state":"off", "auto_on":false, "auto_on_delay":60.00, "auto_off":false, "auto_off_delay":60.00,"power_limit":2800,"voltage_limit":280,"undervoltage_limit":0,"autorecover_voltage_errors":false,"current_limit":10.000},"sys":{"device":{"name":"LightsGarage","mac":"80646FDB89E0","fw_id":"20230912-081945/1.0.3-g6176478","discoverable":true,"eco_mode":true,"profile":"switch","addon_type":null},"location":{"tz":"Pacific/Auckland","lat":-36.845300,"lon":174.757520},"debug":{"level":2,"file_level":null,"mqtt":{"enable":false},"websocket":{"enable":false},"udp":{"addr":null}},"ui_data":null,"rpc_udp":{"dst_addr":null,"listen_port":null},"sntp":{"server":"time.google.com"},"cfg_rev":19},"wifi":{"ap":{"ssid":"ShellyPlus2PM-80646FDB89E0","is_open":true, "enable":false, "range_extender": {"enable":false}},"sta":{"ssid":"OscarSlow","is_open":false, "enable":true, "ipv4mode":"dhcp","ip":null,"netmask":null,"gw":null,"nameserver":null},"sta1":{"ssid":null,"is_open":true, "enable":false, "ipv4mode":"dhcp","ip":null,"netmask":null,"gw":null,"nameserver":null},"roam":{"rssi_thr":-80,"interval":60}},"ws":{"enable":false,"server":null,"ssl_ca":"ca.pem"}}}
Content-Type: application/json
Content-Length: 2129
Server: ShellyHTTP/1.0.0
Connection: close
2023-10-14 09:27:55.749 [TRACE] [helly.internal.api2.Shelly2ApiClient] - shellyplus2pm-80646fdb89e0: API value detached was mapped to momentary
2023-10-14 09:27:55.751 [TRACE] [helly.internal.api2.Shelly2ApiClient] - shellyplus2pm-80646fdb89e0: API value detached was mapped to momentary
2023-10-14 09:27:55.752 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus2pm-80646fdb89e0: HTTP GET for http://192.168.1.221/shelly 
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/x-www-form-urlencoded


2023-10-14 09:27:55.765 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus2pm-80646fdb89e0: HTTP Response 200: {"name":"LightsGarage","id":"shellyplus2pm-80646fdb89e0","mac":"80646FDB89E0","slot":1,"model":"SNSW-102P16EU","gen":2,"fw_id":"20230912-081945/1.0.3-g6176478","ver":"1.0.3","app":"Plus2PM","auth_en":false,"auth_domain":null,"profile":"switch"}
Content-Type: application/json
Content-Length: 244
Pragma: no-cache
Server: ShellyHTTP/1.0.0
Connection: close


2023-10-14 09:27:55.766 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: Closing Rpc API (socket is connected, discovery=true)
2023-10-14 09:27:55.768 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus2pm-80646fdb89e0: Disconnecting WebSocket (/192.168.1.15:47684 -> /192.168.1.221:80)
2023-10-14 09:27:55.769 [TRACE] [helly.internal.api2.Shelly2RpcSocket] - shellyplus2pm-80646fdb89e0: Rpc connection closed: 1006 - Disconnected
2023-10-14 09:27:55.773 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: WebSocket connection closed, status = 1006/Disconnected
2023-10-14 09:27:55.776 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: Shelly settings : {"ble":{"enable":false},"cloud":{"enable":false,"server":"shelly-78-eu.shelly.cloud:6022/jrpc"},"mqtt":{"enable":false,"rpc_ntf":"true","status_ntf":"false"},"sys":{"cfg_rev":19,"device":{"name":"LightsGarage","mac":"80646FDB89E0","fw_id":"20230912-081945/1.0.3-g6176478","profile":"switch","eco_mode":true,"discoverable":true},"location":{"tz":"Pacific/Auckland","lat":-36.8453,"lon":174.75752},"sntp":{"server":"time.google.com"},"debug":{"mqtt":{"enable":false},"websocket":{"enable":false},"udp":null},"ui_data":null,"rpc_udp":{}},"wifi":{"ap":{"enable":false,"ssid":"ShellyPlus2PM-80646FDB89E0","is_open":true,"range_extender":{"enable":false}},"sta":{"ssid":"OscarSlow","is_open":false,"enable":true,"ipv4mode":"dhcp"},"sta1":{"is_open":true,"enable":false,"ipv4mode":"dhcp"},"roam":{"rssi_thr":-80,"interval":60}},"input:0":{"id":0,"type":"button","invert":false,"factory_reset":true},"input:1":{"id":1,"type":"button","invert":false,"factory_reset":true},"switch:0":{"id":0,"in_mode":"detached","initial_state":"off","auto_on":false,"auto_on_delay":60.0,"auto_off":false,"auto_off_delay":60.0,"power_limit":2800,"voltage_limit":280,"current_limit":10.0},"switch:1":{"id":1,"in_mode":"detached","initial_state":"off","auto_on":false,"auto_on_delay":60.0,"auto_off":false,"auto_off_delay":60.0,"power_limit":2800,"voltage_limit":280,"current_limit":10.0}}
2023-10-14 09:27:55.777 [TRACE] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: thingType=shellyplus2pm, deviceType=SNSW-102P16EU, mode=relay, symbolic name=LightsGarage
2023-10-14 09:27:55.778 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: Adding Shelly LightsGarage, UID=shelly:shellyplus2pm-relay:80646fdb89e0
2023-10-14 09:27:59.041 [TRACE] [helly.internal.api2.Shelly2RpcSocket] - shellyplus2pm-LightsMaster: Inbound Rpc message: {"src":"shellyplus2pm-80646fdbe00c","dst":"shellyplus2pm-80646fdbe00c","method":"NotifyStatus","params":{"ts":1697228880.59,"switch:0":{"id":0,"aenergy":{"by_minute":[0.000,0.000,0.000],"minute_ts":1697228879,"total":99.718}}}}
2023-10-14 09:27:59.046 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdbe00c: NotifyStatus update received: {"src":"shellyplus2pm-80646fdbe00c","dst":"shellyplus2pm-80646fdbe00c","method":"NotifyStatus","params":{"ts":1.69722888059E9,"switch:0":{"id":0,"aenergy":{"total":99.718,"by_minute":[0.0,0.0,0.0],"minute_ts":1697228879}}}}
2023-10-14 09:27:59.048 [TRACE] [y.internal.handler.ShellyBaseHandler] - shellyplus2pm-80646fdbe00c: Watchdog restarted (expires in 70 sec)
20

Example of one that doesn’t look fine:

2023-10-14 09:29:29.659 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus1-083af2028610: Shelly device discovered: IP-Adress=[fe80:0:0:0:a3a:f2ff:fe02:8610], type=shellyplus1
2023-10-14 09:29:29.677 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af2028610: Connect Rpc Socket (discovery = true)
2023-10-14 09:29:29.678 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus1-083af2028610: Connect WebSocket, URI=ws://[fe80:0:0:0:a3a:f2ff:fe02:8610]/rpc
2023-10-14 09:29:29.703 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-083af2028610: HTTP POST for http://[fe80:0:0:0:a3a:f2ff:fe02:8610]/rpc {"id":2069135274,"src":"shellyplus1-083af2028610","metho
d":"Shelly.GetConfig"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 09:29:29.707 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af2028610: WebSocket error
2023-10-14 09:29:29.708 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus1-083af2028610: Device discovery of device  with address [fe80:0:0:0:a3a:f2ff:fe02:8610] failed: SocketException: Could not connect to /[fe80:0:0:0:a3a:f2ff:fe02:8610]:80
2023-10-14 09:29:29.709 [DEBUG] [discovery.ShellyDiscoveryParticipant] - ServiceInfo [ServiceInfoImpl@32229825 name: 'shellyplus1-083af2028610._http._tcp.local.' address: '(null):0' status: 'DNS: raspberrypi2-planar-id-au.local. [raspberrypi2.planar.id.au/192.168.1.15] state: probing 1 task: null', has NO data
        shellyplus1-083af2028610._http._tcp.local.: ]
2023-10-14 09:29:34.239 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-441793a94fe8: HTTP POST for http://192.168.1.232/rpc {"id":1556476533,"src":"shellyplus1-441793a94fe8","method":"Shelly.GetStatus"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 09:29:34.381 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-441793a94fe8: HTTP Response 200: {"id":1556476533,"src":"shellyplus1-441793a94fe8","dst":"shellyplus1-441793a94fe8","result":{"ble":null,"cloud":{"connected":false},"input:0":{"id":0,"state":null},"mqtt":{"connected":false},"switch:0":{"id":0, "source":"HTTP_in", "output":false,"temperature":{"tC":29.0, "tF":84.2}},"sys":{"mac":"441793A94FE8","restart_required":false,"time":"09:29","unixtime":1697228975,"uptime":2352238,"ram_size":244884,"ram_free":146300,"fs_size":458752,"fs_free":159744,"cfg_rev":14,"kvs_rev":0,"schedule_rev":0,"webhook_rev":0,"available_updates":{"beta":{"version":"1.0.7-beta1"}}},"wifi":{"sta_ip":"192.168.1.232","status":"got ip","ssid":"OscarSlow","rssi":-59},"ws":{"connected":false}}}
Content-Type: application/json
Content-Length: 681
Server: ShellyHTTP/1.0.0
Connection: close


2023-10-14 09:29:34.383 [TRACE] [y.internal.handler.ShellyBaseHandler] - shellyplus1-441793a94fe8: Watchdog restarted (expires in 70 sec)
2023-10-14 09:29:34.384 [TRACE] [.internal.handler.ShellyRelayHandler] - shellyplus1-441793a94fe8: Updating 1 relay(s)
2023-10-14 09:29:34.738 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-083af2029730: HTTP POST for http://192.168.1.225/rpc {"id":1887434093,"src":"shellyplus1-083af2029730","method":"Shelly.GetStatus"}

I think the difference between those two is the IPv6 address. Having said that, it doesn’t look frequent enough to be causing the thread growth I’m getting.

OK. Doing a more systematic analysis. I’ve restarted openhab from scratch and let it stabilise, I have full trace log of that restart.

At 10:47am I had 125 threads that were WebSocket related. Between 10:48:42 and 10:48:48 I got 8 more threads, and that stayed static for a while. (I got very lucky hitting the exact time it jumped) Whatever it is, it’s more than 5 minutes apart:

pi@raspberrypi2:~ $ sudo jstack 24195 | grep WebSocket | wc -l
125
pi@raspberrypi2:~ $ date; sudo jstack 24195 | grep WebSocket | wc -l
Sat 14 Oct 2023 10:47:47 NZDT
125
pi@raspberrypi2:~ $ date; sudo jstack 24195 | grep WebSocket | wc -l
Sat 14 Oct 2023 10:48:42 NZDT
125
pi@raspberrypi2:~ $ date; sudo jstack 24195 | grep WebSocket | wc -l
Sat 14 Oct 2023 10:48:48 NZDT
133
pi@raspberrypi2:~ $ date; sudo jstack 24195 | grep WebSocket | wc -l
Sat 14 Oct 2023 10:49:41 NZDT
133
pi@raspberrypi2:~ $ date; sudo jstack 24195 | grep WebSocket | wc -l
Sat 14 Oct 2023 10:50:34 NZDT
133
pi@raspberrypi2:~ $ date; sudo jstack 24195 | grep WebSocket | wc -l
Sat 14 Oct 2023 10:52:08 NZDT
133
pi@raspberrypi2:~ $ date; sudo jstack 24195 | grep WebSocket | wc -l
Sat 14 Oct 2023 10:52:26 NZDT
133
pi@raspberrypi2:~ $ date; sudo jstack 24195 | grep WebSocket | wc -l
Sat 14 Oct 2023 10:53:12 NZDT
133
pi@raspberrypi2:~ $ date; sudo jstack 24195 | grep WebSocket | wc -l
Sat 14 Oct 2023 10:53:42 NZDT
133
pi@raspberrypi2:~ $ date; sudo jstack 24195 | grep WebSocket | wc -l
Sat 14 Oct 2023 10:53:51 NZDT
133

In the log in that time slice there are two interesting things:

  1. Exactly 8 HTTP post commands. And 8 new threads. Problem is there’s heaps of HTTP posts everywhere else and the thread counts aren’t increasing
  2. Exactly a single WebSocket error. Perhaps that error causes 8 threads?
2023-10-14 10:48:42.238 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus2pm-80646fdb89e0: HTTP POST for http://192.168.1.221/rpc {"id":1112583018,"src":"shellyplus2pm-80646fdb89e0","method":"Shelly.Get
Status"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 10:48:42.238 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-441793a86d40: HTTP POST for http://192.168.1.149/rpc {"id":1479566910,"src":"shellyplus1-441793a86d40","method":"Shelly.GetStat
us"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 10:48:42.241 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-083af2028610: HTTP POST for http://192.168.1.235/rpc {"id":1841861816,"src":"shellyplus1-083af2028610","method":"Shelly.GetStat
us"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 10:48:42.241 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-441793aa6bf8: HTTP POST for http://192.168.1.120/rpc {"id":931069525,"src":"shellyplus1-441793aa6bf8","method":"Shelly.GetStatu
s"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 10:48:42.242 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-a8032aba16f8: HTTP POST for http://192.168.1.226/rpc {"id":1606624934,"src":"shellyplus1-a8032aba16f8","method":"Shelly.GetStat
us"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8

2023-10-14 10:48:42.357 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-083af2028610: HTTP Response 200: {"id":1841861816,"src":"shellyplus1-083af2028610","dst":"shellyplus1-083af2028610","result":{"
ble":null,"cloud":{"connected":false},"input:0":{"id":0,"state":null},"mqtt":{"connected":false},"switch:0":{"id":0, "source":"HTTP_in", "output":false,"temperature":{"tC":28.8, "tF":83.8}},"sys":{"mac":"083AF202
8610","restart_required":false,"time":"10:48","unixtime":1697233723,"uptime":2357108,"ram_size":244888,"ram_free":146332,"fs_size":458752,"fs_free":159744,"cfg_rev":14,"kvs_rev":0,"schedule_rev":0,"webhook_rev":0
,"available_updates":{"beta":{"version":"1.0.7-beta1"}}},"wifi":{"sta_ip":"192.168.1.235","status":"got ip","ssid":"OscarSlow","rssi":-67},"ws":{"connected":false}}}
Content-Type: application/json
Content-Length: 681
Server: ShellyHTTP/1.0.0
Connection: close


2023-10-14 10:48:42.360 [TRACE] [y.internal.handler.ShellyBaseHandler] - shellyplus1-083af2028610: Watchdog restarted (expires in 70 sec)
2023-10-14 10:48:42.361 [TRACE] [.internal.handler.ShellyRelayHandler] - shellyplus1-083af2028610: Updating 1 relay(s)
2023-10-14 10:48:42.369 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus2pm-80646fdb89e0: HTTP Response 200: {"id":1112583018,"src":"shellyplus2pm-80646fdb89e0","dst":"shellyplus2pm-80646fdb89e0","result":{"ble":null,"cloud":{"connected":false},"input:0":{"id":0,"state":null},"input:1":{"id":1,"state":null},"mqtt":{"connected":false},"switch:0":{"id":0, "source":"HTTP_in", "output":false, "apower":0.0, "voltage":242.8, "freq":50.0, "current":0.000, "pf":0.00, "aenergy":{"total":268.909,"by_minute":[0.000,0.000,0.000],"minute_ts":1697233721},"temperature":{"tC":22.8, "tF":73.0}},"switch:1":{"id":1, "source":"HTTP_in", "output":false, "apower":0.0, "voltage":242.9, "freq":50.0, "current":0.000, "pf":0.00, "aenergy":{"total":206.781,"by_minute":[0.000,0.000,0.000],"minute_ts":1697233721},"temperature":{"tC":22.8, "tF":73.0}},"sys":{"mac":"80646FDB89E0","restart_required":false,"time":"10:48","unixtime":1697233722,"uptime":2356741,"ram_size":243372,"ram_free":140076,"fs_size":458752,"fs_free":143360,"cfg_rev":19,"kvs_rev":0,"schedule_rev":0,"webhook_rev":0,"available_updates":{"beta":{"version":"1.0.7-beta1"}}},"wifi":{"sta_ip":"192.168.1.221","status":"got ip","ssid":"OscarSlow","rssi":-56},"ws":{"connected":false}}}
Content-Type: application/json
Content-Length: 1122
Server: ShellyHTTP/1.0.0
Connection: close


2023-10-14 10:48:42.373 [TRACE] [y.internal.handler.ShellyBaseHandler] - shellyplus2pm-80646fdb89e0: Watchdog restarted (expires in 70 sec)
2023-10-14 10:48:42.374 [TRACE] [.internal.handler.ShellyRelayHandler] - shellyplus2pm-80646fdb89e0: Updating 2 relay(s)
2023-10-14 10:48:42.419 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-441793aa6bf8: HTTP Response 200: {"id":931069525,"src":"shellyplus1-441793aa6bf8","dst":"shellyplus1-441793aa6bf8","result":{"ble":null,"cloud":{"connected":false},"input:0":{"id":0,"state":null},"mqtt":{"connected":false},"switch:0":{"id":0, "source":"HTTP_in", "output":false,"temperature":{"tC":30.0, "tF":86.1}},"sys":{"mac":"441793AA6BF8","restart_required":false,"time":"10:48","unixtime":1697233722,"uptime":1539341,"ram_size":244880,"ram_free":146912,"fs_size":458752,"fs_free":159744,"cfg_rev":17,"kvs_rev":0,"schedule_rev":0,"webhook_rev":0,"available_updates":{"beta":{"version":"1.0.7-beta1"}}},"wifi":{"sta_ip":"192.168.1.120","status":"got ip","ssid":"OscarSlow","rssi":-52},"ws":{"connected":false}}}
Content-Type: application/json
Content-Length: 680
Server: ShellyHTTP/1.0.0
Connection: close


2023-10-14 10:48:42.423 [TRACE] [y.internal.handler.ShellyBaseHandler] - shellyplus1-441793aa6bf8: Watchdog restarted (expires in 70 sec)
2023-10-14 10:48:42.424 [TRACE] [.internal.handler.ShellyRelayHandler] - shellyplus1-441793aa6bf8: Updating 1 relay(s)
2023-10-14 10:48:43.435 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-441793a86d40: HTTP Response 200: {"id":1479566910,"src":"shellyplus1-441793a86d40","dst":"shellyplus1-441793a86d40","result":{"ble":null,"cloud":{"connected":false},"input:0":{"id":0,"state":null},"mqtt":{"connected":false},"switch:0":{"id":0, "source":"HTTP_in", "output":false,"temperature":{"tC":27.6, "tF":81.6}},"sys":{"mac":"441793A86D40","restart_required":false,"time":"10:48","unixtime":1697233724,"uptime":1539532,"ram_size":244884,"ram_free":146928,"fs_size":458752,"fs_free":159744,"cfg_rev":17,"kvs_rev":0,"schedule_rev":0,"webhook_rev":0,"available_updates":{"beta":{"version":"1.0.7-beta1"}}},"wifi":{"sta_ip":"192.168.1.149","status":"got ip","ssid":"OscarSlow","rssi":-65},"ws":{"connected":false}}}
Content-Type: application/json
Content-Length: 681
Server: ShellyHTTP/1.0.0
Connection: close
2023-10-14 10:48:43.438 [TRACE] [y.internal.handler.ShellyBaseHandler] - shellyplus1-441793a86d40: Watchdog restarted (expires in 70 sec)
2023-10-14 10:48:43.439 [TRACE] [.internal.handler.ShellyRelayHandler] - shellyplus1-441793a86d40: Updating 1 relay(s)
2023-10-14 10:48:43.448 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-a8032aba16f8: HTTP Response 200: {"id":1606624934,"src":"shellyplus1-a8032aba16f8","dst":"shellyplus1-a8032aba16f8","result":{"ble":null,"cloud":{"connected":false},"input:0":{"id":0,"state":null},"mqtt":{"connected":false},"switch:0":{"id":0, "source":"HTTP_in", "output":false,"temperature":{"tC":28.9, "tF":84.0}},"sys":{"mac":"A8032ABA16F8","restart_required":false,"time":"10:48","unixtime":1697233724,"uptime":1539451,"ram_size":244892,"ram_free":147116,"fs_size":458752,"fs_free":159744,"cfg_rev":14,"kvs_rev":0,"schedule_rev":0,"webhook_rev":0,"available_updates":{"beta":{"version":"1.0.7-beta1"}}},"wifi":{"sta_ip":"192.168.1.226","status":"got ip","ssid":"OscarSlow","rssi":-41},"ws":{"connected":false}}}
Content-Type: application/json
Content-Length: 681
Server: ShellyHTTP/1.0.0
Connection: close


2023-10-14 10:48:43.451 [TRACE] [y.internal.handler.ShellyBaseHandler] - shellyplus1-a8032aba16f8: Watchdog restarted (expires in 70 sec)
2023-10-14 10:48:43.452 [TRACE] [.internal.handler.ShellyRelayHandler] - shellyplus1-a8032aba16f8: Updating 1 relay(s)
2023-10-14 10:48:45.242 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-a8032aba1704: HTTP POST for http://192.168.1.171/rpc {"id":1112383377,"src":"shellyplus1-a8032aba1704","method":"Shelly.GetStatus"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 10:48:45.242 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-083af202a350: HTTP POST for http://192.168.1.200/rpc {"id":1015284757,"src":"shellyplus1-083af202a350","method":"Shelly.GetStatus"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 10:48:45.330 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-083af202a350: HTTP Response 200: {"id":1015284757,"src":"shellyplus1-083af202a350","dst":"shellyplus1-083af202a350","result":{"ble":null,"cloud":{"connected":false},"input:0":{"id":0,"state":null},"mqtt":{"connected":false},"switch:0":{"id":0, "source":"HTTP_in", "output":false,"temperature":{"tC":27.4, "tF":81.4}},"sys":{"mac":"083AF202A350","restart_required":false,"time":"10:48","unixtime":1697233725,"uptime":2356944,"ram_size":244884,"ram_free":146256,"fs_size":458752,"fs_free":159744,"cfg_rev":19,"kvs_rev":0,"schedule_rev":0,"webhook_rev":0,"available_updates":{"beta":{"version":"1.0.7-beta1"}}},"wifi":{"sta_ip":"192.168.1.200","status":"got ip","ssid":"OscarSlow","rssi":-44},"ws":{"connected":false}}}
Content-Type: application/json
Content-Length: 681
Server: ShellyHTTP/1.0.0
Connection: close


2023-10-14 10:48:45.335 [TRACE] [y.internal.handler.ShellyBaseHandler] - shellyplus1-083af202a350: Watchdog restarted (expires in 70 sec)
2023-10-14 10:48:45.337 [TRACE] [.internal.handler.ShellyRelayHandler] - shellyplus1-083af202a350: Updating 1 relay(s)
2023-10-14 10:48:45.364 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-a8032aba1704: HTTP Response 200: {"id":1112383377,"src":"shellyplus1-a8032aba1704","dst":"shellyplus1-a8032aba1704","result":{"ble":null,"cloud":{"connected":false},"input:0":{"id":0,"state":null},"mqtt":{"connected":false},"switch:0":{"id":0, "source":"HTTP_in", "output":false,"temperature":{"tC":29.3, "tF":84.7}},"sys":{"mac":"A8032ABA1704","restart_required":false,"time":"10:48","unixtime":1697233725,"uptime":1000688,"ram_size":244884,"ram_free":146292,"fs_size":458752,"fs_free":159744,"cfg_rev":15,"kvs_rev":0,"schedule_rev":0,"webhook_rev":0,"available_updates":{"beta":{"version":"1.0.7-beta1"}}},"wifi":{"sta_ip":"192.168.1.171","status":"got ip","ssid":"OscarSlow","rssi":-44},"ws":{"connected":false}}}
Content-Type: application/json
Content-Length: 681
Server: ShellyHTTP/1.0.0
Connection: close
2023-10-14 10:48:45.369 [TRACE] [y.internal.handler.ShellyBaseHandler] - shellyplus1-a8032aba1704: Watchdog restarted (expires in 70 sec)
2023-10-14 10:48:45.371 [TRACE] [.internal.handler.ShellyRelayHandler] - shellyplus1-a8032aba1704: Updating 1 relay(s)
2023-10-14 10:48:45.376 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-441793a94fe8: HTTP POST for http://192.168.1.232/rpc {"id":1407649818,"src":"shellyplus1-441793a94fe8","method":"Shelly.GetStatus"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 10:48:45.379 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-083af2029730: HTTP POST for http://192.168.1.225/rpc {"id":1087308892,"src":"shellyplus1-083af2029730","method":"Shelly.GetStatus"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 10:48:45.536 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-083af2029730: HTTP Response 200: {"id":1087308892,"src":"shellyplus1-083af2029730","dst":"shellyplus1-083af2029730","result":{"ble":null,"cloud":{"connected":false},"input:0":{"id":0,"state":null},"mqtt":{"connected":false},"switch:0":{"id":0, "source":"HTTP_in", "output":false,"temperature":{"tC":29.9, "tF":85.8}},"sys":{"mac":"083AF2029730","restart_required":false,"time":"10:48","unixtime":1697233726,"uptime":2357121,"ram_size":244888,"ram_free":146488,"fs_size":458752,"fs_free":159744,"cfg_rev":22,"kvs_rev":0,"schedule_rev":0,"webhook_rev":0,"available_updates":{"beta":{"version":"1.0.7-beta1"}}},"wifi":{"sta_ip":"192.168.1.225","status":"got ip","ssid":"OscarSlow","rssi":-55},"ws":{"connected":false}}}
Content-Type: application/json
Content-Length: 681
Server: ShellyHTTP/1.0.0
Connection: close


2023-10-14 10:48:45.538 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-441793a94fe8: HTTP Response 200: {"id":1407649818,"src":"shellyplus1-441793a94fe8","dst":"shellyplus1-441793a94fe8","result":{"ble":null,"cloud":{"connected":false},"input:0":{"id":0,"state":null},"mqtt":{"connected":false},"switch:0":{"id":0, "source":"HTTP_in", "output":false,"temperature":{"tC":29.6, "tF":85.4}},"sys":{"mac":"441793A94FE8","restart_required":false,"time":"10:48","unixtime":1697233726,"uptime":2356994,"ram_size":244884,"ram_free":146288,"fs_size":458752,"fs_free":159744,"cfg_rev":14,"kvs_rev":0,"schedule_rev":0,"webhook_rev":0,"available_updates":{"beta":{"version":"1.0.7-beta1"}}},"wifi":{"sta_ip":"192.168.1.232","status":"got ip","ssid":"OscarSlow","rssi":-55},"ws":{"connected":false}}}
Content-Type: application/json
Content-Length: 681
Server: ShellyHTTP/1.0.0
Connection: close
2023-10-14 10:48:45.541 [TRACE] [y.internal.handler.ShellyBaseHandler] - shellyplus1-083af2029730: Watchdog restarted (expires in 70 sec)
2023-10-14 10:48:45.542 [TRACE] [.internal.handler.ShellyRelayHandler] - shellyplus1-083af2029730: Updating 1 relay(s)
2023-10-14 10:48:45.543 [TRACE] [y.internal.handler.ShellyBaseHandler] - shellyplus1-441793a94fe8: Watchdog restarted (expires in 70 sec)
2023-10-14 10:48:45.545 [TRACE] [.internal.handler.ShellyRelayHandler] - shellyplus1-441793a94fe8: Updating 1 relay(s)
2023-10-14 10:48:45.737 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus1-a8032aba1704: Shelly device discovered: IP-Adress=[fe80:0:0:0:aa03:2aff:feba:1704], type=shellyplus1
2023-10-14 10:48:45.784 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-a8032aba1704: Connect Rpc Socket (discovery = true)
2023-10-14 10:48:45.787 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus1-a8032aba1704: Connect WebSocket, URI=ws://[fe80:0:0:0:aa03:2aff:feba:1704]/rpc
2023-10-14 10:48:45.819 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-a8032aba1704: HTTP POST for http://[fe80:0:0:0:aa03:2aff:feba:1704]/rpc {"id":1386121328,"src":"shellyplus1-a8032aba1704","method":"Shelly.GetConfig"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 10:48:45.821 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-a8032aba1704: WebSocket error
2023-10-14 10:48:45.824 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus1-a8032aba1704: Device discovery of device  with address [fe80:0:0:0:aa03:2aff:feba:1704] failed: SocketException: Could not connect to /[fe80:0:0:0:aa03:2aff:feba:1704]:80
2023-10-14 10:48:45.826 [DEBUG] [discovery.ShellyDiscoveryParticipant] - ServiceInfo [ServiceInfoImpl@12166329 name: 'shellyplus1-a8032aba1704._http._tcp.local.' address: '(null):0' status: 'DNS: raspberrypi2-planar-id-au.local. [raspberrypi2.planar.id.au/192.168.1.15] state: probing 1 task: null', has NO data
        shellyplus1-a8032aba1704._http._tcp.local.: ]
2023-10-14 10:48:59.879 [TRACE] [helly.internal.api2.Shelly2RpcSocket] - shellyplus2pm-LightsMaster: Inbound Rpc message: {"src":"shellyplus2pm-80646fdbe00c","dst":"shellyplus2pm-80646fdbe00c","method":"NotifyStatus","params":{"ts":1697233740.90,"switch:0":{"id":0,"aenergy":{"by_minute":[0.000,0.000,0.000],"minute_ts":1697233739,"total":99.718}}}}

There are three WebSocket errors in the log in total. The first two are before the count. The third one coincides with the thread growth.

Since that point I’ve had no more thread growth, and no more errors of that form.

pi@raspberrypi2:~ $ while true; do date; sudo jstack 24195 | grep WebSocket | wc -l; sleep 60; done > threadcount.txt &  tail -f threadcount.txt
[1] 30478
Sat 14 Oct 2023 11:37:56 NZDT
133

I’ll leave the monitoring running, and see if I can correlate any future growth in threads to web socket errors.

Addendum. More thread growth - 24 threads in total:

Sat 14 Oct 2023 11:54:17 NZDT
133
Sat 14 Oct 2023 11:55:18 NZDT
157

In the openhab log there are three instances of WebSocket error in that time period. No other instances of that error since the previous report. So, at 8 threads per socket error, that’s pretty indicative.

2023-10-14 11:54:55.796 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: Shelly device discovered: IP-Adress=[fe80:0:0:0:8264:6fff:fedb:89e0], type=shellyplus2pm
2023-10-14 11:54:55.816 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: Connect Rpc Socket (discovery = true)
2023-10-14 11:54:55.818 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus2pm-80646fdb89e0: Connect WebSocket, URI=ws://[fe80:0:0:0:8264:6fff:fedb:89e0]/rpc
2023-10-14 11:54:55.863 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus2pm-80646fdb89e0: HTTP POST for http://[fe80:0:0:0:8264:6fff:fedb:89e0]/rpc {"id":1582461854,"src":"shellyplus2pm-80646fdb89e0","
method":"Shelly.GetConfig"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 11:54:55.866 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: WebSocket error
2023-10-14 11:54:55.868 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus2pm-80646fdb89e0: Device discovery of device  with address [fe80:0:0:0:8264:6fff:fedb:89e0] failed: SocketException: Could not connect to /[fe80:0:0:0:8264:6fff:fedb:89e0]:80
2023-10-14 11:54:55.870 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus1-083af2028610: Shelly device discovered: IP-Adress=[fe80:0:0:0:a3a:f2ff:fe02:8610], type=shellyplus1
2023-10-14 11:54:55.888 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af2028610: Connect Rpc Socket (discovery = true)
2023-10-14 11:54:55.890 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus1-083af2028610: Connect WebSocket, URI=ws://[fe80:0:0:0:a3a:f2ff:fe02:8610]/rpc
2023-10-14 11:54:55.918 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-083af2028610: HTTP POST for http://[fe80:0:0:0:a3a:f2ff:fe02:8610]/rpc {"id":1487149131,"src":"shellyplus1-083af2028610","method":"Shelly.GetConfig"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 11:54:55.921 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af2028610: WebSocket error
2023-10-14 11:54:55.923 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus1-083af2028610: Device discovery of device  with address [fe80:0:0:0:a3a:f2ff:fe02:8610] failed: SocketException: Could not connect to /[fe80:0:0:0:a3a:f2ff:fe02:8610]:80
2023-10-14 11:54:55.925 [DEBUG] [discovery.ShellyDiscoveryParticipant] - ServiceInfo [ServiceInfoImpl@24504214 name: 'shellyplus2pm-80646fdb89e0._http._tcp.local.' address: '(null):80' status: 'DNS: raspberrypi2-planar-id-au.local. [raspberrypi2.planar.id.au/192.168.1.15] state: probing 1 task: null', has NO data, empty]
2023-10-14 11:54:55.927 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus1-a8032aba1704: Shelly device discovered: IP-Adress=[fe80:0:0:0:aa03:2aff:feba:1704], type=shellyplus1
2023-10-14 11:54:55.943 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-a8032aba1704: Connect Rpc Socket (discovery = true)
2023-10-14 11:54:55.943 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus1-a8032aba1704: Connect WebSocket, URI=ws://[fe80:0:0:0:aa03:2aff:feba:1704]/rpc
2023-10-14 11:54:55.964 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-a8032aba1704: HTTP POST for http://[fe80:0:0:0:aa03:2aff:feba:1704]/rpc {"id":748622270,"src":"shellyplus1-a8032aba1704","method":"Shelly.GetConfig"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 11:54:55.964 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-a8032aba1704: WebSocket error
2023-10-14 11:54:55.967 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus1-a8032aba1704: Device discovery of device  with address [fe80:0:0:0:aa03:2aff:feba:1704] failed: SocketException: Could not connect to /[fe80:0:0:0:aa03:2aff:feba:1704]:80
2023-10-14 11:54:55.968 [DEBUG] [discovery.ShellyDiscoveryParticipant] - ServiceInfo [ServiceInfoImpl@1287577 name: 'shellyplus1-441793a94fe8._http._tcp.local.' address: '(null):80' status: 'DNS: raspberrypi2-planar-id-au.local. [raspberrypi2.planar.id.au/192.168.1.15] state: probing 1 task: null', has NO data, empty]
2023-10-14 11:54:55.969 [DEBUG] [discovery.ShellyDiscoveryParticipant] - ServiceInfo [ServiceInfoImpl@28777544 name: 'shellyplus1-083af2028610._http._tcp.local.' address: '(null):0' status: 'DNS: raspberrypi2-planar-id-au.local. [raspberrypi2.planar.id.au/192.168.1.15] state: probing 1 task: null', has NO data
        shellyplus1-083af2028610._http._tcp.local.: ]
2023-10-14 11:54:55.971 [DEBUG] [discovery.ShellyDiscoveryParticipant] - ServiceInfo [ServiceInfoImpl@22705168 name: 'shellyplus1-a8032aba1704._http._tcp.local.' address: '(null):0' status: 'DNS: raspberrypi2-planar-id-au.local. [raspberrypi2.planar.id.au/192.168.1.15] state: probing 1 task: null', has NO data
        shellyplus1-a8032aba1704._http._tcp.local.: ]
2023-10-14 11:54:56.359 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-a8032aba1704: HTTP POST for http://192.168.1.171/rpc {"id":1738018206,"src":"shellyplus1-a8032aba1704","method":"Shelly.GetStatus"}
Accept-Encoding: gzip
User-Agent: Jetty/9.4.50.v20221201
Content-Type: application/json; charset=UTF-8


2023-10-14 11:54:56.526 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-a8032aba1704: HTTP Response 200: {"id":1738018206,"src":"shellyplus1-a8032aba1704","dst":"shellyplus1-a8032aba1704","result":{"ble":null,"cloud":{"connected":false},"input:0":{"id":0,"state":null},"mqtt":{"connected":false},"switch:0":{"id":0, "source":"HTTP_in", "output":false,"temperature":{"tC":28.7, "tF":83.7}},"sys":{"mac":"A8032ABA1704","restart_required":false,"time":"11:54","unixtime":1697237697,"uptime":1004663,"ram_size":244884,"ram_free":146292,"fs_size":458752,"fs_free":159744,"cfg_rev":15,"kvs_rev":0,"schedule_rev":0,"webhook_rev":0,"available_updates":{"beta":{"version":"1.0.7-beta1"}}},"wifi":{"sta_ip":"192.168.1.171","status":"got ip","ssid":"OscarSlow","rssi":-49},"ws":{"connected":false}}}
Content-Type: application/json
Content-Length: 681
Server: ShellyHTTP/1.0.0
Connection: close

Another 24 threads in the 11:56 window:

Sat 14 Oct 2023 11:56:20 NZDT
157
Sat 14 Oct 2023 11:57:21 NZDT
181

Again three errors in the log in that 1 minute slot. They’re a bit more spaced out in the log so I won’t include them, but they look similar.

@laursen @markus7017 you guys following this?
sorry to ping you guys directly but this is getting interesting

Adding detail. I have:

  • 24 threads between 11:56:20 and 11:57:21
  • 16 threads between 12:01:26 and 12:02:28
  • 16 threads between 12:02:28 and 12:03:29
  • 16 threads between 12:11:40 and 12:12:42
  • 8 threads between 12:12:42 and 12:13:43, but then a drop back again between 12:13:43 and 12:14:44
  • 16 threads between 12:14:44 and 12:15:46
  • 8 threads between 12:19:51 and 12:20:53
  • 8 threads between 12:20:53 and 12:21:54
  • 8 threads between 12:24:58 and 12:26:00
  • 16 threads between 12:30:06 and 12:31:07
  • 16 threads between 12:35:13 and 12:36:14
  • 16 threads between 12:38:17 and 12:39:18
    None since.

Running a grep over the log for ā€œWebSocket errorā€, I get the following list:

2023-10-14 10:46:35.836 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: WebSocket error
2023-10-14 10:46:35.916 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af2029730: WebSocket error
2023-10-14 10:46:55.789 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af2028610: WebSocket error
2023-10-14 10:48:45.821 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-a8032aba1704: WebSocket error
2023-10-14 11:54:55.866 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: WebSocket error
2023-10-14 11:54:55.921 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af2028610: WebSocket error
2023-10-14 11:54:55.964 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-a8032aba1704: WebSocket error
2023-10-14 11:56:56.112 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: WebSocket error
2023-10-14 11:56:56.460 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-441793aa9040: WebSocket error
2023-10-14 11:56:56.663 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af2029730: WebSocket error
2023-10-14 12:02:26.128 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af202a350: WebSocket error
2023-10-14 12:02:26.333 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af2029730: WebSocket error
2023-10-14 12:02:36.115 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-a8032aba1704: WebSocket error
2023-10-14 12:02:36.659 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af2028610: WebSocket error
2023-10-14 12:12:35.862 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: WebSocket error
2023-10-14 12:12:35.903 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-a8032aba1704: WebSocket error
2023-10-14 12:15:16.204 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: WebSocket error
2023-10-14 12:15:16.481 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-441793a94fe8: WebSocket error
2023-10-14 12:20:05.866 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af2028610: WebSocket error
2023-10-14 12:21:16.592 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af2028610: WebSocket error
2023-10-14 12:25:36.105 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af2029730: WebSocket error
2023-10-14 12:30:15.873 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: WebSocket error
2023-10-14 12:30:35.873 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-a8032aba1704: WebSocket error
2023-10-14 12:36:06.150 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-441793aa9040: WebSocket error
2023-10-14 12:36:06.458 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus2pm-80646fdb89e0: WebSocket error
2023-10-14 12:38:26.086 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-083af2029730: WebSocket error
2023-10-14 12:38:26.122 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-a8032aba1704: WebSocket error

This aligns in each time period:

  • 3 at 11:56
  • 2 at 12:02:26
  • 2 at 12:02:36
  • 2 at 12:12:35
  • Assume the 12:12 set were just timing - the threads were there but correctly destroyed again
  • 2 at 12:15:16
  • 1 at 12:20:05
  • 1 at 12:21:16
  • 1 at 12:25:36
  • 2 in the 12:30 slot (12:30:15 and 12:30:35)
  • 2 at 12:36:06
  • 2 at 12:38:26

In short, every instance of ā€œWebsocket errorā€ results in 8 additional threads, no thread increase occurs other than at a time that error message appears in the log.

In each case it was looking for a device on an IPv6 address. In each case the messages are broadly similar to:

2023-10-14 12:15:16.481 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-441793a94fe8: WebSocket error
2023-10-14 12:15:16.483 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus1-441793a94fe8: Device discovery of device  with address [fe80:0:0:0:4617:93ff:fea9:4fe8] failed: SocketException: Could not connect to /[fe80:0:0:0:4617:93ff:fea9:4fe8]:80
2023-10-14 12:15:16.484 [DEBUG] [discovery.ShellyDiscoveryParticipant] - shellyplus1-441793a94fe8: Shelly device discovered: IP-Adress=192.168.1.232, type=shellyplus1
2023-10-14 12:15:16.494 [DEBUG] [g.shelly.internal.api2.Shelly2ApiRpc] - shellyplus1-441793a94fe8: Connect Rpc Socket (discovery = true)
2023-10-14 12:15:16.495 [DEBUG] [helly.internal.api2.Shelly2RpcSocket] - shellyplus1-441793a94fe8: Connect WebSocket, URI=ws://192.168.1.232/rpc
2023-10-14 12:15:16.510 [TRACE] [shelly.internal.api.ShellyHttpClient] - shellyplus1-441793a94fe8: HTTP POST for http://192.168.1.232/rpc {"id":1961832137,"src":"shellyplus1-441793a94fe8","method":"Shelly.GetConfig"}

I note that sometimes the ā€œDevice discovery failedā€ message comes before the WebSocket error, sometimes afterwards. That may be a logging anomaly rather than indicating it sometimes runs out of sequence, but it’s worth noting.

The log is a bit large to upload (5.1MB), I can load any excerpts that are useful.

Edit: I’ve continued monitoring for another 3-4 hours. There are plenty of times it’s stable for an hour or more, also periods where it’s creating new threads. All those instances still correlate with this error in the log. Current total @ 16:04 is 565 threads.

If I was guessing, I’d say around line 145 of ShellyDiscoveryParticipant.java, where we fail discovery but don’t terminate the connection:

            try {
                ShellyApiInterface api = gen2 ? new Shelly2ApiRpc(name, config, httpClient)
                        : new Shelly1HttpApi(name, config, httpClient);
                api.initialize();
                profile = api.getDeviceProfile(thingType);
                api.close();
                logger.debug("{}: Shelly settings : {}", name, profile.settingsJson);
                deviceName = profile.name;
                model = profile.deviceType;
                mode = profile.mode;
                properties = ShellyBaseHandler.fillDeviceProperties(profile);
                logger.trace("{}: thingType={}, deviceType={}, mode={}, symbolic name={}", name, thingType,
                        profile.deviceType, mode.isEmpty() ? "<standard>" : mode, deviceName);

                // get thing type from device name
                thingUID = ShellyThingCreator.getThingUID(name, model, mode, false);
            } catch (ShellyApiException e) {
                ShellyApiResult result = e.getApiResult();
                if (result.isHttpAccessUnauthorized()) {
                    logger.info("{}: {}", name, messages.get("discovery.protected", address));

                    // create shellyunknown thing - will be changed during thing initialization with valid credentials
                    thingUID = ShellyThingCreator.getThingUID(name, model, mode, true);
                } else {
                    logger.debug("{}: {}", name, messages.get("discovery.failed", address, e.toString()));
                }
            } catch (IllegalArgumentException e) { // maybe some format description was buggy
                logger.debug("{}: Discovery failed!", name, e);
            }

Arguably line 169 could be where we put a WebSocket close.

Alternatively, it might be better handled in api2/Shelly2ApiRpc.java around line 723, perhaps we should be closing the web socket when we get a throwable error. Not sure what other errors would turn up in here, and whether some of them assume you don’t close the web socket.

    public void onError(Throwable cause) {
        logger.debug("{}: WebSocket error", thingName);
        if (thing != null && thing.getProfile().alwaysOn) {
            thingOffline("WebSocket error");
        }
    }

Meantime, I have disabled IPv6 on my RaspberryPi and restarted. If that solves my issue it also provides a pointer, although I’d suggest that while IPv6 discovery may be the proximate cause, the real cause is that the generic error handler isn’t closing the WebSocket. My particular errors relate to IPv6, but other people may have other errors that are occurring.