I’ve reverted my IPv6 disabling, as it makes my MQTT unhappy (it’s trying to bind to an IPv6 address for some reason, which I could probably also disable, but since it doesn’t fix the leak anyway not much point).
I think the key, which started this overall thread I think, is to make sure all exception handlers clean up the Web sockets, otherwise we get leaks. And I think this particular one is traceable to an exception that didn’t clean up the Web socket.
I have no real ability to build and test, and Java isn’t a language I’ve spent any time with. I could probably have a go at a pull request, but in reality I’d just be chucking stuff at a wall that someone who knows what they’re doing could do way better.
All of which is fine. I can put in a daily/overnight restart of openhab and it will probably work adequately for the interim period.
OK. Working on that. It triggers a dependency on Gson, so working through how to resolve that.
Resolved Gson. But now have an unresolved dependency on slf4j. It looks to me like slf4j should be always included already, so I’m suspecting I’m breaking my installation. I can’t find just org.slf4j.jar, only things like org.slf4j-api.jar.
I think I’m out of my depth here unfortunately. I’m not sure I want to just randomly pull in additional modules until it works.
I’ve also downloaded the entire snapshot build, and unpacked it. There’s no slf4j jar in there, so I’m dubious that it’s required. Can anyone provide a hint as to what I may be missing?
OK, looks like I can install an entire snapshot version using apt. That seems safer in terms of it working, less safe in terms of upgrading my whole installation. Ah well.
I will wait to see what Paul’s results are as well but for me using the fresh snapshot install on a freshly built Debian 12 fully updated image with zulu17jdk installed via apt and install of Openhab snapshot version Build 3675 latest installed via apt added the shelly binding during initial setup console reports binding as version 4.1.0.202310130405 . I see no change in discovery working manually adding a new shelly thing works ok. Still unable to repro a leak no orphan socket…
On a separate note using Openhab 3.4 and java 11 discovery works fine. using open 3.4 and java 17(yes I know it was not a supported configuration) discovery does not find any device.
As far as I have been able do from a testing perspective.
Openhab 3.4 using recommended dependency’s normal build seems to work and discover shelly devices fine
any version of Openhab 4.x I have tested (using the exact set of instructions in the documentation) does not work with discovery on a freshly built clean load of Openhab (not a upgrade).
regardless of the underlying OS .
Saw same behavior on Linux as well as windows.
Same hardware for all tests.
Heap dump review does not indicate any anomaly’s
and 168 threads counted.
On a side note I really like the new 4.1 it is sweet being able to toggle the logging via the GUI.
For me, same behaviour with the snapshot build. I have openHAB 4.1.0 (build Build #3675).
The Shelly binding reports being 202310130405:
openhab> list -s |grep shelly
280 x Active x 80 x 4.1.0.202310130405 x org.openhab.binding.shelly
I’m seeing thread growth in bites of 8, and same symptom - associated with failed discovery and “WebSocket error” in the log at the time where the threads grow.
Sorry guys, I was traveling and therefore couldn’t participate in debugging, but a great community made already progress
That’s a good finding and matches the symptoms with the Shelly Wall Display, which is not active as a thing, but causes the problem → because it sends mDNS discovery packets
I need to place the api.close() in a finally block so it gets called when request was processed, but also when it failed.
try {
ShellyApiInterface api = gen2 ? new Shelly2ApiRpc(name, config, httpClient)
: new Shelly1HttpApi(name, config, httpClient);
api.initialize();
profile = api.getDeviceProfile(thingType);
logger.debug("{}: Shelly settings : {}", name, profile.settingsJson);
deviceName = profile.name;
model = profile.deviceType;
mode = profile.mode;
properties = ShellyBaseHandler.fillDeviceProperties(profile);
logger.trace("{}: thingType={}, deviceType={}, mode={}, symbolic name={}", name, thingType,
profile.deviceType, mode.isEmpty() ? "<standard>" : mode, deviceName);
// get thing type from device name
thingUID = ShellyThingCreator.getThingUID(name, model, mode, false);
} catch (ShellyApiException e) {
ShellyApiResult result = e.getApiResult();
if (result.isHttpAccessUnauthorized()) {
logger.info("{}: {}", name, messages.get("discovery.protected", address));
// create shellyunknown thing - will be changed during thing initialization with valid credentials
thingUID = ShellyThingCreator.getThingUID(name, model, mode, true);
} else {
logger.debug("{}: {}", name, messages.get("discovery.failed", address, e.toString()));
}
} catch (IllegalArgumentException e) { // maybe some format description was buggy
logger.debug("{}: Discovery failed!", name, e);
} finally |
api.close();
}
and also this is a good catch
The binding is currently not supporting IPv6 handling. Therefore I need to check the address family on discovery requests and refuse IPv6
Sometime in the last couple of days the thread count stopped increasing. Not associated with a config or install change, it just stopped. Which is very unusual. Which is another way of saying that a test may not be proof, if it can just stop increasing on its own without code change. Having said that, it also looks like the message “WebSocket error” also stopped occurring in the log. So presumably if I have that message without thread number change, then it works. I’ll see today if that message is still occurring.
For those who may also want to test, the process here is:
Assuming you’re using apt to install - you want to change your sources.list to include unstable. In my case that’s in /etc/apt/sources.list.d/openhab.list, and the line looks like:
deb [signed-by=/usr/share/keyrings/openhab.gpg] https://openhab.jfrog.io/artifactory/openhab-linuxpkg unstable main
Upgrade, which should give you the snapshot version
Go to the openhab console: ssh -p 8101 openhab@localhost
(password is habopen)
Turn on trace logging for the binding
log:set TRACE org.openhab.binding.shelly
Upgrade the binding to the version provided by Markus above
[openhab> bundle:list |grep Shelly
280 x Active x 80 x 4.1.0.202310211629 x openHAB Add-ons :: Bundles :: Shelly Binding Gen1+2
[openhab> bundle:update 280 https://github.com/markus7017/myfiles/blob/master/shelly/org.openhab.binding.shelly-4.1.0-SNAPSHOToom.jar?raw=true
I always restart my openhab so I can start clean. sudo service openhab restart
Then you need to get the process id of your openhab instance
ps -ef |grep openhab
Use the process id to get a count of the threads that refer to “WebSocket”
I did. No errors nor additional threads so far. Would there be anything different in the log that would tell me I’m definitely running the right version?
So it’s the new one. Would the changes have removed the WebSocket error, or just caused it to not leak when there was the error? I haven’t seen any discovery messages at all since the upgrade - which fixes it I guess, just not what I expected.
Hello @markus7017
I wanted to try it also to test the issue with the Wall Screen but installing the new Jar gives me thi> s error:
Error while starting bundle: file:/usr/share/openhab/addons/org.openhab.binding.shelly-4.1.0-SNAPSHOToom.jar
org.osgi.framework.BundleException: Could not resolve module: org.openhab.binding.shelly [305]
Unresolved requirement: Import-Package: com.google.gson; version=“[2.10.0,3.0.0)”
at org.eclipse.osgi.container.Module.start(Module.java:463) ~[org.eclipse.osgi-3.18.0.jar:?]
at org.eclipse.osgi.internal.framework.EquinoxBundle.start(EquinoxBundle.java:445) ~[org.eclipse.osgi-3.18.0.jar:?]
at org.apache.felix.fileinstall.internal.DirectoryWatcher.startBundle(DirectoryWatcher.java:1260) ~[?:?]
at org.apache.felix.fileinstall.internal.DirectoryWatcher.startBundles(DirectoryWatcher.java:1233) ~[?:?]
at org.apache.felix.fileinstall.internal.DirectoryWatcher.doProcess(DirectoryWatcher.java:520) ~[?:?]
at org.apache.felix.fileinstall.internal.DirectoryWatcher.process(DirectoryWatcher.java:365) ~[?:?]
at org.apache.felix.fileinstall.internal.DirectoryWatcher.run(DirectoryWatcher.java:316) ~[?:?]
I had that same problem. I resolved it by first upgrading my whole openhab to the latest snapshot release (changing my sources.list to include unstable), then upgrading to the new jar. I think it’s a newer version of gson required - so you have it, but not a new enough version.
I did manually upgrade gson, which you can do, but it then asked for a different jar to be upgraded as well, and I decided I’d be chasing my tail.