UsbSerialDiscovery - what should it register?

I’m currently looking at WindowsUsbSerialDiscovery, which implements UsbSerialDiscovery, as a result of what I found during this probably false “memory leak alert”:

What this shows me, it that most likely, WindowsUsbSerialDiscovery processes some ~65 KB every 15 seconds just to poll for USB devices. So, I’m looking at a different way to do this.

While doing that, I see that the current implementation “registers” every USB device it finds, whether they are currently plugged in or not (the registry keys are kept regardless, so that the settings are remembered the next time it’s plugged in) and whether it has an assigned serial port or not.

I’m wondering if this is the way it’s supposed to be, or if I should change this. From what I can understand, UsbSerialDiscovery, is only supposed to handle USB devices with an associated serial port - which is a small part of USB devices in general. Does listing other USB devices here serve any purpose at all? I’m already assuming that it’s pointless to list unconnected devices.

@AndrewFG Do you have any knowledge here? The UsbSerialDiscovery doesn’t state it explicitly, but I think it’s kind of implied that it should only be those with serial ports?

@Mherwege I see that you have also been involved in this in the past, so maybe you have some input as well?

This comment seems to be on the topic, but I’m not sure that I see any conclusion:

Is it correct to interpret it as if #3922 in a way “promoted” UsbSerialDiscovery from a USB serial discovery service to a USB discovery service, but that this was never documented/indicated in UsbSerialDiscovery itself?

My memory is very long gone on this. But reading it again, is it not just stating that it actually discovers more then USB devices connected to a serial port?

I’ve largely rewritten WindowsUsbSerialDiscovery to use SetupAPI for finding USB devices instead of traversing the registry. It should lead to a substantial reduction in the information that must be processed each time, not to mention that it’ll only get actually plugged-in devices.

I’m also trying to find a way to listen to changes from Windows instead of polling, which would make it even more efficient.

What I’m trying to figure out, is whether it in fact should register USB devices without a serial port, like it currently does, or if that is just “noise” that traverse the system without being useful.

So, don’t pay too much attention to that sentence, I just saw that it was on the topic. My question is what it should do, not what it actually does.

edit: The change notification is looking a bit bleak. The notifications are delivered using window messages, and you thus need a window handle to subscribe (or a service handle). Since OH is neither a GUI application nor a Windows service, there’s no obvious way to register to receive these. There is a “trick” where you can make an invisible “message only” window, just for receiving messages, but I think it sounds risky. GUI elements on Windows are usually bound to the current (user) session, so such a window would probably be invalidated if a user logged out. That said, I’m not sure exactly how OH runs under Windows, back when I ran it under Windows I think the Karaf console ran as a “console window”. If so, OH would be killed if the user logged out anyway, so it might not matter. But, I generally don’t like to make “non-GUI” applications depend on the desktop environment, so I’m hesitant at even attempting to do this.

No. It reads the Manufacturer Id and Chip Id and perhaps other properties of the USB chip. The addon suggestion finder checks for chips that match those attributes.

See this..

I understand how vendor/product is used to match, I’m just wondering: Is there any way for OH to communicate with USB devices that don’t have a serial port? That would normally require communication with a specific driver, something that isn’t very convenient from Java. I’m just wondering if the “devices of interest” will always have a serial port mapping.

I have no idea. But I can certainly imagine bindings that talk to the chip via other means that formal serial ports. You should certainly not exclude that possibility.

I feel that I’m not getting my point through here. It’s easy for me to do either, what I want to figure out is what is the correct thing to do. The reason for this is that I generally try to follow up on things that make OH slow, unresponsive etc. There’s no doubt that there can be many reasons for this, but in my experience, if you “clean up” enough small issues, things improve generally.

This is exactly the type of thing that is easily overlooked. “Does it matter?” I think it does, because it’s very hard to predict the exact consequences of things in other parts of the system, written by other people that might have made completely different assumptions. So, sending out a “new USB device discovered” message every 15 seconds for the same USB devices, that might not even be connected to the computer, and doesn’t have a meaningful way for OH to interact with, can in fact make things slower.

A very “easy” fix to such issues is to make sure that you only send out the information when it’s in fact, needed. This is the task I’m trying to do. I’m planning to stop sending the same devices over and over again, I’m hoping to get rid of the polling altogether, but even if I have to keep it, it should only send newly discovered (or removed) devices. I have already removed devices not currently connected, and then there’s the question of devices without a serial port.

I suspect that they serve no purpose. What seems quite clear to me is that before #3922, USB devices without a serial port were of no interest. After that, it gets blurry, and I’m trying to figure it out for the purpose of stop doing unnecessary things.

I get the impression that nobody quite care enough to help me find the answer. I assume that’s either because I’m the only one that think slowdowns from doing unnecessary operations matter, or because I’m the only one that think this is a such situation.

Regardless, this is where I am, this is what I’m trying to do. If I have to figure it all out alone, I might opt to not do it - I can’t tell right now. It’s hard to get the overview over all the moving parts involved in discovery and “finders”, because you’d effectively have to have a very good overview over what the bindings do. So, I might give up on getting to the bottom of this.

But, if anybody has some information on the matter to share, that helps me get some confidence about what is right and wrong, it would be greatly appreciated.

Indeed I dont get the point. If it aint broke, don’t fix it.

Exactly. It is broke as I see it, there are many “levels” of broke - it’s not binary. I’m not saying that it doesn’t work at all, I’m saying that it has “unfortunate effects”, and that many small streams…

I don’t think it’s reasonable that the “USB detection feature” has this much of an impact on a running OH instance, especially considering that no actual change to USB devices happened during the sampling (so no actual action took place):


(the peaks every 15th second are the “USB detections”, the other, lower peaks are other tasks)

@Nadahar look if you want to fix it, I have no objection. My only concern is that your fix shall not break the actual finder service. And your current insistence that the main purpose of the finder is to discover serial ports gives me a strong suspicion that your fix does indeed risk breaking that finder service. But anyway please proceed with your PR and I shall be happy to test it.

I’m not insisting, I’m trying to find out. I’m asking: Can OH utilize USB devices without serial ports? If the answer is no, it’s hard to see why they should be included, if the answer is yes, they have to be included and the JavaDocs for UsbSerialDiscovery should probably make that clear for anyone asking the same question in the future. That’s all I’m trying to do.

I have no intention of breaking anything, I have Windows on my development computer, and I have a couple of serial-enabled USB devices I use for testing, so I’m not doing this “blind”.

I was contemplating making a similar solution for macOS, but that is much harder for me to test, since I only have VMs running macOS, and no easy way to forward such USB devices to VMs running on a server in a remote location.

You could do a grep search over all the bindings’ addon.xml looking for <service-type>usb</service-type> and then read their respective read me files. Probably you should then check with the respective addon code owners. Include the Zigbee and both of the Zwave bindings.

Just for reference, this is the JavaDoc for SysfsUsbSerialScanner (as far as I can tell, the only other UsbSerialDiscovery implementation, the one used on Linux):

A UsbSerialScanner that scans the system for USB devices which provide a serial port by inspecting the so-called ‘sysfs’ (see also sysfs - Wikipedia) provided by Linux (a pseudo file system provided by the Linux kernel usually mounted at ‘/sys’).

A scan starts by inspecting the contents of the directory ‘/sys/class/tty’. This directory contains a symbolic link for every serial port style device that points to the device information provided by the sysfs in some subdirectory of ‘/sys/devices’.

The scan considers only those serial ports for which the corresponding device file (in folder ‘/dev’; e.g.: ‘/dev/ttyUSB0’) is both readable and writable, as otherwise the serial port cannot be used by any binding. For those serial ports, the scan checks whether the serial port actually originates from a USB device, by inspecting the information provided by the sysfs in the folder pointed to by the symbolic link.

If the device providing the serial port is a USB device, information about the device (vendor ID, product ID, etc.) is collected from the sysfs and returned together with the name of the serial port in form of a UsbSerialDeviceInformation.

It doesn’t give the impression that USB devices without serial ports are registered, so I’m wondering if the Windows implementation might be the only one that does this.

Further, I find this code:

    @Override
    public Set<UsbSerialDeviceInformation> scan() throws IOException {
        Set<UsbSerialDeviceInformation> result = new HashSet<>();

        for (SerialPortInfo serialPortInfo : getSerialPortInfos()) {
            try {
                UsbSerialDeviceInformation usbSerialDeviceInfo = tryGetUsbSerialDeviceInformation(serialPortInfo);
                if (usbSerialDeviceInfo != null) {
                    result.add(usbSerialDeviceInfo);
                }
            } catch (IOException e) {
                logger.warn("Could not extract USB device information for serial port {}: {}", serialPortInfo,
                        e.getMessage());
            }
        }

        return result;
    }

So, first it scans for serial ports on the system. From the serial ports found, it then tries to find a “linked” USB device. Only if both are successful, a discovery result is created. As far as I can tell, this doesn’t look for USB devices without serial ports at all.

    /**
     * Checks whether the provided device path in sysfs points to a folder within the sysfs description of a USB device;
     * if so, extracts the USB device information from sysfs and constructs a {@link UsbSerialDeviceInformation} using
     * the {@link SerialPortInfo} and the information about the USB device gathered from sysfs.
     * <p/>
     * Returns null if the path does not point to a folder within the sysfs description of a USB device.
     */
    private @Nullable UsbSerialDeviceInformation tryGetUsbSerialDeviceInformation(SerialPortInfo serialPortInfo)
            throws IOException {
        Path usbInterfacePath = getUsbInterfaceParentPath(serialPortInfo.getSysfsPath());

        if (usbInterfacePath == null) {
            return null;
        }

        Path usbDevicePath = usbInterfacePath.getParent();
        if (isUsbDevicePath(usbDevicePath)) {
            return createUsbSerialDeviceInformation(usbDevicePath, usbInterfacePath,
                    serialPortInfo.getDevicePath().toString());
        } else {
            return null;
        }
    }

I’ve discovered another problem. USB device serial numbers aren’t reliably available on Windows. Neither the current implementation nor my “new one” gets it right.

Here is what’s called the “device path” in Windows for an Aeotec Gen-5 Z-wave stick I have.
\\?\usb#vid_0658&pid_0200#5&1c80a72c&0&2#{a5dcbf10-6530-11d2-901f-00c04fb951ed}

The same basic information is available with slightly different formatting, e.g.

  • symbolic name: \??\USB#VID_0658&PID_0200#5&1c80a72c&0&2#{a5dcbf10-6530-11d2-901f-00c04fb951ed}
  • hardware ID: USB\VID_0658&PID_0200&REV_0000

..and there are others like “instance ID” that’s just a variation of the same information in upper or lower case. The “device path” is the one that consistently has the most information from what I’ve found.

This is how the “old”/current implementation parses this device:

UsbSerialDeviceInformation [vendorId=0x0658, productId=0x0200, serialNumber=null, manufacturer=Sigma Designs, product=UZB, interfaceNumber=0x00, interfaceDescription=5&1c80a72c&0&2, serialPort=COM5, remote=false]

This is how my “new” implementation parses this device:

UsbSerialDeviceInformation [vendorId=0x0658, productId=0x0200, serialNumber=5&1c80a72c&0&2, manufacturer=Sigma Designs, product=UZB (COM5), interfaceNumber=0x00, interfaceDescription=5&1c80a72c&0&2, serialPort=COM5, remote=false]

The results are identical, except for the serial number. From what I can tell, the current implementation will never find a serial number, because it assumes a syntax that I don’t think exists (it’s impossible to be 100% sure of anything here). “My” implementation uses the “unique device ID” as the serial number, but this is where it gets difficult.

It isn’t exactly easy to get implementation details about closed source projects like Windows, so I’ve just had to try to pick up some information here and there. The consensus seems to be that this “ID” is guaranteed to be unique for that device on any given Windows installation, but it’s not guaranteed to match the actual device serial number. In fact, it often won’t be. Not all USB devices even have a serial number, and even if they do, it’s “up to Windows” whether that serial number will be used as the “unique ID” or not. It is in some cases, in which “my” implementation will get it right, but it doesn’t for any of my test devices. As far as I can tell, there’s absolutely no way to tell if the information we acquire is a genuine serial number or not. It also seems like it depends on the driver exactly how this ID is derived, so you might consistently get a real serial number for some drivers.

You can get the actual serial number from the device, but that requires opening the device and querying it for information. This procedure is driver-dependent, so it would have to be implemented for each potential driver we’d be interested in. “WinUSB” is a generic driver that is used for many devices, but that’s not something you can rely on to always be true. Even if you figure out the driver and use the correct approach to query that driver, there’s an exclusive lock on USB devices by default on Windows. So, only one process can “open” the device at any one time. That means that if any other software have the device open, OH would be blocked from doing this - and vice versa - if OH did this to read the serial number, it might block other applications from accessing the device. For all the complications and caveats, I consider querying the driver “unfit for purpose” here.

This might look completely different on Linux, maybe the device serial number is routinely available from the “sysfs” - I don’t know. After looking around in the /sys/devices/platform/ structure on openHABian, it seems like much of the “raw USB descriptor” is found in there, so the serial number found there would probably be genuine.

The question now is what to do with USB device serial numbers under Windows. I can’t find a reliable way to acquire them, but the “ID” I get should be a reliable identificator for that particular device on a given installation. Do we want to use that or not, given that it might not correspond to the actual device serial number? What exactly is it used for?

edit: The current implementation gets it wrong when a device have multiple interfaces, becauseit then parses the interface indication as the serial number, e.g MI_02.

I didn’t pick up on the wording previously, but I saw it now, and would just like to emphasize that I’ve at no point thought that the purpose of the finger is to find serial ports. I’m talking about USB devices, some of which can emulate serial ports. Serial port emulation provides a “generic interface” to communicate with the device. There are some other “generic interfaces” like mass storage controllers, but they are irrelevant here. If a USB device doesn’t have such a “generic interface”, I know of no other way to utilize it than to communicate with the driver directly. And, I’m very skeptical to whether OH actually does this.

To do that, you’d probably have to use C APIs. To do that, you’d need to use JNA or something similar. You’d have to deal with completely different driver communication processes for different platforms, and with specific driver versions and their APIs. All this while having to deal with “native” APIs with manual memory management and “foreign” types, not all of which are very easy to deal with from Java.

I’m very skeptical that anybody has actually gone down this road and done this. And if not, I don’t quite understand how OH can communicate with devices that don’t have a relevant “generic interface”, like serial port emulation. I can’t actually think of any other such “generic interface” that would be of use for bindings to communicate with controllers/bridges. This is very far from “discovering serial ports”.

For reference, this lists the following bindings:

  • org.openhab.binding.bluetooth
  • org.openhab.binding.elerotransmitterstick
  • org.openhab.binding.enocean
  • org.openhab.binding.tellstick
  • org.openhab.binding.zigbee
  • org.openhab.binding.zwave

When looking at the use of the serial number field in those bindings, I can only find that the EnOcean binding use the serial number field at all - and that’s in two situations: As a part of the ThingUID and as a part of the USB300 dongle label. Both of those uses should probably be fine with using an OS provided ID instead of the actual serial number.

So, this is in the binding itself, and not in the finder. Does the Enocean binding use USBSerialDiscovery for discovery (and not just with a pattern for the finder)?

You are probably right about the inconsistencies of serial on Windows. As long as you don’t rely on it for real discovery (and not just the finder), it won’t bite us. In the worse case, it cannot be suggested (found by the finder) reliably on Windows if that is in the regex. But I also fail to see how matching on a serial number can be relevant for a finder.

I do believe the reason to implement this in the first place was to find devices on USB that have a serial port. If none of the finders currently uses anything that doesn’t have a serial port, and there would be a major performance improvement if only looking for serial on USB, we could limit it to that. As you well established, on linux it would only come up with devices having a serial port anyway.

Did you check how this one communicates? This probably uses a bluetooth driver on each platform that may not use a serial emulation, in which case the finder will fail to find it if you limit to serial.

Yes. What I did is that I opened all the bindings listed above as projects in Eclipse, and then asked Eclipse to show all uses of UsbSerialDeviceInformation.getSerialPort(). It doesn’t reveal is information that is extracted through reflection, but except for that, it should reveal all code where this is used, regardless of if it’s in discovery or finder code. I got one hit in the finder, but that’s in the “merge” functionality where it merges all information from two UsbSerialDeviceInformation instances into one, so it’s not really a “real use”.

I can’t really understand how a serial number could be used in a meaningful way in a regex, it would have to be some very particular circumstance, so I can’t see how the serial number can be used for either discovery or the finder, but I might lack imagination.

There’s not much any of us can do about the issue with serial numbers in Windows though. I’ve looked hard for a way to get it, and I could only find the approach of querying the driver, locking the device and depending on different implementations for different drivers. It just doesn’t sound like a path worth trying to me. Without that, I can’t see a way.

So the question here is quite simple: Do we add the “ID” field to the serial number field, which will sometimes be the actual serial number, or do we just leave it as null as the current implementation does? Does it have any value at all to have the actual serial number available for some devices, when we can’t tell which ones that is?

I don’t know if that, it’s part of what I’m trying to figure out. The Bluetooth binding might be an example of just this, but this immediately leads to: Does this mean tha the Bluetooth binding doesn’t get suggested or doesn’t discovery devices on Linux? Because, it is my impression, that a relatively small percentage of OH installations runs on Windows in the first place. So, it this is the case, it would leave things “mostly broken”, and there would be a need to make changes to the Linux discovery.

I can’t promise that any of this will result in a major performance improvement. I doubt it, I’m chasing “many small streams”. But, I don’t think it’s insignificant to not notify all the “subscribing” discovery implementations of devices that is irrelevant to them over and over again every 15th second. The discovery implementations don’t know that the device is irrelevant, or that it has been evaluated before, so they must do what they must do to establish this again. In most cases, it will probably be a quick evaluation of some metadata, but there can be cases where this sets off a more involved process.

I think that not doing the polling, not traversing the registry for a lot of information that isn’t of interest, and not including devices that aren’t currently connected will be the most important factors when it comes to performance.

But, that’s not the only factor that I think is relevant. This is also about consistency and clarity. There’s some obvious “wrongs” here now, where what’s documented in the code and what’s done isn’t coherent. This is confusing for anybody that want to work with this information, and could lead to future bugs or just wasted time. So, I think it would be good to figure this out and correct the inconsistencies, regardless of what impact it would have on performance.

I’ll try to investigate. It could be an “interesting case” - but it still leaves us in the situation described above: Either Windows reports too many devices, or Linux reports too few. One of them must be “wrong”.

edit: This is quite confusing to investigate, as I don’t really know what part of the “bluetooth binding structure” is for “radio”/dongles and what is for devices. But, I’ve found one radio implementation, BlueGigaBridgeHandler, and it indeed uses SerialPortManager for communication.

When it comes to the divide between discovery and finders, I’ll just like to explain my thinking: I understand that a finder might suggest a binding even if that binding doesn’t have discovery for said device, and that this is still useful. What I don’t understand is how it can be useful for a finder to react to a device that the binding can’t communicate with, regardless of discovery. I might be missing something essential, but this is the basis for my reflections around serial or driver based communication.