Add-on Connection Types

Nadahar · January 24, 2025, 9:47pm

There is a property, “Connection Type”, for all add-ons. If it isn’t specified by the add-on, I think it defaults to “Cloud”.

The options in MainUI are, and you can filter add-ons by these:

cloud - Cloud allowed
hybrid - Optional cloud functionality
local - No cloud allowed
none - No LAN access

In Core this property is just a string, so it can be set to anything. I can’t find any real definition of allowed values in Core. The JavaDocs on the Addon class suggests “local or cloud, push or pull…”.

The Add-on Definitions documentation states:

none for add-ons that have no interactions with external systems at all, local for add-ons that only interact locally without internet access, hybrid for add-ons that interact locally without internet access, but can optionally use a cloud connection for extended functionality (such as discovery), cloud for add-ons that require a cloud connection

The documentation and MainUI agrees, while Core basically allows anything. When searching through the official add-ons, I can only find the four documented options in use. Add-ons can though in principle specify any string there, and there’s no “enforcement” of this, so that add-ons from the community and JSON marketplace can still specify values that are unexpected to MainUI and users.

My reasons for writing this are:

I think it would be a good idea to use an Enum or a “validated String” in Core to make sure that only a specific set of values are allowed. In case this is implemented, there should probably be an unknown option too - which would be used for add-ons that don’t specify the connection type.
I think there is one essential option missing from the documentation/MainUI options: “pull”/“read-only Internet” or whatever one should call it. It is explicitly stated that local is for add-ons that only interact locally. That means that any add-on that retrieves information from the Internet, without sending any information to a “cloud” or other service, would have to use cloud (hybrid wouldn’t really fit either). That is misleading users, many of which don’t want to allow “cloud” use, but that wouldn’t mind if e.g weather information or other publicly accessible information is retrieved.

laursen · January 25, 2025, 11:01pm

This should be validated already by the schema, see https://openhab.org/schemas/addon-1.0.0.xsd

What is your definition of “read only Internet” and why do you think it’s important to distinguish something like a weather service from other cloud services?

Nadahar · January 26, 2025, 12:10am

The schema only applies to bundles, there are other types of add-ons on the marketplace.

Because the big problem with “cloud” services is that they are usually run by big companies that requires a login to access the service, and then logs/tracks everything you do, often to sell that information to others. Very often, they also require a cell phone number to sign up, which enables them to link you to who you “really are”.

Any service that doesn’t require a login/an account or other identifiable information like a device serial number, that they can track you by, are much safer to use. I think there’s a big difference between only retrieving information from the Internet and doing two-way communication. I see that there are some gray areas, but I think that for most for add-ons it should be easy to determine what fits.

I would automatically reject anything that says that it uses “cloud”, and I very much doubt I’m the only one.

laursen · January 26, 2025, 12:09pm

There are also some things all services on the internet have in common:

They won’t work if your internet connection is down for whatever reason.
They can be in maintenance mode or having issues preventing them from working.
They can suddenly disapear/cease to exist, announced or unannounced.
They might be slower than local services, even if just because of their distance (speed of light).
They can become upset when you don’t use them right, for example react with HTTP/429 when you perform too many calls.

I agree about everything you said about being tracked etc. But even a weather service can track you. First, you expose your IP address and some user-agent information. On top of that it might also use cookies. Then there are some parameters you send. For a weather service, you probably need to send your location, even if not accurate coordinates for your home, you still need to provide some location or area of interest. They probably cannot use that to identify you as a person without cross-referencing with other services you are using, but they can still track you as a distinct person.

That’s the technical part of it. Maybe they do, maybe they don’t. Then there’s the intent. A service requiring a login with e-mail address might not keep personal data about you besides that account information, or track you in any way - even if they can. Services are different. Some are quite innocent, while some are almost evil.

All of this is just to say that it’s very hard to categorize such services. For now, I think “cloud” is “good enough”, since at least it signals awareness about all those things all internet services have in common, and the user then needs to make informed decisions about which services to use. It’s used as a filter in the UI, and you can quickly get an overview of local integrations by excluding “cloud”.

Home Assistant has something similar, but now I can’t find it. It could be used for comparison/inspiration.

If you have concrete suggestions for changing/extending the connection types, that’s welcome. Just have in mind to keep some simplicity and not mudder the choices too much. If developers will have a hard time figuring out the correct connection type, the users will probably also.

Nadahar · January 26, 2025, 5:19pm

My reason for creating this topic is two-fold, so I basically have two different, independent, “arguments”:

1 I’m dabbling with some changes to the marketplace, which might or might not ever see the light of day. As a result, I have to deal with this field. Since it’s “primary use” is for filtering add-ons, fixed options is the only thing that makes sense (as it already has in practice). I “dislike” that the options aren’t clearly defined in Core, because it means that when I want to do validation, I can’t use an “authoritative source” to validate against, but I must instead try to research what options are viable/in use (I even missed the schema) and then hard-code the validation based on that. It just feels wrong, it feels like making unmaintainable code.

2 As a user I have problems with the term “cloud”. Technically I agree with what you say, the term “cloud” is very vague and one can interpret it as meaning pretty much what one wants - so that it can end being “a synonym” to an online service/resource.

I have a very different association though, I see the term as a pure marketing term, with very little actual meaning. As such, I see those that use it as wanting to portray what they are doing in that marketing sense. The idea is something like “we’ll handle everything for you, and the future will be fantastic. Just lay everything in our hands, and don’t think about it”. Especially when it comes to “smart devices”, this is a very popular thing for the time being. Things that would have been accessible locally a few years ago, are often now locked in to a “cloud” where you have to give them all kind of information things they don’t need to provide the service (like name, e-mail, phone number, birth date etc). The “privacy policy” is often unreadable, and they change it often and with a very short notice. People will have to hire a lawyer to keep up with their “terms”. So, in reality, they do pretty much what they want with your information.

When I read “cloud” on an add-on, that is my association. I don’t think about it for another second, I reject it without even “investigating” what information is actually send - because this is a common arrangement these days, and I assume that’s what “cloud” means. But, this might just be me for all I know. I just assume that others see the world in a similar way.

If you dig into the technical details, there are all sort of gray areas. I would say that requiring personal or device identifiable information is a significant threshold though. Even though you can track IPs, query parameters and whatnot, linking that information to an individual so that it can be used for “profiling” of that individual isn’t neccesarily that easy. Even if they use cookies, I doubt the HTTP Java clients store these, and there’s not much to “fingerprint” like they do with browsers. They will probably be able to figure out that it’s a Java program using a certain HTTP client, from a given IP address and that’s it. If you’re worried about them logging your IP, you can use a VPN service. As soon as you need a login or must send e.g a device serial number, all that is out the window. They don’t have to “work” to make something out of very limited information, they can link it all directly to you. For me, that makes a big difference. The information can then easily be sold, or stolen, and used together with a lot of other data that is being gathered, to “profile” you. That’s what we shoud all, in my opinion, be worried about, and that’s why I personally would like there to be something that differentiates one from the other.

Regarding HA, I have just tried it once several years ago, so I know very little about it (the whole world of Python where “everything” is more or less broken at all times because of buggy code with no type safety written in a hurry wasn’t very appealing to me). When I look at what I think is their equivalen ot the “community marketplace”, I can’t really find anything like “connection type” there at least. The closest thing is their tags, there is a “cloud” tag, but I’m not sure you could really use it for that much useful. To find more information, I guess I’d have to install HA, but that’s a step further than I’m eager to go for this

I know that currently, “connection type” isn’t supported by OH’s community marketplace either. But, what I’m working on would make it possible for add-ons to specify, which is why I think of the field/property as potentially becoming more relevant.

laursen · January 26, 2025, 8:55pm

Ad 1) You are probably right it’s not fully evolved. I don’t have anything to add here.

Ad 2) I get the confusion from the term “cloud”, and it doesn’t really - in its original meaning - capture the intended categorization here. See for example Cloud computing - Wikipedia. So I think we agree. “internet” would probably have been more accurate in terms of “connection” type, would you agree?

Even though that might remove some confusion, it still doesn’t describe differences between services in terms of privacy etc. That should probably in theory be a sub category or entirely different property, but IMHO we (as a community) cannot provide that in a meaningful way.

Nadahar · January 26, 2025, 9:30pm

“Cloud” is indeed very vague and is “problematic” in many ways, but given that many bindings (have to) work with some kind of “manufacturer provided cloud service”, I can see how it’s relevant/used in this context - although imprecise. “Internet” would definitely be more precise as a connection type, but would prevent the ability to filter out add-ons that rely on “cloud services”.

It might not be that much of a big deal, because for the add-on to use your account information, you would have to configure the add-on to use it. THAT should be the “red flag”. It’s more a convenience to have the ability to have all those add-ons filtered out as I see it, but as you point out, it’s very difficult to draw the line. Whether or not a binding is sending device information is more opaque to the user, and I think it would beneficial if that was indicated it some way other than that users must study the source code themselves.

Privacy in general is extremely difficult, because everything is so opaque for those outside “the service”, and I feel that the legal protection is all but a joke - especially since so many things are international and it often isn’t even clear what laws apply, not to mention that very often there is no enforcement. That’s why my “pragmatic approach” is: We will never know what they do with the information, all we can control is what information we give to them. That’s in a way what I’m “wishing for”, that the add-on author can give the user some hint about what type of information is exchanged with external services.

You can of course say, why does that mean anything? The author can easily lie about that. But personally, I trust community members much more than commercial operators…

laursen · January 26, 2025, 10:13pm

In reality, “cloud” currently means “internet”. I’m pretty sure that’s the interpretation used for all bindings, and I think I’ve more or less seen them all. As as example, Energi Data Service is declared as “cloud” because it’s the closest match. For sure no one can locally come up with the settled electricity prices for tomorrow, but on the other hand the binding doesn’t share any data except “openHAB” and version number as user-agent. And I don’t know anything about their infrastructure. So you are right that currently you cannot filter out add-ons that rely on cloud services, since cloud doesn’t really mean that. Renaming “cloud” to “internet” in the current situation wouldn’t change anything, only make the name correspond to the actual meaning.

It would be nice being able to declare that somehow, although I don’t have any concrete idea right now how to do that in a structured and manageable way capturing all the possibilities.