Digital Twins and Machine Learning

I would like to start developing a machine-learning based platform that would create a “Digital Twin” of humans by modeling their preferences, with the goal of controlling and monitoring devices. I like calling it “Nest for the rest” because it would be a generalization of Nest’s approach to learning and adaptation, which could be employed in several other scenarios. Here you can find a very simple example of what I did so far:

I find the challenge very interesting (despite the fact that I’m not much into automation) and I’d like to know if someone else would be interested in collaborating, providing suggestions or at least testing prototypes. Also, I’m not aware of the potential market: does anybody have an estimate? Within the smart-home segment, I guess the market is fairly small, but Gartner identifies digital twins as a Strategic Technology Trend for 2017, with the potential to “become proxies for the combination of skilled individuals and traditional monitoring devices and controls”. Can anybody comment on that? What’s your take?

Thank you!



I think you will find that at least in this community there are people of two minds when it comes to home automation:

  • The automation should be determinisitic and predictable.

  • The automation should learn what I want and do it.

Certain types of automation work better with the former (e.g. alarm systems) and certain types work better with the latter (e.g. HVAC).

The challenge with determinisitic is that one has to ultimately define all the possible states and rules to describe what should happen at state transitions. Most of the time this is relatively simple but it cna become untenable really quickly.

The challenge with the learning approach is three fold. Firstly it has to work well enough from the start to be functional. Or else it won’t have a chance to learn because no one would use it to give it a chance. Secondly you have to decide what factors are relevant to feed into your learning algorithms. For example, what temperature it is in the house probably doesn’t have much to do with whether you turn on the hall light, but the time of day could be very relevant. Thirdly who takes precidence? If there are multiple people in the house, it needs to learn from all of them to discover their preferences. But when multiple people are home only one can take precidence.

Its an interesting problem but I’m not certain there is one universal approach that works for all types of automation. HVAC, lighting, perhaps ambient music, maybe home entertainment systems (e.g. user one likes the lights dimmed when watching a movie, user two likes them turned all the way up) might make good candidates. Alarm systems, irrigation systems (which would use a different type of algorithm not based on a user’s preferences) and the like do not make good candidates.

And even for the automation types that are good candidates, the appraoch to the learning may not be the same.

The fact that you are using MQTT as the interface to the AI makes it very easy to integrate with OH. You can either set up specific Items that pub/sub to the topics, or set up a MQTT Event Bus which will send everything in OH to the AI.

It is also possible to interface with NodeRed if you wanted to go that way.

Personally, as a user, I’d be most interested in understanding how I would use such a system in practice. How would I communicate my preferences to the AI (I assume it would be the sitemap) and how does the AI drive my automation (does it inject messages or does it only respond to requests?)


Hi Rich

thank you for your reply. Here are my comments:

  • Deterministic/adaptive automation: you’re correct, but I think there might be some need to “teach” rules by example as an alternative to coding. Not everybody is able or willing to configure a rule engine and that might be a hindrance to the wider adoption of home automation.
  • “it has to work well enough from the start to be functional”: that’s right: fortunately some learning algorithms allow to adapt the learning rate, to make them learn faster at the beginning of the training. Alternatively, the learning system could be provided with a bespoke initial “memory” to start with.
  • “you have to decide what factors are relevant to feed into your learning algorithms”: that’s not completely correct. Some algorithms can automatically detect relevant inputs and ignore the others.
  • “who takes precedence?”: I guess that’s an issue that even a human butler would face, so, although it’s relevant to the case, it’s not really specific to AI-powered automation.
  • “I’m not certain there is one universal approach”: no there isn’t. A working system might need to be “assembled” from a set of independent and occasionally communicating subsystems. Maybe it’s time to dust off Rodney Brooks’ subsumption architecture.
  • MQTT/NodeRed: I’m already using MQTT and NodeRed for “dry” simulations and I’m now looking for the right project to implement a real-world prototype.
  • “how I would use such a system in practice”: My idea would be to have the AI learn from the human user by subscribing to the events published to MQTT topics, automatically find the relevant inputs (e.g., time of the day, day of the week, weather conditions/forecast, temperature, occupancy, etc) and simply replicate the patterns that can be detected with high confidence (that’s what could be called a digital twin of the human user) by publishing to MQTT topics. In a second phase, it could be possible to apply model predictive control to do more fancy things. In no case would coding or direct configuration be needed, but the AI engine might need to ask questions from time to time to disambiguate cases for which it has little confidence in the computed prediction.

I’m experienced in ML/AI, but not much in home automation, that’s why I’m asking the OH community suggestions: what’s the lowest-hanging fruit? What’s the worst pain point I could try to relieve?

Thank you!

1 Like

True, but my point is at least until there is some major unforeseen breakthrough, one cannot get away from configuring a rule engine. A system like this can minimize the need for coding rules but not completely eliminate it.

In my experience, which is admittedly in a vastly different and more difficult domain (i.e. computer security), such algorithms are not very good at deciding what is relevant and what is not. Perhaps the somewhat less noisy and caotic world of home automation they would work better. But I’ve been burned before so am skeptical.

The problem is a bit more nuanced than that and this is a pet peeve of mine. Far to often designers and builders of home automation systems (from the DIY to the commercial offerings) make two fundamentally incorrect assumptions:

  1. Their system is the only one operating.

  2. There is only one “user”.

The first one is sometimes addressed by releasing an API and is almost the main raison d’etre for OH.

The second one is almost completely ignored by most and it results in user interfaces that are awkward at best and unusable at worst for any but the primary user of the house. You will often see that referred to around here as the WAF (Wife Acceptance Factor, which would probably be better put as SAF Spouse Acceptance Factor).

To me the answer “that’s a universla problem so I’m going to ignore it” is unsatisfactory. There has to be some sort of answer or approach, even if it is something like the Nest approach which simply ignores multiple users and treates all inputs the same. But make that a deliberate design decision, don’t just ignore it because it is hard. And do realize that using the Nest approach and treating all users as one can and may limit the sorts of automations that can be learned.

That’s a hard question to answer. Every home automation deployment is a unique system with unique requirements. One person’s major pain point is another person’s don’t care. For example, many users spend huge amounts of time and effort getting their lighting working the way they want it. I have three lamps which I’m happy controlling with simple timer type rules.

That being said, I would probably prove the system out using lighting, particularly colored lighting. This would give you a problem space rich enough to explore the system (on/off, dimmer values, color) yet simple enough to set up in OH without equipment. You can create a model house (see the demo) with lighting, weather, time, etc but not have it connected to anything physical which should give you a low barrior to entry. Using the REST API you could even write scripts to exercise the house as if a person where actually interacting with the devices to drive your machine learning, at least in the initial development stages.

1 Like

Agreed: coding might be completely avoided only in some cases, in other ones a point-n-click interface could help configure the rules. Of course, each solution would be very case-specific… no silver bullet here. Anyway, I’m not after a universal solution: I’m happy with little problems and simple, usable solutions.

Computer security is a harder nut to crack… not being predictable is the name of the game, whereas at home we can hopefully expect to have more or less regular patterns.

Honestly, I doubt that even a human could solve this class of problems in a way that does not entail some level of compromise between the people sharing the same living space. I think most companies prefer to tackle simpler problems, being a R&D investment in the harder ones more difficult to justify.

I already have a sort of virtual bench of the kind you describe, but the main problem I see now is to solve a real problem, instead of tinkering with sensors and algorithms, which can be fun as a hobby, but pretty useless as a business.

any news on this topic? just out of interest…

Not really, but I’d still like to give it a go, although I think it might be an easier business to use data from commercial buildings rather than residential ones.

If you do come up with something, I’d certainly be interested in testing it!

I have been working on a binary neural network cluster to predict some things in my smart home. So far testing seems to be pretty promising. Basically the system logs everything to a MySQL database and from there a training corpus is build every few hours and training is done. This means the system adapts to changes however can be 4 hours behind (think of this like data being saved into long term memory when you sleep). The system uses all the historical data and the first neuron reduces the variables to the most import 10. The system uses a Tensorflow to create the training model and save the created model to file for quick predictions. Every 30seconds to 1 minute the predictions are calculated.

So basically, the neural network trains of a history of an item and using that model predicts what state the item should be in. Currently the predictions seems pretty accurate most of the time and the scores are quiet high (about 80-100% on most scores).

Currently I have a sitemap dash board that shows the confidence of what state and what it currently is for example: “AI is 68.95% confident the kitchen light should be ON and the current state is ON”

Here are some examples of my test scores (these are the lowest scoring models):

TV Power
Precision : 82.474227%
Recall : 91.954023%
F1 Score : 86.956522%
Cohen : 73.088069%

Bedroom Light
Precision : 83.783784%
Recall : 86.111111%
F1 Score : 84.931507%
Cohen : 81.203566%

I am going to continue testing for another month or so and will shortly be giving the neural network the ability to actuate a switch. For example if AI is more than 85% sure a light should be on, it will switch that light on automatically.

When I am happy with the results I will post the code to github. Currently the training should work for anything binary (ie switches) and I will be looking at more complex predictions down the track. Will try to post any major updates on the forums.


Have you tried to generate a decision tree instead of training a neural network?

No I havent, have you had a lot of success with a decision tree?

The interesting question for me would be: which use cases do you have in mind? What do you want to target?

it depends on what success means :slight_smile:
Generally speaking, decision trees can be inspected and even manually tuned, which is more difficult (or impossible) with neural networks.


I did the Google Assistant integration and my next focus will be ML. If you want we can join forces. I am looking at openNLP of Apache. Also checking which framework/library to use instead of inventing things my own.

@aus also my message to you: Lets try to build sth. together instead of finding the „golden“ use case…lets create the foundation for it.

@nodgesoft if you feel confident and want to colaborate, I can try to support your efforts.

I ve some cool ideas, too.

I’m happy to share both my setup and ideas also. If you shoot me a message with your preferred contact details we can have a chat and see how we would help with each other’s work.

My binary neural net is mostly good to go, just making minor tweaks to increase the accuracy when adapting to new changes.


I wrote you an email Matt. Did you ever receive it? Couldnt reach you…

Would be great if you could share your findings here - others, including me, would be interested.
I’m especially looking for use cases.

Apologies, I have been away on work and family issues. Mehet has been sent my current iteration of the binary neural network cluster, current it is in a very unportable format. Basically, the historical data needs to be normalised for the neural network - an easy thing to do if you dont mind getting your hands dirty on database queries (mysql in my case). But not exactly beginner friendly.

I am still not ready to let the neural network activate things by itself, but it is currently sending my push notifications when it thinks things should change and I would guess it is around 80% accurate.

1 Like

Do you want to share the code for the project.
I am trying to find out something like this, but dont really know how to start.
I have been collecting all the changes to a mysql database for the last year so have plenty of data for the learning

1 Like

I will share the code for the project when I feel I can provide enough instructions for people to be able to set it up easily. I have open sourced software in the past and if it isnt fair complete then you end up spending all of your time supporting people intead of developing it properly.

Currently, you would have to write all of your own SQL questions for the binary ML code and this isnt feasible for a lot of people. I am however investigating ways around this.