Always on whole house voice control

So we are in the planning stage of building a new house. I’ve always assumed if I ever did this the new house would have to be smart. I’ve seen so many demos of ‘smart houses’ where it’s a combination of carry your phone around or have an alexa/google home in every room to make things happen.

In addition alexa/google home need a permanent internet connection and effectively all commands are going to the cloud and back. I’d rather keep everything local. So what I am thinking is wiring the house using cat5/6 mainly and putting a microphone that is always listening in all the main rooms.

They would be connected back to OH and ‘listening’ so in theory I could say lights on and if I am in the future living room only the lights in the living room would come on as the device would know which room it’s in. I was thinking of having something very discrete just poking out in the ceiling. Does anyone know a product that I could use, I’m thinking something that will work with PoE as I’ll wire the cat 5 everywhere.

The other thought I had is… having speakers in the main rooms (ceilings) so I could listen to music. But I was wondering how would that work with the microphone would you be fighting with the music playing in the same room you are trying to issue voice commands in?

All thoughts greatly appreciated. Cheers.

There is no open source voice recognition that can compete with Amazon, Google or Microsoft’s neuronal network based cloud approaches.

You would also need microphone arrays instead of just one microphone to filter out noises and more easily distinguish background music and humans.

Cheers, david

So something possibly like this:

http://www.clearone.com/products_ceiling_microphone_array

I realise that the google/alexa tech is far advanced of what is currently available open source. But I’m thinking that I should be able to give rudimentary commands to OH then in the future when something more sophisticated is available I can switch to that to proxy the commands to OH, hope that makes sense.

But at least the physical infrastructure will be in place to take advantage.

Cheers

Well the OP wants everything (non-cloud, flexible, for free). To put it straight: such a thing does not exist.
You always have to pay a price in terms of quality and work you need to put in.
But while non-cloud DIY solutions may not be fully on par with Echo and Alexa, it does not mean they aren’t good.

Check out Mycroft. I don’t have experience with the device they offer, but the software they use in there is available as Open Source, ready to work on a RPi3, and it comes with an openHAB skill.
Works with any microphone but I’d recommend the ReSpeaker 4-mic array which is ~25USD.

Thanks - I’ve seen mycroft before but it looks like it’s come along way since I last saw it. Might be worth building a prototype and having a play around. These speakers look potentially like what I am looking for I’d need a way of making them discrete though.

Instead of having a bunch of PIs around the house, wondering if they could all connect back to a single server. I was thinking to have a full sized machine running OH to manage all the home automation.

Not related to the topic, but I think you should wire your new house with CAT 7 cables, not CAT 5 :thinking:

Thanks for the comment… CAT 5 just popped into my head. But I’ll make sure it’ll be the fanciest CAT I can lay my hands on :wink:

Making mic arrays ‘discrete’ so they can just plug into some analogue audio jack/extension cable towards a server does not work.
The crux is you need HW to locally (pre)process those multiple input channels, that’s what Pis and Echos do. In the end that’s why there exists no centralized solution for voice control (let alone one to even look nice).
That’s why even Amazon & Google want you to put up one device per room.
You might try replacing the 40pin (or the like) connector with a cable so you can put the Pi elsewhere, and if you have room for ceiling speakers, you should also be able to hide a Pi in there.
But you mustn’t artificially restrict yourself by selecting a location/mic/cabling HW setup at this stage.
Get that to work with some ugly-open-Pi+speaker on your table first and don’t think about the look for now.
You will end up with a decentralized solution and with moving devices around anyway.

Fair point

Off topic again.
But there is no need to spend money where it’s not needed
If you are linking 2 servers or switches with heavy multimedia traffic then go for CAT 7
If you are just wiring sensors, switches, actuators… Then CAT 5 is MORE than enough
CAT 5 is good enough for your CCTV cameras, OH server, MQTT broker…

Evaluate how much traffic you will get and only buy what you need…

Then again, I wouldn’t be sure it works well to have the mics in the ceiling (people “speak out” with their mouth facing forwards rather than upwards. That’s a major reason why Echos und Homes and Mycroft MkII are built the way they are.

I just read about the Mozilla voice recognition Initiative. That could be the software part.

The mic array signals need to be processed locally in real-time ( 100th nanosecond resolution i guess?). I’m not aware of an open source software solution.

Cheers David

I don’t get what you want to tell us… that’s what all mic array chips do locally - they properly combine the multiple mic input channels into one, apply echo cancellation etc. so you have a single good audio stream available for further analysis. No SW needed for that other than the driver of course.
Semantic and syntactic analysis is what’s next, but that does not require nowhere near that nanosecond resolution. That’s what Google/Amazon do in the cloud, and Mycroft does it locally on the RPi. There’s some open source engines available such as Pocketsphinx which is what Mycroft uses for wakeword recognition.

I was curious so I looked. It appears that the Google AIY Voice HAT will work with Mycroft. See https://community.mycroft.ai/t/mycroft-with-google-aiy-voice-kit-disk-image/2607

Now I do know that the new AIY Kits come with a RPi Zero W now instead of requiring purchase of a new RPi 3 and I suspect that Mycroft will need the power of an RPi 3. But I’ve been incredibly impressed with the quality of the microphone array on the HAT. When the house is quiet it can pick up my voice from anywhere on the same floor when speaking normally. When I’m in the same room it can pick up the wake word and most commands even when music is playing. How much of that is from the HAT and how much from processing on Google’s servers I can’t say, but it’s worth exploring as an option.

1 Like

Have you already thought about Snips.ai? They claim, they run locally on a Pi3 and satelites seem to be possible. I have not yet tried it. But it‘s on my list, after mysensors are running properly.

1 Like

Sweet - that’s exactly what I was looking for … lots to look into. The more time I spend thinking on the possibilities the more mammoth a project this is becoming. But I’m trying to be realistic about how time I have to work on the various aspects with a full time job and family to deal with too. But I think doing some forward thinking on the infrastructure will then set me up for a bunch of late evenings to hack this stuff to works together.

Thank you

It appears that the new Amazon Echo Plus does not need a full time Internet connection for local device control:

This may or may not be useful in this application - just pointing out this update from Amazon.

Have you looked at Mycroft ?

@Ruric already mentioned Snips.ai, I think it would be a good match, as you can have distributed audio satellites on lower end hardware, their docs have an example with a pi zero:
https://snips.gitbook.io/documentation/installing-snips/multi-device-setup-satellites

It has local speech processing on a raspberry pi, so it doesn’t work with open ended speech, but it worked fine with the limited domain for home automation. (you can even react on your on wake words, so you can make “lights on” and “lights off” a wake word that would directly turn the lights on/off without the need of saying things like “OK Google” before )

3 Likes

Cool I’ll check this out too, thanks