Rest api, multiple users, voice control

Hi everyone,

I have a program that I use as a front end to openhab that is fully voice controlled. I currently have to enter each command into its interface and then use a curl command to “tell” openhab what to do. I basically update items via the rest api and can control openhab any way i want using voice. This works really well.

for example:

Voice Command: Turn on the bedroom light.
Voice Response: Turning on bedroom light
Action: curl.exe --header “Content-Type: text/plain” --request POST --data “ON” http://ipaddr:port/rest/items/slBedroomlight

So this works really well and really fast. The light is usually on before the response is done. But now i want to ask openhab a question. For example, “what is the thermostat set to?”.

The voice program accepts a simple web command to speak a line of text back to me. To do this i have an openhab switch i turn “on” through the rest api. The rule for that switch generates a string to send to the voice program to speak back to me.

For example:
Voice Command: What is the thermostat set to?
Response : checking
action: curl.exe --header “Content-Type: text/plain” --request POST --data “ON” http://ipaddr:port/rest/items/speakThermostat

the rule for speakThermostat generates a custom response and uses httpgetrequest to send the text back to be spoken.

strResponse = "the thermostat is set to 76 degrees"
sendHttpGetRequest(“http://ipaddress:port/?action=[Speak(‘strResponse.’)]&key=xxxx”)

Ok , so this is working great. Lots of switches and setup to do but it works great. Anyone having better ideas on parsing data on the front end vs the back end of this i am open to suggestions.

Here the problem (finally):

I want to have multiple voice controls throughout the house. They are built into “magic mirrors” in different rooms in the house. But if i have more than one, how do i know what computer initiated the request? What if two people in different rooms ask a question at the same time? i would need some type of unique instance of a variable for each request. How does openhab handle this? Am i missing something simple?

Another idea is to send a simple string to one item in openhab. Sort of like voicecommand. Then parse out the item. I can combine the machine identifier and the switch name and then pass it to openhab, perform a lock , parse the string, do the commands, then unlock. What happens if another command comes in at the same time? Does openhab cache the string until i’m done with the first one or is it overwritten? Can i access a switch based on a passed string variable without massive case statements etc to get to the switch? Is this a proper use of locks? I don’t know enough about the capabilities of the rest api and its probably pretty easy to send two data items at once?

Since the voice program requires a different command for each action, having openhab parse out the action is being redundant. I already know the intent of the command when passing it to openhab so i dont think i want a whole “AI mother of all parsing” rule.

Anyone have any thoughts or ideas. Was looking at the echo as i see the actions binding is coming along really well but i want to have everything local and not have amazon know everything about me. lol. I know that my setup is limited by the commands i enter but in reality, how many things do you really want to ask your house? .The voice front end will soon do voice searches on google and speak the results back to me (siri or cortana capabilities). This is only for specific house things.

I have done some programming in the 80s and self taught myself some c++ in the early 90’s but all my coding experience is pre-internet so please excuse me if this is novice stuff. i’m pretty rusty at coding but coming back to me slowly lol. Any help would be appreciated.

Kimberly F

Personally I would put the creation of the text to speak in your front end and just get the state from OH through the REST API rather than creating rules on the OH side to construct the text response. I would do it this way because that reduces the coupling between your interface and OH (e.g. you could swap out OH for something else and all you have to change is how your front end queries for the current state instead of rewriting a bunch of stuff in the new platform). It also solves your problem of figuring out which smart mirror requested the data because all of the text generation is now local and instead of OH pushing the text it synchronously replies with the current state to whoever requested it.

If you do not move all the text generation for responses off of OH then you will need to move off of using Switches and instead use a String or Number and send the name or address of which smart mirror is requesting the text to speak so the rule knows where to send the HTTP GET response.

But I want to reiterate, this approach is really highly coupled and feels wrong to me. I still think moving the creation of the text to speak to the smart mirror and just use the REST API to get the current state of the Item(s) is a better approach.

Shouldn’t be a problem. OH will process the requests independently in parallel in separate threads. No need for extra variables or doing anything special to keep them separate.

Why not just figure out the Item name and just directly get its value through the REST API and drop all these complicated rules? You already need to get the name of the switch you are talking to now so just get the name of the Item with the value you want.

Each command will be processed separately and in parallel.

If you put your Items into Groups you can get a reference to an Item by name through something like:

myGroup.members.filter[i | i.name="MyItem"].head

This requires some initial thoughts and design up front to organize your Items and Groups but it is a very powerful design pattern.

Not really, unless you want to prevent the two commands received at the same time from processing at the same time. You are not changing state or anything like that so I see no reason why you would need to force them to process in parallel.

Not in the same HTTP call. However, once again I’ll recommend using the REST API to just get the value of the Item you want and build the text on the mirror side rather than using these complicated rules.

Agreed and this is another argument for moving all this logic out of OH and into your voice program.

Thanks for the quick response. I was hoping i wouldnt have to learn something new and write another app. LOL. Having it all done on the openhab side makes changing out the client side easier. This way i can have different types of voice clients, apps, etc and just use simple rest calls to openhab and have it do it all. I guess its back to square one writing a new front end that integrates with openhab and the voice program. I will have to wait until the voice app can have custom plugins. I appreciate the response and kinda figured that was the answer.

Thanks again.

Kim

After looking around a bit and thinking over your response i have decided to go the original route and have openhab do the heavy lifting.

I agree, for simple item on/off things i just send a direct on/off command to the switch or item.

For more complicated stuff i just send a string in the form “1-voicerequest”. The 1 indicates the computer requesting the info. I parse the voicerequest and create a string which sends a speak command back to the computer that requested it.

Using lambdas with the machine number and text string i can have openhab say anything to any machine with one command. This also allows me to broadcast a message to all machines with a simple loop in the lambda and changing just the ip address in the string. They wont be synchronized but that is a whole other issue as i dont know yet if it even will be a problem.

I do think keeping all the processing inside openhab is the way to go. It is the backbone of the system. Having command and control parts distributed to other machines takes away from the central command hub. For example, if a water leak sensor turns on , i can easily broadcast a message to all machines pretty easily if all the processing is done on the openhab side. If i want to change the reply or response of a given action, i dont have to change it in multiple places. I change it once and all machines instantly get the “update”.

So far its works pretty well. Using Openhab on a pi2. The response to any question is instantly spoken back to me.
I think using the lambda was key in keeping the code manageable as its all just putting strings together now. The answer to my “Status report please” inquiry takes a couple of minutes of speech but the response is instantaneous and gives status updates on over 30 different items.

The only tricky part was trying to interrupt the status report by saying “stop report”. I broke down the status report into tiny pieces, ie, one for the status of the garage door, one for the ac system,etc. This way i could ask individual status request for the a/c, security etc separately.

So asking “what is the a/c status” ,openhab would speak back to me the current set point, whether it was on or off, etc.
The “garage status” would just tell me if the garage door was open or closed.So when i ask for a status report i just respond by turning a specific status inquiry on or off with a simple boolean. Saying stop report would now have interruption points where i could abort the report. It seems to work really well. I can literally ask openhab any question i want just by adding a few simple lines of code.

I really like the group idea that you had spoken off. I am currently redoing my items and groups. Just having to send a command to a single item and have that cascade over the entire group by a simple voice command is pretty cool! Thanks for the suggestion.

I’m pretty happy with how it works. i dont seen any limitations on how it works at the moment. Obviously it cant think and will only respond to what i program but i’m running out of things to ask it. lol.

Thanks

Kim

The foundations for a lot of what you’re trying to do is being worked on in ESH. If you can wait that would be my recommendation and it’s what I’m doing personally.

Anyone know the timetable for this? Would be nice to have it now. Do we know what voice front ends ESH or openhab will work with? I see that most people seem to be working on the Echo action and the amazon family of items. I want to avoid sending everything to the cloud. i havnt seen any posts on other voice front ends for openhab that are ready. Did i miss some?

Thanks

Kim

I can’t comment on timescales as I’m not actively involved in the development but here’s a few of the issues raised regarding voice control that you can read up on:

Speech-to-text
Text-to-speech
Command interpretation
Key-word detection

I vaguely recall an issue being raised regarding the concept of user accounts so that OH would know which user was issuing the command but I don’t have time to look for it now.

Daniel,

Thanks again for the response and the info. Interesting stuff. I had read some of that already but not all. I think that there are a few takeaways from the linked material. Number one is that I think its going to be a very long time before this is ready. If i did not already have most of it put together i might wait a while as well. And when i mean put together i just mean I cobbled together bits and pieces here and there and wrote a few scripts. So i am in no way saying my solution is better than theirs.

With that in mind, what they are trying to do is make a machine/os independent solution. That is one of their basic tenants. This means that they are pretty much starting from scratch. They can’t leverage Alexa, Cortana or any other existing service as each has its own platform to run on exclusively. Since I don’t care that my solution is a windows only solution, I can leverage the mighty resources of Microsoft and Cortana and have a pre-built tts at my disposal. All the tts stuff is already done and done very well. I am using the beta version of links’ jarvis program. You should check it out. i can use it now to communicate by voice very well with openhab. It works very fast and does everything that was discussed in the linked threads except intent. All commands have to be entered manually. A pain for sure, but not a deal breaker for me as my house and needs are small. Links Jarvis here: www.mega-voice-command.com. It can be a bit cheesy during the startup but i like the modern look of it. lol.

They only drawback to using this is or any existing tts service available today (cortana, alexa) is that it does not do any command interpretation as far as intent goes. There is a reason for this as its the most difficult part of any natural language interface. I think it will be years before any A.I or any tts speech service will truly understand intent. From the links you provide they really don’t know yet how to even start working on intent. That is not a put down but rather reflects the complexity of getting intent from the spoken word. Intent, in my opinion is the holy grail of tts. In the meantime we are left to explicitly enter our intent into commands one by one. Would i love to just let an A.I get intent from just a voice command? Sure …but not happening today or any time soon.

I personally think that they are biting off more than they can chew and they will end up outsourcing the tts portion to a web service. I like the fact the the Links Jarvis program and windows does all the tts locally and doesnt send it to the cloud. I know that windows 10 and most software is phoning home all the time and its not a perfect solution. Like they say, if your going to do a web search, you have to use the web. lol.

I hope they are able to come up with a natural language interface that can understand intent. But i think what they really will end up making is another alexa or cortana and we already have that today. I hope i’m wrong. Thoughts?

Kimberly F

Kimberly,

Like yourself I started cobbling bits and pieces to get something working and completely understand that users want something that just works regardless of the means.

My current view is that the work going on at the moment is to set the ground work for these concepts. My expectation, or at least hope, is that once the interfaces and initial implementations are complete we can extend or create additional implementations like perhaps using Alexa or Cortana. There are also services out there that do speech/text-to-intent so maybe we’ll see integration of those at some point.

Personally, for now at least, I’m happy to sit back and see how things pan out. My needs are also quite simple so not having voice commands now isn’t the end of the world but I’d definitely like to have a good, slick implementation that’s reliable and robust in the future.

Daniel

I completely agree with what you said. For me personally i hate having to use a phone or computer to get things done. Lol i’m pretty lazy and accessing my phone every time i want to turn on a light gets old really fast (god forbid i get up and use the switch!). Since i’m building this into a voice only appliance that is part of the house i cant wait. This fills the bill for now. I still can pull out the phone and do everything if the voice command fails. its the reliable and robust thing i think we will be waiting for for some time to come. I hope your wait is short.

Regards,

Kimberly F

It depends on how you look at it. Another perspective is that the mirror is the user interface and openhab has the model (i.e. data model) and the controller (i.e. rules logic and interfaces with devices) so if you were to treat your deployment like a MVC design, the voice command is a user interface thing and belongs in the View.

I’m not saying you made the wrong decision or trying to talk you out of it, just pointing out that there is more than one way to look at a complicated system like this.

I’m not sure how doing the processing on the OH side or the mirror side makes much difference here. If OH sends out a broadcast of some sort that the mirrors read in then the mirrors translate that broadcast to human recognizable speech you have the same result.

If you have multiple instances of the mirror software running you should be using some sort of source control and deployment mechanism. You should not have to make changes to the mirror software, not matter the nature of the change, multiple times. You should change it once and the mirrors either pull the updates from a central repository or your central repository pushes the changes out to the mirrors.

I would have to agree with you on this but at the moment i am treating the voice end as just a “dumb terminal” that speaks text. I will probably end up writing a new program for this to integrate with it. I just dont know where to start yet as i dont know how they are implementing their plugins. The openhab side is currently generating text and sending it to the mirrors. Your saying i should have the mirrors poll openhab to see what to say instead of openhab telling them what to say? It would just come down to which end should do the logic. You are correct in that this puts the front end on the openhab side and is not as clean a design as it should be. Buts its sooo much easier for me to use openhab right now!

Lol, My ocd has kicked in and i’m afraid i will have to start writing a new app. haha damn you! its working so well now in openhab that maybe i can talk myself into waiting a while. My voice rules file is growing at an alarming rate though. lol.

As an aside, since you seem to be the go to guy on here for programming, where would i find a good read about available commands , syntax etc for openhab. The extend library? Its my understanding that openhab uses a subset of extend? Best to just keep searching for examples? Is there a manual or anything on the openhab designer error codes? It took me a while to finally figure out that a variable inside a timer must be declared in the timer code as it doesnt exist after the rule executes. Things like that drive me nuts. But now i have an openhab alarm clock that speaks to me in the morning. I can even tell it to snooze!

Its always nice to see a response from you on the forums as the responses always seem to work :slight_smile: Thanks for the input.

Kimberly F

Sort of. The mirrors construct the text of what to say but asks OH for the current state of Items to complete the text. We might be saying the same thing here. I would say polls because it is an on demand request.

Treat this as a chance to take your time with it. You can run both approaches in parallel and experiment with your new approach while keeping your working system. I’m actually envious. Right now I want to add an RFM69HW transceiver to one of my Raspberry Pis but I’ve already got things hooked up to some of the pins the library uses. So I need to rewire it and change up all my code to move to the other GPIO pins, and figure out what the heck I did a year or so ago to keep it from triggering my garage door every time the Raspberry Pi reboots. So I have to break things to move forward.

Rules are written in a Domain Specific Language (DLS) based on Xbase and it bares the most resemblance to Xtend. So the first place to look is the Rules wiki page. This will give you the over all structure of Rules and specifics about how they interact and what is made available to you (e.g. you get a reference to each Item automatically). For specifics on how to code smaller code constructs like for loops and switch statements the Xtend documentation is the place to look. Finally, look at examples. Unfortunately there is no one good reference source (I hope to write one for OH 2).

Unfortunately not, at least not something like you would want.

This is odd behavior as I routinely reference local rule variables from inside my timers. A timer is supposed to get a copy of the contex/state of the what was around when it was created so any variable that was valid when the Timer was created should still be valid within the timer. Of course, if the Timer tries to reassign it there might be an error or nothing may happen if it isn’t a global variable because the Rule has exited.

You must not be trying them out yourself. Almost every one of my code examples has a typo or something wrong with them. :wink:

Thanks!

I was having trouble with the designer giving me a hard time with the reschedule timer function. In my snooze alarm rule i have to declare a variable inside the timer code for it to reschedule. Here is the snooze rule for the alarm clock for mirror 1. I cant use the intSnoozeMin in the reschedule command or i get a designer error. This is the kind of stuff that makes me scream. lol. its all good thought as it only slows me down a bit.

Cut and paste of my snooze alarm:

 //turns to on when voice snooze command is given.
rule "do the snooze"    
when 
   Item sDoSnooze1 changed to ON
then
  //get the snooze minutes for the alarm clock
  var intSnoozeMin = (numAlarmSnooze1.state as DecimalType).intValue
  if (timerSnooze1 != null ){
   	timerSnooze1.cancel
   	timerSnooze1 = null
  }
  sendCommand(sRecurringAlarm1,OFF)
  // if the alarm clock has gone off is the only time the snooze should work
  if (sAlarm1Active.state == ON ) {
     timerSnooze1 = createTimer(now.plusMinutes(intSnoozeMin)) [|
        // recurring alarm speaks the wake up message on the alarm1 mirror
        sendCommand(sRecurringAlarm1,ON)        
        var intSnoozeMinTemp = (numAlarmSnooze1.state as DecimalType).intValue
        timerSnooze1.reschedule(now.plusMinutes(intSnoozeMinTemp))  	
    ]
  }
  postUpdate(sDoSnooze1,OFF)
end

[edit] format of sample code

Kimberly F

There are several ways. If you want code formatting line just surround the text with back ticks ```
If you have one or two lines you can just indent by four spaces
If you have a lot of code you can put three backticks on the line before and three on the line after.

```
your code
```

Furthermore, with this last option you can get some additional formatting if you tell it what language the code is written i. NOTE: for Rules DSL code use Java for the best results

```java
rules code
```

What is the error?

Do you still get the error if you make intSnoozeMin a global var instead of a rule local var? Come to think of it I don’t think I’ve actually tried to use a rule local var in a Timer and now that I think about it some I can see all sorts of potential problems with it.

Honestly, what you are doing now is IMHO the more appropriate approach anyway. As a rule of thumb I recommend to put as much state as possible into Items and retrieve values from Item’s state instead of temporary variables. Therefore pulling the value from numAlarmSnooze1 again within the Timer is what I would recommend doing even had the other way worked.

While is can seem awkward in your code to always be pulling stuff like numbers and boolean flags from Items, the ability to take advantage of persistence, Group operations (forEach,filter, sortBy, etc.), etc far outweighs the awkwardness. It also results is shorter, simpler, and more generic Rules code though you have to pay for it with slightly more complicated Items files.

I figured since it was just retrieving a value from an item that wasnt going to change using a local var would be ok and would be preferred.

Thanks for the format help.

so the code below would give an error in designer if intSnoozeMin was not declared within the timer code.

timerSnooze1.reschedule(now.plusMinutes(intSnoozeMin))  	

Error in the designer is :

Cannot refer to non-final variable intSnoozeMin from within a closure

My guess is that based on the syntax of the timer , timers are actually a form of lambda. Therefore i need to declare the var inside the timer code.

[edit} forgot to mention that i completely agree with you about using items. For me, items are my global variables and i actually have very little globals declared in my rules file. Persistence is golden! lol

Thanks again.

Kimberly F

Change the above to val intSnoozeMin to fix the error in Designer.

well that was easy , lol.

Explanation please?? makes it a contant and not a var or just a weird syntax thing?

Good call. Thanks. never would of thought of that.

Kimberly F

Using “val” would be the same as declaring the a variable as final in Java or const in C++. It basically means that the variable cannot be reassigned. NOTE: This only applies to variable, you can still change the contents of, for example, a val declared Hashmap). For example:

var canChange = 5
val cantChange = 5

canChange = canChange + 1 // valid
cantChange = cantChange + 1 // error

val Map<String, String> map = newHashMap

map.put("New", "Value") // valid
map = newHashMap // error

If you are big into defensive coding (I tend to be) it is a good idea to declare everything as a val except for those variables that actually do get reassigned.

It is documented on the Rules wiki page in the Syntax section under Variable Declarations.

1 Like

Hey all,

Great discussion here! I wasn’t familiar with Links until Kimberly pointed it out, cool stuff… I have yet to build a magic mirror, let alone multiples! One day… :smile:

In the mean time I really like the feature set of the Amazon Echo’s, and have put a lot of effort into building an Alexa Skill for OpenHAB. This does a lot of what has been discussed here, such as identifying and handling intents from the spoken request, getting states, and much more… All outside of OH Rules / Voice Commands (although it does supports OH rules too)…

This isn’t intended to be a shameless ‘plug’ for my project, but rather an ask to everyone who owns an Echo - please help us refine and improve Alexa-HA! In the future I intend to expand the project to support Alexa Voice Services directly so you won’t necessarily need an Amazon Echo to use Alexa-HA (i.e. a RPI and Jasper could suffice instead).