A workaround for lastvoicecommand on AmazonEchoControl

Hello,

Context
My OH setup strongly relies on AmazonEchoControl LastVoiceCommand (LVC) to locate where voice commands were issued and process them smartly depending on the location. This should be handled by Alexa server side but is so erratic and time-consuming to implement in the Alexa App that I preferred to implement my own logic.

I use 2 levels of proxied items which receive and forward command to items (indeed, there is also a 3rd level of functional proxy but this is not the topic here):

  • first level of proxies has 2 functions :
  1. to “keep control over Alexa”: when I am out of the house, commands sent to Alexa items are not forwarded to devices and Alexa just returns a TTS “I am not allowed to control this device at the moment”,
  2. to wait for the LVC update so a command can be tagged (using tags in item metadata) with information about where and who triggered it. The “who” gets only filled when the command is sent from a mobile phone (until the binding is able to return user’s voiceprint information
), the “where” gets filled using the location of the LVC item.
  • second level of proxies are attached to the topmost "house location"item for each function such as light, tv, media activity, thermostat, shutter or whatever “functional” item we can think about. When one of this virtual item receives a command, it forwards the command to the right device in the right room.

I also run a “smart notification logic” to return a relevant feeback to the issuer of the command through available “notification channels” available in the room where the command was issued (TTS, color light, OSD, push notification or whatever available).

Use cases

if I request Alexa to turn on my TV with a voice command sent to the Echo device located in the living room:

  1. I will be locate me in the living room (through Echo LVC item),
  2. the proxied “generic TV item” will look for a TV item in the living-room and turn it on if any or will default to the “default TV item” if any
  3. will NOT notify me as I am located in the room where I triggered the command (so I basically can see the TV turning on, so no need for a feedback).

if I request the same command from my phone while outside:

    • I will be located out of the house (I use Tasker and items for each mobile phone for that),
    • will look for the “default” or “master” device of the house and turn it on,
    • will send a push notification to the phone to confirm the TV of the living-room was turned on

So the loss of LVC functionality in AlexaEchoControl just messed up all system and I worked hard to find on a workaround which principles is described here.

Principle of LastVoiceCommand workaround
I have setup a basic packet sniffer for Echo devices on my LAN.

  1. the outbound traffic from echo devices is mirrored to a monitoring device
    in my case, I could run the following command on my Asus router to create a mirror of Echo packets and route them to my OH server (a Windows PC: might be simpler or more efficient on a Linux setup)

  2. the “monitoring device” detects increase of outbound traffic from an Echo and updates the “packet monitor item” (PKT item) related the Echo device in OH
    I wrote a short python routine using Scapy library for packet sniffing and requests library to send a postUpdate() using OH API.

  3. When the PKT item gets updated and an Alexa proxy item (AX) received a command “barely at the same time”, then I update tthe LVC item of the Echo which triggered the network traffic
    I store LVC updates and AX commands in Jython queues and when the time difference is low (was 2 secs when LVC was working but with my implementation, 0.5 secs is enough!), I trigger a postUpdate() on LVC item

Advantages: Notifications and commands are dramatically faster: the LVC gets always updated before the command (when AmazonEchoControl LVC was available, LVC item was sometimes updated after my AX item received the command so I had to implement a tricky set of rule to ensure “wait for each other”). No longer depends on Alexa API.
Drawbacks (but this was already the case before) : if 2 persons send different voices commands simultaneously to 2 Echos in 2 rooms, things may not run as expected :rofl: :rofl:

3 Likes

How are you getting the packet payload? Everything I’m seeing is encrypted (https)

I just count packet KBytes without dealing with content (as you say, it is encrypted). When reaching a limit (60KB), I trigger a LVC update. If an AX item command occurs at the same time, then I assume it was trigger by the LVC. Not 100% reliable but it works quite efficiently.

but how do you get the command (e.g. “turn on lights”) that was issued? None of my traces have anything “in English” because of the encryption.

he does not get the command in text.
I think rules are being executed. The rule sets the AX proxy item. If then the AX proxy item is set in same time period when the packets are detected a command/rule sets/updates the lastvoicecommand. This logic does not require to read encrypted network traffic.

Wow! Innovative approach.

Quick question for clarity, as I am too suffering from the loss of lastVoiceCommand. Is the following understanding correct?

  • Before, lastVoiceCommand was updated in real time without the need to do anything.
  • Due to a change on Amazon‘s side (I would be surprised if the timing in parallel to HR cuts in Amazon‘s Alexa team was coincidental) lastVoiceCommand is not updated in real time anymore.
  • The workaround presented here has the downside of a delay of 10-ish seconds (and not being part of the official OH release)

  • 
 and polling the item constantly every 1 second is no good alternative either

  • 
 so you your approach consists of polling it on demand after you notice that something has been said via your sniffing approach?“

Yes, that’s it. I update an unlinked lastvoicecommand item related to my Echo device (not the one from the amazonechocontrol binding) when some traffic from an Echo to Amazon server is detected. And if an item controlled by Alexa gets updated “at the same time” (barely in the following 2sec), I assume it was triggered by the Echo which generated traffic and updated the lastvoicecommand item.

As said before, this is not fully reliable, especially if you have several Echo devices which get triggered simultaneously. But it is better than nothing, and even faster than the initial lastvoicecommand from the binding (lastvoicecommand always happens before the actuator gets triggered, which was not always the case with the original lastvoicecommand from the binding.