Xiaomi Gateway audio sink for text to speech (TTS)

Following this post:
I’m able to use the Xiaomi Gateway as an audio sink for text to speech. You’ll need ffmpeg and miio (https://github.com/aholstenson/miio):
First, you need to download the speech (abusing Google Translate, sorry), convert it to AAC and place it on a web server where the gateway has access (openHAB’s). “Just testing” is the text to say, and en-GB is the locale:

ffmpeg -i 'https://translate.google.com/translate_tts?ie=UTF-8&tl=en-GB&client=tw-ob&q=Just+testing' -b:a 64k /etc/openhab2/html/tts.aac

Then you need to upload it to the gateway (5001 is a code it uses to reference the stored sound, is openHAB’s IP address):

miio protocol call download_user_music [\"5001\",\"\"]

Then you play it:

miio protocol call play_music_new [\"5001\",15]

I think it would be great to integrate this into the binding so that it can be used as an audio sink, at least for text-to-speech.


nice work, yeah adding this to the binding would be great…

sounds great - but not working for me :frowning:

Could not connect to device, token needs to be specified

Any ideas how to do that?

I am assuming you have not added the token for your gateway to the miio cmd

try this (must use the short form of the token not the 96 character version):

miio tokens update <GW IP@> --token <SHORT_TOKEN>


1 Like

I didn’t do that.
When investigating I did ( is the gateway):

miio discover
miio inspect
miio discover --sync

I don’t know if any of these commands were necessary.

I have never used this one myself but it is supposed to sync any token it learns automatically with the miio command store so it does the same thing as I said but in my case the user provides the token rather than its being discovered.
Not all tokens are discoverable through handshaking with the device.



I’ve realized that you cand send raw commands to the Xiaomi Gateway using the Xiaomi Mi IO binding, so you don’t need to install miio or node.js to use it for TTS.
After you install the binding, add the gateway thing:
You can add it from Paper UI or from the things file. To add it from the things file you will need the deviceId and the token. To find them (I don’t know if there is a simpler way), you’ll need to add it from Paper UI (you could leave it like this but I prefer configuration in files):
In the Inbox you will find the gateway as a “Xiaomi Mi Device with token”. Add it, go to Configuration/Things and click the pencil to edit it. Click again on the pencil and on “Show more”. There you have the device id and the token
Once you have the device id and the token you can delete the thing from Paper UI and create it in a .things (if you want):

Thing miio:unsupported:gateway "Xiaomi Mi Smart Home Gateway Mi IO" [ host="YOUR_GATEWAY_IP_ADDRESS", token="YOUR_GATEWAY_TOKEN", deviceId="YOUR_GATEWAY_ID" ]

Then, create a item to send commands to the gateway:

String XiaomiGatewayCommand "Command [%s]" <none> {channel="miio:unsupported:gateway:actions#commands"}

Now, you can send commands to the gateway:

executeCommandLine("rm /etc/openhab2/html/tts.aac")
executeCommandLine("ffmpeg -i https://translate.google.com/translate_tts?ie=UTF-8&tl=en-GB&client=tw-ob&q=Just+testing' -b:a 64k /etc/openhab2/html/tts.aac",20000)
XiaomiGatewayCommand.sendCommand('download_user_music["5001", "http://YOUR_OPENHAB_IP_ADDRESS:8080/static/tts.aac"]')
if (!XiaomiGatewayCommand.state.toString().contains("ok"))
    logInfo("Casa","Error uploading")
if (!XiaomiGatewayCommand.state.toString().contains("5001:100"))
    logInfo("Casa","Not finished")
//15 is the volume:
if (!XiaomiGatewayCommand.state.toString().contains("ok"))
    logInfo("Casa","Error playing")
if (!XiaomiGatewayCommand.state.toString().contains("ok"))
    logInfo("Casa","Error deleting")



sorry for reviving this old thread. A few weeks ago I started using my Xiaomi Gateway V3 as an audio interface in a very similar way. All worked fine until a few days ago when it suddenly stopped working. Whenever I try to execute:

XiaomiGatewayCommand.sendCommand('download_user_music["5001", "http://YOUR_OPENHAB_IP_ADDRESS:8080/static/tts.aac"]')

The response from the gateway will always be:


I have absolutely no idea what the -9 should tell me. Normally I would expect a number from 0 to 100. Does anybody know?

I already tried longer Thread::sleep times, different music ids and even reseted my gateway. I am pretty much out of ideas right now.

Sorry, but I have no idea what the -9 means. Also, do you get “___” as the id or have you removed the number?
Have you checked that the mp3 file from Google Translate is correctly generated? Some months ago it stopped working for me and I had to switch to Google Text-to-Speech.

thanks for the quick response!
I don’t get “___” I just removed the id because I thought it was not necessary as it seems to be just counting up?
I generated the aac files some weeks ago and reused them every time because it basically just has 5 sentences to say. So they were working just fine all the time and I am able to download and play them on my computer.

Have you tried to upload a sound via the Xiaomi Home app?
Besides that, I’m out of ideas.

No, I haven’t tried that. Could you explain how to do that? I can’t find this option in the app (on iOS).

I would assume that the gateway is out of storage space. Mi Home sucks on iOS for this particular use, as iOS does not allow you to upload custom sounds, though you can upload voice recordings.

Mi Home → Tap on Gateway → 3 Dots Menu → Doorbell → Scroll to Bottom and Select Ring Doorbell → Doorbell → + Button

It will error if you have too much on there when trying to click the plus button.

You will not be able to use this method if you do not have a child device connected to the gateway, in that case, use an old android version.

Mi Home keeps drastically changing so the above instructions may not apply in the future.

In my case I have made pre-recorded messages via Googles TTS online demo and some notification tones in mp3 that are uploaded with an older version of the Mi Home app via Bluestacks on PC (Android). Quirky but works for my needs. Doesn’t help if you have dynamic sentences. But if it says the same 5 things, just upload 5 versions of the sentence (that’s effectively what I did).

I was just looking for it.
If that is the case, I’d try to sending the command:


for every code you’ve used and re-uploading the sounds again.

Thank you for this detailed instruction. I will try when I’m back home.

I’m already doing this and it doesn’t throw any errors.

I also tried:


and it always returns:


So to me it doesn’t seem like a storage problem.