Amazon Echo hacking

Posted on
Thu Feb 04, 2016 1:24 pm
mundmc offline
User avatar
Posts: 1060
Joined: Sep 14, 2012

Amazon Echo hacking

Hey all,
I plan to use the new Indigo Amazon Echo plugin soon (the one that mimics a hue bridge), but the functionality I REALLY want is to capture the actual text that Echo receives, even if Amazon doesn't know what to do with it.

I'm blown away by the speech recognition of Echo, really blown away, I just want access to the raw text so I can parse it on my own and not rely on somebody else's language processing algorithms.

Anybody with thoughts on how best to do this? I've seen several neat but not really elegant hacks that do this:

http://forum.universal-devices.com/topi ... -variable/
Above captures the graphics from the browser of a Windows machine, then preforms OCR, and sends it via an Android. Not preferable as I don't want a dedicated windows machine or Android.

https://hackaday.com/tag/amazon-echo/
Reliant on IFTTT, Adafruit, and an Arduino. Too many jumps, and too many services that get access to things spoken aloud in my house. (Amazon alone is bad enough)

viewtopic.php?f=65&t=15374
Jay's plugin, though it doesn't (to my knowledge) grant access to the raw command

http://blog.zfeldman.com/2014-12-28-usi ... mperature/
The most elegant, but pretty high level. Involves having a machine open a browser and capture the text from the raw html. A little over my head given reliance on Ruby (though I did have a Siri Proxy up), Linux, and Sinatra. Also invloves having to say "Stop" after every command, for reasons unclear.

My current thinking is to use Python (or even Applescript) to log me in to Amazon's Alexa page, then monitor the HTML portion that has the last received command (it is always in the same section). I've looked into basic web-scraping tools and coding, and I am trying to figure out a way to have this run in the background (ideally) of my host mac, which runs Indigo Server.

Once I can get text into my own variable, I can code things based on my typical needs.
Example: "Alexa, Netflix"
- Amazon doesn't do anything with this command, but stores it in a variable
- Netflix is only associated with apple TV in my setup, so it would trigger the following Macro (currently ssigned to a button on my Control Page)
- turn on receiver, TV
- change input on receiver to one used for Apple TV
- execute "up" "down" "left" "right" commands on the Apple TV, selecting Netflix
- execute the "enter" command on Apple TV
(I use an IP2IR and Perry the Cynic's amazing plugin for this)

- my current language processing, written by me in python, strips irrelevant words, infers context, and executes Indigo actions via REST calls


So, any thoughts? Is the functionality already IN the hue plugin?

Posted on
Thu Feb 04, 2016 7:10 pm
RogueProeliator offline
User avatar
Posts: 2501
Joined: Nov 13, 2012
Location: Baton Rouge, LA

Re: Amazon Echo hacking

Would you be opposed to saying "Alexa tell Indigo to launch Netflix"? That kind of skill would be possible and I am thinking about creating one, along with a plugin to receive the data... I don't think it would be that widely used as the Hue method / plugin is FAR easier to setup/configure. Still, for those of us that want a bit more it might be cool.

I've been experimenting with Echo and I agree - it's speech recognition is top notch! I setup the existing plugin and would use it for lights still I think even with a new skill setup since it would be faster.

Posted on
Thu Feb 04, 2016 9:50 pm
mundmc offline
User avatar
Posts: 1060
Joined: Sep 14, 2012

Re: Amazon Echo hacking

Totally would be okay with "Ask Indigo", especially because, as an Alexa "Skill" (which I assume you are looking into), I wouldn't have to deal with Alexa's default response of not understanding, or worse, doing the wrong command.

If you ARE going the skill route, does the API even allow the raw text to be accessed?

Posted on
Thu Feb 04, 2016 10:14 pm
RogueProeliator offline
User avatar
Posts: 2501
Joined: Nov 13, 2012
Location: Baton Rouge, LA

Re: Amazon Echo hacking

If you ARE going the skill route, does the API even allow the raw text to be accessed?

I'm not really sure yet... all the Amazon-provided skill examples include some default handing if the request isn't one they recognize... which leads me to believe that Alexa can pass along more than the defined intents. However, it could just be defensive programming. Worst case, though, you could easily define every possible thing you wanted to handle? Can't be THAT many, huh?

Posted on
Fri Feb 05, 2016 10:41 am
jay (support) offline
Site Admin
User avatar
Posts: 18199
Joined: Mar 19, 2008
Location: Austin, Texas

Re: Amazon Echo hacking

I am by no means an ASK expert - however, as I recall from my initial investigations, I don't believe a skill ever actually gets the raw words.

Jay (Indigo Support)
Twitter | Facebook | LinkedIn

Posted on
Sun Feb 07, 2016 12:13 pm
mundmc offline
User avatar
Posts: 1060
Joined: Sep 14, 2012

Re: Amazon Echo hacking

I am by no means an ASK expert - however, as I recall from my initial investigations, I don't believe a skill ever actually gets the raw words.


That's my sense. It appears that adding a "To Do" qualifier allows stuff to be done with the raw text, but I'm still playing with his.

The next step is getting the actual command to Indigo from the Amazon server, and I'm curious what people are implementing. My hacky workaround is having Alexa add an item to "Reminders" via IFTTT, then having an appllescript applet run on my host machine that checks for new items which it then sends to Indigo. Main problem: even though the IFTTT triggers quickly, it takes a while for the info to sync with other machines running Reminders.

Will play around with the Maker channel, though I suspect the digest authentication will be an issue. I'll check back in.

Appreciate everybody's thoughts and interest.

Posted on
Mon Feb 08, 2016 11:16 am
mundmc offline
User avatar
Posts: 1060
Joined: Sep 14, 2012

Re: Amazon Echo hacking

Okay, so in Python, I can login to Amazon ALexa using the following, in Python:

Code: Select all
import requests
from BeautifulSoup import BeautifulSoup
from lxml import html
import re

payload = {
    "username": "MY_EMAIL,
    "password": "MY_PASSWORD",
    "appActionToken": "T52q5AP8j2Bo1z5Oaj2FrcpcSOof3wIj3D"
}
#The token was grabbed from the login page

session_requests = requests.session()

login_url = "https://www.amazon.com/ap/signin?showRmrMe=1&openid.return_to=https%3A%2F%2Fpitangui.amazon.com&openid.identity=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.assoc_handle=amzn_dp_project_dee&openid.mode=checkid_setup&openid.claimed_id=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.ns=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0&"
result = session_requests.get(login_url)

tree = html.fromstring(result.text)
authenticity_token = list(set(tree.xpath("//input[@name='appActionToken']/@value")))[0]

result = session_requests.post(
    login_url,
    data = payload,
    headers = dict(referer=login_url)
)

url = 'http://alexa.amazon.com/spa/index.html#settings/dialogs'


Now I'm trying to figure out how to grab the most recent statement heard by Alexa (Under Settings->History):
It's stored as follows:
Code: Select all
<span class="dd-title d-dialog-title">alexa what's the forecast</span>

With the following xpath:
Code: Select all
//*[@id="i1454945637622"]/div/span[1]

And the following selector:
Code: Select all
#i1454945637622 > div > span.dd-title.d-dialog-title


Any web programming guru's who can instruct me on how to grab this text using Beautiful Soup or lxml (both web scrapers)?

For whatever reason, if I "view source" for the entire page, it contains javascript voodoo that doesn't contain the actual data I want.

Posted on
Fri Mar 03, 2017 8:20 am
whmoorejr offline
User avatar
Posts: 762
Joined: Jan 15, 2013
Location: Houston, TX

Re: Amazon Echo hacking

mundmc wrote:
So, any thoughts? Is the functionality already IN the hue plugin?


If you are not opposed to saying, "Alexa, turn on Netflix"... then you can create a virtual device to run the actions you require to turn on tv, navigate to netflix, etc. Add that virtual device to Alexa via hue bridge.

Saying "Tun on" is a little weird for some devices, but it works.
"Turn on front door lock" locks my z-wave door
"Turn on security system" arms my DSC panel
"Turn on living room TV" tuns on my TV (Via ROKU TV Plugin)

Sometimes I have to remember if turning "on" my door lock is how I lock it or unlock it.

Using "lock my door" would be great and helpful for locks and other things.... but at least I only have to say "Alexa, turn on ___" versus "Alexa, ask ____ to ____"

I absolutely love the idea of getting the raw text from the website. Not so much for lights and stuff, but for other randomness. Like updating a iCal event (versus alexa's built in capability to update a google calendar event). Or a shopping list that I can access without the alexa companion app.

Last tidbit, you can now change the echo wake word to "computer" vs. "alexa", if you want to feel like you are in the bridge of a starship.

Bill
My Plugin: My People

Posted on
Fri Mar 03, 2017 8:52 am
mundmc offline
User avatar
Posts: 1060
Joined: Sep 14, 2012

Re: Amazon Echo hacking

That is exactly what I am currently doing. I just want the raw speech-to-text so I can parse it on my own.


Sent from my iPhone using Tapatalk

Page 1 of 1

Who is online

Users browsing this forum: No registered users and 3 guests