back to article When you play this song backwards, you can hear Satan. Play it forwards, and it hijacks Siri, Alexa

Computer science boffins affiliated with IBM and universities in China and the United States have devised a way to issue covert commands to voice-based AI software – like Apple Siri, Amazon Alexa, Google Assistant and Microsoft Cortana – by encoding them in popular songs. They refer to these tweaked tunes, which issue mostly …

  1. redpawn

    Alexa, purchase a Bit Coin

    is the new stealth hit song. Wish I knew how to set it to repeat.

  2. Danny 14

    so you have a device listening in all the time and you are surprised when it hears things? oh my. next people will be using photographs to fool facial recognition locks...

    on a second note, we try to set my mates alexa off all the time if he has discord running on speaker. sometimes he does forget to swith alexa off.

  3. Neil Barnes Silver badge
    Big Brother

    How well does this attack work

    against an Alexa device that isn't there?

    1. Charles 9

      Re: How well does this attack work

      Simple. There probably will just be one there without your knowledge.

    2. The Man Who Fell To Earth Silver badge
      FAIL

      Re: How well does this attack work

      The worst part is most of these voice recognition devices are always on & listening even when set to "off". We have a smart thermostat whose settings are set with voice commands "off", yet about once a month it just belts out a "I didn't understand that". When I check to see if it's voice command feature got turned on by someone, the settings always show it "off". Be afraid, very afraid.

      Or better yet, get rid of the thing.

      1. HieronymusBloggs

        Re: How well does this attack work

        "The worst part is most of these voice recognition devices are always on & listening even when set to "off""

        No, the worst part is that they are there.

      2. Anonymous Coward
        Anonymous Coward

        Re: How well does this attack work

        My phone used to have an occasional habit of suddenly saying "Sorry, I dod not understand that question" ... normally during the R4 Today program. Turned out that Android skips the "Ok Google" step and is always listening when beign chargd unless a setting to stop this is set. Hence, while chnarging on my bedside table beside my radio it had been spending several weeks/months listenning to John Humprhys et al!

    3. bazza Silver badge

      Re: How well does this attack work

      "Hey Siri, buy me an Alexa"

      Or Any Windows 10 PC, soon, apparently.

  4. The Alphabet

    I can't wait for someone to sneak this into a Justin Bieber song and watch the fallout.

    1. Simon Harris
      Mushroom

      I thought any Justin Bieber song already caused the listener to self-destruct.

      El Reg - can we have a 'Scanners' icon please ---------->

    2. jelabarre59

      I can't wait for someone to sneak this into a Justin Bieber song and watch the fallout.

      I thought Kenny-G or Michael Bolton. It'll serve them right for listening to such crap.

  5. Rich 11

    Enough to drive anyone to murder

    She played the fiddle in an Irish band

    But she fell in love with an English man

    Kissed her on the neck and then I took her by the hand

    Said, "Alexa, kill them all"

    With my pretty little Galway Girl

    You're my pretty little Galway Girl

  6. Hanry_M

    On the topic of ultrasound...

    Kudos to the Register for getting 90% of a story across in very few paragraphs.

    I would not normally grinch over the sentence “a microphone capable of suppressing ultrasound” but it leads to a couple of observations:

    1) Microphones can be designed to suppress ultrasound, but that either makes them bulky or expensive. More likely the suppression is happening in the amplifier stage.

    Interestingly, even this is an expense that scale-producers avoid as we found to our astonishment when we ran a hack-day around nature exploration: trying to use off the shelf gadgets to listen to bats. We found that out of several laptops and mobile phones about half had no hardware limitations limiting the frequency range to the human-audible spectrum! Not surprising, given that hugely oversampling and filtering in the digital domain is working fine, but I digress. Anyway, the observation was that the cheaper the laptop of phone the easier it was to hack it for bat-listening.

    So, for money reasons alone, how much would you like to bet that even in three years’ time plenty of connected microphones will still be responding to ultrasound input?

    2) Ultrasound is actually wanted by gadgets: digital fingerprinting of music, advertisements etc. has been reported aplenty recently, even in the Register iirc. Another reason to doubt that ultrasound will be ignored by connected mics.

    More likely that gadgets will increase their listening range (“to better listen out for your protection” says grandma and we all feel comforted) and countermeasures will have to evolve in the signal processing. Where the success-rate will be unknown, and thus it ever bumbles on... :)

    1. diodesign (Written by Reg staff) Silver badge

      Re: On the topic of ultrasound...

      "More likely the suppression is happening in the amplifier stage."

      OK, I've slightly tweaked that.

      C.

    2. Muscleguy
      Boffin

      Re: On the topic of ultrasound...

      If you can make the speakers play ultrasound then you should set it to play in the autumn when nonfleider maus tend to come inside the house. They don't like ultrasound noise. I recall a research mouse facility whose mice were not breeding, turned out their aircon system was screaming in the ultrasonic. They fixed it and they bred like, small rabbits.

      Random Biology: male mice sing like canaries in the ultrasonic to attract the ladiez and also scream in the ultrasonic as they climax. So you can imagine how ultrasonic noise spoils the vibe.

      1. Evil Auditor Silver badge

        Re random biology

        Maybe the females lost interest in the boring males and tried to hump the aircon instead?

    3. HieronymusBloggs

      Re: On the topic of ultrasound...

      "Microphones can be designed to suppress ultrasound, but that either makes them bulky or expensive"

      In many cases just using a heavier diaphragm in the microphone capsule would do the job (extra material cost close to zero). I suspect it's more a case of apathy than expense.

      1. paulll

        Re: On the topic of ultrasound...

        ... at the expense of sensitivity. If you set about upping the mass of the diaphragm to the point where high frequencies you don't want are significantly rejected, your signal's going to be mangled.

        1. Anonymous Coward
          Anonymous Coward

          Re: On the topic of ultrasound...

          noise cancelling microphone?

          What comes out of the speaker really shouldnt be accepted as commands,so they could fix in software or they could filter frequenices outside of normal voice range, either way would not stop it PWNing someone elses phone

  7. tokai

    Until voice recognition can reliably identify individuals, it would be cool if you could name each device locally.

    I.e. if I say “Alexa” or “Siri” it only gives me very basic guest type access.

    However I’d I say “AmazonMcSpyFace” instead it sort of salts my following voice command, giving me higher privaledges.

    Disclaimer: I don’t own any of these, they possibly already allow this.

    1. Flocke Kroes Silver badge

      Not that easy

      These devices have a low power chip that listens continuously for anything resembling "Alexa" or "OK Google". When it hears something that matches it sends a recording to the cloud for speech recognition. Putting proper speech recognition into a low power chip would be difficult.

      The strange thing is that early attempts at speech recognition (what you say) turned out to be voice recognition (who is speaking) devices. The down side is that antique tech requires training. Say "Siri recognise my voice" a hundred times and a low power chip probably could (but it would also respond to you saying "OK Google" or "OW! Who spread drawing pins on the floor?"). The problem is to find customers with enough brains to understand the problem, enough patience to actually train the device and sufficient courage/gullibility to let such a device in their home.

  8. Anonymous Coward
    Anonymous Coward

    Pub sport

    Say "Hey Siri, set my alarm for 3am" as you wander about...

  9. Aladdin Sane

    Obligatory xkcd.

    1. aliceklaar?

      Meanwhile, back in 1994, Dilbert http://dilbert.com/strip/1994-04-24

      1. Unep Eurobats
        Windows

        My, that Dilbert is old: he still has a CRT. And Wally has yet to become so addicted to coffee that he's surgically attached to his cafetiere.

  10. tony2heads

    Passive ultrasound absorption

    If I recall, some foam rubbers and some curtain materials absorb ultrasound. Can any confirm this?

    If so wrap you home 'assistant' (alias spy device) in curtains.

    1. Anonymous Coward
      Anonymous Coward

      Re: Passive ultrasound absorption

      Or, how about simply not having one. Seems the best way to "defeat" this.

      1. Anonymous Coward
        Anonymous Coward

        Re: Passive ultrasound absorption

        And if your significant other insists?

        1. Aladdin Sane

          Re: And if your significant other insists?

          Wrap them in curtains.

          1. John Brown (no body) Silver badge

            Re: And if your significant other insists?

            "And if your significant other insists?"

            "Wrap them in curtains."

            I believe the preferred method is carpet. Not, or course, forgetting the shovel and the quicklime.

        2. Anonymous Coward
          Anonymous Coward

          Re: Significant Other.....

          Time to find a new one...........

        3. iron Silver badge

          Re: Passive ultrasound absorption

          Divorce them.

          (I'm not joking.)

        4. Julian 8 Silver badge

          Re: Passive ultrasound absorption

          Get one, then order all the random items and good toys you want.

          When they complain, say their poor choice in music or TV programmes (if your other half is like mine) must be ordering this via the hidden messages

          1. Charles 9

            Re: Passive ultrasound absorption

            And if they counter by going ballistic in the shopping and then pinning the blame on YOUR hidden messages...oh AND knows the number to a good divorce lawyer (meaning divorce runs the risk of losing more than half your stuff)?

            1. JohnFen

              Re: Passive ultrasound absorption

              If doing that seriously run the risk of divorce, then you already have much deeper relationship problems that you might want to attend to.

  11. Charles 9

    Who wants to bet some stunt like this gets inserted into the Super Bowl halftime show? By the time the fallout clears, someone's bound to have made out like a bandit.

    1. rmason

      @Charles 9

      As with many things in life....

      South Park already did it.

      1. Charles 9

        Re: @Charles 9

        I don't care about South Park. I care about an event that gets seen by millions of Americans throughout the country. I just picked it for being one of the most-watched events in the US. In Australia, it would probably be the AFL Grand Final, or anything else of similar caliber.

  12. Mystic Megabyte
    Unhappy

    Very costly

    Alexa, dial 118 118 and ask to be connected to the Azerbaijan police headquarters..

  13. Aladdin Sane

    If you play New Kids on the Block albums backwards they sound better.

    Bill Hicks, 1961-1994

  14. Steve Graham

    Well...

    So, to summarise: they tuned audio to be recognized by a specific analysis engine, and then tested it by having that specific engine recognize it.

    And this won't work on any existing products until they reverse-engineer their recognition. That could be difficult, since it's based on machine learning and is likely to be obscure.

    1. Cuddles

      Re: Well...

      "And this won't work on any existing products until they reverse-engineer their recognition."

      Not true at all. It's easy to create adversarial images by treating the recognition system as a black box, and there's no reason sound would be any different. If anything, it actually makes things much easier - you don't need to know anything about how it works, you just try over and over again making small changes to the input until you get the output you want. No understanding or thinking required at all, which ironically makes machine learning the perfect tool to screw with systems that rely on machine learning - if you use machine learning for a recognition system, I can use machine learning to learn how to break it, without ever needing to know what your machine learning system has actually learned.

      1. Anonymous Coward
        Anonymous Coward

        Re: Well...

        This is equivalent to saying "It doesn't matter that your password is long, I can just keep trying and trying until I find a pattern that fools you!"

        Technically correct, but we all know why that won't actually work.

        This is an interesting theoretical attack, but the fact they haven't extended their open box, clean room, in the lab theory work into a practical proof-of-concept speaks volumes.

        1. Cuddles

          Re: Well...

          "This is equivalent to saying "It doesn't matter that your password is long, I can just keep trying and trying until I find a pattern that fools you!"

          Technically correct, but we all know why that won't actually work."

          No we don't. Mainly because it absolutely does work. That's exactly what is meant when we say, for example, that MD5 is broken because of the possibility of collision attacks. It's literally the exact same thing - an attacker tries lots of different inputs until they find one that happens to give the desired output. The only real difference is that with cryptographic functions that's a big problem that we try to avoid, while with machine learning systems it's a design feature; the whole point of image recognition is to feed in lots of different pictures and get a limited set of outputs - dog, cat, car, etc. - so attacking it is just a matter of making small changes until the output switches from one to another.

          Basically, both systems simply convert an input to an output. Cryptographic functions would ideally be one-to-one, but in practice are always many-to-one and therefore at least theoretically attackable; they simply rely on making such attacks unfeasible given the current level of technology. Machine learning systems are many-to-one by design, and are therefore inherently vulnerable.

  15. Anonymous Coward
    Anonymous Coward

    I love the Alexa devices

    Spend my time setting up routines I can activate remotely to freak my wife out. Yesterdays was one where it started meowing and purring randomly.

    Todays is fart noises.

    Technology is great.

  16. Teiwaz
    Coat

    Perfect Solution

    I've designed a digital assistant that is proof against these sorts of attacks.

    I'm calling it Helen K.

    It's not very assisting though...

    ...stop throwing things....

    1. David 132 Silver badge
      Happy

      Re: Perfect Solution

      That dumb joke is deafinitely tasteless and has sent me into a blind rage!

      1. DJV Silver badge

        Re: Perfect Solution

        I struggle to see what you did there...

        1. This post has been deleted by its author

          1. Alistair

            Re: Perfect Solution

            @Coen :

            *Whoooooooooooooooosh*

  17. Captain Scarlet
    Paris Hilton

    Echo, open the front door.

    What people install "smart" locks now as well, I assume all you need to do is yell loud enough so said device can hear?

  18. EveryTime

    We need a checkbox response for "security hacks".

    This one would be "You construct your own (flawed) system that you claim is similar to the original, and hack that, never proving the hack works on the readily-available original."

    Neural networks are inherently untrustworthy. It's trivial to train a bad one that superficially appears to work. There are many stories of NNs that were later found to be deeply flawed. One was a tank / APC image recognition that was dramatically 'better' than humans, spotting tanks that were expertly concealed. It turned out that it was classifying road ruts as positives, not armored vehicles.

    There are now tools that help visualize intermediate node responses on specific types of Tensorflow networks. But that's a tiny fraction of the systems, you need to be an expert to understand what you are seeing, and it only works for images. It's actually more directly useful for figuring out the system is flawed than improving it (although one can lead to the other). But note that it requires access to the intermediate nodes -- which isn't known to the end user with the cloud processing of Alexa and the like.

  19. The Bionic Man

    Now Justin Beiber can do to Alexa what he's been doing to my poor ears.

  20. wayne 8

    Not just Alexa

    Hijack a vehicle's satellite radio receiver and deliver a song with a payload to a self driving vehicle's processor.

  21. Oengus

    I wonder

    I wonder what playing Billy Joel's song "Downeaster Alexa" would do...

    or what would happen in the Joel household if Billy was calling his daughter...

  22. ronspencer314

    Miranda

    https://www.youtube.com/watch?v=hXCaF68sDPU

  23. Andy 97

    An ad exec’s dream!

    There was a time in Blighty where HMG had to ban subliminal messaging in cinema and broadcast media.

    All an of our ad execs would need to do is to hide some nonsense about product/service at the end of the ad and hey presto, your slimline plastic friend will ask you if you want to know more.

    Ofcom has better get their regulation head on...

  24. Phukov Andigh Bronze badge

    Finally!

    a technological reason to tell the youth to "turn that sh*te down!"

    I wonder how long until "pirated" audio or video tracks start "phoning home"?

    Phishing with music files "traded" amongst associates?

    with some of the craptacular techno modified stuff out there, would one ever know till it was too late?

  25. fidodogbreath

    It's only a matter of time before there are web sites where you can select an audio file, type in your desired command, select a target device type, and then download your custom attack. Then the real fun begins.

    Alexa, set the temperature to 40 C

    OK Google, open the garage door. OK Google, close the garage door.

    Siri, show all the pics in the folder named private on the TV

    Etc.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like