back to article How we fooled Google's AI into thinking a 3D-printed turtle was a gun: MIT bods talk to El Reg

Students at MIT in the US claim they have developed an algorithm for creating 3D objects and pictures that trick image-recognition systems into severely misidentifying them. Think toy turtles labeled rifles, and baseballs as cups of coffee. It’s well known that machine-learning software can be easily hoodwinked: Google's AI-in …

  1. Dave Bell

    There's two distinct problems here. A false positive can be handled by using human staff, though the system design and training need to be better. Today it often seems that the human staff at airport security are the big problem, and some people see the AI tech as a solution to that. It isn't a simple answer to the people working on it, but at the more political levels of decision-making, it comes close to Mencken's "simple, plausible, and wrong."

    The false negatives are where it gets dangerous. I can't see any way of avoiding those without maintaining the existing human-based monitoring. So the AI-based system is something that maybe can be added in parallel, but it's not something that will save money. It likely will also need continuing professional development, just like a human-based system.

    It's abour a hundred years since Mencken wrote his line. And maybe that is an example of a deeper trend. In any field, the simple answers that work get identified soon in its history. Is the marker of a mature field that new, working, simple answers are rare?

    I tried looking for a quote on that, It isn't simple.

    1. jmch Silver badge

      " A false positive can be handled by using human staff"

      One, or a few, false positives, yes. Massive amounts, no. If everyone is triggering automatic security systems and the human searchers cannot find anything suspicious, at some point the humans will assume the system is wrong and ignore it even when it does flag a real positive.

      It's the Mr Bean shoplifting scenario :)

      1. ITS Retired

        "One, or a few, false positives, yes. Massive amounts, no. If everyone is triggering automatic security systems and the human searchers cannot find anything suspicious, at some point the humans will assume the system is wrong and ignore it even when it does flag a real positive."

        It doesn't matter as long as the proper people make a profit on the machines.

    2. Muscleguy

      That is because reality is dirty and noisy and fuzzy. So the better we need things to work the more complex they have to become.

      In biology because so much is still unknown we are still falling over the simple. Like the recent revelation that the signal which causes poison frogs in the Amazon to take froglets to pools is simply the presence of a froglet/tadpole on the back. This works for female frogs even though in nature they hardly ever move froglets/tadpoles. Males on the other hand will move to water any froglets/tadpoles in their territories.

      They also asked how the frogs find the pools, turns out they just remember where pools they know about are. Simple.

      But delve deeper and the complexity gets messy and because biological systems are subject to historical contingency, unpredictable. When investigating you can posit ideas but you have to go and look. Hypotheses in biology crash and burn constantly. Pity we can't extract energy from the process.

    3. Voland's right hand Silver badge

      A false positive can be handled by using human staff

      Sorta.

      It will be mortuary staff handling a false positive in a car image recognition system if the car moving at 75mph mistakenly identifies the signs and banisters on a sharp turn. They make a pretty good place to sticker the images over too.

      1. Bronek Kozicki

        That's a good point. A naiive assumption would be that road signs are an absolute, however for any reasonable driver actual road situation (e.g. approaching a sharp turn, weather conditions, cars nearby) obviously take precedence. The question is, how much "reasoning" can an AI driver actually perform.

  2. Rebel Science

    This just shows that deep learning is not the answer to AGI. In fact, it's a hindrance. Wake up, people. AGI research money must be applied elsewhere.

    1. John Brown (no body) Silver badge

      "This just shows that deep learning is not the answer to AGI. In fact, it's a hindrance. Wake up, people. AGI research money must be applied elsewhere."

      It does seem as if an entirely wrong approach is being taken to the problem. Show a human a car once and they will likely recognise almost any image of a car as also being a car. A computer gets trained on "millions" of images and can then still be fooled by even the tiniest of changes to an image.

      I don't now what approach should be taken but it clearly demonstrates we have no idea how human (or animal in general) image recognition works.

      1. Yet Another Anonymous coward Silver badge

        On the other hand a doctor 'trained' on a few images of a cancerous cell from a textbook 20years earlier does much worse job of screening a million tumour scans than a machine trained on millions of images

        1. EnviableOne

          there are as many different cancer cells as there are cells in the body, and the trained doctor doesnt have the persistance of recognition, as the human brain will have re-clasified the textbook image as irrelevant if its not used regularly, so chances are it will be overwritten by the cost improvement formula the DoH have decided he is responsible for achieving works.

    2. Aitor 1

      It is ok

      have three different deep learning systems, and allow them to vote. You will get the right answer almost always.

  3. redpawn

    I think I might have AI

    I often misidentify objects on first sight and only after closer inspection realize my mistake. For example I was driving and saw a large dead black bird on the road with a wing flopping around in the wind created by passing traffic, a sad but not uncommon sight. As I passed close to the corpse of this poor bird it morphed into an umbrella. Daily miss-identification of objects are so common I seldom notice the mental correction later on.

    1. Pompous Git Silver badge

      Re: I think I might have AI

      "Daily miss-identification of objects are so common I seldom notice the mental correction later on."
      And you know you have a problem when the guacamole scratches your face!

      1. Anonymous Coward
        Anonymous Coward

        Re: I think I might have AI

        That's turtlely ridiculous

        1. John Brown (no body) Silver badge
          Thumb Up

          Re: I think I might have AI

          "That's turtlely ridiculous"

          There are parts of the Kingdon of Geordieland, the province of Gateshead in particular, where "totally" actually is pronounced "turtely". It did make my smile the first time I heard a native pronounce the culinary dish "turd in the hurl".

      2. Tim Seventh
        Joke

        Re: I think I might have AI

        "Daily miss-identification of objects are so common I seldom notice the mental correction later on."

        And you know you have a problem when the guacamole scratches your face!

        Disregard the scratching, I think the bigger problem is him/her trying to eat the guacamole.

    2. jmch Silver badge

      Re: I think I might have AI

      "Daily miss-identification of objects are so common I seldom notice the mental correction later on."

      Thing is, you look a few times and eventually arrive at the correct result. The article makes it clear that most AI spoofings depend highly on spoofing a single evaluation in a very specific way, and any further sampling of the object by the AI would quickly arrive at the correct result.

      Being able to repeatedly spoof the AI over different orientations, different light conditions etc is a huge step

      1. Eddy Ito

        Re: I think I might have AI

        Did you watch the video? The assault turtle can be reoriented at will and still gets classed as a rifle.

    3. Florida1920

      Re: I think I might have AI

      I once mistook the back of a partly-submerged alligator for an old truck tire. That was early in my experiences with alligators. Training is important! In the meantime, stay out of the swamp.

      1. Yet Another Anonymous coward Silver badge

        Re: I think I might have AI

        Presumably all the other generations that didn't make the mistake will survive to reproduce.

        We need to feed more people to partly submerged alligators if we are going to achieve a statistical result

    4. a_yank_lurker

      Re: I think I might have AI

      The key is context and lighting. Something flapping around on the road is a hazard to avoid. Some are more dangerous than others. Even if one initially misidentifies the item the key point is it is likely a hazard. However, in the context of automating and speeding up screenings, false reports will lead to either annoying an innocent person or letting someone get a weapon through.

      'AI' lacks one key attribute of true intelligence; the ability to infer something based on previous experience and knowledge when confronted with a new situation. I am now offer to pet my meowing guacamole.

  4. Anonymous Coward
    Anonymous Coward

    As I read it

    There is not much "I" in the "AI" its still broadly algorithmic pattern matching based on samples. The images don't look that noisy to me either.

    As far as I can see this is simply an exercise in popping the hype bubble a bit and bringing "AI" back to reality.

    1. Evil Auditor Silver badge

      Re: As I read it

      There's certainly no "I" in this "AI". I don't quite agree with algorithmic in your comment though. It's probably some sort of artificial neural network trained to recognise what it has recognised before. Pretty stupid, but... since it's hardly understood what happens within this neural network, they call it intelligent.

  5. fajensen
    Big Brother

    The Real Problem Here

    Is that "they" are happily using exactly the same kind of AI systems in "Signature Strikes" and for spotting "Terrorists" on the internet!

    Although mistaking a turtle for a gun could be a "Working as Designed"-Feature, to better ensure the unimpeded and continuous flow of million dollar missiles onto brown peoples mud-huts, hospitals and weddings.

  6. Baldrickk

    It would certainly be interesting

    ... if this were in use in an airport or other secure location to aid in detecting threats, and someone walked in with something like this that set off the system, to see what would then happen, given that the person in question certainly wouldn't be carrying something so obvious as a rifle.

    1. katrinab Silver badge

      Re: It would certainly be interesting

      Well the turtle would require further attention anyway, given the laws on transporting animals across borders, but what if you wore a t-shirt with a picture of the turtle on it?

  7. Anonymous Coward
    Anonymous Coward

    Typical hipsters

    It had to be something like "guacamole" instead of something mundane. These are the types that will order a "double sweet soy latte with a twist of vanilla" instead of a coffee-from-the-jug*.

    :)

    * Once brewed, still the fastest way to serve less fussy customers

    1. fidodogbreath

      Re: Typical hipsters

      It had to be something like "guacamole"

      No indication in the article as to whether the guac is an extra charge. Journalism is dead...

      1. Anonymous Coward
        Anonymous Coward

        Re: Typical hipsters

        Journalism is dead...

        Judging by the downvotes, so is a sense of humour :)

  8. Anonymous Coward
    Anonymous Coward

    No worries, once these are installed as autonav...

    in Cars, everything will be fine. By the way...

    My bonus is the one without the clawback!

  9. JimmyPage Silver badge
    FAIL

    Pattern matching ... keyword matching ...

    A *awful* lot of what is laughing called "AI" at the moment is no such thing. It's keyword matching on steroids. Or, as evinced in this instance, pattern matching (on steroids).

    It was pretty whizzy when you only had a few million targets, and fairly simple selection criteria. But as we are seeing, scale things up, and add in more complex (i.e. real-world) criteria, and you get distinctly unintelligent results.

    The most impressive display of machine learning I have seen was in IBMs Hursley labs, where a neural network was run over a video of a scene in a park. It managed to spot when a skateboarder (which it highlighted in red as "human using wheels", jumped off and walked - which it then highlighted in blue (as "human on foot"). However even it still struggled to spot "human not moving", so missed the people sitting around the fountain unless they moved. Not sure if that was a good economy of processing, or a mistake.

    Human/mammalian cognition is a much more distributed and subtle process. I suspect it's working in a very intermeshed manner simultaneously looking at shape, colour, symmetry in the first instance, and creating a matrix (or matrices) of probabilities which is then being compared to learned objects and how they would be expected to behave if they are what the guess is.

    If we're going to get anywhere near, I would expect to see a wall of smartphones looking at the scene, with each one doing a single specific job, but (and this is the bit we *can't* do yet) communicating in real time to refine it's little bit of the universe.

    1. Anonymous Coward
      Anonymous Coward

      Re: I would expect to see a wall of smartphones

      on reflection (that's a little artificial vision joke) you'd have one camera *feeding* hundreds of smartphones.

      One smartphone would be dedicated to going "PING !" when a vertical line came into vision, while another would do the same for a horizontal. AT THE SAME TIME, others would be looking to "PING !" when a certain colour, or sequence was spotted.

      The real trick (a la mammalian vision) is to be able to work the "PING !" symphony into a probably match. Which is probably a function of the brain trying out various possibilities based on previous experience. So if (for example) you are looking at the sea, the background suggestions coming FROM the brain (to meet the suggestions coming FROM the eyes) are going to be "things you'd expect to see on the sea".

      In real time.

      Just wait until you realise how humans hear speech.

      1. Kiwi
        Boffin

        Re: I would expect to see a wall of smartphones

        Just wait until you realise how humans hear speech.

        Ah yes, fun that. Few people realise when listening to their native language that the spacing between words and syllables is the same. When you're struggling to learn a language, even when you know a lot of the words, sorting out this stream of syllables into separate words takes some doing.

        And as you get older, and little bits of your hearing go, it gets so much harder still. Unless you've had a lot of experience at learning languages.

        With people with English as a 2nd (or 6th) language, I try to take this into account and add a longer pause between words. Still really hard to do at times.

  10. Mage Silver badge

    AI, Neural Nets and Machine Learning

    It's all marketing. AI, Neural Nets and Machine Learning have a special essentially dishonest meaning in IT.

    It's just programs designed by humans, running on machines designed by humans (which are the same principle as a computer made of relays), using big databases, with human curated data.

    It's inevitable that it's fragile. A cat or small child will never confuse a cat and guacamole. Your average crow (British Rook) is better than any computer program, though isn't motivated to work as an operative in Passport Control.

    1. AdamWill

      Re: AI, Neural Nets and Machine Learning

      "A cat or small child will never confuse a cat and guacamole."

      My cat, on the other hand, constantly confuses "her own freaking reflection" and "an actual cat".

  11. Nick Ryan Silver badge

    Context

    A large portion of the failure of AI image recognition systems is that they lack approproate context in their learning.

    A system that is trained in 2D space on 2D images will not work anywhere near as well as a system that has a grasp of 3D space and has been trained on 3D "images". A system that confuses a 2D picture of something with the real thing is going to fail nastily.

  12. John Smith 19 Gold badge
    Unhappy

    If the image is CGI why is it "fragile" ? You can fine tune it as much as you like

    Creating the perfect shot to f**k up that oh-so-clever mult layer NN system (which does not sound as mysterious as "machine" or "deep" but is a f**k load more accurate description).

  13. ThatOne Silver badge

    Shocking

    How dare LabSix make fun of the emperor's new clothes?

    That's lese-majesty.

  14. M7S
    Joke

    I've occasionally wondered how much fun could be had

    if I had a template of something like a handgun or grenade and through it could spray something that was invisible to the naked eye but opaque to x-ray onto suitcases of those travelling via air. The boss' carry-on case comes to mind if he's come into the office before heading to the airport, or just wandering around the lobby of an hotel randomly picking on luggage waiting to be moved somewhere.

    Now if just changing a few pixels will fool the AI, is there a possibility that some very small stickers would in effect do the same sort of thing? At times of major holidays this could be as effective a way of bringing an airport to a standstill as any more violent act.

    1. Yet Another Anonymous coward Silver badge

      Re: I've occasionally wondered how much fun could be had

      Modern half-decent x-ray machines image at different angles so they get a 3d-ish view of the object.

      This is mostly to stop you hiding a suspicious shaped object behind a dense reasonable shaped one

      1. Anonymous Coward
        Devil

        Re: I've occasionally wondered how much fun could be had

        Just put lead paint and paint in the words "I am watching you"... then next bag "stop leaning into the CCTV Xray screen like that, it's bad for your back." last bag "That's better, sit up straight now!"

  15. Anonymous C0ward

    I'm not asking them to make me a burrito

    That is all.

  16. Anonymous Coward
    Anonymous Coward

    Stupid question

    If rotating an image can result in it being correctly identified why doesn't the system rotate 360 times and see what results it comes up with, surely that should throw off the false positives?

  17. the Jim bloke

    The cats name was Guacamole

    Why is there a problem?

  18. Kiwi
    Coat

    Lead paint may be in widespread use..

    That would explain the thickness of many in the TSA.

  19. RoboticRabbit

    Nice pets

    I've always wanted a pet guacamole. Hard to train them though.

    1. hplasm
      Coat

      Re: Nice pets

      I find pet gaucamoles hard to hit with the mallet.

  20. JJKing
    FAIL

    Oh dear!

    The TSA scanner operators are going to get really confused when the Terry Wrists get scanned with turtles hanging from their shoulders.

    1. Kiwi
      Trollface

      Re: Oh dear!

      The TSA scanner operators are going to get really confused when the Terry Wrists get scanned with turtles hanging from their shoulders.

      Not sure that something like could happen. I don't believe they're capable of any further confusion.

  21. tomcouchman

    Couldn't you use a variation of a GAN setup where you get the adversarial network to generate examples to fool the discriminator, while having a human supervisor ensure the GAN still produces images clearly recognizable as a cat?

    1. hplasm
      Unhappy

      "...images clearly recognizable as a cat"

      What if the cat is covered in guacamole? What then???!!??

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like