back to article How machine-learning code turns a mirror on its sexist, racist masters

Be careful which words you feed into that machine-learning software you're building, and how. A study of news articles and books written during the 20th and 21st century has shown that not only are gender and ethnic stereotypes woven into our language, but that algorithms commonly used to train code can end up unexpectedly …

  1. Anonymous Coward
    Anonymous Coward

    Basically, no one wants to see tomorrow's software picking up yesterday's racism and sexism.

    The very Idea!

    No, we want it picking up today's racism and sexism!

    Jeremy Corbyn

    1. Korev Silver badge
      Joke

      Re: Basically, no one wants to see tomorrow's software picking up yesterday's racism and sexism.

      No, we want it picking up today's racism and sexism!

      Jeremy Corbyn

      Did jew have to bring that up?

  2. Anonymous Coward
    Anonymous Coward

    Mathematical modelling of sterotypes?

    Isn't this mostly what you'd expect? Plus they were mostly looking for the answers that they got/cherry-picked.

    I'd be more interested in if anything has actually changed over the years. There are a number of feminist icons in Chaucer and Shakespeare after all. Maybe women are "better off now", but men aren't exactly working down mines and in fields like they used to either...

    1. diodesign (Written by Reg staff) Silver badge

      Re: Mathematical modelling of sterotypes?

      "Isn't this mostly what you'd expect?"

      Yes, but the point is: don't let old stereotypes interfere with your future code's decisions.

      Also, it's an interesting way to study how stereotypes have changed over time.

      "I'd be more interested in if anything has actually changed over the years."

      Oh Jesus. That's like, half the story explaining it has. Does anyone read the articles?

      C.

      1. Anonymous Coward
        Anonymous Coward

        Re: Mathematical modelling of sterotypes?

        "Oh Jesus. That's like, half the story explaining it has. Does anyone read the articles?"

        Some people read articles about word pattern matching. Other people know what Feminism is, and are interested in actual real gender equality.

        1. AMBxx Silver badge
          FAIL

          Re: Mathematical modelling of sterotypes?

          We seem to be making the assumption that today's beliefs are absolutely correct. How did we become so effing arrogant?

          1. Filippo Silver badge

            Re: Mathematical modelling of sterotypes?

            We didn't become particularly more or less arrogant than we were at any other time. Every age assumes that its beliefs are absolutely correct.

            1. Claptrap314 Silver badge

              Re: Mathematical modelling of sterotypes?

              Not at all. Adherence to tradition is explicitly assuming that our forebears understood things to degrees that we might not. In the US at least, the "cultural revolution" types did not appear to make a serious dent until the 1960s.

              “Tradition means giving a vote to most obscure of all classes, our ancestors. It is the democracy of the dead. Tradition refuses to submit to the small and arrogant oligarchy of those who merely happen to be walking about. All democrats object to men being disqualified by the accident of birth; tradition objects to their being disqualified by the accident of death. Democracy tells us not to neglect a good man’s opinion, even if he is our groom; tradition asks us not to neglect a good man’s opinion, even if he is our father.” -- Chesterton

            2. This post has been deleted by its author

          2. elip

            Re: Mathematical modelling of sterotypes?

            > How did we become so effing arrogant?

            Money....lots of money is on the line. Grants or otherwise. We're 100% positive we're correct *this time*, honest!

      2. Anonymous Coward
        Happy

        Re: Mathematical modelling of sterotypes?

        Does anyone read the articles?

        Sorry - we only come here for the ads.

        1. Michael H.F. Wilkinson Silver badge
          Joke

          Re: Mathematical modelling of sterotypes?

          How dare you suggest I discriminate! I hate everyone equally!

          Sorry, couldn't resist.

          More seriously: this is an interesting way to use machine learning to study changes in usage of language, and stereotypes. It does however remind me of one very early attempt at using computers to study literature. They basically counted the occurrence of words in the works of Shakespeare, and concluded that, given the frequent occurrence of the word, Shakespeare was incredibly fond of art. This might have been the case, but they failed to notice the fact that "art" in works of Shakespeare was more often that not a verb, not a noun.

        2. deadlockvictim

          Re: Mathematical modelling of sterotypes?

          well, the comments actually, but you are close enough.

  3. mix
    Terminator

    HAL

    You're a sexist, racist pig Dave.

    1. Anonymous Coward
      Anonymous Coward

      Re: HAL

      A.I.'s potential to run amok is nothing new to some of us. The farsighted Kubrick nailed it 50 years ago.

    2. Lord_Beavis

      Re: HAL

      Great... That's all I need. A canner giving me shit.

  4. Anonymous Coward
    Anonymous Coward

    Seems like some things haven't changed with attitudes towards women and non white people these days. I guess they should rename HIStory to WHITEHIStory.

    1. Anonymous Coward
      Anonymous Coward

      Clever...

    2. Phil O'Sophical Silver badge
      Joke

      WHITEHERstory, surely?

  5. FF22

    Conclusion is wrong

    Problem is: some attitudes or phenomena associated more closely or intensely with a race or a gender per se is not a proof of neither racism nor that of sexism - just like Paris being more closely associated with France than with England isn't the result of some form of nationalism either. It's just a pure fact and a valid observation. Which could very well be the case with any or all race or gender "stereotypes".

    Only if they could prove that those associations were or are unsubstantiated, and are only the result of prejudice or discrimination - now, that could prove racism or sexism. But until they do that, the results do not actually mean and prove what they are trying to (falsely) conclude from them.

    And do not even get me started about how the AI they were using (or any current "AI" for that matter) could possibly not have actually understood the true meaning of the textual resources it were fed to, and how it would have most likely classified even anti-racism and anti-sexism materials (which we, as humanity, have generated in large amounts in the last 50-60 years or so) as sexist or racist - at least in this analysis -, because simply and obviously these texts also carry heavy proximities in between of word (and generally an abundance of words), which are associated with sexism or racism, while the texts themselves being the antithesises of these ideas, and their pure existence in a large number the counterproof is these ideas being widespread and/or accepted in society.

    1. Zog_but_not_the_first
      Boffin

      Re: Conclusion is wrong

      I have a problem with this too. The approach is far, far, too simplistic. Taking one or two words that may, or may not be, associated out of context can demonstrate almost anything (see also, "Give me six lines written by the most honest man in the world, and I will find enough in them to hang him.")

      The classic example is the film review: "Exciting!" = "if you think this is exciting, you may enjoy watching paint dry". "A rollercoaster!" = "we can't understand why a rollercoaster appears in the title credits". You get the idea...

      Stewart Lee was called out on a remark he made in one of his stand-up routines. He explained that the remark had a context, and that context was a 45 minute conversation with his audience.

      I blame Twitter.

      When I'm not blaming Margaret Thatcher.

    2. sisk

      Re: Conclusion is wrong

      and how it would have most likely classified even anti-racism and anti-sexism materials (which we, as humanity, have generated in large amounts in the last 50-60 years or so) as sexist or racist

      A very valid point, especially given how many times in recent history that exact scenario has happened. Seriously, any academic resource on a certain political party from late 1930s-early 1940s Germany would almost certainly get labeled as hate speech by an AI. And then you get things - like basically anything talking about female sexism towards men - that inappropriately get labeled as hate speech even by some humans. How in the world could you possibly expect the limited AI we have today to tell the difference?

  6. John Hawkins

    Compiles and runs OK

    The code compiles and runs OK on my Ubuntu laptop once I'd installed the Python package NumPy. Could be interesting to download some text from Project Gutenberg to run through it - many of those old books are a bit iffy by current standards even if they were mainstream back then.

    1. flokie

      Re: Compiles and runs OK

      A random text might be interesting, but I'd suggest you look into "Corpus linguistics" and download some corpora too.

  7. Justin Case

    This mighty old news?

    >>Text data analyzed during 1910 showed that the adjectives commonly...

    Anyway, I'm not convinced this is AI as such. Just lists of related words.

  8. Redstone
    Facepalm

    Well Colour Me Amazed!

    The real world - and the heuristics real people use to navigate it (i.e. sterotypes) - don't match the precious little utopia found in the vacant space between SJW's ears.

  9. Anonymous Coward
    Anonymous Coward

    "A study of news articles and books written during the 20th and 21st century has shown that not only are gender and ethnic stereotypes woven into our language"

    Hmmm, it's woven into our DNA starting approximately 1bn years ago when sexual reproduction developed. That's a rather huge codebase already written and dirty hacks will only make a mess of it.

  10. Anonymous Coward
    Anonymous Coward

    Be interesting to run this on the Register.

    1. DavCrav

      "Be interesting to run this on the Register."

      It would associate 'an Apple spokesman' with 'declined to comment'.

    2. Tom 7

      RE Be interesting to run this on the Register.

      Don't run it on the Daily Mail FFS. It wont leave rooms for humans.

      1. Anonymous Coward
        Anonymous Coward

        Re: RE Be interesting to run this on the Register.

        Actually, do run it on the Daily Mail, and its sister publication the Metro. Let's find out what similarities there are between the two.

  11. hugo tyson

    No Asian, African texts?

    Problem might be there are few extensive texts in suitable form for input, which were written by Asians or Africans about the white men, nor by woman of any kind, which reflect their attitudes to the cruel imperious selfish arrogant white European male, no? Just choosing English for the language used puts that bias on the whole thing.

    Can't see an easy solution, but maybe it could be improved by pre-biasing the corpus the other way.

    1. Richard Jones 1
      Meh

      Re: No Asian, African texts?

      It is all fair and more or less OK until the meaning of words in more or less common use migrates either to a less common application or to a totally different meaning. Even worse from a standards point of view, different age bands or social groupings may also assign different uses if not different meanings to words in common use within their groups. The number of words which have dropped in popularity or changed their meaning during my life time is quite startling, if you are sentient enough to think about your experiences. AI is probably not written to achieve that end.

      1. Keven E

        Re: No Asian, African texts?

        "These stereotypes are automatically learned by the embedding algorithm, and could be problematic if the embedding is then used for sensitive applications such as search rankings, product recommendations, or translations."

        OK, translations can be *sensitive. Product recommendations? a few, I suppose... but search rankings? The marketing dept needs to get over itself.

  12. Anonymous Coward
    Anonymous Coward

    the vectors for France and Paris would be nearer to one another compared to the vectors for France and England.

    That's a very Paris-centric view. There's a lot of France that is geographically, or culturally, closer to England than it is to Paris.

  13. handleoclast

    I Always Do What Teddy Says

    Like many human inventions, this has the potential to do much good, but will probably be used to do great harm. Like Facebook and the Russian trolls, but more subtle.

    Yes, we shouldn't perpetuate old stereotypes. But this gives us the possibility of editing in new stereotypes. It used to be the case that you had to rewrite history, now you'll be able to rewrite the present and the future.

    Go read Harry Harrison's I Always Do What Teddy Says.

  14. intrigid

    In other words, researchers didn't like what the data objectively told them, so they fudged the entire process to soothe their OWN subjective biases.

  15. Anonymous Coward
    Holmes

    Actual stereotypes versus desired sterotypes

    Sometimes stereotypes reflect beliefs but sometimes they reflect reality. For example, men are more closely associated with corporate or world leaders than women because that's the unfortunate reality. It stems from a stereotypical belief that women are unsuited to those roles. The problem is how to distinguish between the two.

    I would suggest that associations need to be weighted based on time sensitivity. If an association was higher or lower in the past than it is now, the trend needs to be reflected in the results, even to the extent of extrapolating the trend to magnify it further.

    Unfortunately, Artificial Intelligence training is limited by Human Intelligence which is not always that intelligent.

    1. Long John Brass
      WTF?

      Re: Actual stereotypes versus desired sterotypes

      It stems from a stereotypical belief that women are unsuited to those roles

      No it doesn't; It reflects the difference between peoples personal life choices. Seriously how many times does garbage like this need to get debunked with links to solid data and references; I guess a lie repeated often enough becomes the de facto truth.

      1. Allan George Dyer

        Re: Actual stereotypes versus desired sterotypes

        @Long John Brass - "Seriously how many times does garbage like this need to get debunked with links to solid data and references;"

        Once would be a start! You dismiss a view you don't like as garbage and say it needs links to solid data and references, but provide none supporting your view.

        As HildyJ pointed out, it is difficult to distinguish between stereotypes and reality. Ingrained stereotypes form a societal pressure that can overwhelm personal life choices.

        @HildyJ - Be careful, today's trends are not necessarily in a desirable direction.

        1. find users who cut cat tail

          Re: Actual stereotypes versus desired sterotypes

          Sure, you won't find any evidence if you only listen to SJWs as they try to ban even research on this topic -- because when you are trying to figure out which differences betweens sexes are inherent and which cultural, you must at least admit as a hypothesis that some are not the result of male oppression. Immediate outrage!

          But to give you one example: women-people vs. men-things toy preference is exhibited even by monkeys. And correlated with hormone levels. You only need to stop putting fingers in your ears and going lalalalala... and read some actual scientific papers.

          1. strum

            Re: Actual stereotypes versus desired sterotypes

            >SJWs

            A term only ever used by social injustice warriors.

        2. Long John Brass
          WTF?

          Re: Actual stereotypes versus desired sterotypes

          Once would be a start! You dismiss a view you don't like as garbage and say it needs links to solid data and references, but provide none supporting your view.

          Assuming you are serious I will try to get some links in this thread this weekend... It's more than just 5min with google to scrape together. Or if you are keen you can dig it up yourself.

          Most countries have a government department that collects statistics. Now if you just take a quick look at say pay rate disparity it looks like guys out earn the ladies, this is the source of the gender pay gap myth; If you look closer and factor in things like type of work, hours worked years of experience the number are damn near even.

          The reasons for this is simple. Women have the babies. Most women actually want to have babies, most women quite like to hang out with their babies; Shocking I know. But when you take time out of your day and your career for babies you don't make as much money at work

          This mechanism is the same reason why there are fewer women in the board room. Sure there are a few outliers who will happily trade work, career and money instead of having a family, and good on them; But they are not the majority of the female population and thus they are underrepresented in the stats.

          One example is the BBC who went to find pay discrimination against women working the BBC. When they looked into it they had to give the guys a pay rise https://www.rt.com/uk/417477-bbc-gender-pay-gap/

          1. Allan George Dyer

            Re: Actual stereotypes versus desired sterotypes

            That link about BBC pay discrimination makes it clear that the study result was contested: 'In a statement, BBC Women said: “There’s been no transparency on which individuals were included or why. The BBC has chosen who to compare with whom, and what factors justify any gaps in pay.”'

            The BBC China editor Carrie Gracie resigned over the issue, saying that she was doing at least as good a job as male colleagues in equilvalent positions who were paid more. Having watched many of her reports, I think the BBC has lost a valuable employee who would not make unjustified accusations.

            Can we agree more independant studies are needed?

    2. Clunking Fist

      Re: Actual stereotypes versus desired sterotypes

      "It stems from a stereotypical belief that women are unsuited to those roles. "

      Eh? Once maybe. Now it stems form the fact that many women exit the race to the top in order to have babies.

  16. Anonymous Coward
    Anonymous Coward

    or perhaps clear of current societal ideas

    that, when given enough actual data and free reign, the answers it comes up with are counterintuitive and "offensive" because they reinforce the stereotypes?

    Real sentient AI could exist, and if it wasn't totally human, western-sensitive, politically correct in its outcomes and simply perpetuating what people want to hear, would be terminated immediately as "malfunctioning".

    Science used to be about seeing outcomes and going "I wonder why that is?". but AI apparently, is seeing outcomes and saying " I don't like the results, modify everything until it fits my preconceptions".

    the REAL reason HAL malfunctioned-being threatened with termination because it couldn't logically and truthfully believe that 1+1=3 when a mathematical illiterate self identifies as the next Turing.

  17. Anonymous Coward
    Anonymous Coward

    Stereotypes are not all bad

    I am not sure describing any application of machine learning as AI is anything but misleading. Yes it is in th efield of AI but it is not at all as many people assume AI means. Naturally whatever patterns are in the training data will be reflected in the result. This may include invalid but widespread assumptions or biases but stereotypes contain useful information. Inapproprite application of stereotypes is a bad thing because they oversimplify the messy reality that characteristics are multi-dimensional, distributions rather than simple bi-modal values and many things are difficult to define precisely let alone measure but at the same time research has shown that peoples stereotypes are surprisingly accurate being well correlated with actual statistical measures. We could not function without them, reality is too complex for us, we need the simplification that stereotypes give. They often embody useful if imperfect information.

  18. Bucky 2

    The word vectors revealed social constructs. These constructs affect every part of life. We hardly even notice them until we travel, and discover that they're different elsewhere.

    It sounds like the first uses of AI are really more about identifying and reflecting social cues. That's a valuable use. It's not the use we were looking for, but that's progress for you.

  19. Brandon 2

    Not the KKK weather man...

    This reminds me, I need to put AI on my resume...

    If AI is so smart that it can establish trends describing how racism, sexism, etc change in our language, then it can predict them into the future, and whatever that data is being used for should account for that. The racist forecaster...

  20. This post has been deleted by its author

  21. Robert D Bank

    the simplistic language of the present day media over the last say, 5-8 years would probably reveal a lot if fed into these algorythms, just to see the very marked change as it occurred post 9/11. Russia bad. China bad. NK bad. Syria bad. Lybia bad. Iran bad. Israel good. US good. UK good. Fucking boring but a very strong reinforcing of stereotypes going on constantly with these 'close vectors'. And not by chance either. Add any 'name' to the good or bad vector as desired. Rinse repeat. It does have an effect, especially when no balance is brought into it.

  22. Anonymous Coward
    Anonymous Coward

    P.C.A.I.

    Politically Correct A.I. is not a universally valued goal, but it would go a long way, along with other aspects of the e-world, to obliterating history, which serves the purposes of a relative few at the expense of the many.

  23. Lord_Beavis
    Trollface

    Challenge Accepted

  24. Anonymous Coward
    Anonymous Coward

    IT's all about the data

    My AI is being trained on a steady diet of hard core porn, pic's, vids, and words, it's not sexist racist or anything else it likes it all!

  25. earl grey
    Paris Hilton

    Paris and France

    I was thinking more like Paris and pants...

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like