back to article fMRI bugs could upend years of research

A whole pile of “this is how your brain looks like” fMRI-based science has been potentially invalidated because someone finally got around to checking the data. The problem is simple: to get from a high-resolution magnetic resonance imaging scan of the brain to a scientific conclusion, the brain is divided into tiny “voxels”. …

          1. Pompous Git Silver badge

            Re: "Cargo cult science is a bit harsh for most."

            Could I have checked that when I bought it? Yes, of course, but you assume that the company selling right angles sells right angles, and not sort-of right angles. Do I "deserve [my] five minutes of shame"? I am not sure.

            No, you don't deserve shame, but you have learnt a valuable lesson: A careful workman never assumes; he checks. First building site I worked on I was taught two things:

            Measure twice and cut once.

            If you've got time to do it twice, you've got time to do it right the first time.

            Easy way to check a carpenter's square is to draw a line while holding it against a straight edge. Flip it and draw another line a smidgeon* away from the first. If the lines are not parallel, the square is useless.

            * Technically, a red cunt hair, but if there's no red cunt available, a smidgeon is near enough.

        1. a_yank_lurker

          Re: "Cargo cult science is a bit harsh for most."

          Granted someone deserves being named and shamed for this fiasco. There is a point at which one has to trust someone else to their job correctly. The vendors apparently did not bother to properly test or more telling did not apparently publish their data. To me, the principle blame falls on the vendors because they claimed they did their job correctly and obviously failed. The researchers get a much smaller portion for not asking the correct questions up front which may be obvious in hindsight.

          I remember a metallurgy professor commenting about crack detection equipment. Vendors tout the smallest crack it can find. But he said one should ask what is the biggest crack it will miss. The size of the crack is critical when trying to determine if a fatigue failure will occur. And I wonder how many fatigue failures occur because this overlooked. This situation sounds similar. The researchers may not have know the right questions to ask until now.

      1. Voland's right hand Silver badge

        Re: Cargo Cult Science

        Cargo cult science is a bit harsh for most.

        Sort-a. They underlying root cause is that statistics is not taught to biology/med science majors properly around the world.

        You need to learn probability theory all the way inclusive of its dependencies from a mathematical standpoint such as calculus to be able to make sense of modern statistics used in Biology and Medicine. I do not know of a country which has that in the biology major curriculum. Chemistry usually studies a basic course which is clearly insufficient by today's standards. Biology and medical sciences - not a chance.

        As a result, they cannot make sense of a false result even if it slaps them in the face like a Monty Python wet fish.

        1. Primus Secundus Tertius

          Re: Cargo Cult Science

          While I was a physics undergraduate, I had courses on statistics. I concluded that it was a difficult subject to teach.

          To prove the results that applied stats uses needs very advanced maths, which only a mathematical specialist will master. So those results are one-off recipes that have to be memorised individually. Contrast this with the classical geometry of Euclid, where the proofs are relatively straightorward and can easily be checked and re-proved if one needs to do that.

          We therefore reach the situation in statistics where in the physical sciences they are usually reasonable, in the biosciences they are wobbly, and in the social "sciences" they are non-existent. That last is particularly grievous because a lot of public expenditure is at stake there.

          1. Voland's right hand Silver badge

            Re: Cargo Cult Science

            To prove the results that applied stats uses needs very advanced maths,

            Yes

            To prove the results that applied stats uses needs very advanced maths, which only a mathematical specialist will master

            No.

            Based on my personal suffering from the same subject (I am not a math graduate either), I can conclude that same as the majority of non-math graduates you have suffered from Idioticus Statisticus. I am quoting the species name by translating a good friend of mine who used to run the prob and stats dept in a major Eu university for several decades. That is how he referred to those of his colleagues which sucked so bad that they had to go teach non-math majors. That is curable.

            Go read Feller, "An Introduction to Probability Theory and Its Applications". Amazon finally has it paperback so you do not have to shell out the relatively obscene 100 or so quid for the harcover (I have both the original and the English translation). Once you have read it, you are likely to change your opinion about "only advanced can grok it". It still holds a premium spot on my bookshelf till this day (many years after I did anything in proper math or sciences). Most of it reads like a novel by the way, you can enjoy reading it (that says a lot about a math book). I am probably going to add the English version of Gantmacher next to the original to it at some point so I can tell the kids to sod off and go read it when they ask math questions.

            1. Pompous Git Silver badge

              Re: Cargo Cult Science

              Go read Feller, "An Introduction to Probability Theory and Its Applications". Amazon finally has it paperback so you do not have to shell out the relatively obscene 100 or so quid for the harcover

              A lot less than 100 quid from Abebooks. Thanks for the pointer :-)

            2. Anonymous Coward
              Anonymous Coward

              Re: Cargo Cult Science

              Go read Feller, "An Introduction to Probability Theory and Its Applications". Amazon finally has it paperback so you do not have to shell out the relatively obscene 100 or so quid for the harcover (I have both the original and the English translation).

              It's also on archive.org for free:

              https://archive.org/details/AnIntroductionToProbabilityTheoryAndItsApplicationsVolume1

              1. Pompous Git Silver badge

                Re: Cargo Cult Science

                It's also on archive.org for free:

                An Introduction To Probability Theory And Its Applications Volume 1

                Yer blood's worth bottlin' as we say in these parts :-)

          2. LionelB Silver badge

            Re: Cargo Cult Science

            Well said. Statistics is difficult and the maths is hard and subtle. But the hardest part of applied statistics is figuring out which tests to use, whether your data satisfies the underlying assumptions behind the tests, how to apply them correctly and how to interpret what the results are (and are not) telling you.

        2. Doctor Syntax Silver badge

          Re: Cargo Cult Science

          "They underlying root cause is that statistics is not taught to biology/med science majors properly around the world."

          In my day, if we had a statistical problem we went and asked a statistician. I recall one such visit where a very simple test gave a 1 in N chance of a random pattern match. When we left he was still shaking his head and muttering, about N "That's a very big number".

          In this case it's a matter of not so much understanding the statistics but the code of a function. It's not clear from the report whether or not the function was open source or, even if it was, whether users would necessarily know it was incorporated into proprietary software.

      2. PNGuinn
        FAIL

        Re: Cargo Cult Science

        "I doubt very few of the researchers would or should have a clue about validating third party proprietary software."

        If you don't have a clear audit trail of your methodology, measurements and analysis it's not science - it's bad magic at best. Go get a real job annoying chickens.

        If that sounds a little harsh, think of the damage false or fake results cause, especially when the the wider world makes vital decisions based on them.

        If that proprietary software is not FULLY open source and 3rd party audited (note that I didn't say free and open source) It MUST be ASSUMED TO BE DODGY and no results based upon it can be taken as anything more than "interesting".

        The whole point of science is to be able to question EVERYTHING.

        1. DavCrav

          Re: Cargo Cult Science

          "The whole point of science is to be able to question EVERYTHING."

          Except you don't. You assume that the scales and thermometers are accurate. You are heading towards Descartes's evil demon with your thinking. It's 'standing on the shoulders of giants', not 'standing behind them constantly checking their shoes'.

          Science works by construction and destruction. You have to assume that most people are doing their jobs properly, because you cannot question absolutely everything. If I had to check every piece of research before using it I wouldn't be able to do any research. Sometimes errors are discovered, generally because new research contradicts old research or experiments don't match theory, and we have a big bonfire and throw a load of research on it. But I claim that this is by far the most effective method, rather than everyone being paralyzed through the requirement to replicate every single experiment.

          1. Pompous Git Silver badge

            Re: Cargo Cult Science

            "The whole point of science is to be able to question EVERYTHING."

            Except you don't. You assume that the scales and thermometers are accurate. You are heading towards Descartes's evil demon with your thinking. It's 'standing on the shoulders of giants', not 'standing behind them constantly checking their shoes'.

            I think you missed the word ABLE.

            You can assume that "thermometers are accurate" all you want, but that doesn't mean you are correct. Different sensors may well agree that water freezes at 0 C and boils at 100 C at a pressure of 1 atmosphere, but will disagree on air temperature when placed in a screen.

            The energy balance of small temperature sensors was modelled to illustrate the

            effects of sensor characteristics, particularly size, on the accuracy of readings in the presence of

            strong shortwave or longwave radiant loads. For all but extremely small sensors, radiant

            exchange may lead to unacceptable errors. The common practice of using passively ventilated

            instrument screens was evaluated in a series of comparative measurements. The differences

            resulting from the use of different models of shields may be an order of magnitude greater than

            the error resulting from sensor calibration. In the absence of technological innovation capable

            of reducing the error due to radiant exchange to negligible proportions, it is suggested that a

            standard methodology for calibrating and labelling the error resulting from the characteristics

            of the screens be adopted, to allow comparison of new data with long-established records.

            http://www.fau.usp.br/aut5823/Medicoes/Erell_Leal_Maldonado_2005_Abrigos_Radiacao.pdf

            assume = to make an 'ass' out of 'u' and 'me'.

          2. Alan Brown Silver badge

            Re: Cargo Cult Science

            "Except you don't. You assume that the scales and thermometers are accurate. "

            I don't. I was brought up with the notion that you CHECK their calibration against known standards, which are in turn crosschecked against other standards and you periodically RECHECK the devices to ensure they have not gone wonky. You also need to know their limitations - thermometers being one example where they may be accurate in liquids but not in air.

            I was also taught the difference between RESOLUTION and ACCURACY.

            What that means is that an uncalibrated device can often tell you that there is a difference between two measurements to a high degree of resolution, but you can't tell what the absolute values really are. both are useful traits. A calibrated device which has low resolution might give an accurate reading for measurements, but the higher resolution uncalibrated device allows you to tell the differences between the readings (for this reason, a digital multimeter might give 3-4 digit resolution but the reading should never be considered accurate to more than 1 significant place even after calibration - nonetheless, it's still useful to have those extra digits when tuning for peaks or minimums.)

          3. Adrian Midgley 1

            Assuming scales???? No

            When I was taught to weigh things I was also taught to check the scales - balance beams with brass pans - by swapping the sides round.

        2. Anonymous Coward
          Anonymous Coward

          Non open source "must be assumed to be dodgy"

          I sure hope you don't mean to imply that open source software can be assumed to be not dodgy. That didn't work out too well for people relying on e.g. openssl, which in the past year or two was found to have some bugs that were just as old as the bugs in this research software - despite being used by FAR more people and being FAR more critical to get right. Yeah, openssl code was rather messy, but research software is generally the ugliest code you've ever seen.

          There are only two ways to validate the output of research software like this. One, have a second version of it developed completely independently to act as a check - but who wants to "waste resources" on writing and maintaining it? Two, have someone occasionally work out the results "by hand" to check. I realize some may object it say "what if it needs a million calculations, it would take someone a lifetime to do that" but you can still do it with computers, you just need to break it down into its components and do the calculations separately piece by piece without referring to the standard research software at all.

          Even these aren't foolproof - what if the method is based on a paper someone wrote 20 years ago and it turns out there was an error in the paper that no one ever caught? In such a case the software would be "right" in terms of following the formula exactly, and separately developed software or calculations performed manually would show a matching - but wrong - result. Science is hard.

          1. Adrian Midgley 1

            Not automatically reliable, but

            very difficult to prevent anyone from discovering the faults.

            Proprietary software often seems designed to cover up faults.

    1. Arthur the cat Silver badge
      Headmaster

      Re: Cargo Cult Science

      Can also be anthropological studies of millennial belief systems in post-WW2 Melanesia.

  1. oldtaku Silver badge

    I'd have hoped most people twigged to this when that 2010 fMRI study found statistically significant activation in the brain of a dead salmon.

    1. Anonymous Coward
      Anonymous Coward

      Dead Salmon

      "brain of a dead salmon"

      Doesn't count - that was the control group.

      1. Mark 85
        Trollface

        Re: Dead Salmon

        I guess they couldn't get a politician for the control group then?

        1. mosw

          Re: Dead Salmon

          "I guess they couldn't get a politician for the control group then?"

          Actually it was a politician. It was just the salmon coloured comb-over that fooled them.

    2. David 132 Silver badge

      I'd have hoped most people twigged to this when that 2010 fMRI study found statistically significant activation in the brain of a dead salmon.

      Now that you mention it, I did think it was a bit fishy...

      1. Anonymous Coward
        Anonymous Coward

        Might have been a red herring.

        1. Code For Broke

          The whole series of experiments were nothing more than a fishing expedition.

          1. Captain Badmouth
            Happy

            They were just trawling for data, obviously, but the netting didn't quite balance things.

    3. Petrea Mitchell
      Boffin

      Salmon's off

      My first thought as well. (Though the study was actually in 2009.)

      http://blogs.discovermagazine.com/neuroskeptic/2009/09/16/fmri-gets-slap-in-the-face-with-a-dead-fish/

  2. allthecoolshortnamesweretaken

    I really like the "This is what your brain looks like on bad data" line.

    Question: any chance the raw data is still stored somewhere and can be processed again using revised software? (I guess not, but I've never worked with MRI machines.)

    BTW, why the many downvotes for the cargo cult reference? It may be a bit harsh in this context, but the basic point is valid. If you use something in your work that is a "black box" to you, you take a risk.

    1. You aint sin me, roit

      If you use something in your work that is a "black box" to you, you take a risk

      I imagine the whole MRI scanner is a black box to most biologists.

      Even if you wave your hands and say "nuclear magnetic resonance" you're using the black box of quantum mechanics - so do the biological researchers now need to be physicists? And those physicists... few [i]fully[/i] understand the rigorous derivation of all of the black box mathematical methods they use. So do the biologists now need to be mathematicians too?

      In the case of this research, would the researchers be expected to have reviewed the software? Would the software developers be expected to review the compiler, the operating system, the computer hardware, the software used to design chips?

      However, it is a bit worrying that the software was faulty... I wonder how many other researchers, heavily dependent on third party analysis packages, are quaking in their boots.

      1. EddieD

        Re: If you use something in your work that is a "black box" to you, you take a risk

        "I imagine the whole MRI scanner is a black box to most biologists."

        I don't think so - even back in the 80s when I was studying towards a Biology based degree, and NMR was a fairly new phenomenon, we were taught very carefully the underlying principles of both NMR and ESR spectroscopy. I don't think that that will have changed.

        And the University insisted that all science subjects had Maths and Stats courses.

        1. a_yank_lurker

          Re: If you use something in your work that is a "black box" to you, you take a risk

          You were taught enough to have a fair understanding of how it worked commensurate with the amount of math and quantum mechanics you studied prior. This is true of all areas of science and engineering, you learn your field reasonably deeply but you are relatively weak in related fields.

          The problem is when one must rely to on "black boxes" that they did not design, build, and program. One will always be at risk either misusing the black box or, as in the case, the black box has a serious flaw that corrupts the data and invalidates the experiments. I seriously doubt any of these researchers could design, build, and program an MRI machine.

      2. Alan Brown Silver badge

        Re: If you use something in your work that is a "black box" to you, you take a risk

        "However, it is a bit worrying that the software was faulty..."

        Software is ALWAYS faulty.

        One shining example being the way that floating point calculations are handled. With the best will in the world there are always rounding errors introduced and the way you write your software can actively seek to minimise them, or possibly ignore them and compound the errors.

        We ran into real-world examples of this with IDL programs giving different answers in 32 and 64 bit environments. The assumption was that as the 32-bit software had been around for years, the 64bit environment was wrong.

        It turned out that BOTH were wrong, just in differing degrees. The resulting insights meant that more accurate (and consistent) results were achieved in both environments(*) AND the old raw data got reprocessed to give "better" answers.

        (*) Hint: Don't take the answer from a single calculation and use that as input for the next calculation, ad nauseum. This compounds the rounding errors.(**) Always recalculate from your basepoint.

        (**) It's akin to using pi=22/7 or 3.142 and then iterating over several million subsidiary results. Whilst both are a useful approximation, over a few million cycles you won't get circles anymore. ALWAYS do all the integer stuff _first_ when calculating on a computer. Real numbers need to go into the calculation mix last.

  3. Destroy All Monsters Silver badge
    Unhappy

    Only 20% of a software development team's brains are used at any moment!

    What would happen if they used 100% (as Einstein famously not said). Click "HERE" (site related to Scientology) to find out!!!

  4. Anonymous Coward
    Anonymous Coward

    Any decent scientist checks the tools for accuracy

    Just because the tool is software does not abdicate them of that responsibility.

    1. Anonymous Coward
      Thumb Down

      Re: Any decent scientist checks the tools for accuracy

      Just because the tool is software does not abdicate them of that responsibility.

      This is not 1 + 1 = 3. This is millions of lines of software carrying out complex calculations, and widely used by a large community. It's like saying that any decent driver tests the software in their car for bugs.

    2. Doctor Syntax Silver badge

      Re: Any decent scientist checks the tools for accuracy

      To a very limited extent I agree with you. I used to be a Tutor Counsellor back in the early days of the OU. The kit sent out to foundation science students included a simple beam balance. Two of my students were teachers and they objected to this because they had nice digital balances at school - the lure of the numerical readout. I pointed out that with a beam balance they at least could see what their standards were. When my lab acquired its first digital balance (we'd lost our original kit in a fire) I took advantage of the fact that the local Weights and Measures folk were next door and went to borrow a standard mass from them to check it.

      Apparatus that you put together yourself (which you might well do if you're in a field where COTS kit isn't available) can and should be carefully tested and calibrated.

      But given the complexity and quantity of commercial stuff in any lab these days it's not feasible and the manufacturer's reputation has to be relied on.

    3. DavCrav

      Re: Any decent scientist checks the tools for accuracy

      How? This isn't a slide rule. This is a piece of software that interprets a couple of terabytes of information into pictures. What do you test it with, and against?

      1. Anonymous Coward
        Anonymous Coward

        Re: Any decent scientist checks the tools for accuracy

        Trust but verify.

        Science based on unvalidated software is not science.

  5. John Smith 19 Gold badge
    Unhappy

    "“lamentable archiving and data-sharing practices”"

    Yes that did put me in mind of The Climate "Research" Unit and the harryreadme file.

    I've heard people comment that functional MRI has been used by drinks companies to identify brain activation and response to various drinks. Yes that really is your brain on Coke.

    That research has been very quietly done and is obviously worth billions in a global market.

    "Know your tools" is a good moto but of course with tools this complex there is a reason for "we have to trust the developers knew what they were doing."

    This looks like the situation in CFD,but without the approach. In that field no software (or new major release) gets accepted without multiple test runs amongst known test cases (many from live wind tunnel tests). First they test, then they trust.

    BTW doing it with dead samples brought up the interesting factoid that about 500 genes were firing after death, presumably as part of the organisms last ditch attempts to self heal.

    1. Vic

      Re: "“lamentable archiving and data-sharing practices”"

      with tools this complex there is a reason for "we have to trust the developers knew what they were doing."

      And yet - most of us here have worked in complex development environments, and we know the shite that goes out the door on ship day...

      In that field no software (or new major release) gets accepted without multiple test runs amongst known test cases

      The trouble with that sort of test-driven development is that you can easily end up in the situation where you're developing a product to pass those tests and do essentially nothing else; if the tests are not renewed frequently, your solution ends up touching the problem surface at a few discrete points only.

      Vic.

      1. Alan Brown Silver badge

        Re: "“lamentable archiving and data-sharing practices”"

        "The trouble with that sort of test-driven development is that you can easily end up in the situation where you're developing a product to pass those tests and do essentially nothing else;"

        I've seen this too. Products getting fudge factors inserted by the programmers so that they passed the tests (what's happening in the real world) vs notifying the project initiator so they can go back to find what the root cause of the discrepency was and incorporating this into the original hypothesis.

        People can and have been sacked for this kind of shenanigan and it's the kind of attitude from staff that results in things like accusations of deliberate climate model fudging for political ends, etc.

        The fundamental problem is that the slacker attitude that you see in many movies is alive and well even in well-funded science environments (and thriving in badly funded ones). Noone wants to admit that they can't trust _all_ their staff to produce the right results (worse still you don't know which staff are unreliable until long after the horse has left the barn), in the same way that noone really wants to admit the biggest threat to any business is from within.

  6. Steve Crook

    Wee 'P' values

    There's an idea that a small 'P' value always means something, and also, an idea that statistics isn't something that needs a specialist, that your graduate course in stats and Excel functions are enough to see you through...

    I can't help feeling that every science team needs a statistician or three to analyse their data and produce results. Sort of like system/acceptance testing, members of the team, but kept apart so they can at least make a show of independence.

    As for publishing code and data to allow reproducibility, why would anyone want to do that unless they're forced to? IMO for peer review to be worth anything, every published paper should have had its code and data analysed by at least one independent reviewer and should be published to allow anyone to reproduce the results.

    Remember a paper published last year that claimed to show that people who held right wing views correlated with people with psychotic tendencies? Someone got hold of the data and showed that a bug in the researchers Excel spreadsheets meant they'd got the 'results' the wrong way round.

    http://retractionwatch.com/2016/06/07/conservative-political-beliefs-not-linked-to-psychotic-traits/

    If forcing reproducibility during review means that guff like this never gets published, job done.

    1. Alan Brown Silver badge

      Re: Wee 'P' values

      "Someone got hold of the data and showed that a bug in the researchers Excel spreadsheets "

      I think you mean a bug in the researchers. Using Excel for detailed analysis should be a sacking offence.

  7. Doctor Syntax Silver badge

    Who'd 'a'thunk it?

    Clustering software produces artefacts.

  8. demat
    Boffin

    Where's the problem?

    Provided the proper controls/baseline scans were in place then any systematic errors in the analyses will be accounted for.

    1. Uffish

      Re: Where's the problem?

      I agree - but what is a 'proper' baseline scan (or set of scans)? Totally MRI ignorant nerds would like to know.

      1. demat

        Re: Where's the problem?

        The design of appropriate controls will completely depend on what the experiment is. The point is that any systematic errors will be present in both datasets.

  9. cantankerous swineherd

    loads of neuro"science" goes down the drain, and a good thing to.

    side note: statistics is inherently hand wavy, involving approximations and estimates. add a lot of prerequisites for the theorems you're using but don't understand and any statistical argument is prima facie dodgy.

    1. DavCrav

      "statistics is inherently hand wavy"

      You are wrong. Statistics is not inherently hand wavy. People who don't understand statistics think it's hand wavy.

      "involving approximations and estimates. add a lot of prerequisites for the theorems you're using but don't understand and any statistical argument is prima facie dodgy."

      I can tell you are not a mathematician or statistician by that argument, which boils down to: "I don't understand stats. Therefore nobody does".

      1. EddieD
        Joke

        "You are wrong. Statistics is not inherently hand wavy. People who don't understand statistics think it's hand wavy."

        Oh I dunno. Having something called "Tukey's honestly significant difference test" implies that there are dishonestly significant difference tests out there somewhere...

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like