back to article A Rowhammer ban-hammer for all, and it's all in software

A group of German researchers reckon they've cracked a pretty hard nut indeed: how to protect all x86 architectures from the “Rowhammer” memory bug. It's been 18 months since “Rowhammer” first emerged, and responses have largely come from individual vendors working out how to block the “bit-flipping” attacks in their own …

  1. Anonymous Coward
    Anonymous Coward

    Memory controller feature

    This is a good approach, though detecting multiple access to a single dram row should be done by the memory controller. It could then force a refresh or slow the processor. Also use ecc ram wherever affordable.

    1. Lee D Silver badge

      Re: Memory controller feature

      I am, in fact, quite surprised that there is a low-enough level of control to exhaust a particular capacitor, certainly in any controlled way whatsoever.

      Software shouldn't have to deal with stuff like this as anything more than a stopgap. Like DEP etc. it should be using the hardware's inherent capabilities to manage this kind of thing, not doing the software "bouncer pushing certain groups back" method.

      1. Paul Crawford Silver badge

        Re: Memory controller feature

        Comes down to money eventually - people want cheaper/faster DRAM and so design margins are inevitably pushed down and refresh arrangements made more 'optimistic' so they don't block I/O too much, etc.

        ECC should trap this of course, but again few will pay the ~15% more for ECC DRAM and sadly most AMD motherboard don't support it even though AMD do in the CPU! For Intel you have to pay extra for the 'server' CPUs to use it (except I think for a few embedded CPUs where they grudgingly enable the feature).

        Still this approach makes sense as it has little performance hit and the genera idea, of identifying and separating physical RAM regions that care at risk of coupling in a rowhammer attack, could be applied to other OS as well. Assuming they care...

        1. Down not across

          Re: Memory controller feature

          ECC should trap this of course, but again few will pay the ~15% more for ECC DRAM and sadly most AMD motherboard don't support it even though AMD do in the CPU!

          I think you may find that at least on some motherboards ECC memory works fine even though the motherboard documentation don't say anything about supporting it. Just as an example I have Gigabyte 990XA-UD3 with Athlon II X2 and it is running ECC memory with CentOS just fine.

          Nov 4 20:31:58 centos-test kernel: AMD64 EDAC driver v3.4.0

          Nov 4 20:31:58 centos-test kernel: EDAC amd64: DRAM ECC enabled.

          Nov 4 20:31:58 centos-test kernel: EDAC amd64: F10h detected (node 0).

          Nov 4 20:31:58 centos-test kernel: EDAC amd64: MC: 0: 0MB 1: 0MB

          Nov 4 20:31:58 centos-test kernel: EDAC amd64: MC: 2: 2048MB 3: 2048MB

          Nov 4 20:31:58 centos-test kernel: EDAC amd64: MC: 4: 0MB 5: 0MB

          Nov 4 20:31:58 centos-test kernel: EDAC amd64: MC: 6: 0MB 7: 0MB

          Nov 4 20:31:58 centos-test kernel: EDAC amd64: MC: 0: 0MB 1: 0MB

          Nov 4 20:31:58 centos-test kernel: EDAC amd64: MC: 2: 2048MB 3: 2048MB

          Nov 4 20:31:58 centos-test kernel: EDAC amd64: MC: 4: 0MB 5: 0MB

          Nov 4 20:31:58 centos-test kernel: EDAC amd64: MC: 6: 0MB 7: 0MB

          Nov 4 20:31:58 centos-test kernel: EDAC amd64: using x4 syndromes.

          Nov 4 20:31:58 centos-test kernel: EDAC amd64: MCT channel count: 2

          Nov 4 20:31:58 centos-test kernel: EDAC amd64: CS2: Unbuffered DDR3 RAM

          Nov 4 20:31:58 centos-test kernel: EDAC amd64: CS3: Unbuffered DDR3 RAM

  2. John Smith 19 Gold badge
    Unhappy

    Should't be possible.

    But is.

    The question of course is when is a lot of activity on a row malware and when is it a very busy record in a database?

    Still <1% performance hit for a software solution is pretty impressive.

    1. Paul Crawford Silver badge

      Re: Should't be possible.

      I suspect most servers used for serious database work would have ECC DRAM and probably be tested (often called "qualified") that it works without crashing.

      My Asus Chromebook, now running Linux, hangs occasionally. When I tried the rowhammer example it hung the same way. Also it hangs on memtest86 unless you use the 'safe' mode, so guess who has crappy RAM?

      1. Michael H

        Re: Should't be possible.

        Apparently, ECC is not a totally effective mitigation for rowhammer, since ECC is only guaranteed to detect single-bit errors, whereas rowhammer can flip multiple bits.

        1. Paul Crawford Silver badge

          Re: Should't be possible.

          Yes, but usually if ECC can't correct (it will often detect multiple bit errors, but can't fix them) your machine will normally reboot.

          Not ideal, but they you *know* that something is wrong and it is better than silently being backdoored.

          1. Anonymous Coward
            Anonymous Coward

            Re: Should't be possible.

            Would be nicer if ECC handled things more gracefully. e.g. send a signal/interrupt to the kernel indicating corruption in that region, and allow the kernel to decide what to do. Within the mapped area of a userland process? Kill the process. Within a file backed area? No big deal, remap the file somewhere else in memory. Within the kernel itself? Panic.

            1. Anonymous Coward
              Anonymous Coward

              Re: handled things more gracefully.

              "Would be nicer if ECC handled things more gracefully. e.g. send a signal/interrupt to the kernel indicating corruption in that region, and allow the kernel to decide what to do."

              Been happening for years. Of course you may have to look outside the weird and wonderful world of x86 hardware and software to do that, but it's definitely not rocket science.

              E.g. this snippet from 2003

              http://h41379.www4.hpe.com/wizard/wiz_8771.html

        2. Solmyr ibn Wali Barad

          Re: Should't be possible.

          No, ECC is pretty much guaranteed to detect multi-bit errors. It calculates CRC-like checksum of every data word and stores it in a separate space. Most commonly it's 8 checksum bits for every 64 data bits. Checksums are used on every read and write operation. If there is an uncorrectable error, memory controller has to issue NMI signal and reboot the machine.

          As for correction - normal ECC has sufficient checksums to correct one wrong bit, but there are implementations in the wild that can correct up to 4 bits. 8-bit versions exist in research papers.

  3. David Roberts
    Paris Hilton

    Still does my head in

    1) Hammer row

    2) Random memory corruption

    3) ???

    4) Profit!

    I can visualise how buggering up memory can cause other programs to mis-behave but still struggle to visualise how you can force such a specific mis-behaviour that you can take over control of the machine.

    1. Dave 126 Silver badge

      Re: Still does my head in

      >I can visualise how buggering up memory can cause other programs to mis-behave but still struggle to visualise how you can force such a specific mis-behaviour that you can take over control of the machine.

      The Google Project Zero the article refers to is outlined here. It should answer your question better than I can!

      https://googleprojectzero.blogspot.co.uk/2015/03/exploiting-dram-rowhammer-bug-to-gain.html

      I think the rough idea is that by hammering the memory bits you have permission to access, you can flip a bit in adjacent memory that otherwise would be off limits to you. Part of the exploit method is to deliberately fragment the machine's memory before the hammering, so that there is a greater chance of accessible memory being adjacent to memory reserved for the kernel.

  4. Loud Speaker

    Solution

    If chips are vulnerable to rowhammer, they are defective, and should have been rejected during manufacture (although standard practice appears to be to sell them in 3rd world countries).

    Please can we have the Linux memory test upgraded to detect rowhammer so we can check our own memories. (Quickly - I have just ordered a bunch of cheap memory from Ebay).

    1. Olius

      Re: Solution

      You could argue it is a design defect, but it isn't a manufacturing defect.

    2. Dwarf

      Re: Solution

      Cheap memory is a false economy. You will pay from the curse of random crashes and data corruption.

      1. DropBear

        Re: Solution

        "Cheap memory is a false economy."

        ...and expensive memory is money spent on unneeded heatsinks and flashing LEDs or on a brand name that sells the same OEM hardware with a flashy name on it. I'm _not_ saying there is no such thing as better, more reliable memory - I'm saying good luck figuring out when your money pays for an actual difference in quality...

        1. Brewster's Angle Grinder Silver badge
          Holmes

          Re: Solution

          "...good luck figuring out when your money pays for an actual difference in quality.."

          If only there was some way a reviewer or end user could test it. Hmmm.

        2. allthecoolshortnamesweretaken

          Re: Solution

          What? No blinkenlights? That's heresy, that is!

      2. Anonymous Coward
        Anonymous Coward

        Re: Solution

        "You will pay from the curse of random crashes and data corruption."

        Isn't that the curse of running an OS with untrustrworthy code and untrustworthy memory management? Don't even need dodgy hardware...

    3. Mr Flibble

      Re: Solution

      Memtest86 has Rowhammer tests these days. I advise running that for several hours, at least; enough time to let it complete a few runs.

    4. Anonymous Coward
      Anonymous Coward

      Re: Solution

      I see it that way too: if rowhammer works the memory chip will not operate correctly when running some algorithms so it is faulty, not fit for purpose, should be sent back and potentialy a class action taken out against the manufacturer. However the fact that the manufacturers are not being sued suggests that this errant behavior is declared somewhere in the depths of the datasheets. Is this true? Do they declare that the memory will corrupt if it is used intensively? And finally is there any conceivable legitimate computational algorithm that is doomed to fail due to rowhammer errors it will trigger?

  5. FlippingGerman

    The most obvious way the defend against this that presents itself to me is to have a physical buffer of RAM around each process's memory. Just some unused space. It would waste some RAM, though I wouldn't have thought it would too much.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon