back to article Place your bets: How long will 1TFLOPS HPE box last in space without proper rad hardening

SpaceX and HPE will put a modest little supercomputer into space next week to test how computer systems operate in extreme conditions. On Monday, August 14, HPE’s Spaceborne Computer will blast off to the International Space Station aboard a SpaceX CRS-12 rocket. It’s part of an experiment to examine if commercial off-the- …

  1. s. pam Silver badge
    FAIL

    The predecessor to HaL 9000?

    I'm sorry Dave but no HaL 9000 has ever made a mistake. We have a perfect record...bzzt...feh....zap...

  2. Anonymous Coward
    Anonymous Coward

    Lucky it wasn't a Microsoft Surface...

    According to ConsumerReports, it wouldn't survive the takeoff let alone a year in Space.

    1. Anonymous Coward
      Joke

      Re: Lucky it wasn't a Microsoft Surface...

      That's why the are called Surface - Earth surface only use...

  3. Suricou Raven

    Not quite how I'd have gone about things, but I assume the experts know what they are doing.

    I'd have sent a box packed with many different sorts of chips running software to detect and quantify errors, with the aim of figuring out exactly how much ECC and self-validating software it takes to make sure a computer can operate reliably even with the radiation. Perhaps eventually involving having two less-than-reliable conventional processors operating in lock-step, with a third rad-hardened, very minimal chip constantly comparing them and initiating a reset every time they do something different. It'd probably weigh less than packing everything behind a big sandwich of plastic, lead and boron.

    1. TRT Silver badge

      I'm sure they've already done that. This is to see how as close as practicable an off-the-shelf computer fares, knowing already how the chips themselves do.

      1. Ken Hagan Gold badge

        It is also possible to irradiate things down here on Earth, and I'm sure they've done that too. This exercise sounds more like the "all up" test that proves something you are already pretty damn sure of.

    2. Anonymous Coward
      Anonymous Coward

      Lockstep RIP

      This isn't rocket science, even if it's being used in a rocket.

      The lockstep+comparator idea doesn't really work for modern processors where cache plays a significant part in performance. E.g. there will be correctable cache errors ('soft' errors) from time to time, but they can't be guaranteed to be the same place and same time on each member of the set.

      If the comparators are simple, e.g. relying on instruction-level and/or memory-access level comparison, a soft error that occurs on only one member of the set will cause different behaviour from the ones without the soft error, and therefore (in your proposal) cause a reset.

      In the real world, fault tolerant systems such as Tandem NonStop (now of course HPE NonStop) nowadays use cache-based commodity processors, same as everyone else. For error detection they use rather specialised comparators that do the comparisons on the eventual IO accesses caused by the instructions. If the IO accesses differ, something went wrong.

      There's more to it than that, but more than I have time to describe right now.

      1. Random Q Hacker

        Re: Lockstep RIP

        VMware does this too. In fact, running everything in VM where it can be snapshotted, rolled back, etc probably makes a lot of sense for these missions...

        1. Flocke Kroes Silver badge

          Re: Lockstep RIP

          What to do when radiation bit flips the hypervisor? Obvious: run three hypervisors in virtual machines and compare their output...

    3. Anonymous Coward
      Anonymous Coward

      Lockstep on conventional processors doesn't work on Earth anymore, which is why is was dropped by the fault tolerant computing HPE NonStop team back during the transition between MIPS and Itanium and Itanium. Modern processors just do too much internal error correction and have too much pulled into the SoC to give a boundary that can be checkpointed and subject to a majority vote. Itanium featured radiation hardened latches in its pipeline and a lockstep mode, but at the cost of running at a fraction of the speed of normal mode and you're paying for at least two if not three processors.

      The NonStop team started with an Itanium design that moved the checkpoint voting to the memory subsystem with a hardware checker and memory replication between boxes over optical links, but eventually figured out how to do the whole thing in software, which in turn has allowed them to transition to standard x86 blades.

    4. Lusty

      "Perhaps eventually involving having two less-than-reliable conventional processors operating in lock-step"

      Can't vote with two systems, you'd never know which had an error because you wouldn't know the right answer. Use three and the two matching answers can be used. It's different to a normal cluster where you're only detecting failure since here you're also detecting subtle errors.

    5. Hero Protagonist

      "It'd probably weigh less than packing everything behind a big sandwich of plastic, lead and boron."

      Am I the only one who initially read that last word as "bacon"?

      1. Ken 16 Silver badge
        Trollface

        Bacon?

        Pigs! In! SPAAAAACE!

  4. TRT Silver badge

    For what the adverts tell us...

    off-the-shelf iPhones and American wrestlers are already being space tested.

    1. SkippyBing

      Re: For what the adverts tell us...

      Where do I get an off-the-shelf American wrestler?

      Asking for a friend...

  5. MachDiamond Silver badge

    Sol help us

    Solar activity is pretty low right now so things might be just fine. Not rad-hardening electronics is a road SpaceX has been down a couple of times. They ruined some returned experiments when the fridge on board a Dragon capsule got zapped and there was at least one other incident that I can't recall the details of off hand.

    Been there, flown that, won the NASA prize.

    1. Anonymous Coward
      Anonymous Coward

      Re: Sol help us

      Why does a fridge need a microprocessor?

      1. Francis Boyle Silver badge

        well

        I assume they weren't just using it to store yesterday's leftovers and a two-year old jar of Branston pickle that "still should be good - it's been kept in the fridge all the time".

      2. Suricou Raven

        Re: Sol help us

        Simple thermostat without a microprocessor: Temperature sensor, power MOSFET, comparator circuit, trim pot to set temperature, two resistors (inc. hysteresis)

        Simple thermostat with a microprocessor: Temperature sensor, power MOSFET, PIC microcontroller. Plus it can do PWM with PID feedback, and soft start.

        It's common in electronics to use microcontrollers for absolutely everything now because they are of near-negligable cost and can usually do the task of several more basic components.

      3. annodomini2

        Re: Sol help us

        "Why does a fridge need a microprocessor?"

        Cost, it's usually cheaper to put a micro + code, than design/test an analogue circuit.

        With the Micro usually working out cheaper component wise.

        Plus they can potentially add features with a firmware update.

    2. Gene Cash Silver badge

      Re: Sol help us

      No, the fridge got ruined when seawater got into it on landing.

  6. Charles 9

    What about computers at rest?

    Now, yes, you're going to need some well-built stuff to use while in transit between Earth and Mars, but is this also true for computers at rest, powered down, and packed up? Can ionizing radiation have deliterious effects for data or even hardware that isn't operating yet but will be? I would think this to be an interesting question as well as most of the computing power one would take to Mars wouldn't be in use during the trip, only once one arrives.

    1. Anonymous Coward
      Anonymous Coward

      Re: What about computers at rest?

      It takes a lot more energy to damage electronics that are powered off. If a powered off computer is permanently damaged on the trip to Mars, your astronauts are probably dead. I'd worry more about NAND, since it needs to preserve state, but error correction would presumably handle it. Probably you're going to mirror everything anyway, so that should account for the (perhaps unlikely?) case where a single energetic particle is traveling at just the right angle to upset more bits in the same word than ECC can correct.

      1. Anonymous Coward
        Anonymous Coward

        Re: What about computers at rest?

        Surely the problem is that a high energy particle is going to create a shower of particles as it makes its way through matter so bits in the same block are very likely to be affected, hence bit striping (single bit wide memory in parallel).

  7. Anomalous Cowshed

    Test about exposure to radiation in space

    Doesn't the Earth's magnetic field provide quite a lot of shielding against space-borne radiation in low earth orbit? Isn't the real problem working out the survivability of electronics in deep space, where there truly isn't any protection?

    1. I ain't Spartacus Gold badge

      Re: Test about exposure to radiation in space

      Yes, otherwise the ISS astronauts would be unlikely to live too long. But this is space research. So you do everything with baby steps.

  8. MNGrrrl
    Coffee/keyboard

    Not a good test

    Radiation inside the Van Allen belts is very low; Except for solar flares it's a non-issue. About 99% of the total solar radiation is deflected by Earth's magnetic field. For Mars, it's another story -- it weighs in at nearly .7 sievert per week. For comparison, the ISS receives about 150 mSv **per year**. It's not a valid test because the environment isn't anything like it would be out there. Regular PCs are already on the ISS, with no real ill effect other than a few extra reboots here and there.

    1. Suricou Raven

      Re: Not a good test

      Yet another reason I don't think a manned mars mission is going to happen.

      I'd like it to happen. I think it should happen. I wish it would happen. But in the end, it won't - because some group national leaders is eventually going to have to look at the bill and realise that's a hell of a lot of money even by government standards. Especially as the public is going to insist upon bringing the astronauts back again afterwards.

      1. Rich 11

        Re: Not a good test

        Yet another reason I don't think a manned mars mission is going to happen.

        I think it'll happen, but not for several decades yet, and not at anything like the scale those suicidal would-be colonist idiots imagine.

        Since any interplanetary vessel is going to have to have a storm shelter for its crew (which most spaceship designers envisage being inside the ship's water tank), it makes sense for that to also be the location for the core computing systems (the water will be a handy coolant too).

        And who among us hasn't ever dived into the server room to escape unwelcome visitors?

        1. Alistair
          Windows

          Re: Not a good test

          @ Rich 11

          If your server room is at the bottom of the pool, I'm coming to work with you.

          @ElReg, need a scuba icon!

    2. John Smith 19 Gold badge
      Unhappy

      "Except for solar flares it's a non-issue. "

      3 little words.

      South Atlantic Anomaly.*

      *Thanks to Henry Spencer for that

    3. Brangdon

      Re: Not a good test

      I'm not sure where you are getting your figures from. On the surface of Mars you are protected by the atmosphere, and Mars itself, so the radiation is similar to that on the ISS. You would also build a shelter from regolith, or situate it in a lava tube, which would give excellent shielding when you didn't need to be working on the surface or during solar events.

      The journey there and back are another matter, but hopefully the trip will be under 3 months and the ship itself will provide some shielding. Overall we're talking risk of death by cancer increased by a few percent. If that bothers you, don't go.

      1. John Smith 19 Gold badge
        Unhappy

        "I'm not sure where you are getting your figures from. "

        Umm. Mars has no magnetic field and an atmospheric pressure about 1/160 that of Earth. The ISS "storm shelter" is about 0.5% of the Earths atmosphere equivalent.

        To get the equivalent protection of the Earths atmosphere at ground level on Mars takes a layer of regolith about 3m thick.

        As for where I got my information this guy, who should be quite well informed on the subject.

        1. Brangdon

          Re: "I'm not sure where you are getting your figures from. "

          I'm not going to watch a video. Curiosity has a device to measure radiation. It gives 0.67 mSv per day, which about double the ISS exposure rate. Your figure of .7 Sv per week, or 100 mSv per day, is out by several orders of magnitude. Maybe you have your units wrong, or are confusing exposure during the journey with exposure once arrived. (If you missed a "milli" and confused weekly with daily, that would do it.)

          http://www.sci-news.com/space/science-mars-radiation-measurements-surface-01629.html

  9. BobC

    It's really Cosmic, Ray.

    On Earth we routinely simulate much of the space environment with one massively significant exception: Cosmic Rays, relativistic particles with extreme theory-breaking energies and unknown origin. We have some reasonable approximations that are a PITA to use at all, and impossible to use on whole systems, as they require de-lidding chips and exposing the naked silicon to heavy ion beams.

    Cosmic Rays don't care about the van Allen Belts or Earth's magnetic field. But, thankfully, they are filtered quite nicely by Earth's atmosphere, converting into cascades of other relativistic particles that include muons and pions. These particles themselves have vanishingly short lifetimes when observed in the Lab, yet when coming from a Cosmic Ray cascade, they manage to live long enough to reach the Earth's surface, all due to their startlingly high relativistic speeds.

    Cosmic Rays are the The Hulk of radiation, and since we have no clue how to make them on Earth, if you want to expose your equipment to Cosmic Rays, you need to send that equipment above the Earth's atmosphere.

    And not far above it either! LEO does just fine.

    1. Rich 11

      Re: It's really Cosmic, Ray.

      And for anyone who is a fan of big numbers: https://en.wikipedia.org/wiki/Oh-My-God_particle.

  10. razorfishsl

    They did this is the 70's

    The stuff would not even operate correctly up a mountain.

    main issue is soft errors and corruption due to Alpha particles striking the silicon, then there was the bad batch of ceramics used to house chips, where the ceramic was giving of particles.

  11. Anonymous Coward
    Anonymous Coward

    A single machine here on earth is the control group? So if that machine has a problem that's a zillion space dollars wasted? I hope they have a couple of machines at least, preferably isolated.

    1. The First Dave

      @ Codysydney

      But that applies to the flown hardware too - only little extra bump on take-off and the whole machine fails...

  12. John Smith 19 Gold badge
    IT Angle

    A few notes on chips and radiation.

    That's probably more computing power than the entire processing power of all the GNC systems of all LV's to date. The processors on Apollo were pocket calculator power IE 32KIPS, Shuttle GPC's started at 0.4MIPS and upgraded to 1MIPS each. The ISS runs (IIRC) 40MHz 386s. The bigger Mars rovers run Power PC's at around 200MIPS (and $100K a board, hence the interest in OTS processing).

    As for radiation RAM started using on chip ECC because of radioisotopes in the packaging material decades ago. They don't report statics because a)It would tie up valuable pins and b)Who cares as long as the state read out is the same as the state read in.

    Servers should have ECC for ram as standard, and logging processes as standard for SNMP (obviously the packet delay will be a bit of an issue).

    Likewise "spinning rust" is AFAIK a lot more rad hard but it induces motion in the structure, unless you have pairs of contra-rotating disks to cancel those forces out. Sounds crazy but despite its size the ISS is not actually attached to anything

    Obviously HPE are hoping a good result ouf of this will make them the goto supplier for HPC systems but getting hardware NASA certified and you can bet it will have to be NASA certified if any kind of software is running that's mission critical and the mission is NASA funded.

    IOW upgrading to new processors is usually a massive PITA, which is why space runs with hardware generations behind the SoA in processing power. SX accepts the systems will reset and is OK with that, but getting that accepted by NASA for ISS docking must have been a nightmare.

    1. Korev Silver badge

      Re: A few notes on chips and radiation.

      What about using systems like these to process data on satellites or spaceships which could mean only the processed and probably much smaller data would need to be transmitted.

  13. Dave 126 Silver badge

    What they currently use on the ISS:

    https://www.quora.com/What-are-computers-used-for-on-the-ISS

    So there's three main US computers - of which one is considered Primary, one Backup and one Standby at any one time - and three main Russian computers which work simultaneously. These are accessed using laptops, seven US, seven Russian, running Linux. These systems govern the stuff you really don't want to go wrong.

    Less critical stuff - inventory control, note taking, on board experiments, email etc - is handled by some Windows laptops, mainly Thinkpads as can be seen from photographs from onboard the ISS.

    1. maffski

      Re: What they currently use on the ISS:

      Exactly, the ISS is full of laptops (something like 60) so what does this 'test' do that can't already be done on the ISS laptops (and probably has been time and time again). Smacks more of PR as experimentation.

      1. asphytxtc

        Re: What they currently use on the ISS:

        You're right, the ISS is full of laptops.. Lenovo Thinkpads mainly. However these are FAR from considered "off the shelf". In one way or another, specialised Thinkpads have been flying to space since 1993 aboard STS-61. They're quite significantly modified however to meet stringent NASA requirements.

        Here's an interesting story from a few years ago, posted to nasaspaceflight.com by one of the IBM project managers responsible for initially putting the Thinkpads on the shuttle.

        https://forum.nasaspaceflight.com/index.php?topic=27043.0

        TL;DR - The laptops on the ISS aren't "off the shelf" at all.

        1. Dave 126 Silver badge

          Re: What they currently use on the ISS:

          That's a very interesting link you've posted, thank you. However, I didn't spot any mention of the Thinkpads being modified, bar for a different power supply on on model. They were, especially the earlier ones, subjected to a little of testing.

        2. Dave 126 Silver badge

          Re: What they currently use on the ISS:

          The IBM project manager 'Jim' in the above link specifically states that the Thinkpads were off the shelf. The experiments that used them were designed to tolerate a reboot every so often.

  14. Destroy All Monsters Silver badge
    Paris Hilton

    Phew Phew!

    it causes bits to randomly flip thus changing information and crashing programs.

    AFAIK cosmic rays (i.e. light-hugging fat nuclei) can also cause the circuitry to trip or even blow a transistor here and there right up the epoxy.

  15. Anonymous Coward
    Anonymous Coward

    > “Future phases of this experiment will eventually involve sending other new technologies and advanced computing systems, like Memory-Driven Computing, to the ISS

    Well at least they'll burn less fuel in future sending up marketing fluff rather than real servers

  16. Old Used Programmer

    And there there are....

    In counting up all the more or less off the shelf computers on the ISS, don't for the two Raspberry Pis that went up late last year.

  17. E 2

    Won't the electrons get confused and float away?

  18. Hans 1
    Happy

    Yeah ...

    I am sure the astrobuffins in space just want to take part in the Overwatch Summer Games ... well done!

  19. Anonymous Coward
    Joke

    I suppose if it starts overheating,,...

    ...they can always open a door.

    1. TRT Silver badge

      Re: In space...

      no-one can hear your fans scream.

      1. TRT Silver badge

        Re: In space...

        Which sounds like the ideal environment for Justin Beiber fans, actually.

  20. Anonymous Coward
    Anonymous Coward

    Warranty

    Is space covered by the manufacturer? They're taking a big risk here.

    Also best make sure Teamviewer is installed. Onsite response might be crap. Especially if the support guy is going via London Waterloo.

  21. Anonymous Coward
    Anonymous Coward

    windows space edition

    they have been going to that space station for flippin decades and must have a wealth of information from various ibm thinkpads and advent computers from dixons unless all they have been doing is taking golf clubs and tennis rackets for some sort of human interest angle to get on the news.

  22. Anonymous South African Coward Bronze badge
    Trollface

    Hope they have kept a twin to this spacegoing PC, and will compare the two's outputs.

    Use of DOS 3.30's FC is highly recommended :p

  23. Anonymous South African Coward Bronze badge

    I'll wager about four days, then the first errors will start to pop out.

  24. CheesyTheClown

    Shouldn't they try a machine reliable on earth?

    HPE has now owned SGI for long enough that all their best engineers will have left and the remaining ones will have been eliminated as redundancies. Therefore, all that's left is HPE engineers... which got rid of all their useful people throughout the dictatorship of the past 3 suits in charge.

    I have some serious questions though.

    1) If an HPE computer produces the wrong results due to random behavior... Is this considered a success or a failure?

    2) If an HPE computer fails in space and support is needed, is the call routed through mission control first or does it go directly to India?

    3) How will the cooling system impact the ISS? HPE last I checked only uses one model of fan and it's REALLY REALLY loud... on purpose because they think that if Ferraris sound faster because of how loud they are, then computers should too.

    4) 56Gbp/s interconnect? Wasn't this supposed to be a supercomputer? I buy old used 56Gbp/s infiniband equipment for pennies on the dollar these days. Super computers should be running 10 times that by now. Or is this the HPE version where we sell yesterday's technology today?

  25. allthecoolshortnamesweretaken

    I'll just leave this here:

    Ejecting a floppy disk in space

  26. T283ta

    Mask and HPE very good in marketing.

    But when (if) their products works and deliver promised its heavily overpriced...

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like