back to article NASA ditches Itanic for new Xeon-based SGI giant

NASA has once again turned to SGI for a massive supercomputer. The two organizations announced today that SGI will build a whopping 20,480-core system for NASA Ames in Mountain View, California. The giant will run on four-core versions of Intel's Xeon chip and should reach peak performance of 245 Teraflops, which would make it …

COMMENTS

This topic is closed for new posts.
  1. Slipgate
    Happy

    Now that might be able to run Vista!

    Well, may need a bit more memory....

  2. Don Mitchell

    Amdahl's Law

    Now when we say this machine is running Linux, we don't mean the kernel is running on 20,000 cores I'm sure. You'd be spending 99 percent of your cycles in resource contention. Look up "Amdahl's Law" on wikipedia to see why.

    Linux used to be a single-core system, and it still sufferes from contention due to its basic design. NT was designed to run on those crazy old 32-proc Unisys machines, so it was designed to be multiproc from the beginning. The lore I've heard is that Linux is best at about 4 cores, and NT on about 8. But parallizing the kernel is not the important issue, parallizing applications is.

    Special purpose applications like commercial database servers can really use 32 or 128 cores these days. With 20,000 cores, you could only be running very special purpose numerical code that can be parallelized in a very simple way. You couldn't afford to do anything that required shared resources or interlocking, or Amdahl's Law would completely slay you.

  3. Anonymous Coward
    Coat

    Now if we had a beowulf cluster of these,,,

    Oh damn, I'm suffering Forum Misidentification Syndrome!

    Seriously though, NASA TV showed this off last week, as the camera slowly panned by rack after rack of kit with the big SGI jellybean on it. It was very impressive. And not an "Intel Inside" sticker on the lot. The intro was the usual vague useless shite of "NASA unveils new supercomputer" and that's ALL they said about it. And they wonder why everybody votes "no money for NASA"

  4. Mark Hahn

    amdahl's law, for real

    do you really think these guys don't intimately understand parallelism? but look up Gustafson's law instead - that's the relevant one here, since the point of this cluster is to scale up the problem, not to solve a small problem really fast...

  5. Henry Wertz Gold badge

    Re: Amdahl's Law

    Well, I did some googling, and it seems the Altix ICE is in fact a cluster machine; each blade has a quad-core Xeon, 16 blades per blade enclosure, and 4 blade enclosures per rack. This would take 40 racks for 20480 cores.. the whole thing would be connected with Infiniband. However, it does appear each blade will be booting an individual copy of Linux with this setup (booting via Infiniband) and all storage done to Infiniband-connected storage. So it really will be running 5120 copies of Linux, each handling 4 cores and 32GB of RAM. OTHER machines in the Altix line are large shared-system-image systems though, with 512 cores running under a single kernel not being too unusual.

    Linux doesn't really suffer from it's historical roots.. I'd say Linux scaled to 4 CPUs through roughly the 2.2 kernel. It seems 2.4 would scale to 128 or so pretty well. SGI did research and kernel improvements to improve scalability. The 2.6 kernel has an improved scheduler that keeps a given thread on the same CPU (to maximize cache hit rate). NUMA (non-uniform memory access) is handled intelligently so the system doesn't bog down from having some memory faster than the rest. 2.4 had one "big kernel lock", 2.6 has the minimum locking required to avoid any general cockups. And most importantly, the scheduler in 2.6 is O(1), so scheduler overhead doesn't increase with number of CPUs or processes. SGI scaled 2.6 to 512 cores fine. For 1024 cores, they had to just enlarge an in-kernel table or two. The largest singile system image system in the world is also at NASA, with 1024 dual-core Itanium2s (2048-core single system!) This is also an SGI Altix, but a shared-memory model instead of Infiniband cluster model. It seems to go from 1024 to 2048 cores, SGI just had to enlarge some kernel tables again, and it ran fine.

    Amdahl's law definitely applies.. and applies even more with a cluster. But I think the kind of code NASA runs in fact will be heavily parallelizable. Apparently each infiniband channel does 20gbits/sec, so hopefully that's enough to keep those processors well-fed.

  6. Gerhard Mack
    Stop

    @Don Mitchell

    That simply hasn't been true since 2002 or so. IBM spent the early part of this decade optimizing Linux to run on its' 32 and 64 processor systems. And SGI has been dedicating developers to the multiprocessing cause as well.

    Apache and Mysql have also been doing some serious work in this field as well.

  7. Anonymous Coward
    Flame

    Linux? Your joking!

    What is the hell wrong with NASA? Where have they been? Haven't they seen all those adds on tech sites for M$ products that dish up the "facts" on why M$ is always faster, securer, better than Linux in all situations. These adds give "facts" about all the top companies deserting Linux for the sinking ship it is to embrace the comforting bosom of the beast. Get with times NASA, Linux was doomed to failure right from the start and you will find yourself in a dead end street soon enough.

  8. James O'Brien
    Coat

    Another verse different from the first

    They are gonna be screwed

    /someone stole my coat looks like I am screwed

  9. SpitefulGOD
    Gates Halo

    Would have been easier

    If the secret services had lent some of their petaflop processing time, although official that doesn't exist.

  10. Richard
    Pirate

    but...

    will it run doom?

  11. storng.bare.durid
    Flame

    wow, I wonder how much their electricity bill is...

    How much is one xeon? 50-100w or around that magnitude? let's say 100w

    20480 / 4 = 5120 xeons x 100w = 5.12 x 10^5 watts... or 0.5 megawatts...

    And that's not including the other bits...

  12. Nick
    Linux

    :> its so funny

    I dont think we have enough storage!

  13. druck Silver badge
    Coat

    High Performance Fiddling

    All that processing power should allow them to fiddle the historic temperature data even quicker.

  14. vincent himpe
    Coat

    oh no ...

    i just figured out what causes global warming : The computers used to model the weather patterns and check for global warming !!!

    Where is Cray when you need them ... A Cray is the only computer known to run an infinite loop in just under 4 hours .. then again SGI used to own Cray...

    Mines the one soaked with fluorinert

  15. Nick Woodson
    Gates Horns

    Whiz - bang - thud

    ...and then Steve Balmer skulks in and loads Server 2k8 on it and performance increases exponentially.......

  16. Don Mitchell

    @Henry Wertz

    Thanks, that makes sense. So it is basically a collection of PCs ganged together on a fast net, not some kind of Illiac-style matrix cruncher.

    Gerhard: Windows and Linux both can run on 128 core machines. But that doesn't mean the kernel code will not run 128 times faster, but only 4 or 8 times faster. That's because an OS kernel has a certain number of global resources that have to be interlocked. High performance database systems (and I'm sure SQL Server and Oracle are well ahead of MySQL on this) are designed to minimize resource contentions, and they go out of their way to not invoke the kernel any more than they have to. For example, on Windows you can do asynchronous disk I/O, and often a database system will poll for completion instead of asking the kernel to alert them or wait for an event.

This topic is closed for new posts.

Other stories you might like