Spectre/Meltdown fixes in HPC: Want the bad news or the bad news? It's slower, say boffins • The Register Forums

This post has been deleted by its author

Thursday 26th July 2018 11:10 GMT Zippy's Sausage Factory

Simple - the same ideas underpin their implementation of the same thing. Speculative execution and so forth are fairly industry standard techniques, and even if they aren't necessarily implemented in exactly the same way, it's possible that people have made similar mistakes in their implementation - especially if there's an open "reference implementation" that includes the bug.

7 2 Reply

Thursday 26th July 2018 11:11 GMT Duncan Macdonald

AMD CPUs

As AMD CPUs do not suffer from the Meltdown problem - it would be interesting to see comparable figures from an AMD based system.

5 0 Reply

Friday 27th July 2018 20:17 GMT Michael Wojcik

Re: AMD CPUs

Meltdown is only one small sub-class of the Spectre class of vulnerabilities. I wouldn't expect a large difference.

0 1 Reply

Thursday 26th July 2018 11:22 GMT Ken Hagan

The good news...

...from Intel's marketing department will be that their next chips are several times faster than their immediate predecessors on most workloads. In fact, they've never seen such a performance leap in a single generation and you simply must rush out and give them all your money.

8 0 Reply

Thursday 26th July 2018 11:36 GMT Shadow Systems

Re: The good news...

I wonder if the recent news that Intel's next generation of chips will remove the hyperthreading capabilities (still multi core but only 1 thread per core) is coincidental?

7 0 Reply
1. Thursday 26th July 2018 11:45 GMT Aitor 1
  
  Re: The good news...
  
  My guess is that there are several reasons for this.
  
  First yeah, security implications.. but also benchmarks. Removing MT allows you to have better single core performance, and with Zen2 on the market intel is starting to lag behind in SC benchmarks.
  
  Either intel gets their manufacturing processes sorted or they are at high risk of losing plenty of the market to AMD.
  
  In this age they cannot simply tell google, amazon etc that they won't supply products unless they stop buying from AMD, as these companies do not sell consumer products, and are therefore mostly not affected by processor branding.
  
  7 1 Reply

Thursday 26th July 2018 11:24 GMT kain preacher

I want to know how non x86 cpu's are affected .

3 0 Reply

Thursday 26th July 2018 12:46 GMT Korev

Some bright sparc with the power to explain the arm will be along shortly to tell you

8 0 Reply
Thursday 26th July 2018 13:45 GMT Aitor 1

Obvious

This is a side channel attack, and if you implement the caches in the most obvious way, you are affected.

Read the vulns and it will be quite clear.

2 2 Reply
Thursday 26th July 2018 16:36 GMT Anonymous Coward

Meltdown is Intel-specific, but Spectre seems general to any architecture that implements some kind of speculation. That includes ARM. And generally, if an architecture isn't vulnerable to Spectre, it's because it's too old or limited to be of practical general use.

0 3 Reply
1. Friday 27th July 2018 20:35 GMT Michael Wojcik
  
  Meltdown is Intel-specific
  
  Sigh. This is incorrect. Non-Intel Meltdown vulnerabilities were published months ago, and have been mentioned in Reg comments, as well as all the usual places.
  
  Specifically:
  
  - At least three ARM designs have Meltdown (spec-ex across privilege boundary) vulnerabilities: A15, A57, and A72.
  
  - Some IBM POWER CPUs are vulnerable. It's not easy to find specific information on this from IBM, but it was widely reported (see for example here), and the Wikipedia page on Meltdown includes POWER among affected architectures; it's not likely IBM would let that slide if it were incorrect.
  
  1 0 Reply
Thursday 26th July 2018 18:58 GMT thames

Meltdown seems to be Intel specific, while Spectre is a more general problem relating to speculative execution.

For ARM, it will depend on the specific model. Some ARM chips are affected and some are not. There is a list somewhere of what models are affected.

For example, the Raspberry Pi Foundation have said that all models of Raspberry Pi are immune to both Meltdown and Spectre due to the model of CPU they use.

Generally, some of the top end ARM models are affected by Spectre, while the rest are not. What this has generally meant in practice is that the most expensive premium model mobile phones have a potential problem while the medium to low priced Android phones are largely immune. The bulk of embedded applications using ARM are also probably immune.

2 0 Reply

Thursday 26th July 2018 12:55 GMT Korev

More benchmarks please

I'd be interested in seeing benchmarks like this from more HPC codes, for example MD, CFD, bioinformatics etc.

3 0 Reply

Thursday 26th July 2018 15:21 GMT DeathByDenim

Code optimization

So, the linked article states that these tests were for unoptimized code and seems to imply that optimization reduces the slowdown.

Are these optimizations specific for Spectre/Meltdown or are they just the normal stuff such as compiler flags and sequential memory reading and stuff like that?

0 0 Reply

Thursday 26th July 2018 16:07 GMT Claptrap314

Re: Code optimization

I'm pretty sure that would be Spectre/Meltdown optimizations. By the time you go into HPC, you are going to be paying at least a little attention to optimization in the generic sense.

2 0 Reply
1. Friday 27th July 2018 13:23 GMT DeathByDenim
  
  Re: Code optimization
  
  So do you have links to what that would be? I tried searching for it of course, but searching for Spectre/Meltdown (code) optimization only returns links explaining how Spectre and Meltdown use processor predictive-branching optimizations.
  
  0 0 Reply
  1. Friday 27th July 2018 16:36 GMT Claptrap314
    
    Re: Code optimization
    
    I've generally been on the other side of the house, so no. However, something like retpoline only affects performance if you have a return. No return, no retpoline. So structuring your code to have fewer returns would be one way to optimize (today) for performance in the presence of Specter fixes.
    
    For instance.
    
    0 0 Reply

Thursday 26th July 2018 15:41 GMT Yet Another Anonymous coward

Stop me if I'm wrong here but...

If you are worried about malware exploiting security flaws in your on-premise top500 supercomputer - aren't you perhaps missing a bigger problem ?

It's a few years since I worked on this sort of stuff, but we used to have men with guns stopping you even going into the building with all the Crays - we certainly weren't concerned with a user clicking on a dodgy website.

4 1 Reply

Thursday 26th July 2018 16:12 GMT Claptrap314

Re: Stop me if I'm wrong here but...

This. As I've previously mentioned, I believe that the realization of these attacks means that the market will bifurcate. If you can trust the users to play nice, you don't need these mitigations and existing architectures perform much, much better than what is coming.

But for now, a lot of HPC is probably in environments where there has historically been a lot of trust that might not be warranted. I view this work is a step towards getting this issue in front of the right people.

0 0 Reply
1. Thursday 26th July 2018 16:27 GMT Yet Another Anonymous coward
  
  Re: Stop me if I'm wrong here but...
  
  I assume this will apply to cloud.
  
  I can buy cloud time with all the protections enabled - but I pay a 20% loss of speed
  
  Or my company's tasks are guaranteed to be the only user on the physical CPU and a I pay a 10% security premium.
  
  Or I could decide I don't care. The attacker can slowly try and extract sensitive information with cache miss tricks. But if I'm doing CFD or seismograph modelling, and the attacker would need to get hold of the entire Tb data set to make sense of it, and still wouldn't know where the data was coming from - do I care? Especially if I can conceal who I am when I'm renting the machine.
  
  1 0 Reply
Thursday 26th July 2018 16:37 GMT Anonymous Coward

Re: Stop me if I'm wrong here but...

"It's a few years since I worked on this sort of stuff, but we used to have men with guns stopping you even going into the building with all the Crays - we certainly weren't concerned with a user clicking on a dodgy website."

But what about moles? Or the Edward Snowdens of the world?

0 0 Reply
1. Thursday 26th July 2018 18:25 GMT Yet Another Anonymous coward
  
  Re: Stop me if I'm wrong here but...
  
  But what about moles? Or the Edward Snowdens of the world?
  
  They have access to the information anyway - they aren't stealing the supercomputer memory a bit at a time through cache-miss side channels.
  
  My point is that the jobs running on these machines are restricted, controlled and billed!
  
  Back in the Cray days, and even now on the biggest machines, you don't log in to your shell account ad start running stuff. The builds and verifications are typically done on front end servers and then submitted to some queue.
  
  Worrying about Spectre on a top500 is like fitting a bike lock to the wheels of a moon buggy.
  
  4 0 Reply
Thursday 26th July 2018 19:42 GMT c1ue

Re: Stop me if I'm wrong here but...

The issue is that attackers don't have to have admin access to attack, they just need to have access to something running on the same physical machine.

Do the HR and payroll people have the same security practices and standards as the R & D folk?

Worse, a lot of companies use cloud - there can be literally dozens of different companies running on a given big iron.

Vet short sighted to just talk about single specific big iron installs

2 0 Reply

Friday 27th July 2018 08:19 GMT _LC_

Refund, anyone?

I find it hard to believe that they have been "speculating into the blue", vastly ignoring the multi-user aspect (and thus the MMU) – by accident. Does anyone still remember "the little mishaps" of our car manufacturers?

Furthermore, the performance penalties are not the only problem (they hurt, especially when VMs are involved). Speculative execution consumes a lot of power. This is why ARM came up with the big-little concept to begin with. Now we are having systems that "keep speculating into the Orcus". That is, you'll not only get less performance, but it will still consume the same power.

Just, because Intel is in bed with certain secret services it doesn't have to imply that they get off free. Just my two cents...

0 0 Reply

Sunday 29th July 2018 00:24 GMT gnasher729

I suppose you could reduce reliance on speculative execution. There are conditional operations (which test a condition but are executed unconditionally), and there are things like the POWER "count" register where loop execution isn't actually speculative - the processor _knows_ that the loop is executed or not. With hyperthreading, you could move execution to the other thread instead of doing speculative memory reads / writes although that might open another side channel.

0 0 Reply

Sunday 29th July 2018 10:19 GMT Anonymous Coward

Some loops will always be speculative (such as reading from a raw input; no telling where the end will be). And there will always be a need for raw throughput. It's easier to BS around a wrong answer than a missed deadline.

0 0 Reply
1. Sunday 29th July 2018 10:39 GMT _LC_
  
  We are still talking about speculative execution in CPUs, are we? ;-)
  
  0 0 Reply
Monday 30th July 2018 18:56 GMT Claptrap314

The Power (and PowerPC) chips that I knew had the ability to be non-speculative wrt the count register, but they were in fact speculative. I expect that to change.

0 0 Reply

Sunday 29th July 2018 15:16 GMT T. F. M. Reader

Massively parallel workloads?

Have they tested massively parallel workloads, such as partial differential equations solving? This is a typical HPC problem, and it seems different from the "computationally intensive code" they analysed. Pure number-crunching does not use much I/O or system calls indeed. For parallelized derivative computation you will need to exchange information between the nodes working in parallel though - this is why low latency networks are so important in HPC systems. And that means I/O.

That does not necessarily mean system calls, especially if the information exchange is facilitated by RDMA-capable HW - the kernel may be bypassed.

Enquiring minds want to know, etc., etc. ...

1 0 Reply

Topics

Special Features

Vendor Voice

Resources

COMMENTS

AMD CPUs

Re: AMD CPUs

The good news...

Re: The good news...

Re: The good news...

Obvious

More benchmarks please

Code optimization

Re: Code optimization

Re: Code optimization

Re: Code optimization

Stop me if I'm wrong here but...

Re: Stop me if I'm wrong here but...

Re: Stop me if I'm wrong here but...

Re: Stop me if I'm wrong here but...

Re: Stop me if I'm wrong here but...

Re: Stop me if I'm wrong here but...

Refund, anyone?

Massively parallel workloads?

POST COMMENT House rules

Enter your comment

Add an icon

Other stories you might like

Los Alamos Lab powers up Nvidia-laden Venado supercomputer

Microsoft foresees a new type of AI PC: A Surface designed with help from machines

India and EU finally advance HPC collaboration project hatched in 2022

Lenovo scores deal to build supercomputer at UK's Hartree Center

Intel's neuromorphic 'owl brain' swoops into Sandia labs

Butler Investments joins Atos rescue party

Google is wrong to put AI search features behind paywall, says HPC leader

NASA missions are being delayed by oversubscribed, overburdened, and out-of-date supercomputers

Italy's military mulling space-based supercomputing cloud

What's going on with Eos, Nvidia's incredible shrinking supercomputer?

Australian supercomputer 'Taingiwilta' comes online this year with [REDACTED] inside

HPE's updated Spaceborne Computer-2 ready to hitch another ride to the ISS

About Us

Our Websites

Your Privacy