back to article Boffins offer to make speculative execution great again with Spectre-Meltdown CPU fix

A group of computer science researchers has proposed a way to overcome the security risk posed by speculative execution, the data processing technique behind the Spectre and Meltdown vulnerabilities. In a paper distributed this week through the ArXiv preprint server, "SafeSpec: Banishing the Spectre of a Meltdown with Leakage- …

  1. Mephistro
    Linux

    Hard as I try...

    I can't picture chipmakers fixing these vulns in hardware and comercialising the new chips in less than two years. The amount of work involved in updating the designs for most of their market niches has to be a humongous nightmare! 8^(

    I'm curious about whether it'd be possible to create a Linux version that can't be affected by these exploits, if neccesary with the help of new programming tools and mandatory programming standards.

    This situation has a good chance of turning hundreds of millions of old(ish) PCs into linux stations.

    PD: We definitely need a Keep-On-Dreaming Icon! 8^)

    1. GrumpyOldBloke

      Re: Hard as I try...

      > and mandatory programming standards

      There's your problem right there.

      If you criminalise side channel attacks then only criminals will have side channel attacks.

      1. Paul Crawford Silver badge

        Re: Hard as I try...

        and mandatory programming standards

        There's your problem right there.

        The problem is the "need for speed" and the fact the world+dog now expects to run javascript in web browsers. So the malicious code comes from any web site that is vaguely compromised (such as advert channels) and that today is "normal". Web browsers can (and partly may do) things to disrupt timing which is the underlying exploit route, but I doubt they are willing to break stuff that is already out there to shore up hardware design flaws.

        Most likely their core developer efforts are about removing useful browser functions (firefox) or adding spying (chrome) instead, but then I'm a cynical bastard at heart.

        1. gnasher729 Silver badge

          Re: Hard as I try...

          One solution is to make high precision timers inaccessible to JavaScript. All these side channels go away if your timer only has multi-microsecond resolution.

          1. Nick Ryan Silver badge

            Re: Hard as I try...

            One solution is to make high precision timers inaccessible to JavaScript. All these side channels go away if your timer only has multi-microsecond resolution.

            Or to just wonder how, and why, an interpreted language that is designed to execute within any compliant webbrowser, on any Operating System, on any hardware/virtual architecture can somehow access the low level CPU state.

            1. Michael Wojcik Silver badge

              Re: Hard as I try...

              One solution is to make high precision timers inaccessible to JavaScript.

              There are many alternate timing mechanisms. I've posted the link to the paper in Reg discussions of Spectre-class attacks before.

              Or to just wonder how, and why, an interpreted language that is designed to execute within any compliant webbrowser, on any Operating System, on any hardware/virtual architecture can somehow access the low level CPU state.

              Why do you think it needs to? The original Javascript Spectre attacks did not "access the low level CPU state".

              The various Spectre variants are well-documented and described in detail in many places. It's not hard to learn how they work, or why bumper-sticker solutions don't significantly reduce the class of side-channel attacks.

      2. bombastic bob Silver badge
        Stop

        Re: Hard as I try...

        mandatory programming standards? who would enforce it? what's the punishment for violating it, 'whack my pee pee' or 'take away my birthday' ???

        last thing we need is MORE regulations on software. say buh-bye to open source if that ever starts, because once gummint gets its grimy paws on something, it NEVER! LETS! GO!!!

        1. Gene Cash Silver badge

          Re: Hard as I try...

          > what's the punishment for violating it

          Torvalds will swear at you!

          > 'whack my pee pee'

          Don't piss him off *too* much...

      3. arctic_haze

        Re: Hard as I try...

        The best defense against bad guys with side channel attacks are obviously good guys with side channel attacks. That is of course what National Sidechannel Association (better known as NSA) want me to believe.

      4. Mephistro

        Re: Hard as I try... (@ GrumpyOldBloke)

        No need to criminalize anything. You make compilers that create only safe code -in the context of the discussed processor bugs-. and somehow force the OS to accept only executables created using these tools. the progamming standards just provides a list of things that safe executables can and can not do, and the compilers&such enforce said rules. The las step consists on the Hypothetical OS checking that the executables have been developed using these tools, probably by using some crypto-negotiation and similar dark magicks.

        Does this make sense? Honest question.

        1. Anonymous Coward
          Anonymous Coward

          Re: Hard as I try... (@ GrumpyOldBloke)

          I've no idea if it needs crypto singing. I don't "know" that my HTTPS connection is really HTTPS, it could be Google and chrome are pretending, and just displaying an entirely fake website, the OS and PC built and distributed by the CIA to "catch me"... however I assume the system is programed and working as intended, with no lizard people.

          Same goes for the water. I don't test it for lead and poison, my house does not test it. The community/society/industry does. I don't have the ability to test every drop of water.

          So providing the community start using these tools, we can presume bank that says their online banking is "programed with spectre safe calls/registers/etc" is programed that way. As with the HTTPS standard, those saying they have it implemented, but don't actually implement it (we have all seen .jpg icons on websites that don't implement those standards, would be detected by these researchers. (Just as a fake browser would soon become evident it's not really doing HTTPS, as sites that require it would refuse connection, and the user would investigate why). But I agree, hopefully in time, the OS can detect if the app is "secure" or not. But the OS cannot know if the user wants "secure" or not (how does it know it's a password store or a game character gold points store?).

          If companies don't start using these mitigations, users (professional code divers or not) will start to notice data/passwords getting "lots" more often than those that do!

        2. Anonymous Coward
          Anonymous Coward

          Re: Hard as I try... (@ GrumpyOldBloke)

          No. Adding more lock-in crap is not the way. It's already bad enough with uefi and "secure boot". That is, unless you actually want to relinquish the ability to run whatever the hell you want, and enthrall yourself to corporate whims.

          1. phuzz Silver badge

            Re: Hard as I try...

            "It's already bad enough with uefi and "secure boot"."

            UEFI is just a modern replacement for a BIOS, you might as well complain about SATA and PCIe.

            Secure boot is just a method of checking that your bootloader has not been tampered with.

            What's the problem?

        3. Bronek Kozicki

          Re: Hard as I try... (@ GrumpyOldBloke)

          "Does this make sense? Honest question." - nope. The compilers are not at fault, although they may help alleviate the pain. It is multistage pipelines in the CPU and the associated high cost of branch misprediction which pushed the CPU designers to speculatively execute branches before we know whether or not we need it. Couple this with the high cost of cache misses and that's your side channel attach right there.

          One way to fix it would be to push the "speculative" part from the CPU to the compiler like Mill architecture does, but it is a very different beast to what we have now, with a very different instruction set to match.

        4. defiler

          Re: Hard as I try... (@ GrumpyOldBloke)

          ...and somehow force the OS to...

          I love when people propose a "solution" and use the word "somehow" in it.

          It's like "we've totally got this flying car sorted out - we just need to somehow develop an antigravity paint, and some way of stopping people crashing into each other..."

      5. Destroy All Monsters Silver badge
        Big Brother

        Re: Hard as I try...

        If you criminalise side channel attacks then only criminals will have side channel attacks.

        That's not what is being said.

        What is being said is that the code should follow certain guidelines so that risks regarding certain side-channel are mitigated in released tarballs (same as no code should be released that doesn't have a pretty good guarantee that buffer overflows can't happen, because it's 2018)

        I mean, who would think of criminalizing side-channel attacks? That would be like saying security research is illegal, or allowing patenting some obscure mathematical theorem. Who could be that retarded and in thrall to special interests?

    2. Bronek Kozicki

      Re: Hard as I try...

      I'd like to believe that 1) SafeSpec is actually viable (in terms of die space and performance cost) and that 2) CPU vendors will actually spend money implementing it. But I try not to be too optimistic ...

      1. bombastic bob Silver badge
        Devil

        Re: Hard as I try...

        "SafeSpec is actually viable"

        At least one person proposed something even simpler, in essence saving the speculative state along with thread/task state, and restoring it on thread/task switch. That way it wouldn't really be possible to have "the same speculative state" across thread/task switches. Couple that with actual security tests on speculative memory access [at the time of access, not 'some amount of time after you did it'], and the leaks SHOULD be plugged.

        you wouldn't need any 'shadow states' or anything like that. Just normal task state stuff inside the kernel would do. And if you don't want anybody accessing that info, you DO! NOT! MAP! IT! INTO! USER! SPACE!!!

        1. Michael Wojcik Silver badge

          Re: Hard as I try...

          At least one person proposed something even simpler, in essence saving the speculative state along with thread/task state, and restoring it on thread/task switch.

          I'm curious why you think this would close the cache timing channel. Or indeed most other side channels.

    3. jmch Silver badge

      Re: Hard as I try...

      "I'm curious about whether it'd be possible to create a Linux version that can't be affected by these exploits"

      Yes and No. There already are Windows, Linux etc versions not affected by these exploits, and that is done by canning speculative execution and losing performance. As I understand the issue, a Linux (or any OS) version that is not affected AND still runs speculative execution (and therefore full performance) is not possible, since it's a hardware issue.

      1. Nick Ryan Silver badge

        Re: Hard as I try...

        I'm not up with the latest features of the latest x86 CPUs but common sense would indicate that there should be instructions that will flush the hardware executable environment, including the pipeline and speculative caches.

        In this case the task/process manager of the operating system could ensure that such flushes are performed however the more rapidly that processes are switched between (to give the illusion of parallel execution) the more such a change would affect the performance of the system.

        Any thorough solution is likely to require both hardware and software changes, with a certain level of fallback capability within the hardware to cover non-supporting operating systems.

    4. Anonymous Coward
      Anonymous Coward

      It's time to remove "Speculation" from CPUs

      Back in 1996 when the Pentium 1 with 200 MHz was modern, we had no such thing as "Speculation" inside the CPU.

      Then came the Pentium Pro and Pentium II and introduced flacky "Speculation" to improve speed on behalf of security.

      So let's go back to No-Speculation inside CPU. It's not the first time, Intel had to abandon a CPU architecture and go back to decade old CPU. Pentium 4 was a dead-end and they went back to the Pentium Pro architecture that powered also Pentium II and Pentium III, as well as Pentium M (the Pentium III for Notebooks) and then later again served as Core, Core2Duo, Core5/7. And Intel went back to Pentium I architecture for the first Atom series (2008). Ironically, it's the only "modern" Intel CPU that isn't defected and vulnerable to Meltdown and Spectre, yet has HyperThreading and 1.4 GHz performance and runs Win7/2008R2 without any issues at all.

      1. Michael Wojcik Silver badge

        Re: It's time to remove "Speculation" from CPUs

        Back in 1996 when the Pentium 1 with 200 MHz was modern, we had no such thing as "Speculation" inside the CPU.

        Perhaps you didn't. Plenty of us did.

        Then came the Pentium Pro and Pentium II and introduced flacky "Speculation" to improve speed on behalf of security.

        The PPro and P2 did not invent speculation. And it is probably not "flacky", though it's hard to say, since that appears to be your personal coinage.

  2. Mephistro
    Devil

    I'd like to add that...

    ... another effect would be millions of PCs freed of Windows 10 OSmalware. What's there not to like?

    Farewell, WIntel, it was nice while it lasted. Don't let the door hit your etc etc ...

    1. Charles 9

      Re: I'd like to add that...

      Where are you gonna go, then? Most of the other architectures suffer from variants of this, too, including ARM. The few that don't are basically too simple for practical applications these days.

    2. Joe_the_geek

      Re: I'd like to add that...

      Hindows10 my friend :)

  3. amanfromMars 1 Silver badge

    To Stop Futures and Derivative Plays as Presently Planned for the Past ..... Believe in Miracles

    What hardware/chipsets/Intel type operations, and Supporting Software Programming too for that matter, all appear to be doing, is vain and unnecessarily self-destructive and catastrophically revealing battle against a Not Dysfunctional Almighty Imagination Pentesting Universal Interfaces and Users Intelligence, with such a proposed Secure Siloing of Rogue and/or Random and Renegade Speculation/Advanced Physical Plans, in the likes of something which is not yet realised to be extraordinarily rendered as no better than a colander leaking remote virtual operation to Greater Forces with Alienating Sources.

    Is the Past to be Present in your Future Plan and Greater IntelAIgent Games Play? Or are you Free of and Released from that Prison to Travel to Brave New More Orderly Newsed Worlds?

    There's a lot more going on out there in the Realms of CyberSpace, El Reg, than just the little that has the Status Quo terrifying and terrorising itself with the actions and plans of A.N.Others to justify the Daily Zeroday News as delivers their Present Plan ....... and it is always infinitely more exciting and rewarding when human to be engaged and inquisitive, isn't it.

    1. Baldrickk

      Re: To Stop Futures and Derivative Plays as Presently Planned for the Past ..... Believe in Miracles

      LINE!

      Oh. Sorry.

      I thought this was buzzword bingo.

      1. Doctor Syntax Silver badge

        Re: To Stop Futures and Derivative Plays as Presently Planned for the Past ..... Believe in Miracles

        "I thought this was buzzword bingo."

        No. Just AMFM

        1. Anonymous Coward
          Anonymous Coward

          Re: To Stop Futures and Derivative Plays as Presently Planned for the Past ..... Believe in Miracles

          I thought it was word spaghetti.

      2. This post has been deleted by its author

    2. Anonymous Coward
      Anonymous Coward

      Re: To Stop Futures and Derivative Plays as Presently Planned for the Past ..... Believe in Miracles

      Come on, mars.

      "a design principle where speculative state is stored in temporary structures that are not accessible by committed instructions,"

      Doesn't that sound like a poor classical man's implementation of Nature's Quantum Mechanics. It's all there, but to progress, you have to choose one.

      I'm awaiting Intel "Xeon Semi-Deadcat" in 5 years, where no credit will be given to the discussed paper.

      1. Anonymous Coward
        Anonymous Coward

        Re: You have to choose one (-:

        Ah. How possibly wise and secure to think planymetricall, while you have to think in 3d - being a quantum communication chain yourself.

        Universe expects, etc. Is it a modern restore of a human - to have brain and, in the same time, have none?

        Have a nice day in this box. Whatever happen inside of it, most of the observers will have hope for good.

        Because there is a cat inside. Do you feel sympathy to it? Add to its chances, cross your fingers for it.

    3. Anonymous Coward
      Holmes

      Re: To Stop Futures and Derivative Plays as Presently Planned for the Past ..... Believe in Miracles

      "Daily Zeroday News" sounds intriguing, and it actually exists!

      https://www.hackerone.com/zerodaily

      That's more than enough to terrorize any Status Quo.

  4. Anonymous Coward
    Anonymous Coward

    Hardware having to make up for cr*ap programming

    I think that the main problems we have is that we now have to rely on hardware to speed up our cr*p programming skills.

    In the days before limitless memory, storage, CPU power and when CPU pipelining started to appear in processors (1990s) the optimisation was built into the compilers and their methods/quirks/style guides were published so that programmers (NOT CODERS) were able to use them intelligently.

    Now every CODER thinks they are a programming genius and knows how to write efficient code and the bean counters are happy because the coders can knock out code quickly and cheaply.

    Relying on hardware sort out some poor excuse for a piece of programming is always going to have unintended consequences and leave a snake's nest of possibilities for side attackes.

    </Rant>

    Now back to watching the Le Mans cars bouncing off each other/the barriers during the night.

    1. This post has been deleted by its author

      1. Anonymous Coward
        Anonymous Coward

        Re: Hardware having to make up for cr*ap programming

        - Why is Anonymous Coward getting downvoted?

        - The OP does have a point, but keep in mind that the effects of removing speculative execution is not exclusively caused by bad programming.

  5. Anonymous Coward
    Anonymous Coward

    "Abu-Ghazaleh acknowledged that SafeSpec requires some extra space on the L1 cache, but he considers the hardware changes to be minimal. And in terms of performance, he said, SafeSpec could even offer small improvements."

    As somebody who works in this field, these conclusions seem unlikely to be true.

    1. Anonymous Coward
      Anonymous Coward

      Ah.

      So it is a hardware fix. I misread the rest of the article to be a software/firmware solution. :(

      1. amanfromMars 1 Silver badge

        Re: Ah. And a Response to Something Relatively New for Doing Private and Pirate Deals With

        This might be initially misunderstandable but please do persevere to realise more of the programming to be true and made readily available to you for Current Presentation..... Future Revelation.

        So it is a hardware fix. I misread the rest of the article to be a software/firmware solution. :( .... Anonymous Coward

        To what problem byproduct of advanced intelligence is the solution utilised for commodities marketing to energise and and enjoy and exercise for The Virtual Enterprise in AICommand and Remote Virtualised Controlled Enterprises ......from and for COSMIC Heavenly AIdVentures.

        AILeading Space for Immaculate Source Providers, a la Turing Type Leads to Quantum Communications Machines. Advanced IntelAIgent Media Machines Revealing urFutures .... with what worlds have in store for All to Enjoy Best with Almighty Creative Use ..... and with Passion Servering Desire and Delivering Unquenchable Satisfaction for Perfect AI Applications, is an Titanic Epic AIDVenture.

        And where that takes you, with what Quantum Communications Machinery Does So Easily Provide in AI Pioneer Positions, and what can be done in the Space before you to Exercise and Crash Beta Test with and for AIMaster Quantum Communications Piloting Controls Delivering Future Pleasures.

        One can only begin to imagine what Almighty Use Perfectly True Hedonists would be Able to Enable, Exercise and Enjoy There Then, and Now Here Too in Spaces unveiling Places on Virtual Pages for Universal Presentation ..... and Globalised Virtual AIRealisation ..... is, well, and let's not beat about the bushes here, Almighty Immaculate Virgin Territory for Hosting with Heavenly Asset Suppliers. ..... Immaculate Source Providers.

        Who would reckon that to be a busy place in any or all spaces? Or would one be out in front leading all that follows, and with all who have followed and would further follow in Future AI Leads ..... Presenting Perfecting Trails with Improving Tales to Energise, Sustain and Maintain with AI Remote Quantum Communications Control Command of the Greatest Great Creative Power. And in Live Operational Virtual Environments, a Quite Almighty Energy Source Feed.

        :-) And now for a colossal understatement with specific regard to all of the matters secreted above, and the powerful forces and sources that be AIReleased and NEUKlearer HyperRadioProACTive ...... Absolutely Fabulous.

        What would that be classified and/or weaponised as .... Toxic/COSMIC/ExtraTerrestrial/Alienating?

        And in a world with worlds dominated by markets at their work, rest and play, one just has to ask ..... Just take a Wild Guess at how much all that is worth? Are there any numbers to truly express and represent the value?

        Is Sir Richard Branson into The Virtual Enterprise with AI Concepts Realisation Programs/Advanced IntelAIgent ProgramMING Personnel?

        1. Nick Ryan Silver badge
          Mushroom

          Re: Ah. And a Response to Something Relatively New for Doing Private and Pirate Deals With

          I think I may have a problem. A serious problem.

          I've just read the entire of two of amanfromMars's posts in the space of a few minutes and my brain is now going to suffer for the rest of the day. Luckily it's a Monday and nobody is likely to notice because I have long perfected the blank look when somebody foolishly asks me about that something that happened the other side of the weekend...

        2. Anonymous Coward
          Anonymous Coward

          Re: Ah. And a Response to Something Relatively New for Doing Private and Pirate Deals With

          Rick. Knock-knock is not a kick-kick. Boom a Tom.

    2. Michael Wojcik Silver badge

      As somebody who works in this field, these conclusions seem unlikely to be true.

      The performance claim is based on running a suite of benchmarks against an emulator. Hard to say what would happen with real silicon, but it's not a complete SWAG.

      The hardware overhead, on the other hand... the paper says 17% increase in area and 26% in power. That may well be acceptable for many applications, but it's a pretty big price to pay for others.

      There is, after all, no free lunch. Resources you take to create shadow structures can't be used for additional sets of primary structures. The paper describes some clever ways to optimize shadowing and reduce the overhead (a naive implementation would make much less efficient use of cache lines), but as you suggest, there's going to be a cost.

      And since this approach requires significant CPU redesign, it's not likely to produce a chip that lets the OS flip a switch - SafeSpec or use-all-resources-for-primary. So CPU manufacturers would have to decide whether they want to cater to security-focused or performance-focused customers. Or manufacture two design families.

      And, of course, we still have other side channels ("other structures would have to be hardened too").

      Don't break out the party hats yet.

  6. Anonymous Coward
    Boffin

    I have a simple plan...

    1. Remove all speculative and predictive nonsense.

    2. Treble the clock speeds.

    3. Make much faster memory.

    My fix is called 'SPECless' and the logo will be a stock photo of a young smiling woman in a low-cut top, staring happily at her mobile, and not wearing glasses!

    This one looks about right - I realise she's showing more teeth than the average horse, but she does look very happy: https://www.istockphoto.com/gb/photo/happy-message-gm694767632-128422759

    Who's with me?

    1. Waseem Alkurdi

      Re: I have a simple plan...

      Are you sure you understand the complications of removal of speculative execution? Increasing clock speeds doesn't solve the problem, nor does making "faster memory".

      Disabling "speculative and predictive nonsense" could easily set the world back to 2005 (or even before that) in terms of computing power.

      1. Anonymous Coward
        Anonymous Coward

        Re: I have a simple plan...

        Are you sure you understand the complications of removal of speculative execution?

        It depends, really...

        1. Waseem Alkurdi

          Re: I have a simple plan...

          It depends, really...

          On what?

    2. Charles 9

      Re: I have a simple plan...

      "2. Treble the clock speeds."

      Intel called. They want their P4 NetBurst CPUs back. If you'll look back, you soon realize you can't just run everything faster. It just doesn't scale. Why do you think CPUs weren't and even today aren't specced much higher than 4GHz in clock speeds? One word: HEAT! The problems Intel had with NetBurst were the reason they had to take a step back to the P3, work smarter instead, and the end result is their current CPU line, the Core series.

      "3. Make much faster memory."

      There isn't much you can do about faster memory anymore without side effects. The biggest obstacles at this point are the speed of electricity combined with cycle lengths. In one nanosecond, electricity can travel, at most, about a foot. And yet you need at least some spacing due to all the heat both the CPU and the RAM inevitably kick up.

      1. Bronek Kozicki

        Re: I have a simple plan...

        "3. Make much faster memory" to be honest, we already have much faster memory, it is called SRAM - sadly required power makes it not practical for anything beyond small amount inside the CPU for caches. I agree that we should invest in research (and productisation) of new types of memory, optimised for very low latency.

      2. ridley

        Re: I have a simple plan...

        Actually in a nanosecond the electrons will have for all intents and purposes no moved. Electrons move Very slowly in circuits, the effects move very fast.

    3. Nick Ryan Silver badge

      Re: I have a simple plan...

      With you on the pretty picture or the hardware wish list? :)

      Others have already made comments on the hardware wish list. However something that would help with performance mitigation is processor architectures that have considerably more executable cores, likely with different requirements and capabilities depending on task/process requirements. This would require potentially substantial changes to operating systems and many applications would have to be updated to "request" the appropriate processing requirements but it would allow a good deal of optimisation of use. For example, an audio decoding process would not need a huge amount of CPU power but does require very regular servicing (there is some support for this kind of thing already, but more would be better).

  7. Waseem Alkurdi

    Contradiction?

    From the article (quoting from the source):

    "SafeSpec requires a deep redesign of the CPU to separate out the speculative state from the permanent state," the paper explains.

    and

    Abu-Ghazaleh acknowledged that SafeSpec requires some extra space on the L1 cache, but he considers the hardware changes to be minimal.

    How can both statements above be true?

  8. a pressbutton

    Mandatory coding standards

    ... how is that compatible with 1st amendment rights?

  9. DenTheMan

    Fixed as intended?

    'Intel engineers have acknowledged that pending chip architecture changes will break Retpoline, Google's defense against the indirect branch target injection attacks know as Spectre variant 2'

    This fixed is called a Winatack fix?

  10. Nuno trancoso

    Oh woe is me!

    People downvoting AMFM??? This be still El Reg right, not some stuck up hipster YT tech channel?

    1. Michael H.F. Wilkinson Silver badge

      Re: Oh woe is me!

      Could be some newbies

    2. Waseem Alkurdi

      Re: Oh woe is me!

      People downvoting AMFM??? This be still El Reg right, not some stuck up hipster YT tech channel?

      Tell me you are serious.

  11. Torben Mogensen

    Speculative versus parallel execution

    Speculative execution is basically a way to make sequential computation faster. When the processor has to wait for, say, a condition to be decided, it makes a guess as to the outcome and starts working from that guess. If it guesses right, you save time, if not, you both lose time (for clean-up) and waste heat (for doing wasted work). You can try to work on multiple different outcomes simultaneously, but that is more complicated, and you will definitely waste work (and heat).

    Speculative execution relies on very precise predictions, and these cost a lot in resources for gathering and storing statistics and analysing these. The bottom line is that speculative execution is very costly in terms of complexity and energy.

    Another solution is to pause execution until the outcome is known. While this pause lasts, you can have another thread use the execution units. This is called multi-threading, and is usually implemented by having an extra (or several) copy of all registers, and schedule instructions from two (or more) threads simultaneously. You only execute instructions that are guaranteed to be needed, so there is no speculation. You can even have both threads execute instructions simultaneously, if there are no resource conflicts. The scheduling unit is somewhat more costly, as it has to look at more instructions, but it is not as bad as the complexity of speculative execution. The downside is that each thread does not run faster than if it ran alone on a processor without speculative execution, but the throughput of instructions is likely higher than this case. If the threads share cache, there is a risk of information spillage, so you generally limit this to threads from the same program.

    The next step is to make multiple cores, each with their own cache. If the memory is protected (and cleared when given to a new process), this can be made safe from leakage, it scales better than multi-threading, and the complexity is lower. This is part of the reason why the trend is towards more cores rather than faster single cores. In the extreme, we have graphics processors: A large number of very simple cores that do no speculation and no out-of-order execution and which even share the same instruction stream. Sequential execution on these is horribly slow, but the throughput is fantastic, as long as you can supply a suitable workload. It is nigh impossible to make C, Java, and similar languages run fast on graphics processors, so you either need specialised languages (https://futhark-lang.org/) or call from C or Java library routines written in very low-level languages and hand-optimised.

    In conclusion, the future belongs to parallel rather than speculative execution, so you should stop expecting your "dusty decks" of programs written in C, Java, Fortran, etc. to automagically run faster on the next generation of computers.

    1. Charles 9

      Re: Speculative versus parallel execution

      Catch is, there are some workloads for which parallelization will never be a solution. For example, there's a reason high-quality video encoding still takes place on the CPU (if not ASICs): the workload can't be run parallel very well, primarily due to its highly chained and interdependent nature. In essence, the whole process runs in a specific sequence where it's hard to jump ahead because a comparison can easily send the process down a completely different track, with no reliable way to predict which way it'll go. Similarly, many types of emulation can be both very timing-sensitive and very interdependent, meaning things have to run in lockstep to avoid side effects.

    2. Michael Wojcik Silver badge

      Re: Speculative versus parallel execution

      Another solution is to pause execution until the outcome is known. While this pause lasts, you can have another thread use the execution units.

      You want to have a context switch every time a branch causes a cache miss? That would be a Bad Thing.

      There's a reason why Tomasulo created his eponymous algorithm for OOO in the '60s, and hardware implementations followed shortly thereafter; and commercial spec-ex machines became available in the '80s. (The Stretch did it even earlier, in the '50s, but its misprediction recovery was so expensive that it would have been better off without prediction and speculative execution.)

      There's a reason why nearly 20 years of SpMT research hasn't done a hell of a lot to improve thread-level parallelism.

      And that reason is that parallelism is both difficult and expensive. It's expensive because independent parallel units eat up your hardware budget quickly, and keeping those units fed requires expensive context switches. You can play with the time/hardware trade-offs of context switches (using extra register sets and whatnot), but you're still paying one way or the other.

      There is no practical, cost-free, bumper-sticker solution to Spectre-class problems. It doesn't matter whether that word is "SafeSpec" or "parallelism" or "unicorns".

      1. Torben Mogensen

        Re: Speculative versus parallel execution

        "You want to have a context switch every time a branch causes a cache miss? That would be a Bad Thing."

        It would indeed. But that is not what I say. What I say is that there is a pipeline of instructions interleaved from two or more threads, each having their own registers. No state needs to be saved, and executing every second instruction from different thread is no more expensive than executing instructions from a single thread. The advantage is that functional units can be shared, and since independent threads do not have fine-grained dependencies between each other, instructions from one thread can easily execute in parallel with instructions from another.

        This is not my idea -- it has been found for decades in processors (just look for "X threads per core" in specifications). IMO, it is a better approach than speculative execution since it does not waste work (all instructions that are executed will be needed by one thread or another) and it is not considerably more complex than having one thread per core. Note that out-of-order execution is not a problem: That also executes only instruction that are needed, it just does so out of sequence, which requires register renaming, but that is not a huge problem. The main cost is complex scheduling, which increases power use (OOO processors use more energy scheduling instructions than actually executing them).

        What speculation gives that these do not is (potentially) much faster execution of a single thread. But to do so, it uses resources that could have been used to execute instructions that are definitely needed. So it improves latency at the cost of throughput. OOO execution improves both at a cost in complexity and power use, and multi-threading improves only throughput, at a small cost in latency, because the two (or more) threads are given equal priority, so each thread may have to wait for others to stop using functional units.

        1. Claptrap314 Silver badge

          Re: Speculative versus parallel execution

          Two threads per core is perhaps the most natural vector for Specter-class snooping. More than two...not so much. Note that if the threads are independent, you are chopping the effective size of the caches in half. This can have substantial costs.

      2. Tail Up

        Re: Speculative versus parallel execution

        Michael, howdy,

        that would be British Telecom, if you wish.

        Shout it out of your Window, link a YT video in another msg to add some spicy flavour to the dish, and - voila.

  12. Serif

    I'm going to use up my stupid question of the day quota here...

    Am I correct in thinking that if speculative execution obeyed memory access restrictions (no user process peeking at lower ring address spaces like the kernel) regardless of whether or not that memory is cached, then these problems would go away?

    1. amanfromMars 1 Silver badge

      Re: I'm going to use up my stupid question of the day quota here...

      Am I correct in thinking that if speculative execution obeyed memory access restrictions (no user process peeking at lower ring address spaces like the kernel) regardless of whether or not that memory is cached, then these problems would go away? ... Serif

      Serif, Hi,

      Obey would suggest some sort of recognised and universally accepted absolute authority with inalienable rights to direct matters as they would choose/wish. Such is systemically problematical and therefore will never be a solution for consideration.

    2. Michael Wojcik Silver badge

      Re: I'm going to use up my stupid question of the day quota here...

      Am I correct in thinking that if speculative execution obeyed memory access restrictions (no user process peeking at lower ring address spaces like the kernel) regardless of whether or not that memory is cached, then these problems would go away?

      No.

      The Spectre vulnerabilities use side channels to extract information. They don't "peek[] at lower ring address spaces".

      Meltdown is a Spectre variant that leaks privileged memory, and the (obvious) fix for it was to prevent spec-ex from crossing privilege boundaries. But Meltdown is only one of many Spectre variants.

  13. cliffe
    Devil

    Strange Charcters

    Can a longtime El Reg veteran please educate me about a few special characters here.

    What's the deal with amanfromMars 1 and bombastic bob?

    1. Anonymous Coward
      Anonymous Coward

      Re: Strange Charcters

      MERE NEWCOMERS .... BRING BACK B1FF!

    2. amanfromMars 1 Silver badge

      Re: Strange Charcters

      Can a longtime El Reg veteran please educate me about a few special characters here.

      What's the deal with amanfromMars 1 and bombastic bob? ... cliffe

      Is the Deal or No Deal Situation/Virtual Reality here and with you, cliffe .... Proof Positive of Advanced Virtualised Application with NEUKlearer HyperRadioProACTive Applications Presenting Guaranteed Futures for AIMaster Pilot ProgramMING.

      Beware in Future Territories though, and be also Aware that Live Operational Virtual Environments are never a Tool for Schooling Fools. LOVE Space Places are Heavenly HeadQuarters for True Kings and Queens with Bounty Aplenty and Interminable to Share ..... Invest and Spend Sagely/Right Royally :-)

      1. Anonymous Coward
        Anonymous Coward

        Re: Strange Charcters - Some Subsonic ASCII

        "There is a Sound

        It's Under Ground"

    3. Michael Wojcik Silver badge

      Re: Strange Charcters

      Just some of our local kooks. What has the deal ever been with online kooks? They are a species unto themselves.

      Personally, I'm glad to have them, as long as they don't become too disruptive. They give the dish some flavor.

      Hell, I even enjoyed Eadon. ("Overwhelming, am I not?")

      1. amanfromMars 1 Silver badge

        When Drowning in Choppy Waters ...... Call for Sublime Help and Surreal InterNetional Rescue

        Both local kooks and alien spooks are a stealthy lawless species whenever invasive is not too disruptive and right tasty, Michael Wojcik, and would you agree that they are a creative force and novel future intelligence source whenever their space places are targeted for military intervention and domination ....... Trump Orders Pentagon To "Immediately Establish A Space Force"

        Do you think UKGBNI MOD leaderships have skin and effective reins for reigns in that Particular Peculiar Great Game? Or are present chiefs of staff and the current minister in office all as at sea in a leaky boat without AI See/Learned Virtual Machine Vision, which you should note is not shared here as a question, for of course they are.

        1. Anonymous Coward
          Anonymous Coward

          Re: When Drowning in Choppy Waters ...... Call for Sublime Help and Surreal InterNetional Rescue

          Trump Orders Pentagon To """Immediately Establish A Space Force""" -

          Making Greenbuck Weak Again. Crazy Printers Order Thousands More Tons of Paper.

  14. Claptrap314 Silver badge

    These academics need some fresh air

    It only took them six months to publish a finding which is only a partial implementation (and solution) to what I commented here the same day that this information became public? (Nothing special about me--anyone with a few years or more of experience in that field would certainly have had the same ideas.)

    I stand by my original (same day) estimates: if shadowing is to be implemented without substantially hurting speed/power, it's going to take roughly 50% of the processor area, at least out to the L1s. Deep fully-associative pools of resources are really, really expensive & slow.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like