nav search
Data Centre Software Security DevOps Business Personal Tech Science Emergent Tech Bootnotes
BOFH
Lectures

back to article
Monday: Intel touts 28-core desktop CPU. Tuesday: AMD turns Threadripper up to 32

Re: Intel was fudging

Wow, they needed a chiller to (fail to) match a stock IBM mainframe CPU clockspeed!

10
0
Silver badge

Re: Intel was fudging

Yes, but even IBM has backed off from pushing the clock speed to add more parallelism.

The Power6 processor had examples being clocked at 4.75GHz, but the following Power7 clock speed was reduced to below 4GHz (but the number of SMT threads went from 2 to 4, and more cores were put on each die, again 2 to 4). Power8 kept the speed similar, but again increased both the SMT and cores per die.

In order to drive the high clock speeds in Power6, they had to make the processor perform in-order execution of instructions. For most workloads, putting more execution units, reducing the clock speed, and putting out-of-order back into the the equation allowed the processors to do more work, but could be slower for single-threaded processes.

The argument about compiler optimization really revolves around how well the compiler knows the target processor. Unfortunately, compilers generally produce generic code that will work on a range of processors in a particular family, rather than a specific model, and then relies on run-time hardware optimization (like OoO execution) to actually use the processor to the best it can.

In order to get the absolute maximum out of a processor, it is necessary to know how many and what type of execution units there are in the processor, and write code that will keep them all busy as much of the time as possible. Knowing the cache size(s) and keeping them primed is also important. SMT or hyperthreading is really an admission that generic code cannot keep all of the executions busy, and you can get useful work by having more than one thread executing in a core at the same time.

I will admit that a very good compiler, targeting a specific processor model that it knows about in detail is likely to be able to produce code that is a good fit. But often the compiler is not this good. You might expect the Intel compilers to reflect all Intel processor models, but my guess is that there is a lead time for the compiler to catch up to the latest members on a processor family.

I know a couple of organizations that write hand-crafted Fortran (which generates very deterministic machine code - which is examined) where the compiler optimizer rarely makes the code any faster, and is often turned off so that the code executes exactly as written. This level of hand optimization is only done on code that is executed millions of times, but the elimination of just one instruction in a loop run thousands of millions of times can provide useful savings in runtime.

All of the time an organization believes that hand-written code delivers better executables, they may justify the expense of doing it. It's their choice, and making a generalization about the efficiency of code generated by a compiler is not a reason to stop when faced with empirical evidence. Sometimes, when pushing the absolute limits of a system, you have no choice than making the code as efficient as possible using whatever means are available.

13
0

Re: Intel was fudging

Yeah, but I was talking about mainframe - not Power - okay only two threads per core (for speciality engines) but they're clocked at 5.2Ghz, yes the z14 has built-in water coolers but these are just that - water coolers - not chillers. The previous z13 ran at 5Ghz and the previous to that (zEC12) a staggering 5.5Ghz.

4
0
Silver badge

Re: Intel was fudging

I think you would be surprised about how closely related the Power and Mainframe processors are nowadays.

With the instruction set micro- and milicoded, the underlying execution engines rely on some very similar silicon.

Oh, and there have been relatively recent Power6 and Power7 water-cooled systems, the 9125-F2A and -F2C systems, but only a relatively small number of people either inside or outside of IBM will have worked on them (I am privileged to be one of them). These were water-cooled to increase the density of the components rather than to push the ultimate clock speed. The engineering was superb.

And... they were packaged and built buy Poughkeepsie, next to the zSeries development teams, and use common components like the WCU and BPU from their zSeries cousins.

There was no Power8 system in the range, because of the radical change to the internal bus structures in the P8 processor. I don't know whether there will be a Power9 member of the family, because I'm no longer working in that market segment.

3
0
Bronze badge

Re: Intel was fudging

When I was at IBM, the compiler team was really proud that their code could outperform gcc by 15%. Nobody liked it when I pointed out that their code became available 18 months after silicon came out, and that intervening hardware performance improvements were more than that...

But, yes. Preservation of pain applies. Unless the compiler has the data to make a near-cycle-accurate simulation, there will be significant amounts of performance left on the table. As we are hitting the wall for MOSFET, however, expect compilers to start focusing more on this.

1
0
Anonymous Coward

Perhaps AMD will be kind enough

to swap out and replace my defective (insecure) 16-core ThreadRipper for a new nice shiney 32-core one without the security bugs ?

4
21
Silver badge

Re: Perhaps AMD will be kind enough

Oh for fuck's sake. I'm getting tired of the bitching around here regarding "oh noes my CPU is not secure!!"

You know the stuff that you're talking about is (for the most part) ludicrously complex and obscure. The CPU designers have had a shake and realised that this is a valid attack vector. They will now proceed to figure out ways of plugging these holes, and we'll all be happy if they can do it without dropping speed. The milk has been spilt, and the tearful little boy has been told to be careful in future - there's no point in dragging the fucking thing on and on. I really hope you don't have kids before you learn this.

Also, I will be pounds to pennies that when you went CPU shopping you had a pretty short list of requirements:

1) Will it run my software?

2) Is it cheap enough for me?

3) Is it the fastest I can get for the money?

4) Is it going to burn my house down or make the electricity meter spin like a top?

5) Will it work with all the other bits I've got?

I bet security did not even factor into it for even one second. But now there's a public hoo-ha? Give it a rest.

33
2
Anonymous Coward

Re: Perhaps AMD will be kind enough

to swap out and replace my defective (insecure) 16-core ThreadRipper for a new nice shiney 32-core one without the security bugs ?

You should have actually bought Intel (which is actually more "insecure").

Then installed Windows 10 OEM.

From some vendor which delivers Superfish with it.

6
0

Maths co-processor?

Does it have one like my DX or do I need to fill the empty socket like my friend's SX?

Kids today don't appreciate the difficult choices we had to make.

42
0
Silver badge

Re: Maths co-processor?

Get yer mobility scooter off my lawn whipper-snapper.

You did not have to try to find a coprocessor that will survive the blazing 25MHz of main CPU provided by the Harris 286C on a VLSI with memory interleaving.

24
2
Silver badge

Re: Maths co-processor?

Harumph. Some of us had to pry out our ARM2 CPUs and put ARM3 daughterboards in to get the FPA socket before we could even think of floating point copros... With the speed of FPEm, is it any wonder I learned to use integers with a liberal sprinkling of LSL and LSR?

Oh, and can I borrow that mobility scooter? My UPS needs new batteries.

15
0
Silver badge
Trollface

Re: Maths co-processor?

There was no need. If it started to glow red hot, we could any time just hit the turbo button to scale it back down...

12
0
Silver badge

Re: Maths co-processor?

All this worrying about daughter boards when you should have just bought a computer built with expandability in mind, like my Amiga with it's internal expansion port that lets me plug a board with a faster CPU and and FPU on the same board!

(Or these days you'd whack in a Vampire accelerator, which uses an FPGA to emulate a 68060 faster than anything Motorola ever built, along with all the gubbins necessary for hi-res output via HDMI. Shame they're almost impossible to find in stock.)

11
0
Thumb Up

Re: Maths co-processor?

Thumbs-up for getting a reference to the Vampire accelerator series in this discussion. :)

6
0
Silver badge

Re: Maths co-processor?

a computer built with expandability in mind, like my Amiga

Don't get me wrong, the Amiga was indeed an awesome computer in its day. My Archimedes could do most of what the Amiga did in hardware, but in software. But that's only most. The thing that (to my understanding) broke the Amiga was that Workbench had some nasty bugs on memory allocation. It wouldn't check if the memory was available before allocating it (or something like that), so you had to verify it and constantly micromanage malloc(). So people just dropped Workbench and programmed to the metal.

So far so good, but when the metal changed (AGA, anyone?) half the bloody software stopped. The software worked perfectly on only one iteration of the hardware. So when better silicon became available (and the support chips in the Amiga had some fairly tight memory limits, for example), they just couldn't deploy it without breaking the user base. So whilst it may have been expandable in some directions, it wasn't easily upgradable.

At least when the MEMC1a allocated RAM it was your bloody RAM to use. :D

Shame really. I never had an Amiga myself, but sometimes I'd quite like one. And most of the best Archimedes games were Amiga ports :)

5
0
Anonymous Coward

Re: Maths co-processor?

Well, the Amiga was OK, I guess.

But the Atari ST had a higher clock speed.

3
3
Silver badge

Re: Maths co-processor?

"But the Atari ST had a higher clock speed"

Shame it lagged behind in every other respect...

What do you mean we should have stopped having these Amiga vs Atari arguments twenty years ago (or more)?

10
0
Silver badge
Windows

Re: Maths co-processor?

25MHz? Some of us had a blast with doubling up our throughput with a 2Mhz to 2.48 Mhz replacement crystal.

3
0
Silver badge
Pint

Re: Maths co-processor?

>My Archimedes could do most of what the Amiga did in hardware, but in software.

Speaking of Acorn... The BBC micro had "Tubes", daughterboards for external CPUs like a Z80, ARM or even x86.

One for the Acorn folks whose computer started my career -->

8
0

Re: Maths co-processor?

"Shame really. I never had an Amiga myself, but sometimes I'd quite like one. And most of the best Archimedes games were Amiga ports :)"

There is a stand-alone Vampire V4 coming out some time... hopefully in 2018. Fully upgraded with 24 bit graphics, 512MB memory, 16 bit audio, AMMX, fast FPU, HDMI output.... 68080 processor (equivalent to ~250MHz 68060). Very cool indeed. You might want to keep an eye out.

1
0
Silver badge

Re: Maths co-processor?

Speaking of Acorn... The BBC micro had "Tubes"

The Tube was an external bus so expansions weren't really daughterboards (except in the Master, where there was an internal as well as an external connector), but it was a great system and the thing that really set the Acorns (6502 and ARM-based) apart from other computers of the day was what today we'd call the API. Software which correctly followed the API could be ported from machine to machine, from base configuration to co-processor and (aside from any actual assembler) would "just work".

As for FPU etc, the Archimedes did have the "podule" bus - wasn't it possible to fit an FPU in the first slot of that, or am I mis-remembering?

Then of course, the RiscPC. I still use one every day. Not just a podule bus, but a co-processor slot allowing the fitting of two processors. I had a '486SX in the slot next to my original ARM600, then I swapped the 600 for a StrongARM.

There was even a daughterboard which fitted into the processor slot and could hold five processor boards, and something called "Kinetic" which I coveted...

M.

2
0
Silver badge

Re: Maths co-processor?

The Tube wasn't even really a bus. It was a fast synchronous I/O port that kept the original BBC micro running, but as a dedicated I/O processor handling the screen, keyboard, attached storage and all the other I/o devices the BEEB was blessed with, while the processor plugged into the Tube did all of the computational work without having to really worry about any I/O. All of the documented OSCLI calls (which included storage, display and other control functions) worked correctly across the Tube, so if you wrote software to use the OSCLI vectors, they just worked.

When a 6502 second processor was used, it gave access to almost the whole 64KB of available memory, and increased the clockspeed from 2MHz to 3MHz(+?) IIRC. Elite was written correctly, and ran superbly in mode 1 without any of the screen jitter that was a result of the mid-display scan mode change (the top was mode 4 and the bottom was mode 5 on a normal BEEB, to keep the screen down to 10KB of memory). Worked really well, and even better with a BitStik as the controller.

I also used both the Acorn and Torch Z80 2nd processors, and I know that there were Intel 8186 running DOS, NS32016 running UNIX (used in the Acorn Business Computer range) and ARM 2nd processors built as well.

1
0

Great, but time for windoze to catch up.

How about lifting the Windows 10 32 MCSS thread limit eh Microsoft? What use is all those cores to professional musicians if we can't use them?

10
0
Anonymous Coward

Re: Great, but time for windoze to catch up.

>How about lifting the Windows 10 32 MCSS thread limit eh Microsoft?

As it's Windows 10 SAAS of course you can, we'll be making the available in-OS purchase available through the Microsoft store soon.

Have a nice day, kerching.

14
1
Anonymous Coward

Re: Great, but time for windoze to catch up.

Oh, you want a new feature? Fine!

We'll do the dev up front for nothing and we'll sell the product through the store so only those that want it pay.

What an evil, evil thing to do.

1
5

Re: Great, but time for windoze to catch up.

"Oh, you want a new feature? Fine!

We'll do the dev up front for nothing and we'll sell the product through the store so only those that want it pay."

Er, the multimedia thread limit was INTRODUCED with windows 10 - previous versions had no such limit - so it looks like Microsoft took something away without asking and now want to charge us to get the functionality back.

18
1
Anonymous Coward

Re: Great, but time for windoze to catch up.

>so it looks like Microsoft took something away without asking and now want to charge us to get the functionality back.

Game of Solitaire or Minesweeper anyone ?

8
0

Can anyone explain more about AMD's inter-die fabric? How fast and what are the latencies between each die and between level, 1, 2 and 3 caches?

2
0

An article on Ryzen Gen 1 is available at Toms Hardware.

https://www.tomshardware.com/reviews/amd-ryzen-threadripper-1950x-game-performance,5207-2.html

4
1

Depends upon your memeory

The clock speed of the infinity fabric is set to that of the memery it’s controlling. The fabric on VEGA will run to 1300 but haven’t tried higher yet as the HBM struggles to get must past 1250 stable, still looking at over 600GB/s at that speed.

More on IF here

https://en.wikichip.org/wiki/amd/infinity_fabric

1
0
Silver badge
Coffee/keyboard

Xeon was great. So was coal.

Awesome, that made me laugh.

20
0
Silver badge

Xeon was great. So was coal.

I was wondering if that was also a slight dig at the Trumpster.

11
1
Silver badge
Windows

@Chris G.

I don't recall Xeons shipping in orange boxes. But that could just be an unknown unknown.

2
0

"Whether Intel was trying to spoil AMD's announcement, or vice versa, it doesn't really matter: what matters is that someone is putting pressure on monopoly giant Intel, forcing it to innovate in the desktop world rather than burn millions of dollars on trendy side projects it later abandons or neglects."

This here is why I love AMD, who cares what they produce* as long as they keep Intel (relatively) honest.

*I love AMD as much as the next person, no fanboi downvotes please ;)

10
0
Silver badge

Meh

20+ core CPUs have been around for a while now. But there's a reason hardly anyone uses them, and we've been stuck at 2-4 cores for consumer parts for so long - most regular tasks just don't benefit from being massively parallel. Indeed, many programs still only run on a single thread because there's so little benefit from trying to use more. Even for workstations there are very few cases where throwing as many cores as possible at a task is actually the best approach; you're almost always better off with fewer, faster cores, and will often end up limited by RAM or drive I/O anyway.

People complain about Intel not increasing cores until pushed by AMD, but there were 6 and 8 core i7 CPUs parts around nearly a decade ago - no-one actually wanted them so they stopped making them. Meanwhile Xeons have been happily in the teens and into the 20s, and even those were never anywhere near as popular as those in the 10-12 core range. AMD haven't pushed Intel to do anything useful, they've just started yet another pointless willy-waving exercise where everyone tries to boast about how big their number is with no regard for whether it's actually useful. Which can be clearly seen by the mention of gamers - there isn't a game on the market that will actually benefit from having a 32 core CPU (for the most part they're GPU-limited anyway and as long as you're not doing something stupid like pairing a budget mobile CPU with a 1080 Ti they won't even notice what CPU you have), it's just a big number for people who don't understand what they're doing but want to have a big number.

4
9
Silver badge

Re: Meh

Well, there's at least one game that has been, is, and will be for the indeterminate future* definitely CPU-limited - a certain space sim in development**. Nothing else you throw at it can make it run as anything other than a slide show. Of course, whether it would benefit from a 32-core CPU or not is anyone's guess - but I'm more than happy to test it if you sponsor me with a test rig...

* amazingly, it's scheduled to deliver some semblance of a solution to this issue exactly at the same time as El Reg switches to IPv6: "Soon". Or, reportedly, two releases from now which, considering their track record regarding deadlines and promises, is actually precisely equivalent to "Soon" down to at least twenty decimals.

** geologists reportedly found evidence somewhere below the Permian layer indicating that the game wasn't always in development; based on this, several cults - widely shunned by the scientific community at large generally agreeing that the Sun will go nova first - believe the game will likewise exit development stage some day, achieving what they call "Release". The subject remains poorly explored after several ethnographic expeditions setting out to study the particularly vicious attitude of these groups towards non-cult-members have gone missing to date, never to be heard about again.

3
0

Re: Meh

IO is the major strength of the Threadripper setup, giving 64pciE3 lanes on the gen1. Not seen how many gen2 will give but it could be 128, as you get 16 per Ryzen chip and gen2 has 4 of the beauties.

5
0

Re: Meh

Love the BIG numbers and also love the smaller $ numbers this competition brings about.

1
0
Silver badge

Microsoft will come to the party sooner or later - they'll have a "basic" version of their product restricted to 10 cores.

Want access to an extra 10 cores? Buy an extra licence. etc etc etc...

5
1

So we need a 28+ core CPU in a desktop to make up for the reduction in performance due to Meltdown and Spectre mitigation? Nice!

5
0
Bronze badge
Facepalm

But ... I don't want more cores.

These days, I try to build silent PCs. I'm tired of my desk sounding like an airport with planes taking off every few minutes. I need to rock out to the rhythmic beats of my clackity-clacking mechanical keyboard. My priorities are electrical input, thermal output, and PCIe lanes first, GHz second, number of cores last. Which means that right now, both AMD and Intel have serious issues preventing me from having a preference.

This whole number of cores race is like the stupid GHz race of years past. In the end, no one wins because CPUs get "optimized" into becoming their own bottlenecks.

And whatever happened to adding more execution units and/or making complex executions require fewer cycles? Who cares about the number of sleeping cores and unused cycles you have?

10
2
Silver badge

Re: But ... I don't want more cores.

Heh, reminds me of something back in the OS/2 days - somebody soundproofed their PC - then put a resistor over the terminals of the CPU fan to keep it slow as OS/2 doesn't cook your CPU.

Was a really quiet PC.

Nowadays it is like McDonald's farm, a fan-fan here, a fan-fan there all fan-fan-fanning.

You get fans with LED's and allsorts of blinkenlights. Why?

And some components (esp RAID cards) with passive heatsinks really get hot, I prefer to have these cooled with active cooling. More noise. Plus dust is the bane of fans.

In the end it may be better to just get a bar fridge, stick everything inside with a couple of fans and have a really quiet PC (until the fridge compressor kicks in). And have a filter inside which'll filter the air continuously.

4
0
Silver badge
Holmes

Re: But ... I don't want more cores.

> And whatever happened to adding more execution units

Isn't that about adding moar coares?

> and/or making complex executions require fewer cycles

How.

> Who cares about the number of sleeping cores and unused cycles you have

As with programming languages, "it depends on the problem".

2
0

Re: But ... I don't want more cores.

If you merely require a desktop computer for admin tasks open office etc, just use a Raspberry PI! No noise if you run off a large flashcard or SSD.

1
0
Anonymous Coward

Fuck Everything, We're Doing Thirty-Two Cores.

I like how Gillette showed themselves to be beyond parody by actually introducing a five-bladed razor a couple of years after that was written. "Make the blades so thin they're invisible. Put some on the handle. I don't care if they have to cram the fifth blade in perpendicular to the other four, just do it!" Stick it on the back? Yeah, that's good.

9
0
Anonymous Coward

Reminds me of the 1950s USA cars that tried to outdo each other in adding fins etc.

1
1

Intel because they run cooler and die less often.

0
11

This is another 10GHz Pentium 4!

I find it virtually impossible to believe that Intel will be able to make a 28 core CPU run at 5GHz on all cores within 6 months, they can't even do that with 6 cores today. And if it was being cooled by liquid nitrogen this would be exceptionally misleading. I see only 4 explanations for this announcement:

1) It is a misunderstanding or vapourware, maybe he meant by the end of the decade which I still wouldn't believe.

2) They use a totally different type of core which isn't very complex but can clock quite high, kind of like Larrabee.

3) The TDP is like 1000 Watts and you need specialist high flow water cooling system to run it which comes pre-bonded direct to the die.

4) Intel has been sandbagging us that 10nm wasn't working properly and it's actually a magical node they defies the known laws of thermodynamics.

My money is on this being vapourware and is just Intel trying to get people's attention in the hope if they are waiting for this to come out then they are not buying AMD CPUs.

6
0
(Written by Reg staff) Silver badge

Re: This is another 10GHz Pentium 4!

The 28-core Core X-series part that's due out this year is 14nm, and will likely be a de-featured Xeon, and will not run at 5GHz unless massively overclocked.

Given Intel's other SKUs, 28 cores isn't ridiculous - but it smells like a stunt to steal the thunder from Threadripper 2.

C.

6
0

Should AMD have waiting for 7nm?

I am loving the specs of this chip but I'm a little worried that firstly, do I have any programs that will get close to using all those cores and secondly, what will they clock at (once thermal equilibrium is reached) if they are all at 100% load?

Also, if they are not coming out until Q3 this means 7nm versions will not be out anytime soon. Given 7nm Epyc is almost ready to sample I would have estimated that 7nm Threadripper could have been released Q1 2019 at a push but I can't see them coming out until Q3 2019 now with these 12nm being so late to the party, bearing in mind Q3 release could mean they are still agonist 4 months away from hitting the shelves.

Personally I would have preferred to wait a little while longer and got much cooler, much higher clocked 7nm chip version. But I guess if 7nm wouldn't be ready until Q2 / Q3 regardless then the 12nm stop gap is a good move but if this chip has delayed 7nm TR by 6+ months then I would say that was the wrong move by AMD.

0
1

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

The Register - Independent news and views for the tech community. Part of Situation Publishing