User topics

Article topics

Log in Sign up

Intel's Atom C2000 chips are bricking products – and it's not just Cisco hit

Intel's Atom C2000 processor family has a fault that effectively bricks devices, costing the company a significant amount of money to correct. But the semiconductor giant won't disclose precisely how many chips are affected nor which products are at risk. On its Q4 2016 earnings call earlier this month, chief financial officer …

COMMENTS

Post your comment

House rules Send corrections

Add to 'My topics'

Page:

Monday 6th February 2017 22:40 GMT Andy Tunnah

I smell an oscar

Bang up detective and reporting work. All you need now is a script and a role for Mark Ruffalo

32 0 Reply
1. Monday 6th February 2017 23:00 GMT AmenFromMars
  
  Re: I smell an oscar
  
  Agreed, good work.
  
  4 0 Reply
2. Monday 6th February 2017 23:45 GMT Mephistro
  
  Re: I smell an oscar
  
  And another thumbs up for the article!
  
  5 1 Reply
Monday 6th February 2017 22:49 GMT Your alien overlord - fear me

"A board level workaround exists" that involves us paying each other millions as bonuses for not paying attention to Quality Control. The rest of the world can just sit there wondering if their kit will boot tomorrow.

29 0 Reply
1. Monday 6th February 2017 23:29 GMT Anonymous Coward
  
  :) I see what you did there. Boredroom level workaround, indeed!
  
  I think when they say "stepping" they mean the next iteration of the chip. Which means new chips and boards all 'round, unless you are handy doing super-fine, surface mount soldering with your trusty micro-iron! The workaround sounds like strapping the line to another clock source, and Low Pin Count sounds like the processor is not fully enabled yet so just provide a clock to get us out of POST and hand over control to the boot loader and everything will be fine. Also, no way to fix with microcode. Very nasty problem given the 18 month wait period before it manifests!
  
  Good on El Reg for putting the clues together!
  
  14 0 Reply
  1. Tuesday 7th February 2017 14:32 GMT Lennart Sorensen
    
    The work around means adding some resistors to the design, which on most boards is not something you can just do, since this is a clock line. You can't just add wires and resistors since that would mess with the clock signal. So it is either change the board design to add the resistors, or wait for the next version of the chip (which will probably take months to happen). Of course since the chips are soldered to the board (not a socket), they are not easy to replace either.
    
    3 0 Reply
    1. Tuesday 7th February 2017 15:47 GMT Anonymous Coward
      
      adding resistors
      
      "The work around means adding some resistors to the design"
      
      Apologies if I missed something obvious - but is this stated officially anywhere?
      
      [It's entirely plausible, it's just one of the many failure modes possible when digital designers forget they live in an analogue world]
      
      3 0 Reply
Monday 6th February 2017 22:57 GMT Spotswood

Uh oh

We own several Synology DS1815+ devices each with about 24TB capacity and currently quite full of data. They use the Intel Atom 2538 which is listed as a SoC containing this fault. These are well over 12 months old and therefore approaching the 18 month danger zone.

This is obviously very concerning.

I hope Synology are ready to help us.

17 0 Reply
1. Tuesday 7th February 2017 00:24 GMT Chris King
  
  Re: Uh oh
  
  pfSense also sell boxes based on C2358/C2758 processors.
  
  2 0 Reply
2. Tuesday 7th February 2017 02:06 GMT Nate Amsden
  
  Re: Uh oh
  
  Looks like those have 3 year warranty so would be surprised if they didn't fix it..but maybe you have to wait for them to fail.
  
  1 0 Reply
3. Tuesday 7th February 2017 08:06 GMT John Smith 19
  
  "I hope Synology are ready to help us."
  
  Wouldn't it be an idea to call them first?
  
  8 0 Reply
4. Tuesday 7th February 2017 08:08 GMT Anonymous Coward
  
  Re: Uh oh
  
  Don't worry. Just put all your data in the cloud.
  
  9 1 Reply
5. Tuesday 7th February 2017 10:21 GMT Synology UK
  
  Re: Uh oh
  
  Hello,
  
  JP here from Synology UK. Unfortunately we didn't receive the contact here so we apologise that no has responded. Please contact our technical support team via www.synology.com/ticket and they will be able to advise you on the best course of action.
  
  Kind regards
  
  JP
  
  15 0 Reply
  1. Tuesday 7th February 2017 12:46 GMT Anonymous Coward
    
    Re: Uh oh
    
    Hello,
    
    AC here from Synology UK. We hold all our open ticket and customer details on our own award winning DS1815+ devices, so will get back to you in 18 months and 1 day.
    
    Kind regards
    
    AC.
    
    9 3 Reply
  2. Tuesday 7th February 2017 15:30 GMT Chris King
    
    Re: Uh oh
    
    Dear JP
    
    Posting boilerplate messages to random threads doesn't inspire me with confidence in Synology's approach to this problem.
    
    Right now, I've got a shiny DS1815+ that's humming along very nicely, but I'd really like to know if this thing has one of affected processors so I can plan accordingly. I understand that RAID Is Not Backup, so losing access to my storage won't hurt me, but it will make my life less convenient.
    
    This is an inherent fault in manufacture of a component, identified by the manufacturer of that component. I'd really like to know what Synology (and other manufacturers who are reading this discussion - CK) are going to do to rectify it.
    
    If your attitude is "we'll wait for your kit to die, then we'll replace it" then your box is going back to where I bought it from - and I will make very sure that anyone who asks me for advice on buying a NAS knows that.
    
    Toodle Pip,
    
    CK
    
    5 1 Reply
6. Tuesday 7th February 2017 13:35 GMT Chris 244
  
  Re: Uh oh
  
  Probably not a good idea to have faith in Synology. They are still promoting said products on their website:
  
  https://www.synology.com/en-us/products/DS1815+#spec
  
  And the resellers the Synology website sends you to are still reselling:
  
  http://www.ncix.com/detail/synology-ds1815-diskstation-8-bay-diskless-71-103588.htm?promoid=1721
  
  1 1 Reply
7. Tuesday 7th February 2017 14:41 GMT Chris King
  
  Re: Uh oh
  
  Double jeopardy for me... "If my 1815+ crashes, no worries, stuff is available elsewhere in verified backups. I can always build myself a U-NAS box as a replacement and I've got just the motherbo-oh crap, it's a C2758 !"
  
  Seems 2016 was not a great year for hardware purchases for the home lab.
  
  7 0 Reply
8. Thursday 9th February 2017 01:06 GMT ryan_c
  
  Re: Uh oh
  
  I have a Synology 1815+ that we purchased about 10 months ago. Starting in December I started having random reboots but they weren't caught right away because notifications were not setup correctly and the unit boots so quickly. Towards the end of December I noticed the problem as the frequency had increased quite a bit. I did some searching around and it sounds like quite a few DS1815+ units have bad power supplies. I called up Synology and they confirmed that my issue was a bad power supply and swapped it out. The transfer to the replacement DS1815+ couldn't have been easier. My point with this long winded comment is that the forum post that is referenced in this article is most likely faulty power supplies which is concerning but not the same issue.
  
  0 0 Reply
Monday 6th February 2017 23:04 GMT Planty

This will show up the good vendors vs the bad ones.

Cisco are top of the good list, and so far the only entry. Any other takers? Or does everyone else think nobody will notice?

Personally sick of companies that take the sweep it under the carpet and hope nobody notices. Don't they realise than in the internet era nothing can really be covered up and mass product faults that the manufacturer hopes nobody will notice, that won't wash anymore... (Panasonic AllPlay, calling you out here...)

8 1 Reply
1. Tuesday 7th February 2017 12:00 GMT Anonymous Coward
  
  Re: This will show up the good vendors vs the bad ones.
  
  > Personally sick of companies that take the sweep it under the carpet and hope nobody notices.
  
  Recently had this with a camera. A Fuji randomly shutting down. Reading various forums suggested it was a common issue and that the lens (built-in) was the cause. After a struggle contacting Fuji UK, they said there was no problem with the camera or the lens, but to send it back. They replaced the lens, which resolved the issue. Which, of course, didn't exist.
  
  Why do corporates not take the high ground of admitting clearly to a problem and then resolving it? There's far more to gain and much less to lose that way.
  
  15 0 Reply
  1. Tuesday 7th February 2017 13:15 GMT Nick Ryan
    
    Re: This will show up the good vendors vs the bad ones.
    
    Why do corporates not take the high ground of admitting clearly to a problem and then resolving it? There's far more to gain and much less to lose that way.
    
    Why? Litigation society, that's why. It's usually believed that if a company admits a problem then they are admitting liability and opening themselves up to litigation. Which can/will be expensive. Best to err on the side of caution and to never admit to anything. Ever.
    
    See also: "Dark ages" or "why nothing of great importance happened because much of Europe was concerned with pointless legal matters and why external input was required"
    
    7 0 Reply
    1. Wednesday 13th July 2022 01:34 GMT Auntie Dix
      
      Re: This will show up the good vendors vs the bad ones.
      
      Doing the right thing does not get a company sued, typically, but the right thing means putting into a remediation chest BIG MONEY.
      
      CEO scumbags are playing games ("We won't name the perp!"), because the law won't prosecute this hiding of information. Their mansions are safe, while your Intel-crippled Synology box croaks and you lose your money, time, and data.
      
      Companies ("persons," under insane U.S. law) will lie, unless severe penalties are at the ready. Think of automobile recalls. Those CEO scumbags have (some) regulation over their slimy heads.
      
      0 0 Reply
Monday 6th February 2017 23:07 GMT Adrian 4

Oh no, not again

Back in the dawn of PC time, I worked on an 8086 machine designed before everyone was expected to copy IBM. The 8088 and 8086 used a special intel clock driver, the 8284. We had loads of problems with them not oscillating properly ..

8 2 Reply
1. Tuesday 7th February 2017 08:31 GMT Dwarf
  
  Re: Oh no, not again
  
  Perhaps someone should have introduced Intel to OpAmps, they ALWAYS oscillate, even when you don't want them to !
  
  13 0 Reply
Monday 6th February 2017 23:12 GMT Aitor 1

Crap support

The problem with this is they will simply start dying.. and my guess as many other commentards say is that most vendors will do a la la la, and ignore clients. As they are set to lose plenty of money.. unless intel compensates them.

What SHOULD be done is vendors sending new devices with the corrected processor, so the old ones are returned and either scrapped or refurbished.

This is quite bad news.. and potentially crippling for intel, not just for the money, but for the lack of confidence. Ppl might just feel more confident putting an nvidia SOC than an Intel one!

9 0 Reply
1. Tuesday 7th February 2017 00:27 GMT a_yank_lurker
  
  Re: Crap support
  
  Actually the vendors might have a very strong civil suit against Chipzilla for delivering defective products. The vendors are caught in the middle as the ultimate miscreant, Chipzilla, is a direct supplier. So the customer harasses/sues vendor who in turn harasses/sue Chipzilla.
  
  Note, do not scrimp on QA/QC because the few bucks you save up front will eventually come out of your hide with a very serious multiplication factor.
  
  9 1 Reply
2. Tuesday 7th February 2017 08:16 GMT Anonymous Coward
  
  Re: Crap support
  
  You're right. Companies don't spend money on customer support or service any more; that cash is instead split as follows: 85% to the board, 14.5% to marketing and making the website pretty and 0.5% to an offshore team to run the customer twitter account ('can you be typing in your number of customer and bank account and identification of order with a quickness kind sir, and I or they or we will be back with you with a perfect answer in hours of plenty').
  
  We will find out that most tech companies don't understand what they sell at all, and are just a change of logo, a website and a hefty dose of BS. It won't be an easy lesson.
  
  13 1 Reply
Monday 6th February 2017 23:20 GMT Anonymous Coward

Now would be a good time to buy ARM shares......damit.

15 0 Reply
1. Tuesday 7th February 2017 10:47 GMT Peter Danckwerts
  
  Pity it's too late to buy ARM shares. It was bought by SoftBank last year.
  
  1 1 Reply
  1. Tuesday 7th February 2017 11:24 GMT Anonymous Coward
    
    >Pity it's too late to buy ARM shares. It was bought by SoftBank last year.
    
    Hence the Damit. Do keep up at the back, no offence intended.
    
    8 0 Reply
Monday 6th February 2017 23:23 GMT Anonymous Coward

I worked at NetApp when they encountered the PCI/NMI error whereas a sub standard adhesive caused controllers throw up protection faults and panic. I have never seen so much effort go into Cover Up, Playing Down, Case Manage and Control Communications (inside as well as outside the organisation).

The Company went into full damage control mode, so concerned about reputation that the technical fault itself became a secondary issue. For NetApp only a few thousand systems were affected, yet they couldn't keep up with producing/refurbishing the number of fixed boards required. It took months to years to fix the last customers.

Now imagine intel with millions of C2000's and most of them on SoC's.

I can tell you this:

If you are large customer with a large vendor (e.g. a large Cisco customer) you get fixed first. Cisco say they would prioritise systems by operational age, but that's BS. Customer's get prioritised by the size of impact and potential of negative press. Therefore large Telco's will come first. Cisco wants to avoid negative press at all costs. "ISP or Mobile Carrier went down due to faulty Cisco gear", would affect a lot of people and generate a lot of negative press.

If you are a small'ish vendor of C2000 systems -or - you are a customer of those systems - you are screwed!

That hot potato will stay in your hands until the large vendors and customers are fixed. Next comes the medium businesses and finally the guys at home with their Synology NAS' come last.

The reason you don't hear a thing from your vendor - is not because they're unaware of the issue - it's because they're developing strategies to minimise their costs. And sorry - they don't give a shit about you (the customer) and the fact that your gear (or business) may fail at any time.

31 0 Reply
1. Monday 6th February 2017 23:49 GMT Anonymous Coward
  
  Been there done that.
  
  As a vendor there is so much you can do.. and doing a samsung is going broke.
  
  A BGA resolder properly done can go to 400$ a piece.. so it makes no sense to do it on synologys...and yet hey, there is your data.
  
  We have a synology as a single point of failure in our company, just for internal use and replication. While we do have a backup of it (well, 2 to be precise) it will be a nuissance to say the least.
  
  1 0 Reply
  1. Tuesday 7th February 2017 15:52 GMT Doctor Syntax
    
    Re: Been there done that.
    
    "A BGA resolder properly done can go to 400$ a piece.. so it makes no sense to do it on synologys...and yet hey, there is your data."
    
    So just swap the whole processor board.
    
    0 0 Reply
    1. Tuesday 7th February 2017 16:50 GMT Anonymous Coward
      
      Re: So just swap the whole processor board.
      
      I'm not familar with the NAS boxes in question, but as well as swapping the processor board, wouldn't another option be to swap the hard drive(s) to a similar-enough NAS box that wasn't implicated in this affair?
      
      The valuable-to-customers bit here is probably the data not the hardware, right?
      
      Just askin' (apologies if it's a daft question).
      
      2 0 Reply
      1. Tuesday 7th February 2017 23:44 GMT Doctor Syntax
        
        Re: So just swap the whole processor board.
        
        "Just askin' (apologies if it's a daft question)."
        
        Not a daft question. I'm not familiar with the product.
        
        If the drives are nothing but data and the whole thing is driven by firmware on the processor board then it would be a tad difficult. It would depend on being able to find an alternate device with sufficiently similar firmware which would be entirely down to the software being generic. Without going off & researching that I've no idea whether it is or whether it's proprietary.
        
        If the drives have an OS on them then it would depend on the OS including the right drivers. There's always a problem, even with general purpose OS's, of having support for newer or even older hardware.
        
        Short answer, "similar-enough" might not exist.
        
        0 0 Reply
2. Tuesday 7th February 2017 10:12 GMT Anonymous Coward
  
  Completely agree about the cover-up
  
  Just like the cases of flaming Ford Kuga's (check news in New Zealand and South Africa)
  
  0 0 Reply
  1. Tuesday 7th February 2017 13:37 GMT Anonymous Coward
    
    Re: Completely agree about the cover-up
    
    "Just like the cases of flaming Ford Kuga's (check news in New Zealand and South Africa)"
    
    The problem only occurs there because the Kuga was never designed to run upside down.
    
    9 0 Reply
    1. Tuesday 7th February 2017 14:11 GMT Paul Kinsler
      
      Re: The problem only occurs there because the Kuga was never designed to run upside down.
      
      And isn't reported in Australia because it gets blamed on bushfires instead. :-)
      
      4 0 Reply
  2. Tuesday 7th February 2017 23:57 GMT Anonymous Coward
    
    Re: Completely agree about the cover-up
    
    "Just like the cases of flaming Ford Kuga's (check news in New Zealand and South Africa)"
    
    That was the Voice Control System committing suicide after hearing the accent! :)
    
    2 0 Reply
3. Tuesday 7th February 2017 14:38 GMT Lennart Sorensen
  
  The only fix so far is to change your own board to add the workaround. New chips don't exist yet so no one is getting those until they exist. So everyone is at their own mercy about how long it takes to change the board design and get new boards made, or they can wait for the new chips and hope for the best in the mean time. Doesn't matter if you are Cisco or some tiny company. Of course I suspect Cisco might very well be able to get a new board revision design made a lot faster than the little guys.
  
  2 0 Reply
4. Tuesday 7th February 2017 17:47 GMT a_yank_lurker
  
  @AC
  
  Given the actual screwup is Chipzilla, the vendors in many cases do not have any real options until Chipzilla figures out how to fix their mess. Then Cisco can start fixing/replacing gear; they do not have any inventory of good chips. Right now there is no gear except for known defective gear to push out. Cisco has the luxury of nailing Chipzilla with a knockout punch and probably will go after them.
  
  1 1 Reply
Monday 6th February 2017 23:46 GMT Anonymous Coward

I remember the NetApp PCI/NMI error. Internally they called it the PCI/Enema and everybody had a good laugh.

When facing the customer the sales guys pretended not to know anything about it. Actually not just sales, but the entire leadership team, all the way up.

10 0 Reply
Tuesday 7th February 2017 01:01 GMT Herby

So when do I short CSCO/INTC stock??

Given that this seems to happen after 18 months, one might want to calculate the time of first failure, and watch the stock go down. It could get interesting.

Of course, one wonders WHY the failure manifests itself after 18 months. Is there some flash component that gets used to determine elapsed time? We know the symptoms of the failure, but not the actual root cause (other than a bad chip design (DUH!).

In any event, not an easy re-work. BGAs are almost impossible, Surface mounts can probably be done in the field, but I wouldn't. Time will tell how this is handled (good, bad, terrible).

Me? No, I don't own any INTC/CSCO stock.

2 0 Reply
1. Tuesday 7th February 2017 07:39 GMT Richard 12
  
  Re: So when do I short CSCO/INTC stock??
  
  Semiconductors of all types wear out over time, as the doping drifts - mostly due to thermal effects, so hotter parts fail faster.
  
  Package pins are connected to the silicon by really tiny wires that can snap, eg under the stress of warming up or cooling down.
  
  There's other failure modes such as insulation breakdown, overvoltages and many more.
  
  It only takes a small miscalculation or manufacturing error to turn a chip with a theoretical 50-year MTBF into chip with an 18-month MTBF.
  
  It sounds like this failure may only matter at boot, if true then a device left running will keep going even after the failure - it just won't boot again.
  
  It is a shame that Intel is saying nothing about the failure rate. Could be 1%, or even 90%. Given the lack of info, it's probably quite high.
  
  9 1 Reply
2. Wednesday 8th February 2017 16:08 GMT bogd
  
  Re: So when do I short CSCO/INTC stock??
  
  Funny you should mention stock value - this is the actual title of an article published today: "Intel Is on a Roll After a Difficult Spell, So Buy the Stock Now"
  
  Unfortunately, I cannot post the link, but here is a nice quote:
  
  "...the quarter also solidified 2016 as a comeback year for the Silicon Valley company.
  
  For years, Intel has tried to break into the mobile-phone business. Last year, it finally secured a deal with Apple to provide chips for the iPhone 7."
  
  Quite funny in context, eh? :)
  
  1 0 Reply
Tuesday 7th February 2017 01:11 GMT Anonymous Coward

Cheating Software ?

Perhaps intel's planned obsolescence team has made a mistake and set the thresholds too low?

This should be investigated. Could be the next VW.

4 0 Reply
Tuesday 7th February 2017 02:13 GMT Anonymous Coward

2017 is the new Millenium Bug.. !

0 0 Reply
Tuesday 7th February 2017 03:33 GMT weekend

My NAS build uses a ASRock c2750di and mysteriously stopped working several months back. I was blaming ASRock as there are a lot of complains about that motherboard failing.

Would there be any way to find out if it's because of intel or if its an unrelated fault?

I can't afford to have such an expensive board break again and each time I try to come up with a new build that can handle as many hdd's I get carried away and things get expensive... So that machine is still not replaced.

1 0 Reply
Tuesday 7th February 2017 03:36 GMT abortnow

Aargh!

I have two potentially affected boxes:

iXsystems FreeNAS Mini

CPU: Intel(R) Atom(TM) CPU C2750 @ 2.40GHz (2400.06-MHz K8-class CPU)

Nothing yet in the FreeNAS forum.

Netgate pfSense SG-2220 firewall

CPU: Intel(R) Atom(TM) CPU C2338 @ 1.74GHz (1750.04-MHz K8-class CPU)

User comments and questions already present in pfSense forum. No response yet from Netgate.

Plus the FreeNAS Mini XL I have on order (8-(

Very annoying that this quite expensive kit should have such a problem. Thanks Intel. Some of us have not yet forgotten the Pentium FDIV saga.

4 0 Reply
Tuesday 7th February 2017 03:56 GMT ecofeco

Rut Roh

Well... smeg.

3 0 Reply

Page:

POST COMMENT House rules

Not a member of The Register? Create a new account here.

Other stories you might like

Intel's effort to build a foundry biz is costing far more – and taking longer – than expected

Analysis If you wanna make money you've gotta spend money. And against Samsung it's gonna cost a lot

On-Prem 16 Apr 2024 | 11

Intel Gaudi's third and final hurrah is an AI accelerator built to best Nvidia's H100

Intel Vision Goodbye dedicated AI hardware and hello to a GPU that fuses Xe graphics DNA with Habana chemistry

Systems 9 Apr 2024 | 1

Fire in the Cisco! Networking giant's Duo MFA message logs stolen in phish attack

Also warns of brute force attacks targeting its own VPNs, Check Point, Fortinet, SonicWall and more

Security 17 Apr 2024 | 6

Intel's neuromorphic 'owl brain' swoops into Sandia labs

Hala Point system crams more than a thousand neurochips into a 6U chassis to tackle real-time AI

AI + ML 17 Apr 2024 | 1

Intel's foundry business bled $7B in 2023 with more to come

CEO Gelsinger promises sunny days ahead as he confirms reorg

Systems 3 Apr 2024 | 18

Cisco creates architecture to improve security and sell you new switches

Hypershield detects bad behavior and automagically reconfigures networks to snuff out threats

Security 18 Apr 2024 | 8

Meteor Lake CPUs splash down in socketed motherboards for edge and embedded workloads

Intel reckons the NPUs that power the 'AI PC' are needed on your lap, on the edge, but not on the desktop

On-Prem 8 Apr 2024 |

Intel over the Moon as Lunar Lake’s NPU performance TOPS Meteor Lake

Intel Vision Pat Gelsinger claims 3x performance in next-gen silicon for AI PCs

Systems 10 Apr 2024 | 3

It's 2024 and Intel silicon is still haunted by data-spilling Spectre

Go, go InSpectre Gadget

Research 10 Apr 2024 | 23

Intel fuels Huawei's AI PC ambitions with Meteor Lake CPUs in MateBook X Pro

But for how much longer?

Personal Tech 11 Apr 2024 | 1

US lawmakers rage over Intel Meteor Lake-powered Huawei PC

Special export license granted to Intel by President Trump unlikely to be renewed

Systems 15 Apr 2024 | 8

Intel preps export-friendly lower-power Gaudi 3 AI chips for China

Beijing will be thrilled by this nerfed silicon

AI + ML 12 Apr 2024 | 5

The Register Biting the hand that feeds IT

About Us

Our Websites

Your Privacy

Situation Publishing

Copyright. All rights reserved © 1998–2024