HPE Again?
HPE == Huge Problems Everywhere/Everytime
Why do people fall for this shell of once a great company to work for and do business with?
HPE + CSC == Customers Shafted Everytime
Not that the alternatives are much better.
HPE's crack repair squad has laboured for over four days to replace kit at Australia's Taxation Office, with no guarantee that the Office's online services would be back online come Monday. HPE and the Australian Taxation Office's (ATO's) troubles started in mid-December 2016, when some ATO services went offline The ATO named …
You do understand what an array is, yes?
-----
A highly available set of hardware that centralises storage onto a single platform.
It may be clustered and HA, but it means you've traded isolation for efficiency all the same.
Each server having its own storage means that you wont get this type of systemic failure. You'll have other problems, sure, but certainly not this.
"Each server having its own storage means that you wont get this type of systemic failure. You'll have other problems, sure, but certainly not this."
This is a pipe dream! so you put storage in each node but and lash them together with software! what happens when the software fails!
There is an old saying. Good Fast Cheap, you can pick two and you won't get the third!
http://www.computerweekly.com/news/450303913/Insurance-brokers-count-cost-of-lost-business-as-SSP-SaaS-platform-outage-enters-second-week
That one was originally said to have been caused by a power problem, but it went on for ages.
Is the HP kit or Support an issue or is it because there are so many of them about that there is more chance of a high profile story cropping up? Bit like when a 'self-driving' electric car has a problem it seems to be more often than not a Tesla?
That's funny. I hear that HPE is under a gag order by ATO lawyers and isn't allowed to say anything about what happened (but ATO can say whatever they want) and I overheard that ATO's implementation and DR plans were - lets say lacking at best and that is ultimately what caused the extended outage.
So perhaps you should check in with HPE and get their statement about what really happened and why there is a gag order.
The different private corporations that ran ATO's IT systems never had a "DR plan".
When I worked for one of them, I remembered a paper-based "DR exercise" that was entirely based around people sitting around a large round table shuffling "what ifs" scenario to one another. Everything was all theoretical. No one, including from the ATO, had the guts to pull down the power lever to a row of racks because there were people who were terrified their >10 year old servers may not survive.
re: "I overheard that ATO's implementation and DR plans were - lets say lacking at best and that is ultimately what caused the extended outage."
Don't know if you posted before or after the article was updated, but the final quote directly supports what you heard:
"Our focus will now turn to building system resilience to best ensure the stability of our services to the community," the Office says.
Which would also suggest why they are having problems recovering the situation back to normal, as they are having to work on the production system, with little or no fall back, with what that implies...
Everything what I've heard about this case since it began in December looks like it's ATO who didn't have any real DR, who had screwed up architecture big time and didn't even have backups that would be restorable in less than one month time. And it's HP who gets contineous bashing because of those obvious end customer faults. They have extreme patience that there are no even leaks to media saying how this really looks like, their only comments that "problem is not of a kind that has any probability to happen anywhere else" suggest to me strongly that reason was on customer/architecture side, not in HW. My internal translator tells me that they wanted to say that no one else is stupid enough to do such things. I'm not going to say they are saints, finally it's their HW that failed, but (sorry for being obvious) there's no HW on Earth that never fails. If they thought that they have one, it only adds to their complete lack of any competence whatsoever
The 3PARs are very good arrays, and - if you know what you're doing - you can just about guarantee that five-nines (99.999%) planned uptime. However, the biggest threat to uptime is designing a solution to a price-point rather than a required capability. I have seen salesmen remove redundancy form SAN proposals (and that's all the major vendors, not just HPE) to win the deal, hiding some disclosure in the proposal along the lines that "there is a very small chance that, in the event of several unusual circumstances, the system will fail". A shelf dying on a properly configured 3PAR should be an issue but not lose data. The failure to have a proper backup solution implies there is something very wrong in the ATO's IT department.