Re: no backup of the schedule?
And in typical Microsoft programming, it does make you wonder why it doesn't check if it's a system partition FIRST before it even poses the question.
A serious error committed by an "inexperienced operative" caused the IT meltdown which crippled the RBS banks last week, a source familiar with the matter has told The Register. Job adverts show that at least some of the team responsible for the blunder were recruited earlier this year in India following IT job cuts at RBS in …
From AC 26th June 14:57 to AC 26th June 18:49
"Oh, really now? Quick Robin, to the CMD prompt!"
Hook, line and sinker....
Yes, that's right windows doesn't allow to to format the partition you booted from, in order to format the C: drive you need to boot from a difrent system partition. But then we all knew that didn't we.
You've made an assumption, you assumed that what I described won't work, you also assumed that hadn't left out any lines like "boot from an alternative drive".
This is how system cock-ups happen, don't assume things, check you are in possession of all the facts before you act, you made an assumption, FAIL, yours, you don't happen to work on CA-7, do you?
I don't think it was the scheduler faulted. Most likely that, some poor sod omitted vital step(s) from EOD runs / scheduling table got corrupted during upgrade.
Have you ever worked in a complex environment where multiple systems feed data to each other? No I thought not! Even if you have recovered from the changes to the scheduler, you would have to restore the missing input feeds, massage header files etc.
This has to be a concerted effort for all production systems up/down stream from the host the EOD batch runs from.
Even under controlled DR environments with known input files it is usually 2 days effort with months of planning.
To make the matters worse, if you realise you've missed something 2 nights ago, you have to do this concerted effort for every EOD run.
So please, before critisizing the people who are recovering the system / transaction engage your brain.
Recovering a stand alone box is so easy, I would not even shed a single bead of sweat for it.
"Why, precisely, does one mess-up by one employee in front of one computer put your ENTIRE BANKING SYSTEM out of action, nationwide?"
it's all about management controls.
Whilst sometimes it is not possible to build in management controls to prevent a single operator nuking a system (but many times they are, but the can't be arsed to build them in), in this situation you use a checklist, do not think, do not use initative, follow the check list to the letter.
There is a reason pilots use them.
They work just as well for operations and application support.
and if they can't follow a check list, you shouldn't have hired them for that job.
It's not about "Management Controls" in fact this kind of wooly headed high level thinking is the reason why many investment banks are filled with "service delivery" departments while lacking the sort of teams that can actually deal on a technical level with this sort of crisis. Where I work we have a "service delivery" department and their sole function appears to be obstructive whenever we have an overnight problem. The end result is people who know what they're doing hacking the sh!t out of stuff to make it work, while the "management" tier do their best to make it look like they had any contribution to this process - and then they take credit. Vendor or in house, at 3 in the morning you need people available (IT and BA's and Devs) who understand the systems and can just make it process the OVN, This RBS situation is the perfect example - technical people within RBS probably knew exactly what needed to be done, but a load of managers probably demanded "process" and spent at least 12 hours relying on scheduler ops teams. For this reason several thousand "payday" customers (people who need their payday money) were skint for several days. All because of crap managers who talk more about "service delivery" and "process" I hate these wooly c***s what exactly do they do? As a senior member of a team who runs a front office system that shifts literally billions per day with base and precious metals I expect freedom where something needs fixing, and I've done nothing to demonstrate that I'm not capable of running that capacity. If you run with outdated systems, give the poor sods who keep it running absolute authority. I'm not Jerome Kerviel, nor are any of my hard-pressed colleagues, the bank I work for pay me to run these systems, and I will to the utmost of my ability - moreover if processing fails my bosses have my mobile number and at weekends I'll be the first to pitch in, on call or not. RBS didn't seem to have this escalation process from what I've seen.
Its service delivery's job to get in the way until everyone is satisfied that all the information is processed. They are basically a fallback because we all know the documentation is not up to date. It also turns out that "getting things working" is not worth risking other revenue/services for.
The problem is worse with consolidated systems. I suspect more money is lost long term on management of consolidated systems than is gained by saving a few million on additional hardware.
A more fundamental question to be asked here (speaking as someone who has provided Production support for several banks and seen most of the enterprise schedulers in action, excluding mainframes) is why on earth was the original CA-7 upgrade being performed during the working week?
Changes of this nature should be performed at the weekend so there's some breathing space if things do go pear-shaped.
Of course the fact that they went pear-shaped in the first place is likely down to inadequate testing/preparation for the upgrade itself combined with less than "expert" staff being used to actually perform the upgrade.
Well of course you wont find evidence if you don't go out looking for it!
This is worse than incompetence, this is 'bury your head in the sand' wilful ignorance. They made experience staff redundant, and when they gave those responsibilities to someone with little experience of the system, they cocked it up.
Indeed. But if he won't accept that it is the fault of RBS' inept offshoring in pursuit of a few shekels, then presumably he still believes that exporting UK jobs is a good way of rewarding the country that bailed out him and his fellow charlatans and incompetents. Maybe the BBC could ask him this directly, although so far their coverage of this has been rather lightweight.
Anyway, let's take it forward that offshoring is a grand thing, in which case the obvious thing is to give Hester and his fat cat cronies an Indian salary, since they're so keen to run an Indian bank.
The statement is logically correct there is no evidence this is connected to outsourcing
However as with all management bullshit statements it is not a full and true picture of events. Had Uncle Fester Hester said "there is no evidence this is connected to outsourcing to a company that employs the cheapest inexperienced people available in a country where staff routinely change jobs for small pay rises at 1 days notice", then I might have had a problem with what he said, not that he cares what I think anyway.
To quote Rockhound (Steve Buscemi) in the film Armageddon: "You know we're sitting on four million pounds of fuel, one nuclear weapon and a thing that has 270,000 moving parts built by the lowest bidder. Makes you feel good, doesn't it? "
Once is unlucky, twice is a coincidence. It's only after the third major crunch that someone would start asking questions.However, nobody in the bank would ever dare anything in public - there's too much hysteria about racism to open that particular can of worms
Though the important question would be: Can Natwest's customers survive another 2 outages?
"too much hysteria about racism"
This could/would have happened just as easily if it had been outsourced within the UK (EDS/HP, CSC, Atos, Logica, etc. It is about the loss of experienced staff.
TUPE would move the experienced staff initially, but history shows they have little loyalty to either employer and find themselves new jobs relatively quickly. The supplier is then left recruiting inexperienced "cheaper" labour to fulfil the contract. THAT IS HOW OUTSOURCING MAKES A PROFIT FOR THE SUPPLIER. Outsourcing NEVER improves service.
Rockhound in Armageddon: "You realize we’re sitting on 45,000 pounds of fuel, one nuclear warhead and a thing that has 270,000 moving parts built by the lowest bidder? Makes you feel good doesn’t it?"
I wonder if the senior executives and accountants at RBS Group do their banking through Barclays?
> This could/would have happened just as easily if it had been outsourced within the UK .... It is about the loss of experienced staff.
Yes, that's what I mean. And if it had happened with UK outsourced staff the topic of overall skill level would be openly discussed in the press. However, I get the impression that the silence (article in The Daily Mash notwithstanding) on the question is because people are too scared to broach the subject for fear of being labelled - even if they don't have a position on in; one way or the other.
"I wonder if the senior executives and accountants at RBS Group do their banking through Barclays?"
I worked for Natwest in the 90s for a brief period and they would only pay your salary to a Natwest account (you may have realised that this effectively means they're not paying you at all until you remove the money as it never leaves the bank!). No matter what the rules now, I expect a lot of their own staff were screwed over by this too. Probably not at board level, I guess they'd have access to their ample offshore accounts...
(The HR woman at the time said there was a time when not only would they only pay into Natwest accounts but they wouldn't allow you to have any other account aside from a building society savings account - your line manager actually had access to your statements to check you weren't moving out money to a rival bank!)
Outsourcing is not the root the cause of the problem. However, because outsourcing builds on the same flawed model already in place it is often the last straw.
The underlying problem is that most enterprise IT works only because computers have been replaced by humans working like an insect colony -essentially an old school organic computer.
In that sense you are correct, as the processes and knowledge exist only in aggregate at the level of the colony and only by continual crawling manually over the infrastructure and code base by the workers does the system keep working and the knowledge stay alive. Insert workers from another colony or hive off (pun intended) some of the work and the inevitable will follow.
IT management know nothing about IT which is why manual process have been substituted for automated ones in previous decades. This process of reversion to Victorian practices is further compounded by the breaking of any meaningful academic foundation of enterprise IT in the 21st century.
This looks like it must be true, just seen a comment on the Guardian web site by an RBS bod "The management and execution of the batch process is based in Edinburgh at Fettes Row, as is all the current work to resolve the problem.".
All the work is now going on in 'Fettes row' in Scotland. That wouldn't be India at all then?
Backbone? Guess they don't mean "having the guts to admit to making mistakes and having not properly understood the risks involved with outsourcing".
This situation really does sound like someone pretty high up in the executive chain responsible for operations needs to be fired. And I mean "fired" rather than helped into a taxi holding a massive payoff cheque.
In the unlikely event that anybody is fired, there's every chance that they will be (in relative terms) quite a junior scapegoat, and the real villains will continue to scrape large pay packets and larger bonuses.
To my mind the bank's head of risk needs to go, the head of IT needs to go, the head of the India operation needs to go (and quite a few people below him), and possibly somebody from operations. But what's the chances that they try and pin it on, say, the head of UK retail banking, who would probably have little or no say on how the back office was designed, or the decision to offshore it?
@AC12:28 The head of risk needs to go? I've worked in large banks all my career in settlements, risk, and trading. The head of risk would have had fuck all say in when the scheduler got upgraded and how it was done. In a normal bank with all its experienced staff grand plans would have been made with layers of sign-off. But even then the head of risk would have little say - business sign-off is often a case of "Will it break? Nah. Are you certain? Sure. Ok then". When the key men get booted out and the outsourcers arrive things tend to get handled a little differently.
This was about lack of experience and cost cutting. That something so integral to the running of the business could be fucked up so monumentally is a can that will need to be carried by the head of IT (minimum) and potentially someone above him (Head of operations? Hester?). You cannot bring a bank to its knees for a week and just walk around with an executive equivalent of "shit happens".
Fired is inadequate.
When an engineer wilfully neglects to design to the accepted standards of his profession and people are killed by the collapse of the resulting structure, he's likely to find himself facing manslaughter charges.
The manager responsible for this almighty F***-up ought to be personally liable for the losses. All of them. Bankrupcy is the least that should happen to him. Jail would be better.
Many Roman bridges and even buildings are still standing after two millennia of use including one of total neglect. This probably has something to do with the Roman approach to quality control. The architect was required to stand under the arches as the scaffolding was removed.
You know better, manglement will always escape punishment.
They will always find a new job.
To send the responsible damager on his way to that new position with a new employer, I suggest use of one of these:
http://en.wikipedia.org/wiki/Trebuchet
Put the mangler in where the rock would have normally been located.
PULL!!!!!!!!!!
So the long-term solution to a problem caused by replacing experienced people with inexperienced people, is to replace the current, experienced executives - who are now as full of information on the perils of employing inexperienced staff as the dog that peed on the third rail - with new, shiny, fresh out of their MBAs executives who will repeat the same errors made by their forebears?
Surely a better solution is to fire no-one, but to re-hire (at great expense) as many of the experienced staff they originally sacked as they can find, while slashing the remuneration of the responsible parties, but keeping them in-house now that they have learned this rather expensive lesson?
Good idea.. Though perhaps re-examine their renumeration, as their evidenced value to the organisation isn't what it was thought to be..
Though following the chain, and finding out why the outsourcing decisions were made if the first place, who applied the pressure, and finding out why they applied pressure to something as obviously risky (we hear the cries every time there is outsourcing that something like this will happen; well, here we are). And definitely evaluate their value to the company and renumeration.
I think many people are conflating two issues because they are so often interlinked, and that's offshoring and outsourcing. I can't speak from any experience of RBS back office, but although we know that RBS do do captive offshoring (ie own employees), they also make extensive use of outsourced deals, both on and offshore:
http://articles.economictimes.indiatimes.com/2009-10-07/news/27646604_1_rbs-chief-executive-royal-bank-technology
Where exactly the blame for this current fuck up truely lies I don't know, but the finger seems to point very strongly offshore. As soon as management start believing spotty faced expensive management consultants, they fall in love with the idea that every job (apart from their own) is a globally portable commodity, and we end up with these types of disaster.
And in large part, the "low wage" is a big part of the problem. If the offshore staff are any good they'll move on to better paid jobs (ie no experienced staff in your offshore centre), or they demand regular and generous pay rises (ie wave goodbye to the savings that McKinsey/BCG/Accenture/KPMG or whoever promised you). If they're really good and motivated, then they'll build their skills and upsticks to another country that pays better, like the US or Western Europe (a perennial problem iin eastern Europe is retaining BPO or IT skills when people can just move across EU borders and quadruple their wages).
There is a simple answer to all of this. Stop offshoring jobs in pursuit of modest and fast eroding savings. Keep them onshore, automate where practicable and sensible, and refine processes and systems to keep costs under strong control. When it comes to outsourcing, that can make sense if you honestly believe that your supplier has some magic sauce that enables them to do the job cheaper and better than you can - in which case you need to ask why they are better than you are at employing people and telling them how to do what is often a transactional or near manual job.
It's not just about "offshore". I worked in an IT department that had a team in India, employed by the company and it worked great.
What consultancies have always done, right back to the early 90s (and maybe before), is to nickel-and-dime customers, even with staff in the UK. They'll start with good people during the implementation process, then once everything's settled in and running smoothly, and it's a bugger to change things back, you start to see experienced guys move off and you find you've got someone new at the other end of the line who doesn't have a clue. But, you're still paying the same for some college kids as for an experienced guy.
They're also horrible places to work for people who care about doing a good job. You'll rarely meet good managers or execs, or work with brilliant people. I've worked for 2 consultancies on contract, and both had people in charge who were clueless.
Spot on. It's nothing to do with offshore staff per se. It's to do with replacing long-term staff with proven experience, by cheaper staff with no experience. Staff who quite possibly lied on their CV to get a job, or paid someone else to sit their exam.
It could be worse. I wonder if they're offshoring the control rooms for nuke power stations yet?
While the Guardian seems to have passed this off as its own research, looks like the Daily Mail not only picked up and ran with the offshoring aspect, but also credited El Reg as the source. As a long time grauniadista I'm disappointed in them, but pleased a least one part of the national press got it right.
Offshoring can be a false economy and in this case the taxpayer pays the price :(
[anon as I work for a company that is tangentially involved]