back to article First-day-on-the-job dev: I accidentally nuked production database, was instantly fired

"How screwed am I?" a new starter asked Reddit after claiming they'd been marched out of their job by their employer's CTO after "destroying" the production DB – and had been told "legal" would soon get stuck in. The luckless junior software developer told the computer science career questions forum: I was basically given a …

Page:

  1. GarethWright.com

    So....restore from backup

    The issue is with the docs.

    CTO is a knob and why is this an issue because they obviously have backups don't they....

    1. SolidSquid

      Re: So....restore from backup

      From what I read in the Reddit thread, they tried to restore from backups and it failed. Seems they hadn't been testing the backups to make sure they could be restored and at some point the backups stopped working

      1. Anonymous Coward
        Headmaster

        Re: So....restore from backup

        If they hadn't been testing then they hadn't been backing up.

      2. Blank Reg

        Re: So....restore from backup

        This seems to be an all too common problem. Too many small companies say they can't afford backups and backup validation, I guess their data isn't all that important then.

        1. Anonymous Coward
          Anonymous Coward

          Re: So....restore from backup

          Only this shouldn't have been a backup issue...

          A common sense strategy would have been to use live backups (given that was what they are doing, I'm assuming that there's nothing in prod that would lead to regulatory/privacy/other issues that would dictate NOT using prod data) and read-only accounts to restore the production DB into test. Script the whole thing, run it daily with a few simple checks for restore size/timestamps/simple query results/e-mail the DB team and you save the company the hassle of discovering their restore process stopped working correctly months ago.

          Imagine - testing your backups AND creating sensible procedures AND showing new starters sensible ways to do things from day one. And if the backups failed to restore as expected, fix the issue...

          Compare this to "break it, discover backups aren't working and fire people" and I would wager make the same mistakes again in the future...

          1. Doctor Syntax Silver badge

            Re: So....restore from backup

            "i'm assuming that there's nothing in prod that would lead to regulatory/privacy/other issues"

            Given everything else about this it's possible your assumption might not be valid.

        2. Kiwi

          Re: So....restore from backup

          This seems to be an all too common problem. Too many small companies say they can't afford backups and backup validation, I guess their data isn't all that important then.

          No, nothing to do with how valuable data is. It's to do with how valuable time is.

          Take a small IT firm with 2 staff, both working 16 (and sometimes more) hours per day to get the business built up. Large amount of work starting to come in as well, but not enough work/income-over-costs to hire someone else.

          If you get jobs out, you eat. If you don't get jobs out, you don't eat. Or don't pay the rent, or power, or..

          Verifying backups can be restored is a pretty long process that ties up resources. Small firms have a hard time buying extra kit let alone having a spare machine they can put on such un-important stuff to actually verify they can re-image from a recent copy. And if your backup is data only (plus licenses etc), then re-installing a machine from scratch can take a couple of hours (*nix) or several days (you-know-which-OS) especially where updates are involved. And in either case, there can be a lot of time involved in making sure that the task is completed properly and everything is back (just numbers of files and bytes can be a good start, only takes a few minutes, but they don't always match even in a perfect restore (say work machine has an extra temp file the backup image doesn't).

          Backups take a lot of time to check properly. When I did them I did a system that worked on first test (and pulled some all-nighters getting stuff set up and testing it), worked on week-later test, and then we went to monthly checks. Not proper testing procedure I know, but I didn't have the time to be sure that everything worked perfectly. I could do that, or I could earn money. Small and new businesses have to put every moment possible into earning, as the moment you're forced to close the doors for the last time any backups or other data instantly becomes worthless. Unless you did some sellable dev stuff.

          PS El Reg - love that voting no longer requires a page reload! Thanks for changing this!

      3. Anonymous Coward
        Joke

        "at some point the backups stopped working"

        They didn't have the correct passwords to back up the DB...

        1. chivo243 Silver badge
          Trollface

          Re: "at some point the backups stopped working"

          Did they check the documentation the Dev had? It's probably there too!

      4. ecofeco Silver badge
        Facepalm

        Re: So....restore from backup

        The backups failed?!

        I'm not surprised they failed when they have documentation like that.

    2. Anonymous Coward
      Anonymous Coward

      Re: So....restore from backup

      "The issue is with the docs."

      Absolutely, and was the responsibility of the CTO to ensure, if not personally, that they should not have included or used real DB details in an example. This was a cockup waiting to happen.

      1. Matt Bryant Silver badge
        Alert

        Re: LeeE Re: So....restore from backup

        ".....This was a cockup waiting to happen." First rule of process design - always assume at least one user will do something wrong if you give him/her the means to do so, do not assume they will be able to over-ride the impulse to follow an incorrect direction. The management were responsible for the process documentation, therefore, IMHO, they were responsible for the failure.

      2. Daniel 18

        Re: So....restore from backup

        For that matter, why is the production environment trivially reachable from the dev/test environment?

    3. Anonymous Coward
      Anonymous Coward

      Re: So....restore from backup

      > The issue is with the docs.

      .... sounds like the CTO needs to visit example.com

      1. Wensleydale Cheese

        Re: So....restore from backup

        ".... sounds like the CTO needs to visit example.com"

        That might not help as much as you think if documentation elsewhere for adding new users also uses "example.com". I.e. it might already exist...

        Rule #1 about documentation is that someone somewhere will blindly follow any code or command snippets that are included.

    4. Just Enough

      Re: So....restore from backup

      "The issue is with the docs."

      Absolutely. Firstly, who wrote the documentation and thought it was ok to include the production password?

      If it was a technical author, they could be excused for assuming it was not a real production password. In which case, who gave them it? This password should be handed out on a needs only basis.

      If it was one of the developers, what kind of moron are they? Even if they they have received no training at all on basic security, you'd have to be an idiot to think it was ok to record a real password in clear text on documentation.

      Secondly, what are the production details doing on a document used to set up a dev environment anyway? A dev, particularly a novice one, shouldn't be anywhere near production, simply because they have no need to be.

      In fact, fire everyone. The whole affair suggests rampant sloppy practice everywhere. The fact their backup restores failed just confirms that.

      1. Rich 11

        Re: So....restore from backup

        you'd have to be an idiot to think it was ok to record a real password in clear text on documentation.

        You must be thinking of some of the companies I've had to work with.

        1. Anonymous Coward
          Anonymous Coward

          Re: So....restore from backup

          you'd have to be an idiot to think it was ok to record a real password in clear text on documentation.

          Grauniard on Wikileaks?

        2. Anonymous Coward
          Anonymous Coward

          I cannot tell you which company it was

          But I'm pretty sure which ones weren't.

          > you'd have to be an idiot to think it was ok to record a real password in clear text on documentation.

          It wasn't the large multinational where I cut my teeth in, where no passwords were *ever* written or even spoken. And it wasn't the one where I am part of manglement now, where our code of conduct says clear and loud that none will be disciplined for making honest mistakes¹, and we do abide by that.

          ¹ Our mistakes have a very real potential to kill people. Going around pointing fingers is not going to help anyone. Making sure that mistakes are open and candidly reported, and fixed promptly, is a far better use of everyone's time, and a humbling experience.

      2. Anonymous Coward
        Anonymous Coward

        "and thought it was ok to include the production password?"

        Nobody but the DBAs in charge of the production database should have had them. And each of them should have had separate accounts (because of auditing, etc.). Database owner credentials should be used only for database/schema maintenance - after the scripts have been fully tested (and at least a backup available). They should never be used for other tasks. Any application accessing the database must have at least one separate account with the only the required privileges on the required objects (which usually don't include DDL statements).

        All powerful accounts (i.e. Oracle's SYS) should be used only when their privileges are absolutely necessary, and only by DBAs knowing what they're doing. Very few developers, if any, should have privileged access to the production system, and they must coordinate with DBAs if they need to operate on a production system for maintenance or diagnostic needs. DBAs should review and supervise those tasks.

        But I guess they have followed the moronic script I've seen over and over (and stopping the bad habits usually encountered a lot of resistance):

        1) Developers are given a privileged account when the application is first created, so they can create stuff without bothering the DBA.

        2) Developers write an application that runs only with a privileged account, and doesn't allow separate database users (granting permissions to other users is a boring stuff, as writing stored procedures to control accesses to things). DBAs are happy because they have to work little.

        3) The privileged account credentials are stored everywhere, including the application configuration files, unencrypted, so everybody knows them

        4) The development systems becomes the production system. DBAs don't change anything because of fear something doesn't work.

        5) Developers still retain access to the production system "to fix things quickly", and may even run tests in production "because replicating it is unfeasible".

        6) DBAs are happy developers can work on their own so they aren't bothered.

        7) Now that you have offloaded most tasks to developers, why care of backups? Developers will have a copy, for sure!

        8) Then comes a fat-fingered, careless new developer....

        1. I ain't Spartacus Gold badge
          Devil

          Re: "and thought it was ok to include the production password?"

          Let me bring my IT expertise into play here. You're all wrong of course!

          Obviously database passwords shouldn't be written in public documents. They should be kept on a post-it note on the screen of one of the shared PCs.

          What's wrong with you people for not knowing this basic piece of security best-practice!

        2. Doctor Syntax Silver badge

          Re: "and thought it was ok to include the production password?"

          @LDS

          I think you've just described DevOPs.

        3. Anonymous Coward
          Anonymous Coward

          Re: "and thought it was ok to include the production password?"

          Apart from using 'careless' instead of 'only human' new developer ...

          I can report similar (some years back now) - new member of the ops staff was instructed to setup a test database, mistyped the DB URL by one character, and because of the risible beyond belief setup of i) dev can see prod network, ii) dev hostnames differ by one character from prod hostnames, and iii) dev, test and prod DBs all shared the same passwords and user names, they wiped the entire production database by running the schema creation scripts.

          The impact - this was a bank (no names but North American and generally an IT train wreck at the time), the DB data loss took out the SWIFT gateway, the bank ceased to be a bank for about six hours, and they had to notify the regulator of a serious IT incident. The length of time was due to the backups being inaccessible and no-one having a clue how to restore.

          And FWIW, we'd already advised the bank that the one character difference in hostnames on an open, flat network was a suicide pact six months before.

          On the plus side, the irascible, voluble Operations Manager took one look at the root cause, said it wasn't the operator's fault and went off to shout at someone on the program management team. Much respect for that move.

        4. andyL71

          Re: "and thought it was ok to include the production password?"

          Nobody bothers with DBAs or people who not what they are doing anymore. It's called DevOPs

      3. Anonymous Coward
        Anonymous Coward

        Re: So....restore from backup

        "In fact, fire everyone. "

        Ah, from the kill em all now, let God sort 'em out later school. Not many of us left.

        1. sandman

          Re: So....restore from backup

          That's because we ran out of Cathars some time ago ;-)

          1. CrazyOldCatMan Silver badge

            Re: So....restore from backup

            That's because we ran out of Cathars some time ago

            That's OK - still plenty of Gnostics out there!

          2. Eltonga
            Thumb Up

            Re: So....restore from backup

            That's because we ran out of Cathars some time ago ;-)

            Aaaah! But that should be no problem for the Spanish Inquisition!

      4. Eddy Ito

        Re: So....restore from backup

        Having worked in a place where the admin password was "Dilbert" it's possible the document writer was just trying to be funny and accidentally wound up using real credentials.

      5. Mark 65

        Re: So....restore from backup

        Where's the segregated VLAN? Anywhere with such important data and of such a size should be capable of setting up an environment where dev network logon credentials only work on the dev VLAN and so do not permit the crossing over into the production VLAN whether you know the prod db connection string or not. One account for troubleshooting prod environments (which they wouldn't have had in this case), and one for performing dev tasks. Not that difficult.

    5. Anonymous Coward
      Anonymous Coward

      Re: So....restore from backup

      This happened on my watch in Prod Support for a certain oil company.

      The dev/prod Oracle databases had the same internal password. Luckless dev connected to what they believed was their dev instance in Toad and then dropped the schema.

      They then called us to advise what had happened once they realised their mistake. We restored from backup (not before they bitched about how long it took to recover) but they never should have been able to do so in the first place.

      Rule Number 1: Restrict access properly to your prod environment. That means no recording random username/password combos in docs, scripts etc.

      Rule Number 2: Don't share credentials across environments.

      Anon for obvious reasons.

      1. Anonymous Coward
        Anonymous Coward

        Re: So....restore from backup

        Rule Number 1: Restrict access properly to your prod environment. That means no recording random username/password combos in docs, scripts etc.

        Rule Number 2: Don't share credentials across environments.

        Rule 3 - never lets Devs have access to stuff. Like small children, they *will* break stuff..

        1. Number6

          Re: So....restore from backup

          Rule 3 - never lets Devs have access to stuff. Like small children, they *will* break stuff..

          In a controlled environment this is good. If the Dev breaks something before the users do, it can be patched to prevent anyone else from doing the same. My unofficial CV has 'breaking things' as a skill, which dates back to using BASIC at school and being unable to resist entering '6' when asked to enter a number from 1 to 5, just to see what happened.

          1. Anonymous Coward
            Anonymous Coward

            Re: So....restore from backup

            My CV also includes breaking things, but I was one to always stick to the rules.

            > being unable to resist entering '6' when asked to enter a number from 1 to 5, just to see what happened.

            So when asked to enter a number from 1 to 5 I broke into the school in the dead of the night.

            1. akeane
              Headmaster

              enter a number from 1 to 5

              I would have entered the number "to"...

          2. Paul 129
            Devil

            Devs and Access

            "Rule 3 - never lets Devs have access to stuff. Like small children, they *will* break stuff.."

            I cannot upvote this enough. Stupid t**ts that I had the misfortune of dealing with would have had a breakdown dealing with keeping the systems running. I want you to deploy xxxx. Ok who to, oh everyone in the organization. Ok how? Oh this installation program that I have written, you just need to install a uninstall a, then install b and c and remove b, that will leave a stable version runnig on their machines.

            So... I'm not going to implement it, until I get a firm written procedure for how to deploy this. We also need to test it in a small department. I could have fixed it in a couple of hours, but this wasn't the first time I had to put up with them having no idea about dependencies, and I had wasted four or five days fixing similar messes of theirs, they were on better money too. I never got any recognition, cause little ever went wrong.

            I was so happy when they chose the planning department to trial on. (This department had 'skilled' staff who like to setup their own 'uncontrolled' infrastructure) I had a good 3.5 months of peace, until guilt at all the wasted man hours finally got to me, and I sorted it that afternoon.

            Then there is the always fun "Yes we do backup database transactions, here is your database backup, and here are your transaction logs, now you buggered up the database, rolled back and forth until you have a confused mess, so which transaction sets do you want to roll forward with?" - (Gives me a warm fuzzy feeling of joy remembering those looks of dawning horror)

            Devs really are like small kids, they have NO idea of consequences, and will run away given a challenge.

            1. Terry 6 Silver badge

              Re: Devs and Access

              Surely it's the development staff's job to break things. Because if they don't do it on purpose someone else will do it by accident ( or malice).

              But if you're testing a vehicle you make bloody sure that you aren't carrying live passengers, if you see what I mean.

        2. Ian Michael Gumby
          Boffin

          @AC ... Re: So....restore from backup

          You are spot on...

          You need to separate prod from dev and not allow dev access to prod.

          So while the Reg asks who should be fired ... there are several people involved.

          1) CTO / CIO for not managing the infrastructure properly because there was no wall between dev and prod.

          2) The author and owner of the doc. You should never have actual passwords, system names, etc in a written doc that gets distributed around the company. The manager too should also get a talking...

          3) The developer himself.

          Sure he's new to the job, however he should have been experienced enough to not to cookbook the instructions and should have made sure that they were correct. He was the one who actually did the damage.

          As to getting legal involved.... if this happened in the US... it wouldn't go anywhere. As an employee, he's covered and the worst they could do was fire him. If he were a contractor... he could get sued.

          We know this wasn't Scotland or the UK. (Across country? Edinburgh to Glasgow ... 40 min drive. )

          I do have some sympathy for the guy... however, he should have known to ask questions if things weren't clear in the instructions.

          He should chalk this up to a life lesson and be glad that his mistake didn't cost someone their life.

          1. Anonymous Coward
            Anonymous Coward

            Re: @AC ... So....restore from backup

            "... Scotland or the UK..."

            Jesus they slipped that referendum through under the radar.

        3. Marshalltown

          Re: So....restore from backup

          "Rule 3 - never lets Devs have access to stuff. Like small children, they *will* break stuff.."

          That's kind of the job description isn't it?

        4. Alan Brown Silver badge

          Re: So....restore from backup

          Rule3.1 Give them things you don't _mind_ them breaking. (DBs, systems, management sanity)

      2. Anonymous Coward
        Anonymous Coward

        Re: So....restore from backup

        That's insane... why would the password on the test server be "production" too? It clearly should be "test".

    6. Anonymous Coward
      Anonymous Coward

      Re: So....restore from backup

      OMG. It's the CTO responsibility to set policy that prevents this shit from happening. They set the technical policy and the idiot is covering their incompetent ass.

      If argued right, job claim could be set here?

    7. Jakester

      Re: So....restore from backup

      I maintain a small file server for a small company (about 45 computers). I use Ubuntu with Samba as the server. I have another desktop running Ubuntu with Samba on the ready and use rsync to provide a nightly copy of files from the server to this system which is a live copy of all the files served by the primary server. If the main server were to go down or hit with a virus (or ransomware), all I have to do is take the primary file server off-line, run a script on the system with the live data from the previous day, and change the IP address to make the backup system a temporary server.

      Another Ubuntu system is running which provides nightly backups using back-in-time to make nightly backups for off-site storage using portable USB hard drives.

      My philosophy is you can never have too many backups, so the primary server also makes hourly backups during business hours that are retained for 2 days. The backkups are then pared down to keeping one copy per day for 14 days, then one a week for a couple months, then one copy a month till drive space runs low. The drive is then removed from service to retain historical data and a new drive put in service. This set is for on-site.

      The system makes it easy to restore anywhere from one to all files in a short period of time. There is currently about 90GB of user data on the server. To restore one file takes only a couple minutes to find the file from the desired backup and restore it. A full restore of user data takes about an hour.

      :The system is regularly tested and a full restore had to be performed once when the Cryptolocker ransomware encrypted files the victim had access to on her computer and the server. More time was spent ensuring the ransomware had be isolated and eliminated on all computers on the network than to get the server ready.

      While some may consider triple redundancy overkill, I like to be prepared in case one of the backup systems may have happened to fail the night before a restore might be needed on the server. There is always at least one backup drive stored off-site. In the case of catastrophic loss of the server (flood, fire, explosion, etc), server configuration files in the nightly backups make it easy to setup and configure a new server in about 2 hours, ready to have files then restored.

      Test and test often...

    8. Anonymous Coward
      Anonymous Coward

      Re: So....restore from backup

      So where is their live RO replica?

  2. Anonymous Coward
    Anonymous Coward

    Now we know what happened with British Airways, the mystery country is India.

  3. wolfetone Silver badge

    I was hoping there'd be an option for "The dickhead who wrote the guide".

    1. Locky

      @wolfetone

      Indeed.

      So day 1, you are given a docuemnt and told to follow that. Yet it's your fault that guide is wrong?

      No wonder they were told not to come back, showing what a shower the current setup is....

      1. IsJustabloke
        Stop

        Re: @wolfetone

        "Yet it's your fault that guide is wrong?"

        Actually, the story quite clearly says he *didn't* follow the document. he should have used the credentials generated by the script, instead he copied the credentials from the document.

        The fact that the document contained the production credentials is an error but the implication is that he would have been ok with those.

        Of course, he should have been supervised so that's also an error. should he have been sacked? Probably not. and certainly not in the way he was.

        1. katrinab Silver badge

          Re: @wolfetone

          The document should have given "username" and "password" as the username and password. And those should not work on the live system.

Page:

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like