back to article Sysadmin crashed computer recording data from active space probe

Welcome again to Who, Me? The Register’s new column* in which readers confess to times they performed sub-optimally and broke important stuff. This week, meet “Fred”, who begged us “Oh God, please don't use my real name or contact info.” Fred’s desire for anonymity stems from the fact that in the late 1970s he worked for a …

  1. Nick Kew

    Hmmm ...

    Not convinced this column is going to work.

    My reading of this story is that Fred is innocent: he had no reason to know that removing a printer would hurt anything, and noone would have thanked him for bugging them about every individual box he moved!

    If this is the best facepalm you can come up with so early in the column, how are you going to persuade others to contribute more cringeworthy stories? Are you going to put people under hypnosis to try and dig up traumatic memories people have buried deeply for self-preservation?

    1. Shadow Systems

      At Nick Kew...

      <Hypno Toad>Tell us your story.</Hypno Toad> =-D

    2. Flocke Kroes Silver badge

      Re: Hmmm ...

      When the first computers hauled themselves out of the ocean they talked to each other through a long coax cable that went from one computer to the next in line. Each computer would have a T-junction connector plugged into the back, with the base of the T in the computer and the coax lines on each side. To prevent the signal bouncing of the ends of the cable each end was fitted with a terminator (pictures).

      When a user decided the network was the cause of all their problems instead of unplugging the computer from the T junction they would unplug both sides of the coax. As well as breaking the network in half, each half would not be able to communicate because each had a missing terminator.

      When computers came down from the trees they talked to each other over SCSI. SCSI worked like 10Base2, either with a ribbon cable with multiple connectors for up to 8 devices or each device had two connectors so they could be daisy chained together. Again, a terminator was required at each end (sometimes a separate dongle and sometimes enabled by setting jumpers in the device). Unplugging any device again broke the bus into two pieces that wouldn't work because of lack of proper termination.

      Someone with a greyer beard than mine is required to explain IBM 360 peripherals, but I can easily believe unplugging either end of the cable would crash the mainframe and that the PFYs of the time were expected to know this.

      Clearly the time has come for me to wire a motion sensor to a Raspberry Pi so it can shout "Get off my lawn!" when any of the neighbours' kids get close.

      1. Anonymous Coward
        Anonymous Coward

        Re: Hmmm ...

        I recall some grey cables to, for example tape drive that were thick as your wrist, had to be two, and had a connector at the end as big as a phone handset.

        At one employment we did some "archaeology" to remove extensive abandoned cabling left in the floor void as it was so extensive it was getting hard to route new service. This is what you get for having third parties remove kit, whats under the floor is not in the contract...

        my only claim in the area of spectacular failure is limited to deleting the online backup before dropping it to tape. This was not the day we needed it (that's another story line of closest spectacular near misses)

        1. Adam 1

          Re: Hmmm ...

          > that's another story line of closest spectacular near misses

          @ElReg, make this new column happen!

      2. big_D Silver badge

        Re: Hmmm ...

        @Flocke Kroes

        I managed to get a free SCSI set-up, because a friend couldn't work out the termination properly. He was about to through the lot out in a heap and I said I'd have a look.

        "Just take it, if you can get it working, you can keep it!"

        Set the jumper terminator on the ISA card and it "just worked". That was an Adaptec card 3 HDDs, 40MB, 80MB and 20MB.

        I also remember taking an Apple SCSI disk box for a Mac Plus home and plugging it into my Amiga.

      3. Anonymous Coward
        Anonymous Coward

        Re: Hmmm ...

        "To prevent the signal bouncing of the ends of the cable each end was fitted with a terminator "

        There were also the half-educated users on 10base2 who knew terminators were used in some way. Removing their terminal they would then add a terminator they had found from somewhere. This could also upset the network - even if it had the correct resistor value for that type of cable (50 or 75 ohm).

        1. Anonymous South African Coward Bronze badge

          Re: Hmmm ...

          Fun and games to be had by playing the game of hunt-the-terminator on unknown installs and large sites...

        2. Chris King

          Re: Hmmm ...

          I remember one drunken student pulling that stunt in a lab full of PC's... Remote-booting DECStation 316sx's that had no local filestore, not even a floppy drive - and these were machines from the Time Before USB.

          Damn fool took out every segment in the lab, and this was two hours before a dissertation deadline.

          Fortunately for him, I managed to bundle him out of the lab before anyone else realised what was going on.

          Unfortunately for him, I then decided to hand him straight over to his department head. Go directly to Jail, do not pass GO, do not collect £200. Nasty, but safer than handing him over to the baying mob back in the lab who just lost their essays.

          (Yes, they HAD turned autosave off, and learned a very hard lesson. It also prompted replacement of the old kit)

      4. Aitor 1

        Re: Hmmm ...

        Well, them being TTL and also being chain multiplexed buses done with cables... I guess it is obvious what happened.

        I think they were also unprotected, so that could have fried the multiplex, but that is just a guess.

        I had to admin one of those, and I did not enjoy it.. the thing was slow, and because of age, not very reliable if you needed to turn it off (and we had to, I know that is a bad idea)

      5. Doctor Syntax Silver badge

        Re: Hmmm ...

        "When a user decided the network was the cause of all their problems instead of unplugging the computer from the T junction they would unplug both sides of the coax."

        They would then be told that they'd leaked all the ethernet out of the cable and their desk would have to be taken away for decontamination.

        1. Bob Wheeler
          Joke

          Re: Hmmm ...

          @ Dr Syntax

          It's the same thing with unplugged Ethernet cables. The data packets leak all over the floor and it's a hell of a job mopping them up again.

          1. elDog

            Re: Hmmm ...

            You don't need to mop up the zeroes, only the ones - they can also be recycled.

            1. Francis Boyle Silver badge

              Re: Hmmm ...

              "You don't need to mop up the zeroes, only the ones - they can also be recycled."

              A common misconception. Because the ones are slender they lie in the cracks and don't cause any problems. The zeroes, on the other hand, just pile up and make the floor slippery.

              1. Shadow Systems

                At Francis Boyle...

                I was told that the 1's were unhappy but the 0's were in a good mood, so you avoided the grumpy 1's & concentrated on the cheery 0's. =-D

            2. Doctor Syntax Silver badge

              Re: Hmmm ...

              "You don't need to mop up the zeroes, only the ones"

              You can only mop up individual bits if the packets get torn open.

              1. Shadow Systems

                At Doctor Syntax, RE: ethernet.

                How do you catch an ether bunny? With an ethernet!

                I'll get my coat... =-)p

      6. Jay 2

        Re: Hmmm ...

        Oh dear god, you've reminded me of SCSI terminators. Now I'll be having nightmares where I have to figure out if it was a single-ended or differerntial I needed...

      7. Daedalus

        Re: Hmmm ...

        Ah, good old "thin wire" Ethernet. It was supposed to be an improvement over "thick wire" Ethernet where connecting a new node involved finding the thick co-axial cable that gave it its name, and boring into it with a connector that broke through the shield to get signal from the core. But we all knew that token ring, with its shoebox sized distribution points, was going to be the future......

        1. Bob Wheeler
          Mushroom

          Re: Hmmm ...

          @ Daedalus,

          No, god no, not the token ring MAU's.... oh for the love of god, don't remind me, please don't remind me.

          1. big_D Silver badge

            Re: Hmmm ...

            My previous employer took over a building as a regional office and it was still wired up for Token Ring! :-O

            They had to re-do the every patch box and the patch panel to get Ethernet! And on some of the desks, they only managed 10mbps... Queue wails of pain and investment in new cable being pulled through the building.

        2. Tim99 Silver badge

          Re: Hmmm ...

          You could just about moor a boat with Token Ring cable. We had a small temporary working TR network in our lab that had cabling taped around the walls, back of desks etc. Our company telecoms engineers came in one weekend to "make a proper job" of the cabling by running it in the wall panels and connecting it to wall pattresses. We came in on Monday and admired the new pattresses etc., then we connected the server to the IBM PS2 clients on the ring and nothing worked. We fiddled about for a bit before taking a wall panel off and found that our new network was cabled with standard POTS twisted-pair wire. When we called the installer in, he explained that the IBM TR cabling was "just twisted-pair" and that the POTS wire was the same, so they had used that because the IBM stuff didn't bend around corners easily and needed large holes drilling through the walls. After taping everything back like before, the engineers came back a few weeks later and wired it properly. Less than a year later it was all ripped out and replaced by a new building-wide system using 10-5 and 10-2 Ethernet.

          1. Daedalus

            Re: Hmmm ...

            Once upon a time we shared data between two systems with tape reel sneaker net. Along comes Ethernet, but we had to get the OK from Dyspeptic Company Owner. First up was Nervous Team Leader.

            NTL : We have a problem.

            DCO: I don't want to hear about problems.

            NTL: erm erm erm

            Me: Ethernet will let us share files instantly and cut our turnaround time.

            DCO: Can we do this?

            NTL: Yes....

            DCO: Then do it.

            Shortly afterward co-ax cable as thick as your finger made its way through the false ceiling.

          2. Doctor Syntax Silver badge

            Re: Hmmm ...

            "You could just about moor a boat with Token Ring cable."

            You could just about moor it to thick Ethernet.

          3. keith_w

            Re: Hmmm ...

            We ran a Token Ring network with over 256 devices attached on Phone cable (type 3). One day 2 of my co-workers returned from an IBM info session and stated that they had learned that TR could support all the nodes on the backup route, that is, instead of having all the MAUs connected, 2 were disconnected from each other turning the ring into a horseshoe, and they were going to test our backup route. They pulled the plug between 2 of our MAUs and immediately crashed our network, leading to our discovery that we were already running on the backup route.

            Also, although I worked with IBM mainframes as a system programmer, I was not aware that you could not just yank the bus and tag cables off a peripheral without causing an issue. On the other hand, I never had any need to know this as IBM did all our hardware support, and we never removed any older hardware, just added new.

          4. jake Silver badge

            Boat mooring ... (was: Re: Hmmm ...)

            I've personally seen >$120,000 yachts moored with nothing more than shore power, a telephone line (POTS) and a fresh water line. The last time I saw this was at St. Francis Yacht Club in San Francisco, a place where you'd think they would know better ... I was there to fix their end of a dial-up UUCP link.

            One of the boats that "broke free" in Santa Cruz, California during the tsunami following the 2011 Tōhoku earthquake was "moored" with only coax for a CB radio antenna bolted to the top of a piling ... The coax was RG-58.

          5. HWwiz

            Re: Hmmm ...

            RE Tim99 Token Ring.

            I remember getting a job to pull TokenRing into a new office space, with about 15 or so desks.

            I was in my early 20's at the time and nearly gave myself a Hernia pulling that damn cable.

            Like you say, you could moor up a shipping tanker with it.

        3. Anonymous Coward
          Anonymous Coward

          But we we all knew that token ring...was going to be the future...

          Which is why I have a shoebox full of old, hardly used tokens just waiting.

          "Murphy Brown" was an American sitcom popular at that time. One of the episodes dealt with a group of men gathered in a circle, passing a "talking stick" around so each could speak about their feelings.

          "Talking Stick" became my go-to analogy when explaining token ring to a class of networking newbies.

          1. jake Silver badge

            Re: But we we all knew that token ring...was going to be the future...

            The talking stick is a device used from time immemorial by so-called "indigenous tribal cultures" world wide. The Murphy Brown episode was mocking corporate America appropriating such concepts.

            1. Anonymous Coward
              Anonymous Coward

              The Murphy Brown episode was mocking corporate America appropriating such concepts.

              I'd say more mocking the "man" movement of the time, which was pretty silly. But it still provided a good analogy since most people in the class were familiar with the show and could get a chuckle of the reference. That analogy is now just as dead as the one I used comparing a content management system's database to the card catalog in the library.

              1. jake Silver badge

                Re: The Murphy Brown episode was mocking corporate America appropriating such concepts.

                Fair enough, Etatdame.

                But I gotta ask ... What's wrong with using the Dewey Decimal system as an example? Seems perfecty reasonable to me.

                HEY, YOU KIDS, GET ORF MY LAWN!!!

        4. CrazyOldCatMan Silver badge

          Re: Hmmm ...

          But we all knew that token ring, with its shoebox sized distribution points, was going to be the future

          And the big thick locking connectors that could still, somehow, present the appearance of being connected without *actually* being connected.

      8. Gene Cash Silver badge

        Re: Hmmm ...

        Yup... the house I shared with my college roommates was wired with 10Base2, and we had a UUCP connection to the local university to get USENET. We even ponied up beer money for those super advanced new 56K modems.

        Plus we had a genuine VT100 in the bathroom, so you could read netnews. Poop coming and going as it where.

      9. HWwiz

        Re: Hmmm ...

        Ahh 10base2.

        I remember working on a contract with a Digital DEC engineer in the early 90's, who showed me an amazing trick with 10base2.

        He inserted a sewing needle into a 10base2 cable, pressed right into the cable.

        Then using an old 286 laptop in the next room he connected some RF device in a rather home made looking black box to the parallel port. Apparently it was some kit they were using on this UK Government contract at the time in 1991.

        He was able to listen into the network traffic on the 10base2 network, wirelessly !.

        The first wireless network ?

        1. jake Silver badge

          Re: Hmmm ...

          No, not the first wireless network. Look up AlohaNET for enlightenment.

      10. macjules

        Re: Hmmm ...

        Ah, the heady days of Syquest 44Mb drives, 10B2 termination and switched routers ..

        I once rescued a seriously expensive 10BaseT switch from oblivion by quickly cannibalising the power unit from a studio's 88Mb Syquest drive and putting it into their £3000 zoned switch. Typically the design studio screamed more about the loss of their £200 88Mb Syquest drive than the sysadmin did about how I rescued him from network hell, so I was not exactly welcomed back again.

      11. CrazyOldCatMan Silver badge

        Re: Hmmm ...

        they talked to each other through a long coax cable that went from one computer to the next in line

        Or, if they came in an odd shade of blue, they passed a token around a local ring like a game of pass-the-parcel.

        Which had it's own moments of silliness, especially if a junior networky person somehow short-circuited the main bus ring that all the local rings hung off so that none of the local rings could allocate a token and when said networky person came back into the main office, all the machines were happily clicking away to signal that the token had fallen off and couln't be found.

        Errr.. allegedly. So I've heard.

      12. Nicko

        Re: Hmmm ...

        Not strictly so - originally there was 10base5 which was over a very very chunky yellow coax with type N connectors and terminators.

        In my very early days in this industry, we had lots of it - nominally you had to cut it, add connectors etc., then "T" in a new node, but DEC (Digital Equipment Corporation) produced a tap connector. the H4000, that basically drove a spike into the middle of the cable. This made life a lot easier, but wasn't particularly reliable as if it was moved about a bit, sometimes connectivity would be lost and sometimes the cable would short out.

        The manual for the H4000 is here http://manx-docs.org/collections/antonio/dec/MDS-1997-10/cd2/VOL002/0341.PDF

        The cables, just like any bus, relies on running as a transmission line - if the end is open, any signal reaching the end will reflect back down the cable, thus neatly messing up all the signal following on.

        Simply put, the terminators absorb the signal, stopping reflections. That's a gross simplification, but essentially correct. You had to use the correct terminator for the impedance of the cable - 50ohm terminators at each end for 50 ohm cable, 75ohm terminators for 75ohm cable.

    3. Anonymous Coward
      Anonymous Coward

      Re: Hmmm ...

      Nick Kew - You're kidding, right? The guy claims he was responsible for losing 3 minutes of data from a spacecraft, probably Viking on Mars, and that's not good enough for this column? Sheesh.

      1. jake Silver badge

        Re: Hmmm ...

        I just now got off the phone with an old mentor from back in the day. The NASA recording systems on the ground were redundant because the hardware of the day was so flaky ... having one system fail catastrophically was always possible, so they took measures. Also, he can't remember a mainframe involved in that recording. That would have been entirely too much money to throw at a problem with much easier & less expensive solutions. He's also fairly certain that the ground systems that did exist were Honeywell, not IBM ... But, as he put it, he's in his eighties & they say the mind is the first to go.

      2. Chris King

        Re: Hmmm ...

        "Nick Kew - You're kidding, right? The guy claims he was responsible for losing 3 minutes of data from a spacecraft, probably Viking on Mars, and that's not good enough for this column? Sheesh."

        Who knows, that missing data could have even greater importance in the future...

        KIRK UNIT ! V'GER DEMANDS THE INFORMATION !

    4. ssharwood

      Re: Hmmm ...

      Well FWIW traffic and comment counts have been pretty good for the first two columns. I reckon it will take a while for the column to find a groove. And obviously there's only so many "i forgot to backup" stories worthy of being told. But I'm optimistic we'll get something rolling here.

    5. notowenwilson

      Re: Hmmm ...

      I thought it was an interesting read.

  2. Shadow Systems

    Houston, we have a problem.

    Dave: We need a backup!

    Hal: I'm sorry Dave. I'm afraid I can't do that.

    Dave: Give me that backup or I'll reprogram you with a HAMMER!

    Hal: Dave. Please put down th-

    Dave, cackling: SMASHY SMASHY!

    Hal, nervously: Ok Dave. Let's talk about th-

    Dave, brandishing the hammer in both hands & jumping up on his chair in a classic Tom Cruise move: I'm gonna RAM your drives RIGHT up your rear port!

    Hal, trying to soothe: Dave, if you put down the hammer I'll-

    Dave, screaming & eyes ablaze: SMASHY! SMASHY!

    Hal, nearly squealing in terror: -will be giving you that backup now Dave! Please Dave, I'm doing the backup right now, ok? Please... Please put down the hammer?

    Dave, backing off a little: That's... Better. But *Eyes flash* I warn you. If I hear "I can't do that" out of you ONE more time... *Swings hammer menacingly*

    Hal, giving a very nervous chuckle: I'm pleased to have provided that backup, Dave. I hope it makes you happy.

    Dave, calming down a little bit more: So no more of your sass?

    Hal, trying to be cheery: Nope. Not one bit. Nope-a nope-a nope!

    Dave, smiling & sliding the hammer into a belt loop: Good! I like it when you decide to play well with others.

    Hal: Would you like me to sing, Dave?

    Dave, brandishing the hammer in a swift defensive move: Do it & I'll reduce your ass to TRS80 shit.

    Hal: Shutting up now! *mumbles as if pursing lips*

    Dave, eyeing the Big Red Eye warily: Shuuuuuut it.

    Hal goes completely silent, even stopping his cooling fans.

    Dave smiles happily: Damn, you learn quick!

    *Cough*

    I need more frog pills. This tin is empty...

    1. Anonymous South African Coward Bronze badge

      Re: Houston, we have a problem.

      Special delivery from Ankh-Morpork of dryd frorg pills just for you.

      1. Shadow Systems

        Re: Houston, we have a problem.

        YAY! More frog pills!

        *Cookie Monster eating cookies scene*

        Nom Nom Nom NOM!

        *Massive burp*

        Aaaaahhhh... yummy.

        Thank you, I needed those. Enjoy a pint & sign me up for regular weekly deliveries of those little buggers...

        Weekly? Did I say weekly? I think I may require daily if not hourly.

        /wanders off to go interrogate the squirrels.

        1. CrazyOldCatMan Silver badge

          Re: Houston, we have a problem.

          /wanders off to go interrogate the squirrels.

          Don't get too close - some of them[1] carry leprosy y'know.

          [1] I think it might only be the reds so, if you are leftpondian you don't have to worry.

          1. Shadow Systems

            At CrazyOldCatMan, re: squirrels.

            Your side of the pond gets the red ones with rabies, over here we've got the Grey ones... little aliens covered in fur. Experimenting on us via mind control. Eating our nuts. Impregnating our LawnGnomes. Doing unspeakable things to Leprechauns...

            *Shudders*

            I need to go refill the FlameThrower...

    2. Simon Harris

      Re: Houston, we have a problem.

      Hal and Dave... are you sure it's not Eddie and Zaphod?

    3. VinceH

      Re: Houston, we have a problem.

      I wish they hadn't cut that scene from the final movie.

  3. macjules

    Presumably the same, very, very famous space agency that once needed another seven astronauts? A friend of mine worked there briefly some time ago but managed to retain his much-vaunted email address for many years.

    When we worked together at a not-at-all famous UK government department there was an undersecretary of state who seemed to be almost permanently watching Google Earth. My friend sent him an email from nasa.org along the lines of "Because you are spending far too much time looking at Google Earth we are having problems retasking satellites to be able to carry out their normal functions. Kindly limit your usage of Google Earth."

    Suffice to say, it worked.

    1. gregthecanuck
      Happy

      Congratulations - you win LOL of the day!

  4. Lee D Silver badge

    Deployed an MSI package with a space in the filename to an RM Curriculum Connect 3 school network.

    All CC3 software packages are installed as an ordinary MSI, just with paths to things like shared icons, configuration files, etc. put on certain drive letters. Educational suppliers would often supply them if bothered, but they were easy enough to make yourself and you could also use their "Discover" software (which was basically just Wininstall with knobs on) to create one for any bit of software. All the complicated bit went without a hitch, and I got a working MSI. Decided to rename it something sensible before deploying it.

    Put it into the management console thing, pushed it out to a handful of test machines (I'm not stupid!), left for the day.

    Came back in to someone screaming that everything was down and they'd had to call out RM support. Turns out, if you had a space in the filename (and we're talking Windows XP/Vista here), their crappy software that decided WHAT packages to install couldn't parse the list and so the next time ANY computer (including the server) rebooted it, it would crash on boot while that software ran to see if it needed to install anything. Literally took down every client on the network, plus the server when some bright spark thought they'd restart it.

    They tried to blame me but literally NOT ONE WARNING to the effect of "don't use spaces" existed in the documentation, not one check in the deployment software, nothing, at the time. But, hey, they released a patch pretty damn quickly after effectively having to rebuild every client and patch up the server to remove that package from the list. P.S. after the patch... the package I made worked flawlessly for years.

    1. Anonymous Coward
      Anonymous Coward

      Back in the days when 80 column cards were the input medium for mainframes. You quickly learned to denote spaces on paper by writing small inverted triangles as individual place holders.

      Still use it to this day. It would seem likely that someone somewhere would be unable to find that symbol on their keyboard.

      The underscore character seems to have taken on a life of its own as a no-break space in potentially sensitive strings like file names.

      1. Doctor Syntax Silver badge

        "It would seem likely that someone somewhere would be unable to find that symbol on their keyboard."

        Isn't it the symbol on the Any key?

        Yup, remember coding sheets. Remember the triangles on them. Also remember the convention (just QUB or any 1900 installation?) for distinguishing O from 0 was different to that everywhere else.

        1. Phil O'Sophical Silver badge

          Also remember the convention (just QUB or any 1900 installation?) for distinguishing O from 0 was different to that everywhere else.

          IIRC the Electrical Engineering (Ashby) card punch operators used the opposite convention to the computer centre (Botanic), so you had to know who was going to punch your cards to know when to use O and when to use Ø... Thinking about it, it was possibly 1900-related, since the Ashby had a different setup.

          1. Doctor Syntax Silver badge

            "the computer centre (Botanic)"

            Botanic? In my day it was in the Mews.

            1. Phil O'Sophical Silver badge

              Botanic? In my day it was in the Mews.

              True, that was the official name. In engineering we just lumped it it with all the computer science stuff over at the far side of the Botanic gardens. I only remember CC staff actually calling it Mews.

        2. Wensleydale Cheese

          "Also remember the convention (just QUB or any 1900 installation?) for distinguishing O from 0 was different to that everywhere else."

          The mention of inverted triangles prompted memories of the different ways of distinguishing O from 0. That was in the 2900 world.

      2. not.known@this.address

        old lessons well learned

        I still use the date format from my VAX/VMS days - dd-mmm-yyyy - as it makes it very, very hard for someone to blame me if they mess up te date for something...

        1. Phil O'Sophical Silver badge

          Re: old lessons well learned

          VMS has a utility to modify system parameters. Some changes can take effect immediately, some only after a reboot, and it was sometimes irritating to find that someone had modified something in the "active" set but not saved the changes to disk, so on the next reboot whatever problem had been fixed by the change would reappear.

          On one occasion I was making such changes. I dutifully saved them to disk first, then to active, to be sure they wouldn't get lost. Unfortunately I'd mixed up my max & min values, setting the max below the min. The moment I saved to active the system stopped dead, didn't even have time to echo the fatal <CR>. Of course, when it rebooted it found the incorrect values on disk & promptly stopped again.

          Fortunately VMS has "conversational boot" to allow changes to be made on the fly during boot...

        2. imanidiot Silver badge
          Headmaster

          Re: old lessons well learned

          not.known@this.address

          There is only ONE correct way to write a date nowadays. yyyy-mm-dd. I've encountered too many fuckups where mm-dd-yy and dd-mm-yy got confused/swapped. And writing out May/Mar sometimes gets confused too (Plus makes sorting difficult sometimes)

          1. Anonymous Coward
            Anonymous Coward

            Re: old lessons well learned

            "There is only ONE correct way to write a date nowadays. yyyy-mm-dd."

            My employer disagrees. We are required by company policy to use dd-mmm-yy, dd-mmm-yyyy, ddmmmyy, or ddmmmyyyy. In other words, day then month then year, day is ALWAYS 2 digits (02, not 2), year can be 2 or 4, month is always 3 letters (never digits).

            From a computer standpoint, I typically use yyyy-mm-dd, as it automatically gets sorted in date order. dd/mm/yy is common where I live. Any system is fine by me, as long as it's clear; mm/dd/yy or dd-mm-yy, but NOT dd/mm/yy or mm-dd-yy.

            1. elDog

              Re: old lessons well learned

              I work in a US state government where very few users are really thoughtful about what makes sense, both from a UI standpoint and from a data storage standpoint.

              I have been using ISO dates (YYYY-MM-DD) and 24 hour times whenever I can. Fortunately most of my coworkers are smart enough to figure out that 2018-01-19 12:00 means today at twelve noon, not twelve PM.

              Next, lets get rid of Summer Time (or Daylight Savings Time). Then we can tackle the Year 9999 problem.

        3. G.Y.

          Re: old lessons well learned

          A medical device company went DD-MMM-YYYY. Worked OK _until_ the French customers wanted French months (just Jan-Fev-..., not Brumaire, Frimaire ...), the Germans wanted German, the Esperanto customers ...

          1. Wensleydale Cheese

            Re: old lessons well learned

            "DD-MMM-YYYY. Worked OK _until_ the French customers wanted French months (just Jan-Fev-...,"

            Can't use the first 3 letters of the month in French, because of Juin and Juillet.

            We used dd.mm.yyyy for user input and display, and stored it on file as yyyymmdd.

        4. Shadow Systems

          At Not.Known@This.Address, re: date format.

          mm-dd-yyyy? BAH! It's yyyy-mm-dd so the dates auto sort themselves when used as file names, data entry labels, or otherwise get acted upon by a computer.

          I've always used that format for the date, but then I also use a 24Hour 0001~2359Hrs format to record the time. It comes from growing up in a military family & having to remember that "anything after 1PM gets 12 hours added to it; anything greater than 13 gets 12 subtracted from it" when translating from military to civilian times.

          All my log files are named as "yyyy-mm-dd xxyyHrs PST_$Event.txt" so they *always* appear in chronological order, even if I sort the file names by something other than name (size, type, number of tentacles, etc).

          It makes it _so_ much easier to keep things straight. =-)

        5. Wensleydale Cheese
          Happy

          Re: old lessons well learned

          "I still use the date format from my VAX/VMS days - dd-mmm-yyyy"

          Whoever came up with that format neatly sidestepped the whole MM/DD or DD-MM confusion.

          I'd really like to shake them by the hand.

          The idea was not new in VMS. In RT-11, dd-mmm-yy was used.

      3. Simon Harris

        "small inverted triangles as individual place holders... It would seem likely that someone somewhere would be unable to find that symbol on their keyboard."

        Unless you're unlucky enough to find someone with an APL keyboard.

        1. Bob Wheeler

          @ Simon Harris

          Geesh, APL, the only programming system that was all but totally unreadable without the well thumbed manual next to you.

        2. Doctor Syntax Silver badge

          "Unless you're unlucky enough to find someone with an APL keyboard."

          Unless you find someone unlucky enough to have an APL keyboard. FTFY.

      4. CrazyOldCatMan Silver badge

        You quickly learned to denote spaces on paper by writing small inverted triangles as individual place holders

        Likewise my brief career as a TPF assembly programmer taught be to clearly distinguish between 7's and 1's and the letter O and a zero..

    2. Anonymous Coward
      Anonymous Coward

      NOT ONE WARNING to the effect of "don't use spaces" existed

      Hmm. Maybe I'm an anomaly, but I have a built in warning that says "don't use spaces", and mostly because it causes problems parsing lists of filenames :-).

      Mind you, your example does seem to be an excessively catastrophic consequence...

      1. Keith Langmead

        Re: NOT ONE WARNING to the effect of "don't use spaces" existed

        "Hmm. Maybe I'm an anomaly, but I have a built in warning that says "don't use spaces", and mostly because it causes problems parsing lists of filenames :-)."

        Yep, same here. Far too much time spent with various apps and scripting engines (especially .bat files) that don't properly handle spaces. These days I still habitually ensure nothing that could even potentially come near being automated has spaces in it if I can help it.

        1. Baldrickk

          Re: NOT ONE WARNING to the effect of "don't use spaces" existed

          To this day "C:\Program Files\" and "C:\Program Files (x86)\" annoy me every time I see them.

          At least "C:\Documents and Settings\USER\" is now vanquished for the more sane "C:\Users\USER\"

          1. elDog

            Re: NOT ONE WARNING to the effect of "don't use spaces" existed

            It was absolutely f'in amazing how ill-prepared M$ was for that change to their default folder structure. It's almost as bad as when Gates and friends decided to use the "/" as the command-line parameter start and therefore had to preempt the command escape character "\" for a folder separator character. I hate to think of how many billions of hours have been spent trying to get MS-DOS to work with other software.

            I have hated "Program Files" (86) and "My Documents" as being legitimate names, especially for things that are fairly important to the OS and, well use of the PC.

            It seems that the "UI" experts with their focus groups have absolutely no knowledge of how software parses paths, as well as other things such as parameters. Unix/Linux ain't perfect either but there is no such a huge disconnect as with all the Windows flavors.

            1. CrazyOldCatMan Silver badge

              Re: NOT ONE WARNING to the effect of "don't use spaces" existed

              I have hated "Program Files" (86) and "My Documents" as being legitimate names

              You think that's bad - you try and convince 'Creative Professionals' that using spaces in documents that they upload to a website is a Bad Idea.

              (This was in the days before CMS' automatically added underscores or dashes and very much before the abomination that IIS adds using %20 etc..)

              1. DuchessofDukeStreet

                Re: NOT ONE WARNING to the effect of "don't use spaces" existed

                Is that where that infuriating %20 comes from? It's been driving me nuts for months - now I know why!

                Thank you.

    3. Anonymous Coward
      Anonymous Coward

      @lee d

      Been there, you have my condolences

  5. Nolveys
    Windows

    Someone I used to work with once told me the story of his boss attempting to restore data from backup. The fellow loaded a backup tape into a rarely used and, as it turned out, broken tape drive. The drive rewound the tape until it got the start...and kept going.

    After hearing the snapping sound of the tape being torn apart he made the assumption that the tape was just bad. In went the next tape...and the next...and the next...

    He managed to destroy every single backup.

    1. Anonymous Coward
      Anonymous Coward

      "After hearing the snapping sound of the tape being torn apart [...]"

      When tape reel servos decided to go in opposite directions at high speed - the result was a distinctive screech and a section of mylar "string".

    2. Anonymous Coward
      Anonymous Coward

      "The drive rewound the tape until it got the start...and kept going."

      On all the mainframe decks I worked with - the relatively short leader tape was not fixed permanently to the deck's take up spool. If on a rewind it missed the data tape's physical start marker*** - then the leader tape ended up flapping round the data reel. The tape feed equalisation sensors (weight or optical) then shut it all down. Some gentle handling was possibly then necessary to recover any thrown loops.

      *** usually a short transparent section of tape. In the early days the operators had to make these themselves - especially if a tape had had to be repaired by lopping a bit off. The section's ferrite coating was removed with acetone - which came in large brown bottles. Unfortunately the head cleaning iso-propyl alcohol came in identical looking bottles - and both were a clear liquid. As tape deck heads were cleaned quite often - it was not unknown for their insulator separators to be dissolved by acetone in error.

      1. stungebag

        It didn't always end up flapping

        On earlier tape units the reels had a leader spliced onto the end that threaded through a short piece of tape attached to the takeup spool. So a resounding crack rather than a flapping noise if the BOT marker was missed.

      2. Antron Argaiv Silver badge
        Thumb Up

        BOT and EOT markers: On the drives I used, they were not transparent sections, but 2 inch long pieces of sticky-backed reflective mylar.

        It was either DG or CDC or both....

      3. Mongrel

        ...identical looking bottles ...

        Did they have no sense of smell?

        1. CrazyOldCatMan Silver badge

          Re: ...identical looking bottles ...

          Did they have no sense of smell?

          Possibly sniffed the bottles for too long (or too many times) and become somewhat hard of thinking..

    3. Wensleydale Cheese
      Unhappy

      "He managed to destroy every single backup."

      There was also a tale of the operator who found he couldn't read a disk pack, so tried it on every drive in the room.

      It was a bad pack and he caused a head crash on every single drive they had.

      1. Mark York 3 Silver badge
        Holmes

        Lord Sugar - The Revenge Of The Disc - A Long Long Time Ago In A Job Far Away

        I heard the amusing story at a Trade Show about a hour after someone had just walked up to the Amstrad stand.

        Can I help you?

        Is this IBM compatible?

        Ohh yes!

        Can it run any IBM software?

        Anything that will run on a IBM will run on this.

        Ohhhh nice

        Sales guy walks away finding new people to gush over & extol the virtues of the PC1512.

        Slidely disk in...>tappity< >JANGLE<

        Sales reps flock back to the new PC1512 showing acute distress...

        What did you do?

        It appears your machine can't run everything that runs on a IBM after all. The miscreant stated as he removed his copy of the IBM diagnostic testing disc.

  6. Paul Cooper

    I guess that every data manager on earth has suffered that awful moment when you realize you shouldn't have pressed return on the command you just entered and that you've lost some large amount of work that will take a day or so to recover? Particularly if you're used to an OS like windows that doesn't REALLY delete stuff until you clear out the recycle bin, and suddenly remember you're in the less forgiving environment of Unix or VMS! It's even worse when you're using some GIS data formats where the basic unit of data is the contents of a directory, and messing with any part of those contents can irretrievably (well, almost) damage the entire dataset (yes, old and hoary users like me do know ways to reconstruct the data, but it isn't for the beginner and is error prone!). Or an SQL "Delete" command with autocommit on and an untested "where" clause that doesn't do what you thought it did...

    1. Anonymous Coward
      Anonymous Coward

      Typing in a delete command for a source file on the mainframe - someone interrupted me part way through. Finally looked back and hit return. Then realised I had only entered the directory name - and the delete command was the same format for file or directory.

      Before my time someone had done a similar thing for an major application source that had no back up. They had to retype it from the latest listings. What was surprising was how many code bugs they found in the process.

      1. Anonymous South African Coward Bronze badge

        Ahhh, which reminds me of this gem :

        rm -rf / [folklore] [home] [search]

        Such things happened at least once to every unix person... To me it happened on February 1, 2000, after several years of heavy Unix usage/administration, when I was damn confident in myself and just leniently smiled on all these for-clueless-newbies warnings about not doing things as root.

        In the middle of the working day, being a root on the main NFS server containing all user homes, sitting in /home/some_user, I typed chown -R some_user .* and stopped it in 15-20 seconds when realized that something is going wrong. But you know, that server was really fast and permissions of the good half of the whole user space have been modified. (I recovered of course - by the price of my lunch time).

        Anyway, the following classic article from Mario Wolczko describing much more interesting case first appeared on Usenet in 1986.

        Have you ever left your terminal logged in, only to find when you came back to it that a (supposed) friend had typed rm -rf ~/* and was hovering over the keyboard with threats along the lines of "lend me a fiver 'til Thursday, or I hit return"? Undoubtedly the person in question would not have had the nerve to inflict such a trauma upon you, and was doing it in jest. So you've probably never experienced the worst of such disasters...

        It was a quiet Wednesday afternoon. Wednesday, 1st October, 15:15 BST, to be precise, when Peter, an office-mate of mine, leaned away from his terminal and said to me, "Mario, I'm having a little trouble sending mail." Knowing that msg was capable of confusing even the most capable of people, I sauntered over to his terminal to see what was wrong. A strange error message of the form (I forget the exact details) "cannot access /foo/bar for userid 147" had been issued by msg. My first thought was "Who's userid 147?; the sender of the message, the destination, or what?" So I leant over to another terminal, already logged in, and typed grep 147 /etc/passwd only to receive the response /etc/passwd: No such file or directory. Instantly, I guessed that something was amiss. This was confirmed when in response to ls /etc I got ls: not found.

        I suggested to Peter that it would be a good idea not to try anything for a while, and went off to find our system manager.

        When I arrived at his office, his door was ajar, and within ten seconds I realised what the problem was. James, our manager, was sat down, head in hands, hands between knees, as one whose world has just come to an end. Our newly-appointed system programmer, Neil, was beside him, gazing listlessly at the screen of his terminal. And at the top of the screen I spied the following lines:

        # cd

        # rm -rf *

        Oh, shit, I thought. That would just about explain it.

        I can't remember what happened in the succeeding minutes; my memory is just a blur. I do remember trying ls (again), ps, who and maybe a few other commands beside, all to no avail. The next thing I remember was being at my terminal again (a multi-window graphics terminal), and typing

        cd /

        echo *

        I owe a debt of thanks to David Korn for making echo a built-in of his shell; needless to say, /bin, together with /bin/echo, had been deleted. What transpired in the next few minutes was that /dev, /etc and /lib had also gone in their entirety; fortunately Neil had interrupted rm while it was somewhere down below /news, and /tmp, /usr and /users were all untouched.

        Meanwhile James had made for our tape cupboard and had retrieved what claimed to be a dump tape of the root filesystem, taken four weeks earlier. The pressing question was, "How do we recover the contents of the tape?". Not only had we lost /etc/restore, but all of the device entries for the tape deck had vanished. And where does mknod live? You guessed it, /etc. How about recovery across Ethernet of any of this from another VAX? Well, /bin/tar had gone, and thoughtfully the Berkeley people had put rcp in /bin in the 4.3 distribution. What's more, none of the Ether stuff wanted to know without /etc/hosts at least. We found a version of cpio in /usr/local, but that was unlikely to do us any good without a tape deck.

        Alternatively, we could get the boot tape out and rebuild the root filesystem, but neither James nor Neil had done that before, and we weren't sure that the first thing to happen would be that the whole disk would be re-formatted, losing all our user files. (We take dumps of the user files every Thursday; by Murphy's Law this had to happen on a Wednesday). Another solution might be to borrow a disk from another VAX, boot off that, and tidy up later, but that would have entailed calling the DEC engineer out, at the very least. We had a number of users in the final throes of writing up PhD theses and the loss of a maybe a weeks' work (not to mention the machine down time) was unthinkable.

        So, what to do? The next idea was to write a program to make a device descriptor for the tape deck, but we all know where cc, as and ld live. Or maybe make skeletal entries for /etc/passwd, /etc/hosts and so on, so that /usr/bin/ftp would work. By sheer luck, I had a gnu emacs still running in one of my windows, which we could use to create passwd, etc., but the first step was to create a directory to put them in. Of course /bin/mkdir had gone, and so had /bin/mv, so we couldn't rename /tmp to /etc. However, this looked like a reasonable line of attack.

        By now we had been joined by Alasdair, our resident UNIX guru, and as luck would have it, someone who knows VAX assembler. So our plan became this: write a program in assembler which would either rename /tmp to /etc, or make /etc, assemble it on another VAX, uuencode it, type in the uuencoded file using my gnu, uudecode it (some bright spark had thought to put uudecode in /usr/bin), run it, and hey presto, it would all be plain sailing from there. By yet another miracle of good fortune, the terminal from which the damage had been done was still su'd to root (su is in /bin, remember?), so at least we stood a chance of all this working.

        Off we set on our merry way, and within only an hour we had managed to concoct the dozen or so lines of assembler to create /etc. The stripped binary was only 76 bytes long, so we converted it to hex (slightly more readable than the output of uuencode), and typed it in using my editor. If any of you ever have the same problem, here's the hex for future reference:

        070100002c000000000000000000000000000000000000000000000000000000 0000dd8fff010000dd8f27000000fb02ef07000000fb01ef070000000000bc8f 8800040000bc012f65746300

        I had a handy program around (doesn't everybody?) for converting ASCII hex to binary, and the output of /usr/bin/sum tallied with our original binary. But hang on - how do you set execute permission without /bin/chmod? A few seconds thought (which as usual, lasted a couple of minutes) suggested that we write the binary on top of an already existing binary, owned by me... problem solved.

        So along we trotted to the terminal with the root login, carefully remembered to set the umask to 0 (so that I could create files in it using my gnu), and ran the binary. So now we had a /etc, writable by all. From there it was but a few easy steps to creating passwd, hosts, services, protocols, (etc), and then ftp was willing to play ball. Then we recovered the contents of /bin across the ether (it's amazing how much you come to miss ls after just a few, short hours), and selected files from /etc. The key file was /etc/rrestore, with which we recovered /dev from the dump tape, and the rest is history.

        Now, you're asking yourself (as I am), what's the moral of this story? Well, for one thing, you must always remember the immortal words, DON'T PANIC. Our initial reaction was to reboot the machine and try everything as single user, but it's unlikely it would have come up without /etc/init and /bin/sh. Rational thought saved us from this one.

        The next thing to remember is that UNIX tools really can be put to unusual purposes. Even without my gnuemacs, we could have survived by using, say, /usr/bin/grep as a substitute for /bin/cat.

        And the final thing is, it's amazing how much of the system you can delete without it falling apart completely. Apart from the fact that nobody could login (/bin/login?), and most of the useful commands had gone, everything else seemed normal. Of course, some things can't stand life without say /etc/termcap, or /dev/kmem, or /etc/utmp, but by and large it all hangs together.

        I shall leave you with this question: if you were placed in the same situation, and had the presence of mind that always comes with hindsight, could you have got out of it in a simpler or easier way?

        1. Anonymous Coward
          Anonymous Coward

          "The next idea was to write a program to make a device descriptor for the tape deck,"

          It was possible to mistakenly issue a "label" command for the mainframe system disk which also usually contained customer applications - source and binaries. After which point the disk then appeared to be effectively blank. An operator did that one day causing great panic. There was no backup to the system disk - as there was a known bug in the system which the customer had been ignoring for several months.

          I sat down and wrote a bit of hex code on paper - and transcribed it to papertape with a hand-held dibber. Then using the engineer's panel I set up the magic sequence on the keys to boot and run it.

          What the code had to do was set the single bit on a sector of the disk that identified a particular "file" area as the System partition. I knew that the labelling command only changed the label and cleared that bit.

          Only one chance - but it worked! Much relief. My bosses regarded that as par for the course - even though no-one else in their more highly paid staff had the experience and knowledge to do such low level tricks.

          1. CrazyOldCatMan Silver badge

            for the mainframe system disk which also usually contained customer applications - source and binaries. After which point the disk then appeared to be effectively blank

            We had an operator attach an untested DASD string to the live system once (after a liquid lunch). Said untested DASD string hadn't been properly initialised either and appeared to accept the writes quite happily.

            Reading the data back was another matter - every attempt either returned a disk error or garbage.

            Said operator wasn't an operator for long.

        2. Doctor Syntax Silver badge

          "rm -rf /"

          You can wreak havoc with mv * issued from root level. Just saying. Of course it's never quite as obvious as that. It's usually something like stuff * where you'd intended a / instead of a space.

        3. Anonymous Coward
          Anonymous Coward

          Re: Ahhh, which reminds me of this gem

          @ Anonymous South African Coward

          We had an Ubuntu server (office use only - mainly document and software repository, but I also used it as our main WPKG server whilst I was building a distributed WPKG server setup for a major client) that all of a sudden refused ssh logons. It was still working perfectly in the morning, but sometime during the afternoon it point blank refused all ssh sessions.

          I went to a colleague of mine and asked him if he knew what had happened, or if he had changed something, he said no - it is still working fine (he was logged in through an ssh session). I explained what my problem was and, to demonstrate that all was fine, he logged out and then logged right back in, except that he could not.

          I then connected a monitor and keyboard to the server to sort it out, but it also refused local logons, saying permission denied. Rebooting in single user mode also did not help - I could not access any command (thinking that he somehow changed all the passwords or deleted/corrupted the password file).

          It turned out that all that he had done was to clean up our repositories, as it had become a bit tatty, with duplicate documents and folders in the wrong place, etc (since I was the only one using a linux machine, the others being Windows, access was via SAMBA shares). Luckily that was still accessible and writeable.

          Further probing revealed that he had decided to change permissions in the document repository to 777, as there apparently were problems with some documents having incorrect permissions, preventing other users from opening or editing certain documents.

          So I very gently asked him in which folder he was when he changed it, and he said that he double-checked that he was in the correct folder. I then asked him what exactly he had typed, and he said "chmod -R 777 /*, of course".

          Then the penny dropped: working mostly in Windows, he had forgotten that / means root - he had changed permissions on all files on the machine! (For those of you unfamiliar with linux/unix - it does not take kindly to such abuse).

          Another anonymous South African in order to protect the guilty.

      2. Baldrickk

        RE: They had to retype it from the latest listings.

        Well, its a forced code review, and one where you can't just go "yup, I see what they are trying to do, looks ok" and hit the OK button. If there were bugs, it's perhaps not so surprising that they were found.

        1. Anonymous Coward
          Anonymous Coward

          Re: RE: They had to retype it from the latest listings.

          "If there were bugs, it's perhaps not so surprising that they were found."

          IIRC one of the bugs meant they had been publishing significantly wrong results about a commodity product's statistics.

      3. Wensleydale Cheese

        "Before my time someone had done a similar thing for an major application source that had no back up. They had to retype it from the latest listings. What was surprising was how many code bugs they found in the process."

        Our junior programmer had just completed his first big program, and with the intention of printing out the finished listings the next day, threw his working printouts in the bin.

        Then the source disk was discovered to be corrupt. All other sources were intact, but not his program's. That had him skip diving to retrieve his working printouts, to no avail. I did have a go at scavenging the empty spaces on disk, but couldn't find anything but snippets of his code.

        Oh, well, he got to write his program again from scratch.

        It was some years later that it occurred to me that he might have been covering up some unexplained cockup. We'll never know.

    2. Doctor Syntax Silver badge

      "Or an SQL "Delete" command with autocommit on and an untested "where" clause that doesn't do what you thought it did..."

      When developing SQL always start with a scripted BEGIN WORK and rely on manual input for the COMMIT. Or for the ROLLBACK as appropriate.

      1. Anonymous Coward
        Anonymous Coward

        Done that one too

        Think managing to remove the entire email listing directory on a customer database.. so any opt-outs were no longer recorded. OR weekly only mailers, that kind of thing. I also could have sworn I was connected to the test database so ran it live without any 'Begin work' to roll back from...

        Yeah that a fun morning (turns out the data was duplicated through other means so I could just use that rather than have to recover the data from one table from the backup which would have taken HOURS since it was on tape and the daily backup was already running..and that was occasionally known to take longer than 24 hours. For a daily backup... yeah. Again fun times and alot of data I didn't want to go through)

    3. phuzz Silver badge

      I've taken to putting a z (or other character) at the start of dangerous commands, until I'm really, really sure, eg:

      phuzz@testbox:~$ zrm -rf /

      No command 'zrm' found, did you mean:

      [....]

      Of course, it's the days when I forget to do that when I accidentally press enter instead of backspace etc.

    4. jfm

      Yes - I heard, years ago, the almost-instaneous time interval between hitting ENTER and realising you shouldn't've described as an ohnosecond.

  7. Terry 6 Silver badge

    It's not usually what you don't know that's the problem

    It's what you don't know you don't know. It should be called The Rumsfeld Rule (unknown unknowns).

    And it holds true everywhere.

    1. Steve the Cynic

      Re: It's not usually what you don't know that's the problem

      It's what you don't know you don't know.

      There's an alternative formulation:

      The problem isn't the things you don't know. No, the problem is the things you *do* know, but aren't true.

      1. jake Silver badge

        Re: It's not usually what you don't know that's the problem

        "It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so." --Samuel Langhorne Clemens

        1. Anonymous Coward
          Anonymous Coward

          Re: It's not usually what you don't know that's the problem

          "Samuel Langhorne Clemens"

          AKA that man of timeless quotes - Mark Twain.

  8. steamnut

    Not totally accurate?

    If the computer was a 360 or 370 then the IBM actual 1403 printer could be taken away without doing much harm as it was connected to a 2821. control unit. The 2821 was used to connect a 1403 printer and the 2540 card reader/punch to IBM's byte multiplexer bus. If you removed the (bus & tag) cables from a 2821 without then re-terminating them than you caused all sorts of problems with any other devices on the same bus.

    As a number of control units could be daisy chained on the same bus then what probably happened is he removed other devices from the bus as the chain was broken.

    1. jake Silver badge

      Re: Not totally accurate?

      Correct. The 1403 was actually built for the 1401, not a mainframe. When used with a mainframe, they required a bit of a kludge, but removing one from such a system wouldn't crash anything.

      1. Anonymous Coward
        Anonymous Coward

        Re: Not totally accurate?

        Well, all I can tell you is that when I removed the B&T cables from the printer, the system crashed. Yes, accurate.

  9. Gene Cash Silver badge

    Active SCSI termination

    Don't forget the devices where the termination was done by some sort of fancy autodetect in the device... which only worked when it was on, so you had to power it on first if it was at the end of the chain. Yup, the thing SCSI needed... device power-on dependencies.

    I sure as hell do not miss SCSI, even though I got started at a mass-storage company where it was our bread-and-butter.

    Edit: I have a pile of 20 or 30 different SCSI terminators, just to remind me of the bad 'ol days.

    People bitch about how hard it is to put in a USB plug, and I have to refrain from slapping the ever-livin' out of them...

  10. Kobus Botes
    Facepalm

    Involuntary server shutdown

    We had a SUN SPARC server that handled local accounting information which was synchronised with the main server after hours.

    A problem developed at some stage, so one of the sysadmins at head office called me to log on to the server, then type in some commands he dictated (I had no UNIX experience, so we verified the commands a couple of times to make sure it was correct. Also no remote login capability - this was in the late eighties. Also a 16kb diginet line) and then note the responses and call him back (way before cell-phones and there was no phone in the server room - it was actually just a closet).

    It went swimmingly and at the end, having carefully written down the results, I turned away and went out. As I closed the door, I noticed that I had forgotten to turn off the monitor, so I went back and turned it off, only to hear the gut-wrenching sound of the server dying on me.

    It turned out that those servers had only one switch that ruled all, and it just so happened that that switch was located on monitor - the monitor itself could not be turned off independently.

    Which just reminded me - I also happened to kill an Exchange server once; I was sent to physically check the disk sizes on all our servers, as they were planning upgrades, but were not sure that their information was correct.

    I was able to obtain the needed info from all the servers (x86 - 386's if I remember correctly) except for the Exchange server (NT 3.5 on an Alpha RISC server). I removed the side panels, but could not see any hard drives, as they were located in a cage in the top of the tower. The top panel was also removable, so I did, hoping to be able to get at the drives, only to be met by the same terrifying sounds of a server dying.

    In this case it turned out that the Alpha had an intrusion-detection switch protecting the top panel, which very cleverly killed the machine the moment it was triggered.

    HO was not very fond of me, to be honest (It just so happened that the main Exchange admin also happened to be a director). Thankfully the server booted up without problems, despite the brutal treatment meted out by me (I suspect it must have been the only time in history that an NT 3.5 server recovered unscathed from such an unceremonious shutdown).

    1. CrazyOldCatMan Silver badge

      Re: Involuntary server shutdown

      It turned out that those servers had only one switch that ruled all, and it just so happened that that switch was located on monitor

      Or an old IBM server with attendent IBM monitor that had identical power switches with about two inches of vertical separation between them.

      Said server is next to the door and a junior servery-and-networky type person may well hit the server power button on their way out of the room, rather than the monitor power button.

      If such a hypothetical situation were to occur, said person may well have called out to his colleague to go round the floor and ask people to save their work while said hypothetical person held the power switch in so that the server didn't actually turn off..

      It made my^W the hypothetical persons finger hurt. I reckon that my^ hypothetical persons colleagues took their own sweet time to talk to everyone on the basis that it might make the object lesson (never poke a power button you can't see clearly) stick..

  11. FlippingGerman

    *cough"Voyager*cough

    Oopsie.

    1. Anonymous Coward
      Anonymous Coward

      *cough"Viking*cough

      Big oopsie.

  12. Stevie

    Bah!

    I pulled out one set of paired redundant multipath optical cable thingies from an array under the aegis of Veritas volume manager.

    Nae problemo as the Nac Mac Feegle say. Computers chugging along as per, no screams of anguish from The Pit or blinkenlites or worrying mechanical noises.

    Having demonstrated the robustness of the volume manager I plugged the cables back in.

    Cue the apocalypse. The console flooded with “Aargh! There are two devices with the same World Wide Number!” messages. Screams from The Pit. Clonking noises from the disks in the array. The Works.

    Everyone had a good shout at me. Two weeks later one of the shouters did the exact same thing.

    Oh how we laughed.

    1. CrazyOldCatMan Silver badge

      Re: Bah!

      Nac Mac Feegle say

      No - mostly he says "meow". Unless he is hissing at New Kitten who is still smelling like intact female and thus is above him in the pecking order..

      (First two cats are Feegle [ginger and white] and Kelda [pure blackness]. They both very much live up to their names..)

  13. broomy

    You needed Vary Off

    Eons ago I was the 1st Unix geek in a mainframe (MF) company. We were a print shop, junk mail, many dozens of VERY LARGE printers you did not want to fall into, that were fed from huge rolls of paper, and ended up running through multiple printers (to print different colors on the same page) before being chopped, folded, and inserted into envelopes. We had several football field sized warehouses filled with these printers. All running via bus&tag connectors (initial post reference). The correct process to remove these devices is via the “vary off” command. You did not need to halt the computer. Later on the connectors became ESCON (17mbit?) then FICON (100mbit?)

    We got a forklift upgrade every 18 months, new CPU for a couple million dollars. We had dozens of 3480 cartridge drives feeding the MF sort / merge / purge processes. We had about 6 tape monkeys on site at all times that did nothing but run around, gather cartridges, and feed the beast. We also had a big EMC disk pack, but it was trivially small compared to the amount of work we pushed through via the tape drives.

    The MF director HATED me. I spent years cherry picking projects, moving them to Unix, building my teams, and taking floor space. When the initial xeon chips shows up, these devices had CPU speed equivalency, so my systems would kick ass on compute bound tasks, but I had no desire to drive those printers. At this point I had about 50 Linux based compute servers and a dozen Sun systems.

    The MF director had a favored bit of software called copycross. This software pretended disk files were tapes. It would also compress/decompress them on the fly. This allowed decades of tape based software to run seamlessly on disk based files, no recoding, plus it ran a lot faster since disks are faster than tapes (sometimes). As long as we were willing to buy enough disk space to support it (sold by EMC of course).

    Except – it was a PIG. It would suck down about 30% of MF compute cycles, dramatically impacting other tasks concurrently running. The MF director denied it. He brought out reams of performance reports to “prove it”. Our boss told me to prove otherwise or back off. I was given a MF login and access to a status program. The program was like “ps” on steroids, full screen deep dive interface. Forget what it was called, but it was kick-ass. I used perl driving a 3270 sessions to screen shoot a session every 10 seconds. I then parsed that out, pulled the REAL performance data, proved my point, and torpedoed his project.

    But: Each time I ran the process monitoring screen, it took up a job number in the mainframe. And I didn’t run the correct exit command to collect/free up the job. After about 20 hours of running, I consumed every available job id on the mainframe. CRASH.

    Oops.

    1. Anonymous Coward
      Anonymous Coward

      Re: You needed Vary Off

      We installed a comms front-end system for a customer's mainframe. For resilience it used three identical custom front end boxes.

      Each new terminal connection was routed by the network to the next front-end in a cyclic sequence to give a reasonable load sharing effect. If one front end died then the terminal reconnections would go to one of the other two fairly seamlessly. Under stress testing to the limits of the customer's terminal numbers it proved resilient.

      Then one day one of the front ends crashed - immediately followed by the second one - and the third one. The customer was very unhappy - we were in big trouble.

      The dumps showed that the number of extant terminal connections on each one far exceeded the design expectations. Furthermore the excess all came from the same physical source.

      It turned out that one of the customer's teams had decided to set up a continuous terminal performance program to test connection response times every few seconds. Unfortunately it never terminated its previous connection - so gradually filled up the front ends with idle connections. The load sharing ensured they all reached near crash capacity at the same time - so the remaining two couldn't handle the reconnections after the first one died. Our software supplier then provided patches to stop such a crash in future - and the customer's performance program was tidied up too.

  14. biolo

    Back in the day when I worked in an engineering department attached to the factory floor of a large Unix hardware manufacturer, we kept a rubber chicken in the server room. It was kept on prominent display on one of the server cabinets, and was there to be ritually sacrificed to the SCSI gods when we couldn't get one of the servers to see its storage. Management was well aware of its purpose of the chicken, and that it was used quite regularly, as it was cited in the weekly team meetings on several occasions as the method by which a system was fixed.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like