back to article Vint Cerf: Everything we do will be ERASED! You can't even find last 2 times I said this

Cyber-pioneer Vint Cerf has warned – once again – that our digital lives are in danger of being wiped from human history. Cerf, who was speaking at the American Association for the Advancement of Science annual meeting, reiterated calls for a "digital vellum" – referring to the ancient parchment made from calf skin and known …

Page:

  1. Electron Shepherd
    Unhappy

    Turtles all the way down?

    The project takes snapshots of digital files, including the technical details of the computers for which they were designed.

    And how do we know how to read the details of the snapshots and the computer specifications? No problem, we'll just document it in this digital file here...

    I agree it's a problem, but I'm not sure that storing things digitally is the best solution to the problem that when we store things digitally, we will probably eventually lose track of how to read them back,

    1. Anonymous Coward
      Anonymous Coward

      Re: Turtles all the way down?

      You mean all those videos of my cat that I burned on to 1000 year M-DISC blue ray DVD's were all for naught?

    2. Anonymous Coward
      Anonymous Coward

      Re: Turtles all the way down?

      "And how do we know how to read the details of the snapshots and the computer specifications? No problem, we'll just document it in this digital file here..."

      Digital storage is only a problem if you

      a) don't keep copying the data as you update storage media

      b) don't have the technical know-how to figure out how data are stored on the old media, and retrieve that information.

      What's needed is somewhere you can put Very Important Information that will automatically be backed-up and then transferred from older to newer media as the newer media arrive. Fortunately, in the modern era a lot of this has become "transfer all our old data from the old storage array we thought was large and fast, to the new storage array that's bigger and faster than we thought was possible when we bought the old storage array."

      1. Uncle Slacky Silver badge

        Re: Turtles all the way down?

        Obligatory XKCD:

        https://xkcd.com/1360/

      2. Anonymous Coward
        Anonymous Coward

        Re: Turtles all the way down?

        It's not the media the data is stored on that he's worried about, it's the format it's stored in. When we switch to binary encoding, bits don't have the nice fixed meaning that letters on a page do. Instead we assign meaning to large collections of bits. If we forget how we've done that, then we're unable to recover the meaning from the data.

        This basically means that we not only need to keep all your .doc files on a reliable storage medium, but we also have to keep around a copy of MS Word that can read that version of .doc files and a computer that can run that version of MS Word. Keeping the data eternally is really the easy part of all this.

        1. Jan 0 Silver badge

          Re: Turtles all the way down?

          > "This basically means that we not only need to keep all your .doc files on a reliable storage medium"

          Is there any useful data in .doc files?

        2. Daniel B.
          Boffin

          Re: Turtles all the way down?

          It's not the media the data is stored on that he's worried about, it's the format it's stored in. When we switch to binary encoding, bits don't have the nice fixed meaning that letters on a page do. Instead we assign meaning to large collections of bits. If we forget how we've done that, then we're unable to recover the meaning from the data.

          The fun thing about this is that he's talking about this now, when the issue has been very known in the IT world for quite some time now. Even my dad, who isn't in the IT world already knows about this. Why? Because the following things are no longer readable:

          His probability programs written in college, which are stored in a big-ass magnetic tape roll. We don't even know which format the files are in.

          His PhD thesis, which was written in either Aldus PageMaker 1.0, 2.0 or 3.0 and is stored in a lot of 3.5" floppies. And they're all in HFS Mac format. Extracting that data requires getting at least PageMaker 2.0, 3.0 and 4.0 to get them up to a point where we might extract that data into a Windows PageMaker version, a PPC Mac or a Snow Leopard-toting Intel Mac that would be able to run PageMaker.

          All my Commodore 64 programs and data.

          All the stuff we stored in Jasmine Removable 45 HDDs.

          All the stuff we stored in MDS88 Removable HDDs.

          All the stuff stored in iomega Jaz or ZIP drives. (Fortunately, I had a wee bit of foresight on this, so I managed to rescue most of my ZIP cartridge data before I was no longer able to read 'em. No such luck for my dad.)

          I'm pretty sure that anyone who got into the whole "computers" thingy back in the 80s like me has already lost something to the "digital dark age" by now.

          1. Dave 126 Silver badge

            Obligatory Iain Banks:

            "So," Ash said slowly. "Let me get this straight: you don't know the machine, but it's probably some ancient nameless Apple clone from the dark grey end of the market, almost certainly using reject chips; it probably had a production run that lasted until the first month's rent fell due on the shed the child-labourers were assembling them in, it used an eight-inch drive and ran what sounds like dodgy proprietorial software with more bugs than the Natural History Museum?"

            - The Crow Road, Iain Banks

          2. Charles 9

            Re: Turtles all the way down?

            Actually, nearly 20 years ago I was able to preserve a lot of C64 and C128 data I had by shuttling the data from the C128 to a nearby 486 using modems and a phone cable. Okay, it was slow and tedious at 1200bps using Xmodem, but at least it worked.

            As for the Mac HFS format, I recall there are Windows programs capable of reading them since around 1995.

          3. JamesTQuirk

            Re: Turtles all the way down?

            Noticed this today, So some are trying ..

            5D ‘Superman memory’ crystal could lead to unlimited lifetime data storage

            http://www.southampton.ac.uk/mediacentre/news/2013/jul/13_131.shtml

    3. cyber7
      Alert

      Re: Turtles all the way down?

      This has already been happening, to a lesser extent! I used to work for a company that re-mediated and dumped the data from vintage 9-track and 21-track tape reels, primarily for the oil industry. It's been a frequent occurrence that shooting seismic data in a location is now impossible due to new construction or political motivations, so companies are looking to re-process vintage data with new technologies. We frequently ran into issues with proprietary formats, undocumented fields, and some forms of obfuscation, forcing us to scour archives for format documentation, hoping it wasn't trashed.

      Fortunately, the generations of 30-70 years ago frequently used paper, though I had to decode my share of UNIX 8" floppy disks at times. A few generations from now, our descendants will not have that luxury given our tendency to keep everything digital, the push for cloud storage, and the plethora of disk formats generated over the last century alone.

      1. Anonymous Coward
        Anonymous Coward

        Re: Turtles all the way down?

        "I used to work for a company that re-mediated and dumped the data from vintage 9-track and 21-track tape reels, primarily for the oil industry"

        For a brief moment I feared jake had started using sock puppets. Then I realised you were describing a real world scenario of the kind jake likes to fantasise about. Sorry about that.

        1. Fibbles

          Re: Turtles all the way down?

          It can't be Jake, there's no mention of doing it whilst horseback riding through his vineyards.

          1. Trevor_Pott Gold badge

            Re: Turtles all the way down?

            "It can't be Jake, there's no mention of doing it whilst horseback riding through his vineyards."

            Clearly while on route to his handcrafted mahogany helicopter which he will use for his quarterly trip into the city to gather the few supplies he doesn't make himself artisinally on his massive plot of prime land.

  2. Joseph Eoff

    Not really much of a problem

    99.999999 (repeat 9s for as long as you care) percent of everything on the Internet is of no real relevance to anyone - not even those who post it (including this comment.)

    If YouTube were to go TITSUP permanently what would we lose? Nada. Cats will still be around so we can recreate all the cat videos at will.

    The crap posted to Facebook (and other related sites) won't be of any use in a couple of hundred years - it isn't even of any use NOW.

    Imagine some poor bastard who digs up a bunch of Facebook backups a thousand years from now. All the hard work and dedication needed to decode it (under the assumption is must be important,) and all he gets for it is the photos some assclown posted of his lunch.

    Your life isn't so damned interesting that you need to preserve it for posterity. Get over it (and yourself.)

    1. Roger Greenwood

      Re: Not really much of a problem

      Au contraire - social hostorians etc will love the treasure trove - e.g. what drove the fascination with some kid called Bieber for a year or two at the beginning of the 21st century? (Thankfully now forgotten). Besides their research grants will depend on it.

      1. Joseph Eoff

        Re: Not really much of a problem

        Actually, I suspect any social scientist who investigated the "Bieber incident" would lose all faith in himself and humanity and committ suicide - possibly together with the technicians who put so much effort into recovering the records only to find the "The Bieber" and the previously mentioned lunch photos.

        1. Zot

          Re: Not really much of a problem

          The Bieber story will be told and re-told with more and more embellishment over the next thousand years, until all the rhetoric will be written down freshly from snippets of teenage writings and Lo, he'll appear to be a Jesus like figure - it's what happens, you know.

      2. Anonymous Coward
        Anonymous Coward

        Re: Not really much of a problem

        Given that middenheaps (trash dumps) is the source for much of what we know about past cultures, it's singularly appropriate that Facebook and related ilk likely will be used for the same.

      3. Mark 85

        Re: Not really much of a problem

        What you say is true. The biggest treasure trove for archaeologists is usually trash dumps. Other than stone, pottery, and some tools, much doesn't survive time. With the digital age, things are even more transitory than with paper.

        Nothing really is forever. Time moves on. Things disintegrate. If we could travel a thousand years into the future, it would be interesting to see what survived and what didn't. And how what was found was interpreted. I'm sure that many things will be glossed over as "being of religious or ceremonial use".

        There's a story I remember from decades ago about some archaeologists excavating a motel in the future and how they interpreted things. Hilarious, yet not far off the mark, IMO.

        1. DaiKiwi

          Re: Not really much of a problem

          The Toot 'n' Come In Motel, or some such. I remember reading it in a Readers Digest 25+ years ago. I must look it up...

          Ok, I went & looked it up - It was an extract from David Macaulay's 'Motel of the Mysteries' (1979). Macaulay's book begins by noting that America was destroyed in 1985 when it was suddenly covered by a huge flood of junk mail, unleashed by an accidental reduction in postal rates, followed by the sudden fall of solid pollutants from the atmosphere, placing another layer over the buried country. Found it mentioned in a Locus magazine article about the Archaeology of the Future. Now to find the book - and some of the other stories mentioned in the article.

    2. Naselus

      Re: Not really much of a problem

      While stuff like Facebook isn't worth saving, there's still serious work which is worth keeping - some academic journals are now online only. And who knows, the digital archaeologists of 2250 might actually care about those photos of your lunch. The Roman man-in-the-street's view is considerably more valuable than the written histories we have simply because no-one at the time thought the pleb's lives were interesting or worth recording.

      1. Joseph Eoff

        Re: Not really much of a problem

        In the main, the "plebs'" lives aren't very interesting - not even of today's "plebs."

        How many life stories do you have to reconstruct to realize that people then are pretty much people now. They work, they eat, they sleep, they do stupid shit, they die. You do it, I do it, the rich do it, the famous do it, the brilliant do it, the stupid do it.

        Certainly there are things that need to be preserved - the back ground of historical events and agreements, records of scientific progess, and a zillion other things.

        We don't need perfectly preserved records of the lives of billions of people.

        1. Electron Shepherd

          Re: Not really much of a problem

          We don't need perfectly preserved records of the lives of billions of people.

          We as a society don't, but individual people do. How many people these days with a new-born baby will realise in 25 years time that they have no ability to look at any of their photos of their children growing up?

          1. Joseph Eoff

            Re: Not really much of a problem

            That is purely the stupidity of trusting your personal things to a corporation.

            Keep your pictures locally, and print the important ones - or do you really think you need every single picture you took of your kids spitting up?

          2. Anonymous Coward
            Anonymous Coward

            Re: Not really much of a problem

            And how many of them then think "You were a child, just like every human ever, so who cares, let's stream a film no-one will care about in 2 years time."

          3. Anonymous Coward
            Anonymous Coward

            Re: Not really much of a problem

            That's why pictures and videos of my kid are stored on three different hard drives, replaced/upgraded every 4 or 5 years, and backed up to MDisc BluRay media, with an unused BluRay drive kept in storage until the next major interface upgrade (from SATA to ...???)

            1. Dave 126 Silver badge

              Re: Not really much of a problem

              >That's why pictures and videos of my kid are stored on three different hard drives, replaced/upgraded every 4 or 5 years,

              That's a good start. There are still some issues that might affect you, especially if your images are in a compressed format such as JPG. A single bit error can be enough to trash a compressed image. True, if you spot an issue yourself, you acn of course manually recoverthe image from the back-ups - but this can't be done automatically if the file system doesn't know that the file has been damaged. This is an issue that ZFS, amongst other file systems, addresses.

              http://openpreservation.org/system/files/Bit%20Rot_OPF_0.pdf

              http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic-cows-inside-next-gen-filesystems/

        2. Doctor Syntax Silver badge

          Re: Not really much of a problem

          'In the main, the "plebs'" lives aren't very interesting - not even of today's "plebs."...Certainly there are things that need to be preserved - the back ground of historical events and agreements'

          Maybe not from your point of view. However as you're not the only person in this planet that's a fairly limited one. Social historians find these lives much more interesting than the doings of the political classes.

          To take one tiny area - how did the domestic textile industry of the West Riding evolve? Just how did the clothiers operate? How did this differ from the domestic industries of other areas such as the Cotswolds? How did it differ from the development of the metal-working industry of the Sheffield area? Manorial and parish records are remarkably unforthcoming about this aspect of their inhabitants. These are not trivial historical concerns; these trades launched parts of the industrial revolution and yet our understanding of them is quite limited.

        3. Michael Wojcik Silver badge

          Re: Not really much of a problem

          How many life stories do you have to reconstruct to realize that people then are pretty much people now.

          Ah. This sort of story never fails to bring out the sophomores in droves.

          "Digital Humanities" research is already turning up all sorts of important historical information using corpora of documents from the past couple millennia, which have been scanned, OCR'd, and corrected.

          An example I may have mentioned here before: I saw a presentation some years ago at MLA on identifying the time period of the modal shift in imitatio christi - when people went from saying "what did Jesus do?" as a moral touchstone to "what would Jesus do?". The presenter had done searches through a number of massive corpora of late-medieval and early-modern documents for variations of those phrases in Latin and vernaculars, and found good evidence that the shift occurred relatively rapidly in, if memory serves, the late sixteenth century.

          So what? Well, in the Christian-dominated European cultures of the era, the imitatio christi modal shift - which happened among the "organic intellectual" members of the working class, so yes, very much the plebs - is a hallmark of modernity. It happens when Christians no longer see the present day as equivalent to the historical moment of Jesus' life, but instead a new and different milieu. And that's just what "modern" means.

          That's a change in the historical episteme that we couldn't track without computer analysis of huge text corpora. The vast majority of content produced online every day may be dross as individual pieces, but in aggregate it can tell us a lot about ourselves.

          And anyone who looks at the actual research people are conducting with it now, instead of indulging in idle armchair speculation, would know that.

          1. JamesTQuirk

            Re: Not really much of a problem

            Armchair speculation ? Isn't that what Christians live on ? NO Proof, Just Believe, in a story made to create a religion that was the largest "business" this Planet has ever seen, until the population become educated, and realised what a pile of crap it is. How it's been perverted into a Control Mechanism for Governments & Arseholes who think "they" know all, cause, they say, a "voice in their head" said it was OK to rort people ...

            You can't quote fairy Tales as a basis for truth....

            1. Michael Wojcik Silver badge

              Re: Not really much of a problem

              You can't quote fairy Tales as a basis for truth....

              And no one in this thread has done so.

              If you have some free time, could I suggest you learn to read? And then maybe you'd like to study how to write.

      2. Robert Helpmann??
        Childcatcher

        Re: Not really much of a problem

        While stuff like Facebook isn't worth saving...

        Hold on just a moment there! Archaeologists have been digging through modern landfills for more than 40 years. One man's garbage is another man's research paper.

      3. John Sanders
        Holmes

        Re: Not really much of a problem

        If only someone at the time of the Romans had made a blog about Roman concrete mixing...

    3. gerryg

      Re: Not really much of a problem

      If YouTube were replaced then I agree with you, however loss without replacement would negatively me, albeit a bit niche, old videos of Jimi Hendrix concerts, interviews with him, Mitch Mitchell etc., on the one hand, Go turorials on the other, then there are all those "how to replace that widget on your gadget".

      Boring to most I'm sure, but they might have their own examples,

    4. bjr

      Re: Not really much of a problem

      One century's garbage is another century's treasure. That's literally true, the thing that archaeologist's love most of all is garbage dumps and cesspits. Historians in the distant future will find Facebook fascinating. They will also love our primitive cat videos. Undoubtedly they will have some form of cat video which is as unimaginable to us as YouTube would have been to the Egyptians who built The Sphinx (the oldest known cat video).

    5. John Sanders
      Holmes

      Re: Not really much of a problem

      In the future, (if political correctness hasn't busted civilization yet) historians will find your post amusing.

    6. Anonymous Coward
      Anonymous Coward

      Re: Not really much of a problem

      I worked on US Census data in the 1980s, stored on magnetic tape, with an access program called CENSPAC, a derivative of COBOL. I suspect much of that demographic data (and back then, the Census asked a lot) is unreadable now.

    7. JamesTQuirk

      Re: Not really much of a problem

      or try to find it again @ ... Internet Archive: Digital Library of Free Books, Movies ...

      https://archive.org/index.php

      or if Cat vid's are your drug of choice, maybe ....

      WayBackMachine - 452 billion web pages saved over time

      https://archive.org/web/web.php

      I like here ....

      https://archive.org/details/software

      Ps: I used 8" - 51/4" 360k/1.2m - 3" - 3.5"720k / 1.44m, HDD in 5mg,20mg40mg & UP, I have Files and HDD's from Amiga's, Old Mac's, PC, Atari ST (lookin for VM), even my Commodore 64 files are on HDD ....

      My point is the Data will move, just some buggers got to do it ....

  3. Captain Hogwash

    I don't see why he's so concerned.

    GCHQ/NSA will take care of it.

  4. Anonymous Coward
    Anonymous Coward

    GCHQ and NSA alternatives

    The GCHQ and NSA archives aren't open to the public.

    On the other hand, archive.org (the wayback machine) is accesible to the public for free (at the moment) and access just needs a browser and not much else. How hard can that be.

    Anything more obscure than that and there are a few computer museums, the classic computers mailing list, and so on.

    If Cerf had said we need to preserve *knowledge*, the kind of stuff that used to be in the non-fiction section in buildings called libraries, then he'd have had an excellent point.

  5. RyokuMas
    Joke

    Family tree...

    Is this so in a few decades time Google can fling ads at people based on their genealogy?

    I can just see it now - some little tyke with his brand new neural plugs chips into the Googletrix only be confronted with "popular with your forefathers..."

  6. Androgynous Cupboard Silver badge

    That's what PDF/A is designed for

    If it's digital you're going to need hardware and software to decode it of course but once you get past that "bootstrap" problem PDF/A is designed to be completely self-contained.

  7. Peter Gathercole Silver badge

    It's hopeless

    We need a technology that can be abandoned and still be readable in future times.

    Any technological solution is bound to fail because maintaining it requires repeated investment in either maintaining what will become an obsolete storage format in the future, or repeatedly re-writing it as new media are invented.

    It's all very well suggesting that technology from people such as "Carnegie Mellon University and IBM Research" might be worth using, but this assumes a certain amount of continuity to maintain the physical storage that requires organisations to survive. You cannot rely on government or industry to still be around in the future, and the 'Cloud' (whatever is meant by that) needs to be maintained as well.

    You end up with stupid chicken-and-egg situations if the description of the programs and machines necessary to read the media is only stored on the media itself.

    I respect Vint Cerf. He's very influential. But he's not, in the grand scheme of things, an engineer (his degrees are in Mathematics, and he's managed various teams and companies mainly on data communication). Nowadays, he's good at the grand scheme thinking, not the detail.

    He was being interviewed on Radio 4 this morning, and I got the feeling that he was either dumbing down what he was saying for a non-technical audience, or that he did not fully understand various fundamentals on machine architecture and what would be necessary to maintain in order to run a program from a current generation of machines. I would hope that it was the former, but I was not convinced. When taking about the systems, he talked about taking a snapshot of the software "with a description of the machine it runs on", glossing over that the description would have to be incredibly detailed to capture all the nuances of machine architecture to allow a working machine to be reconstructed from that description.

    I would suspect strongly that it would already be nigh on impossible already to reconstruct systems from people like DG, Prime or Tandem (amongst others) unless working physical instances exist.

    Trying to capture all of the operating characteristics of a complex modern processor like Power 8 or a Haswell and the associated support chipsets to allow it to be reimplemented in the future on architectures unimaginable at the moment would be a herculean task!

    Much better would be to ban the use of all proprietary closed file formats, and keep the definition of the open file formats in enough detail to reconstruct the data stored in those formats.

    But this does not alter the fact that there needs to be readable media maintained in perpetuity.

    1. a well wisher

      Re: It's hopeless

      And didn't he also say that they keep all their info on archival paper EVEN though its all been scanned in

  8. Pen-y-gors

    Easy!

    Very large slab of granite and a chisel, and just chip out the bits.

    Not sure about the cat videos though...

    1. fritsd

      zoopraxiscope for cat videos

      Better glue those slabs of granite on tight.. wouldn't want one to fly off and drop on someone's toe.

      http://en.wikipedia.org/wiki/Zoopraxiscope

  9. AbortRetryFail

    Not such a new concept

    People have been saying this for decades. I remember a New Scientist article in the late 1980's postulating that we were entering a new "Dark Ages" for the reason that hundreds of years from now historians would be unable to read most of our digital records.

    Also, Iain Banks' book "The Crow Road" centres on trying to read a diskette from an obsolete computer, and it was published in 1992.

    1. Michael Wojcik Silver badge

      Re: Not such a new concept

      Yeah, Cerf's way behind on this one. The ELO's Acid-Free Bits recommendations predate his "vellum" proposals by ten years, and as you note people have been complaining about losing data due to format and encoding obsolescence since at least the 1980s, and probably before. And as other posters have noted, there's an entire industry around recovering this stuff.

      Doesn't make his point less important, of course, but it's certainly not novel.

  10. Bob Wheeler
    Boffin

    Rince and Repeat.

    What he's talking about is the 100 year archive.

    How to keep access to data - be that knowledge or cat video's - across the generations.

    http://www.theregister.co.uk/2011/06/04/snia_100_year_archive/

Page:

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like