User topics

Article topics

Log in Sign up

Archive.org web trove hits FOUR HUNDRED BEEEELLION pages

The Internet Archive's "Wayback Machine" has announced that it has indexed four hundred billion web pages. The trove dates back to late 1996 and comprises at least fourteen petabytes, a figure we base on a 2012 declaration the archive hit 10 petabytes and a later post explaining that a fund-raising drive for another four …

COMMENTS

House rules Send corrections

This topic is closed for new posts.

Monday 12th May 2014 05:12 GMT Paul J Turner

4 Billion pages...

Let's hope that they didn't use 32-bit unsigned index values or they may find themselves more way-back than they planned!

5 0
1. Monday 12th May 2014 06:46 GMT Annihilator
  
  Re: 4 Billion pages...
  
  Four hundred billion...
  
  4 0
Monday 12th May 2014 05:13 GMT MrT

Brilliant project!

Can't remember visiting El Reg back then, but like most people who had anything online in the early days, it's a blast visiting stuff that no longer exists some of mine from '94 was still around for the earliest snapshot). I like the roll-back feature, showing how things have changed over the years.

2 0
Monday 12th May 2014 05:28 GMT T. F. M. Reader

Understatement of the year?

Who had more beer last night: me, The Reg, or the Wayback Machine? Their announcement says FOUR HUNDRED BEEELLION pages. or at least that's what I saw. Twice.

2 0
Monday 12th May 2014 06:52 GMT h3

I think I first read El Reg in either 1999 or very early 2000.

(Think it looked fairly similar to how it is now).

1 0
Monday 12th May 2014 08:04 GMT mix

I miss low res animated gifs

Ahh, those pre-flash days, twirly gif logos all over everyone's page because they were 'dynamic' and 'cool'...

* goes all misty eyed*

3 0
1. Monday 12th May 2014 09:41 GMT Annihilator
  
  Re: I miss low res animated gifs
  
  Not to mention the yellow and black striped sign with "UNDER CONSTRUCTION" emblazened all over it.
  
  6 0
  1. Monday 12th May 2014 12:04 GMT WraithCadmus
    
    Re: I miss low res animated gifs
    
    If you fancy putting one into your next web project here's a selection
    
    And if like the commentards below you miss the <blink> tag then CSS3 can help.
    
    1 0
  2. Monday 12th May 2014 22:09 GMT MrT
    
    Under construction...
    
    ... don't forget little graphics of envelopes with "No Junk Mail" on them, right under a plaintext email address ... because that stopped spam in its tracks ;-)
    
    0 0
Monday 12th May 2014 08:28 GMT JDX

even shows how El Reg looked as a young vulture in the summer of '97

I prefer that version!

2 0
Monday 12th May 2014 08:33 GMT Zog_but_not_the_first

HMTL of yesteryear

</blink>

How we've missed you.

3 0
1. Monday 12th May 2014 11:28 GMT Anonymous Coward
  
  Re: HMTL of yesteryear
  
  "</blink>
  
  How we've missed you."
  
  You only miss it once it's gone. No really, no need to reimplement... please...
  
  2 0
Monday 12th May 2014 08:39 GMT SW10

Tip-top technical guidance from Ye Olde El Reg

"Use the back key on your browser to return to the previous page..."

You have to wonder who the target audience were!

0 0
Monday 12th May 2014 08:41 GMT Khaptain

Remash of old material

Ah ha, now we have proof that the El Reg hacks simply remash existing material.

Win X Rollout

El Reg hack replaces Win 98 for W8, changes one or two names, drops in the obligatory threat from company X who is "prepared" to move to an alternate OS et voila Bob's your auntie.

Bill Gates was already the worlds richest man, for the 4th year running with an estimated 51 Billions - so we can easilly remash that as well....

Interesting to see how nothing much changes.

2 1
Monday 12th May 2014 08:49 GMT Winkypop

Ahh the wayback machine

How many times have I used ye to retrieve old content long since deleted from our servers?

Praise be to the wayback machine!

3 0
Monday 12th May 2014 09:02 GMT Vociferous

It's a sad testament to the state of the world...

...that "Gates Owns Even More of Everything -- Official" is now the Good Old Days.

2 0
Monday 12th May 2014 10:03 GMT ProperDave

I had a friend show me an odd 'bug' in the Wayback Machine once - he bought a domain and set up his own personal website on it, which as it turned out had already been owned several years previously to his purchase by a small foreign telecoms company.

The TelCo had a blanket denial robots.txt file which told all spiders to F- off, and because of this, the Wayback Machine would refuse to allow him to browse the historical snapshots of the domain during the time he owned it, despite indexing his site according to his web traffic logs.

I just shudder at what Wayback Machine holds on me - I can see my very first websites thanks to the history, back when I did terrible things like build websites in Lotus Word Pro (which was marginally less of a sin than building them in Word).

2 0
1. Monday 12th May 2014 10:17 GMT Anonymous Coward
  
  I just hit the same 'feature'. My domain goes back on there to 2000, yet I can't see it due to the robot.txt thing. I just changed my robots.txt to allow all, and it still refuses to show me the archived pages.
  
  So I don't understand how/why this works like this?
  
  0 0
  1. Monday 12th May 2014 11:19 GMT ProperDave
    
    Seems there's some discussion on it on the Archive.org forum;
    
    ( http://archive.org/post/406632/why-does-the-wayback-machine-pay-attention-to-robotstxt )
    
    Doesn't appear to be any sensible consensus on what they should do to fix this... but this is totally off-topic for this article. :)
    
    Pirate flag, as I've partially hijacked the topic! (we need a tangent icon).
    
    2 0
Monday 12th May 2014 19:34 GMT ecofeco

Love the Wayback Machine!

After several PC upgrades, I eventually lost the files that my were first websites I had built back in the gay 90s. Some of them quite good!

Gone for ever I thought. Oh well.

Spent a few years years trolling the Internet Archive thinking they might show one day. After many, many years, low and behold! They did!

Now when I tell people I used to build websites back in the early days of dinosaurs, fire and stone wheel, I now have proof and you know what, they still look pretty good.

Here's one sample. Check the date. https://web.archive.org/web/20010406065812/http://www.worldtv.org/index.htm

Flame on. :)

0 0
1. This post has been deleted by its author
  1. Wednesday 14th May 2014 15:37 GMT ProperDave
    
    Re: Love the Wayback Machine!
    
    Check out http://www.fabricland.co.uk/ - it's like playing bad web design bingo!
    
    "New Page 1", Framesets, pointless gifs, horrific colours, marquees, table layouts, center aligned text, broken links, personal drawings/quotes unrelated to the site... the list is almost endless!
    
    ... I hope that site never changes... it's a fantastic example of everything bad *and* it's an actual live site! :o
    
    0 1
Tuesday 13th May 2014 00:34 GMT C. P. Cosgrove

Early logo

I wasn't a reader then, but I have to say that the early logo still looks quite stylish incorporating as it did both the 'R' and the vulture.

Yes, a definite 'like'

Chris Cosgrove

0 0
Tuesday 13th May 2014 08:10 GMT theotherbill

Old Copies of "As The Apple Turns"

appleturns.com was responsible for many a noser back in the day. Where is Jack Miller?

0 0

This topic is closed for new posts.

The Register Biting the hand that feeds IT

About Us

Our Websites

Your Privacy

Situation Publishing

Copyright. All rights reserved © 1998–2024