back to article Google's encrypted search casts shadow on web analytics

In adding SSL encryption to its primary search engine, Google isn't just protecting your traffic from anyone sniffing your network. It's also preventing third-party webmasters from tracking the search terms you used to find their sites. That may be a good thing for netizens intent on privacy lockdown. But for webmasters, it …

COMMENTS

This topic is closed for new posts.
  1. Anonymous Coward
    Thumb Up

    Good

    "And yet somehow, not knowing this visitor's specific search term is protecting their privacy? Please. The only thing it does is make the life of a web master a much bigger pain in the ass than it was before."

    Yeah, it isn't protecting their privacy at all, it's just.. erm.. protecting their privacy by not telling you the referer.

    This is only really a "problem" with Google search anyway (and possibly bing, I've never used it). Most of the other search engines I use (ixquick, yauba, scroogle etc) provide a referer that doesn't show the search terms at all.

    Privacy FTW :)

  2. Pawel 1
    Thumb Up

    Excellent

    Will make the job of analyzing various malware sites easier - remember that many of them will only sent payload if you came from google, even going to the point of checking if the search was of just keywords, or URL and discriminating users based on that.

  3. Anonymous Coward
    WTF?

    Oh really?

    Then why do I get a 'This page contains both secure and nonsecure items. Do you want to display the nonsecure items?' when I go to https://www.google.com.

    There is nothing evidently different on the page itself when you click Yes vs No, except that in the 'No' case it takes a little bit longer to load.

    This possibly because in the 'No' option, an additional get (https://www.google.com//compressiontest/gzip.html) is done before the page loads fully, while in the 'Yes' option the same is done post page load.

    Also the get of https://www.google.com//compressiontest/gzip.html gives me a 'Done but with errors on page' which on double-click gives the following error message:

    Line; 71, Char: 50, Error: Access is denied, Code: 0, URL: http://www.google.co.uk

    And the privacy report tells me it has blocked a cookie for http://www.google.co.uk

    Is this because I'm *cringe* using IE 6/ XP or because of some goof up at the google end?

    1. Matt Bradley

      The answer is in the question

      "Also the get of https://www.google.com//compressiontest/gzip.html gives me a 'Done but with errors on page' which on double-click gives the following error message:

      Line; 71, Char: 50, Error: Access is denied, Code: 0, URL: http://www.google.co.uk

      And the privacy report tells me it has blocked a cookie for http://www.google.co.uk"

      Look at the protocol in the above 2 URLS - it is HTTP, not HTTP*S*

      Hence the

      "This page contains both secure and nonsecure items. Do you want to display the nonsecure items?'"

      The above URLs are the "non-secure" items which you are preventing from loading by clicking no.

      So yes, Google has made a boo-boo. :)

    2. Steve Coffman

      Hmmm

      Works fine in IE8 - I get no prompt for secure/non-secure items... looks like it's your outdated browser that's having the issue...

  4. Anonymous Coward
    Anonymous Coward

    web analytics

    The future of the internet is encrypted transmissions. So what if it screws up web analytics.

    It won't be the end of the internet if they can't analysis how you got to their site.

  5. Anonymous Coward
    Anonymous Coward

    Yes, well.

    Other sites could plug ssl more. What google could do is check for http/https versions and refer to the https version if coming from google's https version. Easy does it.

    Only we then have to fess up to the fact even more that the ssl certificate infrastructure is horribly broken and not trustable (``verisign^Wsymantec only protects you from those they don't take money from''), and we may have to finally admit it and try and find a better solution. Oh deary deary us.

  6. Ian 35

    This was inevitable...

    ...once Phorm tried their product on the market. So perhaps all the e-commerce people who jumped up and down excitedly at the thought of what the Phorm offering might do for them should actually have thought about the unintended consequences.

  7. Stefing
    Grenade

    If...

    If Google gave every little girl in the world a pony, you could be sure Cade Metz would find a downside.

    1. Anonymous Coward
      Anonymous Coward

      Would...

      ...several million tons of pony shit be enough of a downside?

  8. Anonymous Coward
    Thumb Down

    oh bugger

    there goes my bonus

  9. Paul_Murphy

    What happens if EVERYONE uses SSL?

    Would the info be passed from one https site to another https site?

    If so then problem solved!

    The beta seems pretty fast I must say - haven't used it too much, but it seems to do the job.

    ttfn

  10. Anonymous Coward
    Thumb Up

    About time!

    Being a web-master I say screw myself. Why the hell should anyone know how we got there.

    1. CD001

      To be fair

      It's not normally webmasters who give a stuff about the analytics - we just make sure the damned thing is working properly. The only analytics I'm interested in are load times, the user's OS/Browser combination and any non HTTP 200 responses really.

      Anything else is marketing :)

    2. Anonymous Coward
      Flame

      Agreed

      As a web developer I have to say that the concept of analytics and SEO (thats another rant) is a right pain and something I hate. Yes, we design and build our websites to be friendly to text readers, and therefore search engine bots, but it should stop there. I don't want middle management types poncing in to my office like they own the place, insisting we use completely irrelevant page titles because "its good for google". Well bollocks to you, I'll tell you what's good for google: writing the damn page content properly and updating it on a regular basis with the Content Management System I went to great pains to implement for you, so you didn't have to bother me with your shit every 5 minutes and I could actually get on with my job of constructing websites, rather than pissing about with body text on the old ones.

      So now we can farm less data from our users? Well as far as I'm concerned, that's a good thing, maybe this company will learn to actually engage with its user-base rather than paying idiots to decide what it is they are thinking and how best to exploit that.

      Anon. for obvious reasons.

  11. Ole Juul

    Some winners, some losers

    To me this has only a little to do with privacy. Letting web sites know what you were searching on is a very bad idea because some of them will modify their content accordingly. As long as we have that kind of sleazy behaviour, it is a good idea to block referrers. It will help keep websites honest.

    On the other hand, I'm not sure that Google particularly cares how I feel about that. Since they will have this information and web masters won't, could they be planning to sell it? Either way, I win.

  12. Eponymous Cowherd
    Stop

    Bonus blocking

    In order to protect little miss Cowherd from an assortment of web nasties I have, among other measures, configured my firewall to block URLs that contain certain words.

    One of those words, I have discovered, also blocks Google Analytics.

    Bonus!

    Now the problem. If little miss Cowherd uses Google's encrypted search, then all of the query string will be encrypted and I won't be able to block based on what she is searching for (either deliberately or accidentally).

    So, probably going to have to block port (outgoing) 443 to her PC.

    1. Anonymous Coward
      Alert

      or...

      configure your network to use OpenDNS and block the resolution of the GoogleAnalytics domain (along with loads of other stuff - pron, adware, etc)

      works for me (and all machines on my home network) :-)

      1. Anonymous Coward
        Unhappy

        Re: Or...

        I agree. Blocking port 443 (outgoing) also blocks all encrypted forms for logins and views (banking, IRA self-service, medical sites, etc).

      2. Eponymous Cowherd

        The problem is...

        Even using OpenDNS, you can either block Google (entirely), or not. I want to filter on the query string, and you can't do that over SSL.

        So 443 gets the chop on the kiddies computers.

        There isn't a lot that Little Miss Cowherd would need SSL for, and when she does she will have to use the main computer (under supervision).

        OpenDNS is a good suggestion though. Will certainly use it to block the pron. (just can't be use to block searches (accidental, or otherwise) for pron using SSL Google).

    2. Anonymous Coward
      Anonymous Coward

      would that word be...

      anal ?

    3. Anonymous Coward
      WTF?

      Get a grip

      Why not just lock her up in a cupboard until she turns 18... nothing like having trust in your kids huh

      1. Eponymous Cowherd
        Terminator

        Funny isn't it....

        If you don't control your kid's Internet activity you get one group of twatspanners berating you. If you *do* control their Internet activity a completely different group of assholes has a pop at you.

        Just can't win.

        I am left wondering at what this particular AC was attempting to achieve with that post and why he/she/it is so bereft of anything remotely resembling a pair of bollocks that he/she/it felt the need to post AC in the first place.

  13. dct
    Black Helicopters

    unbiased comments?

    Clicky:"the only thing this does is make things a pain in the ass for webmasters"

    or rather

    Clicky:"the only thing this does is put me out of a job"

  14. marksalter

    The shadow of an arriving invoice...

    Of course the detail of what keywords the user used to reach a site are not lost, google has them and I'm sure will be more than willing to start charging webmasters for access to it.

    It won't be long before google are adding their own detail to the link to allow tracking and enquiry by webmasters for a *fee*.

    Have fun...

    M

  15. This post has been deleted by its author

    1. This post has been deleted by its author

    2. Anonymous Coward
      FAIL

      RE: What?

      Good title, that's exactly what I said when I read your post.

      Either you have no idea how ssl works or I'm reading your post wrong.

      Please tell me you are actually analysing packets captured "off the wire" from another device on your network or not even that, just captured packets using the same computer should do it and NOT just looking at the https google search query in your web browsers address bar?

      Because if you are, then words cannot describe the utter cataclysmic fail of comprehension you have on so many levels of the subject matter that you comment on.

  16. Number6

    Price Comparisons

    Does this mean that all those irritating sites that do no more than provide their own form of search for whatever you were looking for and get in the way of real results will get clobbered? If so, I'll be using https from now on, just to irritate them.

    1. max allan
      Thumb Up

      Ah yes!

      I really hope so. It will make searching so much better not to have to use google to find a search engine that gives more ads and less results, but you still need to read it through for a moment anyway before realising what toss it is.

    2. Captain Thyratron

      I hope so!

      I hate those rotten bastards! This could be the next best thing to leaving flaming bags of canid excreta on their doorsteps.

      I think this place could use a torches-and-pitchforks icon, come to think of it.

  17. DZ-Jay

    Referrer

    So other sites won't see what users searched for? Boo-hoo. This is useless for privacy anyway; everyone knows that Google is the first and worst offender in this regard, and they still have access to your information. Now they'll do so exclusively.

    -dZ.

  18. yoinkster
    Thumb Up

    2p

    everything should run on https, fuck the analytics companies.

    so the webmasters can't see all my traffic info - boohoo

    my isp can't spy on my traffic easily - boohoo

    the legalised mafias can't stalk me as I surf the web - boohoo.

  19. Anonymous Coward
    Thumb Up

    Good

    Google still has options to deliver search term data to the destination site. But of course, you can always just use HTTPS yourself:

    https://secure.grepular.com/Getting_The_Search_Term_From_Clicks_From_Google_SSL

  20. max allan

    Oh really?

    "we would have no idea what people were searching for to get to our site, which is arguably the #1 reason to run analytics in the first place"

    Yeah right.

    Why do you care what people were searching for?

    Answer : So you can tailor your site's apparent responses to get more hits that you don't deserve.

    So, this is a bad thing for consumers how?

    If sites stuck to the content they really have they might get less hits overall but they'd only get relevant hits rather than the huge number of mis-hits I find when googling.

  21. Robert Carnegie Silver badge

    Not necessarily intrusive, but

    If you want to know how I got to your web site and what I was looking for, then ask me while I'm there. I may be willing to tell you.

    I assume that unless you're not only up to no good but insane, you won't want to offer, or to attract people with, something that you can't provide. (Although Googling "{name of latest product} review" often indicates otherwise. I want a review, you don't have one, you still want my eyeballs...)

    Stealing search terms from my browser? I'd prefer you didn't do that, you creepy person.

  22. Havin_it

    One thing I'll miss

    Some sites that specialise in reference information use the referrer info to colour-code your search terms in the page they send, much as Google does for you if you go for the cached version. That can be a very handy time-saver on big pages.

    There you go - at least one purely altruistic webmaster activity that's been stamped in the balls by this change.

    PS @ "All sites should go SSL" - sod off. SSL adds overhead, and for a huge percentage of sites (and visitors thereof) there's absolutely no point to it. I agree that it'd be nice if all sites could have it as an option (and search engines could then direct like-for-like to SSL or non-SSL versions of the result), but asking all webmasters to shell out for a "trustworthy" certificate from one of the anointed sellers is bullshit. The trust that underpins SSL would be better entrusted to an NGO such as ICANN than brokered by a cartel of douchebags.

    1. Anonymous Coward
      Alert

      No need to shell-out

      Just create your own certificate. Remember that the certificate is used for TWO reasons - 1 to 'ensure' the identity of the server, and 2 to be able to encrypt traffic. We are only interested in the second reason here, so VeriSign (Symmantec) need not be involved.

      I agree about the overhead though - massive decrease in throughput unless you go for SSL accelerator cards.

      1. The First Dave
        Boffin

        @AC

        I Take it that you are perfectly happy to visit a site that thinks it needs SSL, but only has a self-signed cert? Pretty sure all modern browsers are going to warn you that this is dangerous.

  23. A J Stiles
    Happy

    Google Analytics

    is just one of many sites blocked by my nameserver anyway.

  24. Daniel Brandt

    Google doesn't know what you clicked on

    At the moment, Google does not see the links you click on from a Google results page. This is true with or without SSL. Hold your mouse over a link, and in your browser's status bar you will see the real URL of the target page.

    Compare this to Yahoo. Hold your mouse over a link that looks like the target page, and you will see a long ugly string that goes to rds.yahoo.com. Occasionally Google has done some redirects too, but as far as I know, only on a very limited "research" basis. Yahoo is a major culprit here -- for many years they've had at least one, and sometimes two, redirects on every link. I scraped Yahoo for about three years, and it was not much fun parsing out those redirects to get to the actual URL. Obviously, Yahoo and Google both know what search terms you used. But Yahoo also knows what you clicked on from their results page.

    One thing to watch out for is whether Google will be tempted to install redirects on their links on SSL pages. That way they'd know what you clicked on, and could sell this data to webmasters. I actually don't think they will do this because it would be too obvious. It would undermine the public relations value of the SSL option they just rolled out.

    1. Cardare Anbraxas
      Badgers

      Oh they did

      Before the current refit of Google search, hovering over the link showed the link you'd expect to go to, but the second you clicked it, it changed to a Google link, with the real link encoded in to the end and your browser would use that instead.

      They probably still can but the above doesn't work, possibly instead some strange javascript stuff in the background dynamically changing hrefs, but I'm too lazy to run Wireshark while Google searching to test.

      Interesting results to be found if you Wireshark while installing Google Chrome, and then doing anything with Google Chrome after install. Each install has a unique ID. IP addresses? Google don't need 'em when they have your Chrome install's unique ID. No idea if Chromium suffers with this, not tried.

    2. Michael B.

      Sorry to burst your bubble...

      ...but Google has placed an "onMouseDown" handler of the form "return clk(this.href,'','','res','8','','0CEIQFjAH')" on every search result and this clk function posts all of the parameters back to Google in a request for an image that returns a "204 No Content"

      So they absolutely know what you are clicking on , what number on the page was the result that you clicked on and even the style of result ie is a video result.

      1. Steven Knox
        Boffin

        Not Absolutely

        "So they absolutely know what you are clicking on ,.."

        IF AND ONLY IF you are allowing their javascript to run...

    3. Leo Rampen
      Alert

      They do!

      You think google don't know what link you clicked on? They almost certainly do. It's pretty trivial to track link clicks by javascript, and google analytics will do it for you. Google used to use redirects for link tracking, but they probably realised people like to be able to copy links from the google search page, and the vast majority of people have JS turned on. There's a few people who block analytics or use noscript, but they're insignificant compared to the majority of people who allow JS.

      Google will not give up on knowing what link you clicked through to. That's a major part of their search algorithm improvement. They want as much data on you as possible.

    4. pauljohnson
      Alert

      actually does

      both ssl and normal versions use this type of redirect despite the fact they show the actual url in the status bar

      https://www.google.com/url?sa=t&source=web&ct=res&cd=2&ved=0CCYQFjA&url=http://en.wikipedia.org/wiki/Sex&rct=j&q=sex&ei=_Cf9S7WFJZSqnAOi3cSXCg&usg=AFQjCNHPSeEKdDPWJip-BQmed0EUGYBzGw

  25. Anonymous Coward
    Stop

    SSL performance

    "I agree about the overhead though - massive decrease in throughput unless you go for SSL accelerator cards." ... on the server side, only. Heavy caching can reduce this even further. If you have once established an SSL connection to Google, there is no need to do the expensive Public Key session key exchange from this computer for the next couple of months.

    Search for "session cache" here:

    http://www.rtfm.com/openssl-examples/part2.pdf

    Actually, a big hard disk might be as effective as cost-effective as a crypto coprocessor. Also, powerful CPUs are cheap these days. Go for a couple of AMD chips if you need more than a trivial server.

  26. William 6

    ungreen?

    the extra power to do this encrypting at Google's end must add an overhead?

  27. Gilbo
    Grenade

    Alternatively...

    You could just disable / fudge your own HTTP referrer if you're using Firefox and RefControl:

    https://addons.mozilla.org/en-US/firefox/addon/953/

    Works a treat.

  28. Matt Bradley

    Simple solution

    It seems to me that the solution is simple. I'm sure it's one that Google has already thought of:

    In Google Webmaster Central for my site, I'd have some additional settings:

    "Encode referrer data in GET request" Yes/No

    "Variable Name for domain" (text input - default "dom")

    "Variable Name for query string" (text input - default "q")

    the when google shows my links on the SERPS, instead of linking to:

    http://www.inventpartners.com/

    It would link to:

    http://www.inventpartners.com/?dom=google.com&q=web%20design

    ... and I can analyse that request string using analytics or logfile analysis.

    1. Steven Knox
      Thumb Down

      Wrong solution

      No, the solution is to leave it as it stands and stop inventing ways to spy on your customers.

  29. brycec
    Stop

    Google Power Grab

    You really think Google have put this massive extra load on their servers (with massive financial cost for the SSL acceleration to process tens of thousands of queries per second) for the sake of your privacy?

    Wake up. This move brilliantly consolidates their dominance of the web analytics market.

    That they have dressed this up as a beneficial move for consumers and you have bought it is tragic. If even El Reg readers think this is good move then God help us.

    Google Analytics tracks you, as does their search engine. They know everything in your click stream as you progress from search terms on their site to external sites which load their analytics code.

    The reported comment from Google about them being in the same boat is ludicrous and disingenous. Their JS code tracks you by cookie, irrespective of SSL. Simple IP and user agent time series analysis from their data silo would easily reveal your true offsite clickstream in any case, irrespective of cookies.

    The only change is that now they have stopped third parties being able to access this useful information, and therefore forced them to give up even more of your data by signing up to analytics on their site.

    The net effect is:

    - increased knowledge by Google Inc of your behaviour on general non-Google websites

    - increased revenue for SSL certificate providers

    Both of these are bad.

    Doh.

  30. Anonymous Coward
    Go

    costs...

    "...with massive financial cost for the SSL acceleration... "

    I venture to say that they will fund it from their petty cash. Just five million dollars will buy a truckload of SSL accelerator cards. And they will easily handle all SSL requests for this 20 billion dollar revenue corporation. Remember that only the first handshake needs to perform an expensive Public Key Operation. Everything else uses cheap symmetric ciphers. Google search traffic itself is much, much less data volume than other things like youtube or bittorent.

  31. BuckBrinkley

    Who will use it ?

    People who take the extra step to use https are not in the demographic that I target. As long as Google doesn't make it easy like a big shiny button that says "Search securely" in place of the good ole' "I'm Feeling Lucky" button, it shouldn't affect my ability to understand what my search engine traffic is looking for.

    I don't think G is gonna make it easy. It costs more to serve. Not much but I hear they get a lot of traffic so it could add up fast.

  32. Anonymous Coward
    Grenade

    Gee us yir data...

    So G alone will have access to my search terms. Cuts Phorm etc out of the pic. No more ISP cashing in. He who controls the data....still, this is capitalism, they are there to turn a profit...we use their search...they own our search terms...every keypress...that's how it is...they're just making sure that *only* Google can have that data. Makes sound business sense to me.

  33. brycec

    Implementation Cost

    @jlocke, yes you're right that Google has pant loads of money. The primary reason for this change IMHO is to extend its reach yet further onto other websites (all those using alternative analytics solutions and who have so far resisted surrendering your page view data to the power of G).

    The SSL implementation cost in absolute terms is clearly significant, but perhaps not in relative terms too. Google is very efficient at data center management and we can conclude that its search servers are pretty much near capacity. Google reportedly has in excess of 1.5 million "servers". This probably includes a variety of device classes (network devices such as load balancers etc).

    Let's group it all together as a big compute hardware cloud. SERPs are G's core business (without that it would have only a tiny fraction of its current power) but clearly only a fraction of those servers are dedicated to real time SERPs generation (the rest are doing maps, search indexing, crawling, hosted apps etc), but let's say it's 10% for the sake of argument. That's approx 150k nodes.

    Now if Google rolled out SSL across all search globally, even if the Google system architects optimize the cost of adding SSL acceleration to say a modest 10% increase in compute resource, you're looking at adding 15k new nodes.

    I guess the cost could only be in the low tens of millions then.

    Not loose change for most companies, but as you say not majorly significant for the king of text ads.

    I think Google's ambition is not to organize all the world's information, but more accurately, to capture everything every individual does on the web.

  34. Jay Hawk
    Megaphone

    Eff the anal ysers

    Good on you, Google. Go ahead and make SSL access the default. Web analytics firms are just bottom dwelling scavengers. Create a way to make money by offering something of value that doesn't depend entirely upon the works of others. Death to Clicky and their ilk. Go ride Microsoft's coat tails. You deserve each other.

    We don't depend in any way upon search terms to draw web visits. Yet we can barely keep up with demand. But then we sell something of value that people need, instead of pipe dreams. We aren't running a money grubbing scam. Good product at a fair price sells!

  35. Captain Thyratron

    It hoses web analytics? Wonderful!

    I can hardly bring myself to complain about this "problem", as I'm quite tired of websites that give a rat's ass how I got there. Screw the lot of them. The only people who care about that crap are marketroids and the sort of websites that warn you not to use the "back" button or they'll break. If Google had done this on purpose, I might send them flowers or something.

  36. E 2

    ssl is good

    period

  37. brycec

    SSL is good?

    @E2, yes, if you're Verisign. The people suggesting sites use self signed certs are delusional. Want a massive warning box every time a new user visits your site? Yes, that's great for user confidence. So better get the wallet out and hand over some more dosh to the American corporates.

    This whole move is a consolidation of power.

    Like everything G does, there's an upside and a downside. They're clever that way, but ultimately we are surrendering more and more clickstream and therefore power to them.

  38. brycec

    No Analytics = death

    @Captain Thyratron et al,

    Analytics are very useful for webmasters and the foundation for UI design on most popular websites that you use and enjoy every day. Without them, they would be harder to use as the designers wouldn't have basic user flow patterns to analyze in detail.

    This move will not reduce the net amount of analytics in any way, it merely grabs dramatically more analytics market share for Google.

    Shop sites especially really must use analytics or they don't do very well. By analogy, it would be like running a high street shop with a blindfold on - you have no idea what your customers are looking at or which in store presentations are working well etc.

    This move has a small benefit for consumers and a big one for Google.

  39. techjacker

    the one way webmasters will still be able to see search data

    is through Adwords. Paid search advertising on google will now be the only way that you will be able to see which search terms people used to reach your website.

    This is just another in a long series of moves to eliminate businesses being able to effectively use natural search and so by necessity start using google's paid alternative.

    Stop being so naive and realise that everything google does is motivated by money and trying to entrench their monopoly status.

  40. The_Police!
    FAIL

    Can I just say...

    ... Google and Privacy? Bwahahahahaahahaa

This topic is closed for new posts.

Other stories you might like