back to article T-Mobile JavaScript comment stripper breaks websites

Attempts by T-Mobile to speed up mobile data connections are breaking websites. The bug intermittently affects mobile device users and PC users using tethered connections. It is caused by "optimisations" to the sites' Javascript code made on the fly, in attempt to optimise the amount of data received. Instead of stripping out …

COMMENTS

This topic is closed for new posts.

Page:

  1. Mike Judge
    FAIL

    Errm,...

    People aren't using Opera Mini?

    1. Anonymous Coward
      Pint

      That's right

      That's correct, people aren't using Opera Mini.

      1. Anonymous Coward
        FAIL

        Missed the point...

        I think you will find they are.. Or at least those clued up that don't want to suffer dog-slow mobile data....

        The point is, Opera Mini is a proxy, but it's one thats innovative and works spectacularly well.

        1. Miek

          You missed the point too ...

          Regardless of Opera's strengths for avoiding this problem, I still don't want to use Opera nor do countless other users.

          1. Sam Liddicott

            I think you are confused

            I think you are confusing opera mini with opera - perhaps because they have similar names or are made by the same company or because they look a bit the same.

            Opera mini is fast web-browsing and low data charges.

            With that, who cares what browser it is. It's not like any of the other web-browsers are normal either.

            They all work pretty much the same on a mobile phone.

            1. DrXym

              @Sam

              Opera Mini is a fast web browser but strictly speaking it is not a browser rather a thin client. A server is browsing on behalf of the client and sending it page data.

              Also, there is the rather fundamental point that you are allowing Opera to browse on your behalf. They literally screenscrape everything you browse whether it is over http or https to feed to the client. They know where you're going, how long you stayed there, what links you followed, what sites you have accounts on, how often you check your email, your banks, how often you browse, where your focus falls most on a page. They're in the same business as Google, sucking up personal data to monetize you.

              On the flip side, it can be useful if you feel like browsing a site without revealing who you are. To the site, the request comes from opera, not your phone so it is hard to figure out who is making the request.

    2. DrXym

      I use Opera Mini

      But not for AJAX heavy sites. The thing tries its best but the reality is it's a thin client serving munged content from a server side renderer. There are limits on how it can behave. From playing around with a few ajax sites it appears to deliver snapshots of content in response to taps which is better than nothing but it wouldn't account for a lot of things sites could do. For example a site which has a ticker, or pops up glasspanes, or otherwise does out of band things isn't going to work too well with Opera Mini.

  2. PsychicMonkey
    FAIL

    I assume they have asked permission

    from the user to do this?

    It seems to me that if they want to save data by messing with the content then they should check that the user doesn't mind first?

    although to be fair, they shouldn't do it at all in my opinion. If I want to see something it's not up to them to decide if there are bits in there I don't need.

    1. The Mole

      copyright

      This does seem rather interesting as not only are they creating derivative works without permission, more to the point they will be actually stripping out the copyright notices from the files

      1. Colin Millar
        Boffin

        @ the mole - No they won't

        This is web pages that are being delivered - not pages intended for code redistribution - your copyright notice won't be in a jS comment cos if it is - no-one will see it.

        1. Loyal Commenter Silver badge
          Stop

          @Colin Millar

          So, at the top of an included .js file, you wouldn't expect to see a comment stating that the file is under the GPL licence? If I could be arsed, I'm sure I could find several examples where this is exactly what you get.

          1. Colin Millar

            @ Loyal commenter

            As I said - these are web pages being requested and in this context potential code re-use is not an issue. If users do use source code direct from web pages delivered as content then it is the user's responsibility to ensure that they are complying with any IP provisions and in the absence of an assertion they cannot assume that any code is free from such.

            There is no obligation on ISPs to redistribute licence assertions which are included in sections which are, by design, not intended to be delivered as part of the response - indeed the GPL could, if it wanted to, include such a provision but it does not and one can only assume that it does not include such a provision (easily delivereable) because the people who wrote it don't want to include such a provision - it is not as if code optimisers are exactly new.

          2. Anonymous Coward
            Mushroom

            Looks pretty bad for MIT licensed code

            "... deal in the Software without restriction, including without limitation the rights

            to use, copy, modify, merge, publish, distribute ..."

            "The above copyright notice and this permission notice shall be included in

            all copies or substantial portions of the Software."

            Release the hounds!

      2. Jean-Luc
        Thumb Down

        @copyright

        I thought lawyers weren 't that popular around these parts. Granted, they may be helpful in this particular instance, maybe.

        But about their involvement the other 99.99999% of the time?

    2. Jdoe1

      I don't mind.

      Stripped data is data I don't have to pay for.

      1. PsychicMonkey
        Unhappy

        but then

        you are paying for the rest of the data for the broken site because the stripped data broke it.

        SO you are payign for somethign that would work, but doesn't because Tmobile screwed with it.

  3. Anonymous Coward
    Boffin

    VPN?

    OK, so this bug should be browser independent. Work around it by first VPN tunneling through T-Mobile to another network?

    1. Anonymous Coward
      Flame

      Perhaps

      That might work if they don't strip those annoying "comment packets" from your VPN protocol too...

    2. Anonymous Coward
      Anonymous Coward

      Browsers

      It's not browser independent. You can get round it by setting your UA string to something completely bogus.

      "MySociety first publicised the issue last week."

      I first mentioned it on El Reg a while back. It's been going on for at least two years.

  4. Anonymous Coward
    Mushroom

    How is this not illegal interception...

    ... a la Phorm? How dare they tamper with, forge and falsify people's private communications? It doesn't matter whether the motive is advertising or "optimising", no carrier has the right to tamper with the data we ask them to convey any more than they have the right to change the words or voices in our phone calls.

    1. BristolBachelor Gold badge
      Coat

      Is that so?

      In that case, you will also want them to also transmit all the silence in calls to you too (taking more data and meaning higher call charges for you)?

      You will probably want them to turn off all the background noise removers, and echo cancellation that actually make mobile conversations so much better?

      Possibly you could also sue them because any freqencies above 4kHz (or 8kHz or higher depending on codec) are also stripped to reduce bandwidth requirements and lower transmission costs.

      If you want them to just be a dumb data carrier, I'd suggest used encrypted VPN so that they don't do anything to what you send/receive.

  5. chuBb.
    FAIL

    Cus positive and negative lookahead/backs are so difficult...

    Regex fail

    if only: http://xkcd.com/208/

    1. Anonymous Coward
      FAIL

      @chuBb

      You're going to need more than regular expressions to parse something like javascript properly. Besides which the code will probably be broken up into blocks as it traverses the network and since regexps can't maintain state each they'd work on each block individually. Not much use if your comment or string starts in one block and ends in another.

      Writing a simple stream parser on the other hand to do this would be a piece of piss.

    2. Loyal Commenter Silver badge
      Boffin

      Probably in this case, yes

      Given that you would want to exclude the matches that occur within a string, which will be delimited with either single or double quotes (which JavaScript allows IIRC), possibly at some arbitrary point away from the match in question, the processing cost of the pattern match may be more than you expect. I seem to recall an example given in The Camel Book of a regex using lookaheads which would take longer than the lifetime of the universe to complete.

      1. Loyal Commenter Silver badge

        ...also...

        ...you'd have to deal correctly with matched pairs of nested quotes, possibly escaped, possibly escaped for interpolation into other protocols (such as SQL escapes). The phrase non-trivial springs to mind.

      2. Eddie Edwards
        Facepalm

        Jesus

        Jesus, guys, read a book on lexical analysis. Stripping comments is a trivial O(N) process with one character of lookahead.

        1. Loyal Commenter Silver badge
          Boffin

          @Eddie

          Counter example:

          /* This is to be displayed to the user: */

          var sometext= "You 'might' see a comment like /* comment */ in the code.";

          Show me a trivial regex which simply removes the first comment, but not the one embedded in the string.

        2. Anonymous Coward
          Anonymous Coward

          @Eddie Edwards

          "Jesus, guys, read a book on lexical analysis. Stripping comments is a trivial O(N) process with one character of lookahead."

          Come on, show us then if its so easy. Don't forget about comments embedded in strings and gotchas like this:

          /* "hello */" "cruel" /* " world" */

          1. M Gale

            Last time I solved this...

            ...it was for a little homebrew interpreter based on Multi User Forth. Stack-based reverse polish is a lot easier to parse than conventional languages, plus your programs end up looking like Yoda made them (whether this is a bonus or not is up to your own preference).

            Anyway, the script file was converted into objects via two stages. In the first stage, I'd detect and objectify strings using a variable called in_string, which could be either a single-quote, double-quote or nil. Okay so there was a little more complexity for detecting escaped quote characters within strings, but not all that much. If you're not in a string (if in_string is nil) and you come across a quote character, fill in_string with the character and continue. You're now in a string (and therefore code comment stripping should not apply) until you recieve another matching, non-escaped, quote character. Detecting escapes is really simple stuff (at least compared to parsing and organising multiple nested IF statements, which was eventually solved using a depth counter and object-in-object-in-object recursion), but I wouldn't want to do T-Mobile's job for them for free.

            Simple enough algorithm, and could be applied to stripping comments from web pages "properly" in just one pass. That is unless the web page has some really funky way of using quotes within <script> blocks. As for my choice of writing a language interpreter in Ruby, well.. I did it because I could. That's my excuse.

            : main

            "Hello, World" "#test_channel" say

            time "The time is " swap concat "#test_channel" say

            2 2 + 4 = IF

            "I'm working fine." "#test_channel" say

            ELSE

            "Something's not quite right." "#test_channel" say

            THEN

            ;

            Yeah, I suppose writing a language interpreter in Ruby was a minor crime compared to the language itself.

        3. Michael Wojcik Silver badge

          Trivial indeed

          It's so simple, a push-down automaton could do it.

  6. Colin Millar
    Facepalm

    Oh do get back in your prams

    To those seeing illegal interceptions every time someone processes a file.

    Code comments are not content - you didn't request them and they were never intended to be delivered to you as part of the response - they are not a part of the data you asked to be conveyed.

    The only problem here is TMobile's inadequate knowledge of how to achieve what they wanted to do.

    They should really have left it alone - it is a matter for web site ownesr to optimise. If their pages are slow they will lose hits.

    1. Loyal Commenter Silver badge

      Code comments are not content

      But copyright notices are, especially if the code in question is under the GPL. If I remember rightly, removing the GPL copyright notice from a GPL copyrighted file violates the terms of the GPL licence. Given that someone receiving the page may well look at the source, think "that's a useful piece of code", not see a copyright notice, and appropriate it, I would ahve thought that in this situation, T-Mobile would be liable for the breach of the GPL. (IANAL)

      1. DrXym

        GPL

        A web site with some GPL'd JS is running as a program, not source code. So copyright messages are meaningless in what it delivers to a browser to execute. I also don't see how it would be the site's fault if something subsequently stripped off the notice if it left the site with the notice intact.

        I doubt even the FSF at their most obstinate could reasonably argue that it's a proxy's job to honour comments inside a piece of code or HTML. They might however recommend that if there is a proxy that a user be allowed to choose to circumvent it if they wish.

        A more serious question is what liabilities you open yourself up to if you run GPL'd JS on your site. After all you're not "linking" GPL'd code to other content of your site, but it is all being executed in the same JS runtime instance. Therefore you may inadvertently end up tainting your entire site this way. The answer therefore would be to avoid GPL'd JS like the plague. Not that there is much need to use it anyway since the major AJAX libs use another licence, probably for this reason.

        1. Anonymous Coward
          FAIL

          Licenses

          @DrXym - May all be well and good arguing that it's not distributing source code except for the fact that when I say "I'd like the source code (as you definitely can with webservices and AGPL)" they also strip it. And when you run "wget http://ajax.googleapis.com/ajax/libs/jquery/1.6.2/jquery.js" to fork it and make aweseomeQuery 4.0 instead they strip that too.

          So no, it's clearly not harmless from the perspective of the licenses

          1. DrXym

            Yes it is

            There is no obligation to deliver human readable JS in your site even if it's under the GPL. At the point that a browser requests and executes it it can be reasonably regarded as a "program" and munged into any form you like, e.g. through a minifier. The GPL requires you to release the "source", which is the input to the program. The site site can provide a link to the source (i.e. the unminified code) somewhere else.

            Also, it's not the site's fault if they do honour the GPL and some bloody proxy screws it up. They've met their obligations and that's all that matters.

            As I said a more pressing issue is if you minified your JS with the GPL'd code, have you tainted your site. And even if the JS files are separate and running through the same JS instances, does that qualify as a program? What about if the JS comes from one domain and the site is on another. You can see how fraught this is. Which is why AJAX tools generally do not licence with the GPL or dual licence.

        2. Anonymous Coward
          Anonymous Coward

          @DrXym - Twit

          "A more serious question is what liabilities you open yourself up to if you run GPL'd JS on your site. After all you're not "linking" GPL'd code to other content of your site, but it is all being executed in the same JS runtime instance."

          Yeah cos the GPL is so contagious that it'll spread to the other JS. FFS next you'll be telling me I've tainted every process on my system because they are running on the same core as a GPL'd piece of code.

          The answer therefore would be to _READ UP ON LICENSES_ rather than believing that shit that gets spouted (about any license for that matter).

          "A web site with some GPL'd JS is running as a program, not source code." Yes, but how is that JS delivered to the browser? Does the server send a binary or the source? Ah yes.... the source. So if someone uses "View Source"?

          "I also don't see how it would be the site's fault if something subsequently stripped off the notice if it left the site with the notice intact."

          You're quite right, not the sites fault. However, it would be pretty shit if you did "view source" looked at some JS which carried no copyright notice (having been stripped by TMob) and decided "hey I can use it" (pretty stupid choice IMO, but it happens) and then got shafted.

          Comments are there for a reason, they may be useless to a lot of users, but less so that a page that doesn't run because someone cocked up removing them

    2. Anonymous Coward
      FAIL

      Code comments ARE content and I DID request them.

      I sent a request to www.foo.bar asking it for the contents of index.html, not to T-Mobile asking them for their summary of it. If that content includes comments, I wanted them; what I wanted is the *exact* response from the www.foo.bar server, and not a paraphrase from someone else who has no right to even be reading my communications. You can't change that just by using the euphemism "processes a file" to describe what's going on; I didn't ASK anyone to apply any process to a file, and particularly not some third-party who has no business doing anything with it whatsoever.

      (Half the jokes in the TDWTF's front-page articles are embedded in the source as html comments, it would ruin the fun if they were taken out.)

  7. Havin_it
    Holmes

    Nothing new

    I don't know what they're doing, but when viewing this very site via an O2 data connection, article pages frequently load with no content. So I assume their proxy is up to some similar ill-conceived jiggerypokery. Anyone else find this?

    1. Anonymous Coward
      Anonymous Coward

      Aye O2 are probably worse

      They even hack the included .js files straight into the webpage, so a js heavy site will not only be broken, but will actually load slower because all the potential for caching has been eliminated. Unbelieveably retarded

  8. Dave Murray
    FAIL

    T-Mobile has broken websites permanently

    Surely not. Clearing your browser cache once on a decent network again will remove the offending broken js magiacally fixing these permanently broken websites.

    So more like T-Mobile has broken websites temporarily until you clear your cache or it expires?

    1. Anonymous Coward
      Anonymous Coward

      For 99% of the population

      For 99% of the population who've never even heard of a cache, yet alone worked out how to clear it yes. For the 1% of nerds who the first thing they think of is "ooh I'll clear the cache" no.

      I wouldn't be surprised if they're modifying the cache expiry dates whilst doing all this too. Lots of evil proxies have tried similar in the past.

  9. Robin Bradshaw
    Facepalm

    YAY!

    Now they are removing the massive overhead that comments add to java script i will have so much more bandwith to watch youtube videos and stream mp3's to my phone.

    Is this not a piss in the ocean?

  10. Anonymous Coward
    Anonymous Coward

    High latency, poor image quality as well as broken pages is the cost of all of this pissing about

    If T-Mobile didn't compress images to the point where they look awful on a modern smartphone with a decent display, and didn't try this type of stuff, we might actually see decent latency:

    Just now, on T-Mobile, HSDPA full signal strength:

    ---www.bbc.net.uk ping statistics ----

    3 packets transmitted, 3 packets received, 0% packet loss time 2797ms

    min = 421.298ms

    avg = 489.761ms

    max = 539.252 ms

    mdev = 49.991 ms

    1. Anonymous Coward
      FAIL

      Missing the picture?

      Since when did ICMP echo request/replies go via HTTP proxies, transparent or otherwise?

  11. Stewart Atkins

    Mobile HTTP Compression

    I was under the impression that all the UK mobile ISPs did this (granted T-Mobile screwed it up) - compressing javascript/css down, dropping it into the page inline, compressing images (and adding a JS file to your pages so that you can say 'gimme the original version of this' if you need to) and so on. While it does save some bandwidth, the speed-constrained part of the transfer will be completed faster thus improving the end-user experience.

    At least this was the experience i had with my o2 3g cellular modem dongle. I wouldn't have even noticed that they were fudging with things except on a desktop some of the graphics look visibly lower quality, and there's a tooltip on them saying press (key combination) to load the original.

    When done correctly, it's a useful feature, especially when you have a particularly weak signal, as long as it's done correctly, and as long as you aren't trying to do code debugging etc.

    1. Anonymous Coward
      FAIL

      um, nope

      I used to have this problem on AT&T's network. I'd download an image from my webstite and then diff it with a copy on my local machine and while the were supposed to be the same, and the timestamp in the html was the same, the saved file was always smaller. turns out one of the AT&T's updates to the windows client enabled automatic compression.

      I don't see why AT&T wants to buy T-Mobile, they know how to fuck up already.

  12. Peter Galbavy
    FAIL

    one2none

    They should have stuck to their old name.

  13. PassiveSmoking
    Thumb Down

    T-Mobile are the worst network for smartphones

    Not content with attempting to slice your 3 gig cap to 500 megabytes unilaterally with no warning or compensation, and not content to applying such utterly overaggressive JPEG compression to images that they become almost unviewable, they now break page code too?

    Anyone got any recommendations regarding phone providers? I'm not renewing my T-Mobile contract.

    1. Anonymous Coward
      Anonymous Coward

      3

      Don't know about smartphones but for dongles I prefer 3. T-mob mess compress images, Vodafone are expensive. 3 do block access to mail servers unless you use an alternative port, Vodafone don't but seem to block content to innocuous sites too often. Looging into a three account from Linux is easier than the others as it dosn't require user ids and passwords.

      1. Luke McCarthy

        3 mail

        I have no problem connecting to my IMAP/SMTP servers on the default ports on 3.

Page:

This topic is closed for new posts.

Other stories you might like