back to article Welcome to Ubuntu 18.04: Make yourself at GNOME. Cup of data-slurping dispute, anyone?

Ubuntu 18.04, launched last month, included a new Welcome application that runs the first time you boot into your new install. The Welcome app does several things, including offering to opt you out of Canonical's new data collection tool. Ever since Edward Snowden confirmed so many once outlandish conspiracy theories, the …

  1. Steve Davies 3 Silver badge
    Thumb Down

    Welcome to the 'new' Canonical

    aka the 'Microsoft of the Linux World'.

    sorry, No, just No.

    Linux and BSD were once the only places you could go to avoid the OS Snooping. No longer.

    Memo to self, block all Ubuntu and Canonical Domains in home firewall.

    1. Teiwaz

      Re: Welcome to the 'new' Canonical

      aka the 'Microsoft of the Linux World'.

      sorry, No, just No.

      Linux and BSD were once the only places you could go to avoid the OS Snooping. No longer.

      Memo to self, block all Ubuntu and Canonical Domains in home firewall.

      If thine own hand offends thee, cut off your own head with a spoon and nuke the vegetable garden.

    2. viscountstyx

      Re: Welcome to the 'new' Canonical

      I think you need to run wireshark and look at the actual traffic on a minimal install ubuntu. I think you'll be surprised.

      1. Anonymous Coward
        Anonymous Coward

        Re: Welcome to the 'new' Canonical

        "I think you need to run wireshark and look at the actual traffic on a minimal install ubuntu. I think you'll be surprised."

        It's not going to be very minimal if you've got Wireshark on it. Perhaps you meant tcpdump? Anyway, I've just done a Bionic minimal - https://help.ubuntu.com/community/Installation/MinimalCD - install and there is no sign of any data slurping.

        1. JohnFen

          Re: Welcome to the 'new' Canonical

          "It's not going to be very minimal if you've got Wireshark on it."

          I don't think he said to run Wireshark on the Ubuntu machine.

    3. FIA Silver badge

      Re: Welcome to the 'new' Canonical

      Linux and BSD were once the only places you could go to avoid the OS Snooping. No longer.

      I’m pretty sure Debian has had data collection for many years.

      1. JohnFen

        Re: Welcome to the 'new' Canonical

        "I’m pretty sure Debian has had data collection for many years."

        Yes, but it's opt-in, therefore you must actively choose to participate.

  2. karlkarl Silver badge

    What will they do with the data?

    Remember, Canonical's Ubuntu is simply a Linux distro. Whereas kernel.org is where the drivers and compatibility is improved. So what will Canonical do with the data? Email it to kernel.org with suggestions on which drivers to implement first? I am pretty sure the guys at kernel.org will say "shut up and get in line. We work on the drivers that interest us, not you".

    So what else is Canonical going to do with the data? Other than sell it of course. No, this is a gateway onto more invasive data collection because they are jealous of Microsoft, Apple, Google and all those other fscks.

    1. LosD

      Re: What will they do with the data?

      You DO realize that a distro is much (MUCH) more than the kernel, right?

      You DO realize that Ubuntu developers contribute to the kernel, right?

      You DO realize that a lot of userspace tools interact with hardware, right?

      1. James Hughes 1

        Re: What will they do with the data?

        No he doesn't realise.

        Also doesn't realise that you don't need to be working on upstream to write drivers. It open source! Everyone can write a driver, and many people do. And then get them upstreamed.

  3. Frumious Bandersnatch

    ``because if you uninstall rather than opt out, [...]

    Canonical never knows you opted out and you've lost your chance to let the Ubuntu-maker know you didn't like the data collection.''

    I'm sorry. Can you explain that? You seem to be saying that if you opt out, a message is sent to Canonical saying that you have opted out.

    So either your reporting/logic here is wrong, or you are saying that the package is reporting your opt-out status to Canonical, despite you clicking the box that says you don't want to share anything.

    Which is it?

    1. David Nash Silver badge

      Re: ``because if you uninstall rather than opt out, [...]

      "Which is it?"

      Read the article properly. He's saying that if you UNINSTALL as apparently recommended in these YT vids, that is not OPTING OUT, it's just removing the s/w. So a message is not sent to Canonical and consequently they don't know that you've effectively opted out.

  4. Guus Leeuw

    Dear Sir,

    There's a typo around "date of the hardware"... probably wanted to type data there instead of date...

    "... the server doesn't even record the IP it's sent from ... ": How do you *know*? Somebody said so? Or did you actually see what the server is doing? If the server is at all logging access requests, they are very likely also logging the Client IP address. The log entry will have some form of timestamp as well. Do they know when your data record was stored in the DB? If so, GDPR applies, because well all of a sudden they can link the DB record to your IP address, and IP addresses are PII...

    Best regards,

    Guus

    1. Mark 75

      Ok I'll give you my IP address. Let's see you identify me from it....

      1. Camilla Smythe

        Dude

        I would give you mine but you would be able to identify me from it.

        Your point was?

        1. Peter Gathercole Silver badge

          Re: Dude @Camilla

          Most people have dynamically allocated IP addresses provided by their ISP. The ISP can identify the account from the IP address and the time, but whether the IP address is enough for the ISP and everyone else probably depends on how long the lease time is for the dynamic IP address.

          But even the account owner name does not definitely identify the user by itself, unless only one person uses it. For example, during the week I stay in a shared flat with four other people, and the broadband account is in the landlords name.

          Of course, if you pay for a static IP, then yes, it is likely that you will be easier to identify, and of course by combining the IP address with other information (like the cookies in your browser, and whether you're logged in to a Firefox or Google account) many more things can be found out about you (I'm pretty sure Firefox ties together multiple devices I use by profiling the usage pattern, even though I don't enable the sync feature).

          Expect this last behavior to increase as time goes by.

          1. Camilla Smythe

            Re: Dude @Camilla @Peter @Guus

            As you surmise I have a fixed IP address linked to a registered domain and it would be a maximum of a two step process to find out who I am.

            Anyone who thinks they cannot be identified via their IP address when it is associated with the other breadcrumbs that are Hoovered by a Kirby on Turbo out of their 'Improved Browsing Experience' is being silly.

            I guess we just trust to Ubuntu not to store IP addresses, some or anywhere as part of the process, or maybe they could file an RFC to propose a method whereby such data might be transmitted to their servers without including such information in the communication.

        2. Anonymous Coward
          Anonymous Coward

          Re: Dude

          The point was very few people waste money paying for a fixed IP. Most people have randomly assigned IP's that change every few days.

          1. JohnFen

            Re: Dude

            "Most people have randomly assigned IP's that change every few days."

            That may depend on your ISP. Mine is Comcast, and I have a dynamic IP address. It hasn't changed for a year now.

            1. Ken Hagan Gold badge

              Re: Dude

              If you have an always-on connection (like, not dial-up) then the only reason for an ISP to change your IP address every few days is because they get a kick out of updating tables. I think most DHCP servers default to letting you stay on the same address when you come to renewing the lease. It's no less efficient and certainly less effort.

              1. Anonymous Coward
                Anonymous Coward

                Re: Dude

                "then the only reason for an ISP to change your IP address every few days is because they get a kick out of updating tables. "

                Or because they charge $15 for each static IP and force a new IP address assignment on everyone else just often enough to make it worth paying the extra each month.

                1. FrankAlphaXII

                  Re: Dude

                  Sounds like CableOne, AT&T uVerse, and Spectrum (Or as I call them, CableNone, American Theft and Thoughtlessness and Speculum). They'll charge 15 to 30, if not require you to pay out the ass for "business class" internet service if you want a static IP.

                  1. Anonymous Coward
                    Anonymous Coward

                    Re: Dude

                    By all the downvotes I can assume that BT & Sky, my current and last providers, are unique among the world.

                    They charge for static IP and my dynamic IP changes every few days, which is why I get you are logging in from a new IP address warnings on a few websites with alarming consistency, across both providers.

                    Pray tell where in the UK I can get a free static address from?

          2. Hstubbe

            Re: Dude

            I've only gotten my ip changed once in the past 15 years. I moved to a different town.

            Static ip's are pretty standard these days, dynamic ip's are silly with 24/7 internet connections.

            1. Updraft102

              Re: Dude

              Static ip's are pretty standard these days, dynamic ip's are silly with 24/7 internet connections.

              They are? I've never had one from any of my ISPs, ever.

      2. MJB7

        Re: "IP address is PII"

        No it isn't - but PII is an American term. The GDPR term is "PD" - "Personal Data", and an IP address absolutely *is* PD. GDPR is much wider than American rules (there's a surprise).

    2. Anonymous Coward
      Anonymous Coward

      the server doesn't even record the IP it's sent from

      Inevitably the IP address will need to be a part of the transaction that sends the data to Canonical, but presumably what they mean is that they store only the information that they have said they store, and discard the IP address any anything else that was an "incidental" part of the data transfer?

      Personally, I think Canonical's choices are reasonable, but I would certainly agree that they should (probably, legally must) alert users if they wish to collect any additional data (and it would be reasonable to only do so once, whenever a new LTS release is made, as people would understandably and reasonably be annoyed at any more frequent requests).

      On the other hand, you have Firefox, who are unfortunately somewhat vague as to what exactly they would like to collect "data _such as_..." (which does not form a closed list), and therefore I'm afraid i always turn that telemetry off. If I could be absolutely certain that no identifying data was sent, I would be more sympathetic, I do understand how telemetry data can be useful (and I mean genuinely useful for debugging and development purposes, definitely not for Teh Evil Spamming).

  5. Starace
    Flame

    I've opted out of 18.04

    Tried using both the server and desktop versions for stuff, gave up on both and went back to the previous LTS.

    All basically because of the combination of stupid feature decisions and because some basic stuff just flat out doesn't work.

    Far too much technical wankery change just for the sake of it breaking stuff, and some really stupid basic faults. I really have serious doubts about how much of this stuff was actually tested or used in anger before release vs. just pissing about feeling smug about a new shiny.

    Not that this is exactly uncommon with some of the big projects but this is the first time I've felt compelled to burn out the mess rather than persevere.

    Wake me up when the latest theological war is over and some sanity has returned.

    1. onefang

      Re: I've opted out of 18.04

      I opted out of Ubuntu some time ago and went upstream to Debian. Though once Devuan ASCII is fully released (any day now), I'll switch to that.

    2. AJ MacLeod

      Re: I've opted out of 18.04

      If you know exactly what you want (and don't want) then Gentoo is likely a good home for you. It's the only workable way to get pretty much exactly the distro you want, whatever that may be. There are other options that get you kind of close-ish to what you want; and some that get you exactly what you want but are a nightmare to maintain long-term.

      I've personally found it just a bit too much hard work to maintain on servers but on my desktop I couldn't live with anything else.

  6. Anonymous Coward
    Anonymous Coward

    "Perhaps, if GNOME started gathering some basic data on a larger scale about how people use GNOME the project would make different decisions."

    Gnome does what RedHat says

    1. JohnFen

      Not to mention, after observing other major software projects that rely primarily on telemetry to inform their decisions, relying on telemetry to make design decisions seems to ensure that your software is never going to be better than average, as best. And probably not even that good.

      1. doublelayer Silver badge

        I can't figure out exactly what Ubuntu is going to do with the data they have. We all know what that data looks like; it's a list of pretty much all the intel and AMD processors released in the last eight years with quite a few from before that. The ram table: 512mb, 1gbb, 2gb, 4gb, 6gb, 8gb, 12gb, 16gb, 24gb, 32gb, 48gb. I'm sure it'll be fun to see how many people are running it on something really old (They would see an intel core 2 duo P8600 for an old backup machine from me if I wasn't still on 16.04), but how is that going to help them. They could go to a lot more effort to figure out what users want by involving them directly.

        1. onefang

          I don't think it's just a list of tech used that's important to them, but what's popular. If only a very tiny fraction uses a particular CPU that has only recently been revealed to have a certain bug, it becomes a very low priority to get a fix for that pushed out as an update. If the great majority of users have 4GB or more, not much point working on squeezing things into 2GB. It's all about setting priorities based on what sort of equipment the bulk of their users use.

        2. thames

          @doublelayer - They'll use the data to decide what ought to be the defaults for the next release. They will be making decisions based on actual data rather than someone's wild guesses. A major problem has been that developers often assume that the sort of hardware they have on their desks is typical of what everyone else has.

          In the past they've had to make decisions on things such as "should the default install disk be CD sized so that it will work with PCs which have CD drives but not DVD drives, or should it be DVD sized so that the user is less dependent on having network access at the time of installation to install stuff that wouldn't fit on the CD?".

          They've also had to worry about things like graphics support, what CPU optimisations to compile in as default (some packages have optional libraries for older CPUs), etc.

          Apple know exactly what hardware they ship. Microsoft can simply assume that the non-Apple PC market is the same as the Windows market. Linux distros can't make these assumptions so they either just pull numbers out of the air, use opt-in surveys which are usually wildly unrepresentative of the user base, or do something like this.

          Before this they had a detailed opt-in hardware data survey which so few people bothered with that it was pretty much useless. The new one collects far less information, but does so from a sample which will likely be representative of the overall user base.

          1. Anonymous Coward
            Anonymous Coward

            > They'll use the data to decide what ought to be the defaults for the next release. They will be making decisions based on actual data rather than someone's wild guesses.

            Really hoping they're not this stupid. Kind of suspecting they will be though.

            The reason it's stupid, is because data like this is extremely easy to game.

            As a random example, lets say you're a manufactuer that has a line of custom Linux laptops. Want really good support added to them for nearly no cost? Well then, send in ten or twenty thousand entries for your stuff, randomising things to look legit and using fake source IP info. Make sure the entries are done over time too, so there's no obvious faking attempt.

            That's the kind of thing that can be scripted and put into play in just a few hours, and will completely skew stats on what Canonical should be targetting.

            And there are likely people/places out there who will do this. Some of them just for the hell of it. Some of them because they just don't like Canonical or compete with them. Either way, the data is way too easy to game and shouldn't be used for business decisions.

            1. thames

              @AC said: "As a random example, lets say you're a manufactuer that has a line of custom Linux laptops. Want really good support added to them for nearly no cost? Well then, send in ten or twenty thousand entries for your stuff, randomising things to look legit and using fake source IP info."

              Or just send an email to Canonical telling them that you are are a manufacturer who is planning on coming out with a line of custom Linux laptops and that you would like them to work with Ubuntu out of the box on launch. Then ask them if their developers would like some free laptops. They're happy to work with anyone who wants to support Linux.

              However, just have a look at the type of information being collected. According to the story it just amounts to the following:

              • Ubuntu Version.
              • BIOS version.
              • CPU.
              • GPU.
              • Amount of RAM.
              • Partitions (I assume that is number and size of disk partitions).
              • Screen resolution and frequency, and number of screens.
              • Whether you auto log in.
              • Whether you use live kernel patching.
              • Type of desktop (e.g. Gnome, Mate, etc.).
              • Whether you use X11 or Weyland.
              • Timezone.
              • Type of install media.
              • Whether you automatically downloaded updates after installation.
              • Language.
              • Whether you used the minimal install.
              • Whether you used any proprietary add-ons.

              There is basically two types of information there. One is some basic parameters such as RAM, CPU, GPU, hard drive size, etc. That tells you what you should be targeting in terms of hardware resources, and so whether your desktop (e.g. Gnome) is getting too fat for the average user (as opposed to the average complainer, at which point you are far too late to be addressing the issue).

              The other is what install options people changed compared to the default install. If most people don't pick live kernel patching, then you know not to make that option the default. If a lot of people are selecting Urdu as the language, then you might want to make sure that language has better default support. Etc.

              Ubuntu will publish this information publicly. Personally I am looking forward to the RAM and CPU type data, as that will give me information on what CPU features to target in certain software I have been working on. I have been relying on Steam data, but that may not be very representative of the science and engineering field which my software relates to.

              1. Anonymous Coward
                Anonymous Coward

                > However, just have a look at the type of information being collected. According to the story it just amounts to the following ...

                Apologies, I was trying to explain the concept of why using this kind of data is bad. The example I chose looks like it didn't work for you as it was too specific.

                Lets say that you're a competitor of Canonical, or they've somehow managed to piss you off a bunch (they're kind of known for doing that). You'd be able to really screw up their stats by submitting false data.

                If they make business decisions based on it, you can lead them up the garden path, so to speak.

                I mean, it's up to them what they do with the data... I personally wouldn't use it for anything meaningful though.

      2. Updraft102

        Not to mention, after observing other major software projects that rely primarily on telemetry to inform their decisions, relying on telemetry to make design decisions seems to ensure that your software is never going to be better than average, as best. And probably not even that good.

        You mean like Firefox's decision to remove the ability to use most of the addons, because most of the people who left telemetry on only have a small number of addons or no addons?

        Or how about Microsoft's decision to remove the Start button, since their telemetry data suggested nobody actually uses it anymore?

        That latter case may be part of why they're so adamant about forcing everyone to have telemetry on... they don't want to exclude the data from those technically oriented enough to know what telemetry is and how to turn it off. That, and the fact that they don't have beta testers anymore, so the end users have to be the beta testers now.

  7. Camilla Smythe

    Bionic Beaver

    What's with them needing to know my preferred bestiality partner. Bob the Beaver will not be happy and I will so miss a good spanking.

  8. Anonymous Coward
    Anonymous Coward

    Linux Unplugged

    "... that several of these videos claim the solution is to remove a package that – wait for it – has nothing to do with data collection."

    IIRC the video Linux Unplugged obliquely referred to showed that if you opt-in Ubuntu sends Canonical the data and if you opt-out Ubuntu still sends a message but showing that you have opted-out. The video also showed how to remove the packages that apparently send the data. It was made clear that as popularity-contest is a dependancy of the ubuntu-standard meta-package it also takes out a core package that can adversely affect the OS.

    Of the few YouTube videos I have seen on this subject none have really objected to what is being sent at this time. There is more a concern that it is "opt-out" and that there is potential for "mission creep" later.

    1. Ken Hagan Gold badge

      Re: Linux Unplugged

      "It was made clear that as popularity-contest is a dependancy of the ubuntu-standard meta-package it also takes out a core package that can adversely affect the OS."

      Sounds like FUD. If anyone else has a dependency on popularity-contest, or if it has been installed explicitly, then it won't be removed. Obviously it will be removed if no-one is using it or has expressed any interest in it, but it is difficult to see that as "adversely affecting the OS".

      1. Anonymous Coward
        Anonymous Coward

        Re: Linux Unplugged

        @Ken Hagan

        I suppose Linux Unplugged could be distributing FUD.

        The fellow that was trying to find a way of stopping all Ubuntu reporting said that removing ubuntu-standard could "adversly affect" the system.

        Linux Unplugged were more specific saying that removing ubuntu-standard means that you will not be able to upgrade Ubuntu 18.04 thereafter. I'd call that an adverse effect.

        Is this not true?

        1. onefang

          Re: Linux Unplugged

          I dunno, my Windows 8.1 development box is all but completely firewalled off from the Internet, specifically from Microsoft update servers. I wont be able to upgrade it. I'd not call that an adverse effect.

    2. Robert Carnegie Silver badge

      Re: Linux Unplugged

      I would expect if I opt out, the company I am opting out from doesn't get told that I am opting out. I want the default state to be that they don't know I exist. Though I may want to convey that fact later.

  9. JohnFen

    Spying

    "there's one thing that must be said very clearly: Canonical is not "spying" on users."

    True, because they have an opt-out and call attention to it. It's close to that line, though.

    Any collection of data about me, my hardware, or my use of my hardware that is collected and transmitted without my knowledge or consent counts as "spying", no matter how innocuous that data may appear to be.

    1. dajames
      Headmaster

      Re: Spying

      Any collection of data about me, my hardware, or my use of my hardware that is collected and transmitted without my knowledge or consent counts as "spying", no matter how innocuous that data may appear to be.

      Methinks most people would agree that to be "spying" it has to be done without your knowledge. In this case they tell you about it and offer you the chance to opt out, so it can't reasonably be called "spying".

  10. Wingel

    IP address alone is NOT personal data

    The IP address alone is not personal data unless you have the means to link it to a natural person and until you do so.

    Context is everything.

  11. Anonymous Coward
    Anonymous Coward

    Maybe I'm missing something?

    Every decent article I've read about hardening a computer specifically states to avoid broadcasting as much technical details of your box to the outside world as possible.

    And every decent penetration testing article I've read specifically looks for technical details of intended targets such as OS, patch level, web browser type and version etc.

    Opt out my ass, I for one will: "sudo apt-get --purge remove slurp"

  12. Snar

    This

    Is part of the reason why I've had El Reg as my home page for 20-odd years.

    Balanced reporting.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like