"opt-out was probably the best choice"
Not if you want to be GDPR-compliant.
Methinks Canonical might be getting a call from some lawyer soon . . .
Ubuntu 18.04, launched last month, included a new Welcome application that runs the first time you boot into your new install. The Welcome app does several things, including offering to opt you out of Canonical's new data collection tool. Ever since Edward Snowden confirmed so many once outlandish conspiracy theories, the …
It was, until all the evil $"q$w started grabbing everything they could vacuum up about us.
Unfortunately this exercise does comprehensively fingerprint the host machine and Snowden did move the (acceptedly fuzzy) paranoid/sane boundary a long way into the previously paranoid side.
Are there any ideas out there about how we give useful anonymous feedback to developers?
you need to read up on GDPR.
Nothing collected by Canonical is GDPR infringing. It's basic hardware metrics on a machine similar (but more techy) to the advertising blurb you see on Dells website giving ram and cpu specs etc. A machines info inst covered under GDPR or DPA.
Your helping to pour fuel on a heated debate by having no idea what your actually talking about. Stop it its silly.
"Nothing collected by Canonical is GDPR infringing. It's basic hardware metrics on a machine similar (but more techy) to the advertising blurb you see on Dells website giving ram and cpu specs etc. A machines info inst covered under GDPR or DPA"
^ Exactly this. Not only that, and unlike with Microsoft and Google, a clear binary choice is offered - to take part or not take part in data sharing.
"PII is a very specific legal term from the US about HIPAA."
And, "PII" as used in the US is a bit of a lie. There's an awful lot of personally identifying information that isn't considered PII. And, in sufficient quantities, all information about you becomes personally identifying.
>>PII is a very specific legal term from the US about HIPAA
Only in some contexts.
For example the Census Bureau has its own definition of what PII is (as do most other Federal Agencies that use your data to provide some kind of a service, the IRS has a different definition, as does FEMA), as do the States. And like usual with the Federal and State bureaucracies, there's no one definition to rule them all.
Afraid the lawyers would have much of a chance. Repeat after me:
GDPR ONLY AFFECTS PERSONALLY IDENTIFIABLE INFORMATION.
GDPR ONLY AFFECTS PERSONALLY IDENTIFIABLE INFORMATION.
This data is anonymised, no compliance required.
Doesn't matter if it's an opt-in or an opt-out. It's anonymous data.
(UK spelling)
I'm pretty sure the article said that the IP addresses weren't being logged... so "not persionally identifying" and "not personal data". which is fine with me. I might consider letting Ubu (and others) know stuff about what I install and where I install it, next time I install one of their distros.
I used to allow that, long ago, even for Micro-shaft, until it became obvious we were being snooped and tracked and whatnot by aggressive advertising firms that seek to target us with their marketing.
Perhaps this article is like the pendulum swinging back towards the middle again?
They are not including the IP address.
No, but they are sending the data to their servers over the internet, so the addressing information will be available from the received IP packet headers. It wouldn't be rocket science to associate the data with an IP address if they wanted.
I wouldn't say that an IP address should count as personally identifiable data, though, there are enough dynamic IPs and enough NATted shared IPs to make it difficult to associate an individual user with a particular hardware fingerprint.
What worries me more is that the data collected will enable them to discover which CPU types (for example) are only being used by a tiny fraction of the userbase, and prematurely discontinue support for those chips in order to make use of some new feature nobody has ever heard of in the very latest.
I don't mind downvotes, but I'm honestly curious here -- why is this comment getting them? Can someone give me the counterargument?
I believe what I said is true because I have yet to see "anonymized" data collection that can't be de-anonymized whenever the entity holding the data wants to do it.
"I don't mind downvotes, but I'm honestly curious here -- why is this comment getting them? Can someone give me the counterargument?
I believe what I said is true because I have yet to see "anonymized" data collection that can't be de-anonymized whenever the entity holding the data wants to do it."
They aren't collecting any PII data. Theres nothing they have they could de-anonymouse.
"They aren't collecting any PII data. Theres nothing they have they could de-anonymouse."
Sure there is -- if you have enough non-PII data on someone, then you can identify the person who generated it. And it's been shown repeatedly that "enough" such data is a shockingly small amount.
That said, I was responding to a comment that stated that there's no need to worry because the data is anonymized by pointing out that anonymizing data does not actually mean that much. Of course, that depends on what is meant by "anonymized". For instance, if the data is aggregated with many other people and the original collections are deleted, that's pretty safe, but requires trusting that the original data records are actually being deleted.
"if you have enough non-PII data on someone, then you can identify the person who generated it. And it's been shown repeatedly that "enough" such data is a shockingly small amount."
well, the definition of "identify" there is somewhat subtle, isn't it? You can *fingerprint* them, yes - in that if you see the same data profile again, you know it's the same person. But you don't actually know *who they are*, in the sense of 'this is Joe Bloggs of 41 Lark Terrace'. All you know is it's the same person (or, rather, the same computer) that sent the same profile before.
The bar to actually *figure out where that computer is and who owns it* is somewhat higher. Facebook and Google can do it, of course. I can't see how Canonical possibly could, from this data.
This is a very late reply as I was on vacation, but...
"the definition of "identify" there is somewhat subtle, isn't it?"
I mean "identify" as in "determine the identity of the user", not just "fingerprint the user". Researchers have repeatedly shown this is a trivial thing to do given just a small amount of non-PII data about someone. You don't have to be Facebook or Google to do it, you just have to be able to afford access to to the data, and that's only a question of money. A couple thousand dollars and the use of free data-mining software and you are home free.
if the data is aggregated with many other people and the original collections are deleted, that's pretty safe, but requires trusting that the original data records are actually being deleted.
Exactly. Trusting is naive. Developers tend to disable data/log deletion when something breaks, then forget to turn it back on again after fixing the problem.
Furthermore, there are always rogue managers/employees/volunteers who feel the rules don't apply to them. If the data is collected, there's a real chance someone will use it for nefarious purposes. Doxxing for dollars, maybe, or for noble social justice causes. "Don't worry, we're only targeting fascists!"
I don't mind downvotes, but I'm honestly curious here -- why is this comment getting them? Can someone give me the counterargument?
Beats me. I'm not voting any of these comments; the reeeeeeeee is winning by a landslide.
"this exercise does comprehensively fingerprint the host machine"... 3-18
"The data collected is not PII"... 20-1
I'll submit another unpopular truth: It phones home even if you opt out. Can I get 50 downvotes for this?
While this is small potatoes compared to the very personal data collected by Facebook et al, what I'm looking for is ZERO TOLERANCE for tracking, profiling, and thoughtless analytics-driven decision making. Just a hunch: it actually WORSENS developers' decisions. There is no silver lining.
As long as you opt me in by default into anything, opting out is all you're going to see from me, even if your goddamn survey is going to magically save all Somalian children forever. ASKING is fine; the moment you PUT YOUR FOOT in the door and assume consent I'm reaching for the shotgun, pal.
"As long as you opt me in by default into anything, opting out is all you're going to see from me, even if your goddamn survey is going to magically save all Somalian children forever. ASKING is fine; the moment you PUT YOUR FOOT in the door and assume consent I'm reaching for the shotgun, pal."
I hate to break it to you - but you're a tiny minority. That's why Canonical did this. They need representative data.
Lots of internet commenters say the above, but most people don't actually behave that way, as anyone who's ever designed a system like this will tell you. If you make it opt-out, very few people opt out. If you make it opt-in, almost nobody opts in. That's human nature, apparently. That doesn't mean it's *right* to make things opt-out, of course. It can't answer that question. It's just a fact: opt-out always results in more participation than opt-in.
Agree. Opting in would put Canonical in the position of having to invite the user to join and provide the user with convincing statements to make that happen in a number of cases sufficient for their purposes.
However and beyond that, Canonical and Ubuntu are private organizations. As such, there is no guarantee other than their word that what they are saying is true (many other such institutions have either shaved the truth or outright lied about it), and even if they are being honest and sincere, they can change their minds tomorrow (as many other such institutions have done in the past to the detriment of user privacy).
That is, it is not paranoia if there is 1) a broad and long history in the industry of such promises broken and 2) endless efforts to bury personal tracking policies under heaps of legalese jargon and flowery PR statements about their commitments to "do no evil" etc. There really are a lot of "bad guys" - re personal tracking - out there.
Bravo for their implementation and transparency but, frankly, I'm still going to opt all the hell out because I perceive this to be the thin end of the wedge. Is Canonical going to pop up a notification, asking for my consent, every single time that data file's schema changes because someone decided it would be cool to add an extra field? Do I have the time to vet all those changes, even should they do that?
Perhaps, if GNOME started gathering some basic data on a larger scale about how people use GNOME the project would make different decisions.
Doubt it, if you take the other example (Firefox) it turned into competition between UXers to see how they could out-stupid each other, using metrics to justify their decisions where they could and ignoring them where they couldn't.
This post has been deleted by its author
Wasn't talking about the good linx distro side. I was just referring to the household name bit.
If I went to my family and said Android, they would go 'What about it?'
If I went to my family and said 'Ubuntu', they would go 'You having a stroke or is that a new cordial?'
If I went to my family and said 'Canonical', they would go 'is that a small camera?'
I love my family. I hate the fact I am the only one who works in IT, in my family.
Wasn't talking about the good linx distro side. I was just referring to the household name bit.
You say kleenex, everyone knows it's a tissue, you say hoover, everyone knows its a vacuum cleaner.
You say android, it's a phone to most, unaware as many are that it runs on the linux kernel.
Linux may be all around, but like a popular brand of sewer plumbing it doesn't get much upfront advertising, it's not a popular enough household product to have a household name.
"Canonical makes an easy target for this sort of thing because it's the closest thing Linux has to a household name."
I would have said Android.
There you have it, at least one 'housewife' 'can't tell the difference between whizzo butter and a dead crab'
Android isn't a good household name for 'Linux as it throws the Gnu out with the bathwater and replaces it with private googlies.
Gnome could be nice but for me is more hard for making changes in the bar. I use 2 keyboard layouts and is messy (at least for me) with some desktops as Gnome. LXDE, Lubuntu, Lubuntu Xubuntu or Enlightenment are most easy for changes. Maybe are others like that and at the same time light.
What generated the controversy, in my view, is the need for click-baity headlines in this day and age of advertising-driven, small publishers.
The internet is full of click baity crap so i'm hardly surprised.
I don't know of any other data collection by a large company that offers that level of control
Steam do a hardware survey that allows you to see what is sent to them but it has a lot more information on it than what Canonical want.