Friday, 6 August 2010

Data Protection Directive reform postponed

French privacy / data protection regulator CNIL has said the long overdue overhaul of the EU Data Protection Directive is to be postponed.

Originally Commissioner Reding said a draft of the proposed new Directive would be issued by the end of this year following a consultation last year, but now it won't be ready till late 2011 - though the Commission will issue a statement this November instead.

The news seems to have been broken by law firm Hunton Williams earlier this week.

©WH. This work is licensed under a Creative Commons Attribution Non-Commercial Share-Alike England 2.0 Licence. Please attribute to WH, Tech and Law, and link to the original blog post page. Moral rights asserted.

Wednesday, 4 August 2010

Behavioral tracking / de-anonymisation reality - WSJ article

Privacy nightmare? Lots of personal info is deducible from 1 click, Wall Street Journal article On the Web's Cutting Edge, Anonymity in Name Only (a must read by Emily Steel & Julia Angwin) just reported -

"From a single click on a web site, [New York ad company] [x+1] correctly identified Carrie Isaac as a young Colorado Springs parent who lives on about $50,000 a year, shops at Wal-Mart and rents kids' videos… deduced that Paul Boulifard, a Nashville architect, is childless, likes to travel and buys used cars… determined that Thomas Burney, a Colorado building contractor, is a skier with a college degree and looks like he has good credit.

The company didn't get every detail correct. But its ability to make snap assessments of individuals is accurate enough that Capital One Financial Corp. uses [x+1]'s calculations to instantly decide which credit cards to show first-time visitors to its website…

…firms like [x+1] tap into vast databases of people's online behavior—mainly gathered surreptitiously by tracking technologies that have become ubiquitous on websites across the Internet. They don't have people's names, but cross-reference that data with records of home ownership, family income, marital status and favorite restaurants, among other things. Then, using statistical analysis, they start to make assumptions about the proclivities of individual Web surfers.

…A Wall Street Journal investigation into online privacy has found that the analytical skill of data handlers like [x+1] is transforming the Internet into a place where people are becoming anonymous in name only."

Kudos to the WSJ for their investigation. The article explained the technology further:

"A visitor lands on Capital One's credit-card page, and [x+1] instantly scans the information passed between the person's computer and the web page, which can be thousands of lines of code containing details on the user's computer. [x+1] also uses a new service from Digital Envoy Inc. that can determine the ZIP code where that computer is physically located. For some clients (but not Capital One), [x+1] also taps additional databases of web-browsing history.

Armed with its data, [x+1] taps consumer researcher Nielsen Co. to assign the visitor to one of 66 demographic groups.

In a fifth of a second, [x+1] says it can access and analyze thousands of pieces of information about a single user. It quickly scans for similar types of Capital One customers to make an educated guess about which credit cards to show the visitor."

See the WSJ's detailed predictions for different testers (including some of the code sent, what personal data [x+1] got right and what they guessed wrong), and also the WSJ's What They Know page generally.

I'd like to know though exactly what they meant by "containing details on the user's computer". It rather looks as if the code is generated through scanning the cookies (and Flash cookies etc etc?) saved by different advertisers on the user's computer from their prior visits to other websites.

The article also pointed out that the algorithms' evaluation of one tester came "extremely close" to identifying him personally. EFF staff scientist Peter Eckersley worked out that the tester's location (a small town) and his Nielsen demographic segment together gave 26.5 bits of info about him, meaning they'd know that he had to be one of only 64 possible people in the whole world. Add just one more bit of info, like his age, and they could probably totally de-anonymise him, i.e identify him precisely. 

(I'll blog "bits of entropy" properly another time, see privacy researcher Arvind Narayan's short explanation on why he calls his blog, another must-read, "33 Bits of Entropy". And this blog. There are 6.6 billion people in the world and log2 6.6 billion is about 33, if you must know!)

I shall just leave you with the unforgettable image conjured up by this screenshot of the article, dear readers (increasing my collection of funny typos) -

I, too, can make predictions about people based on if they lick a website. But I'd rather not.

©WH. This work is licensed under a Creative Commons Attribution Non-Commercial Share-Alike England 2.0 Licence. Please attribute to WH, Tech and Law, and link to the original blog post page. Moral rights asserted.

Monday, 2 August 2010

Cloud computing privacy - Google's Peter Fleischer speaks

Thanks to the IIEA for the heads up on a recent event there on 28 July 2010, Peter Fleischer and Billy Hawkes on Regulating for the Cloud: Updating the EU Data Protection Framework for Cloud Computing, with speeches by Google's Global Privacy Counsel Peter Fleischer and the Irish Data Protection Commissioner. See videos below.

The Fleischer speech covered:

  • what's cloud computing - architecture, services, platform, computing model
  • how can privacy stay protected in the cloud - potential benefits of cloud computing e.g. allowing online subject access requests and some control via Dashboard-style management tools
  • problems (and some suggested possible solutions) on -
    • what law applies & overlapping privacy regimes
    • governments getting data from cloud providers, and Google's publication of government requests for personal data
    • data protection / privacy requirements being based on location of data
    • the EU controller-processor model doesn't fit cloud computing and there are issues with the standardised contract terms e.g. -
      • "forcing people to take responsibility for auditing and doing that for hundreds and thousands of cloud users all of whom are supposed to "audit" Amazon, I don't even know what that means, I get questions about, does that mean they are supposed to visit a data centre??"

He thinks the ultimate solution is obvious, that doesn't mean it isn't hard -

"We need to come up with global privacy standards in this space, else we'll spend forever debating jurisdiction, applicable law, contorting ourselves in attempts to comply with these divergences as best or as poorly as can be done".

I see Peter Fleischer has also briefly pointed to these speeches.

©WH. This work is licensed under a Creative Commons Attribution Non-Commercial Share-Alike England 2.0 Licence. Please attribute to WH, Tech and Law, and link to the original blog post page. Moral rights asserted.