Sunday, 27 June 2010

Behavioral advertising - developments & links

Some notes and links, and a brief history of key UK, EU and US developments over the last few months, in relation to online behavioural advertising (or OBA - variously known as behavioral targeting, behavioural tracking or behavioral marketing).

As online advertising generally grows, OBA will be increasingly important. In September 2009 a study of UK online advertisers including agencies found that 57% had increased their spend on online display advertising over the previous year, with 72% expecting their spending to increase over the next year. (And increase it did.)

Just a couple of days ago, a separate report found that UK online advertising hadn't been affected by the recession as much as other types of media marketing, with UK advertisers spending £3.54 billion online in 2009 (5.7% more than in 2008) even while other types of advertising fell by 16%. The report predicted that although online ad spending would level out in 2011, it would rise again in 2012, the UK Olympics year.

Research studies

Does behavioural advertising help advertisers or is it counter-productive; what do consumers think of OBA?

Some research indicates OBA actually hurts advertisers, other studies suggest it helps them. It seems the outcome might perhaps depend on who's doing the research, or rather who commissioned it.

Rotman, University of Toronto. A study from this university, released 14 June 2010, said "You could match an ad to a web page’s content – such as putting a car ad on an auto consumer website. Or, you could make it stand out with eye-catching pop-up graphics and video. But don’t waste your marketing budget putting the two strategies together. The first large-scale study looking at thousands of online ad campaigns says that in combination, these approaches make viewers feel like their privacy is being invaded – and turns them off." See "Online Display Advertising: Targeting and Obtrusiveness", Avi Goldfarb and Catherine Tucker, February 2010.

From the abstract (emphasis added):

"We use data from a large-scale field experiment to explore what influences the effectiveness of online advertising. We find that matching an ad to website content and increasing an ad’s obtrusiveness independently increase purchase intent. However, in combination these two strategies are ineffective. Ads that match both website content and are obtrusive do worse at increasing purchase intent than ads that do only one or the other. This failure appears to be related to privacy concerns: The negative effect of combining targeting with obtrusiveness is strongest for people who refuse to give their income, and for categories where privacy matters most. Our results suggest a possible explanation for the growing bifurcation in internet advertising between highly targeted plain text ads and more visually striking but less targeted ads."

This seems to be the "creepy" factor.

University of California, Berkeley and the University of Pennsylvania. A survey by academics (including lawyers), reported on 30 Sept 2009 in the New York Times, showed that 2/3 of Americans object to online tracking by advertisers. See "Contrary to what marketers say, Americans Reject Tailored Advertising and Three Activities that enable it", by Joseph Turow, Jennifer King, Chris Jay Hoofnagle, Amy Bleakley and Michael Hennessy.

In contrast -

NAI. 2010 study of 12 advertising networks, sponsored by US ad networks industry body Network Advertising Initiative (NAI). (An ad network sits between advertisers and publishers, reselling publishers' advertising space to advertisers.)

This study on the effect of behaviorally targeted advertising on advertising rates and revenues, headlined that OBA was "more than twice as valuable, twice as effective". It found that the average CPM (cost per 1000 impressions i.e. page views) for behaviorally targeted advertising was just over twice the average CPM for run-of-network advertising; "On average across participating networks, the price of behaviorally targeted advertising in 2009 was 2.68 times the price of run of network advertising"; advertising using behavioral targeting was "more successful than standard run of network advertising [conversion rate 6.8% compared with 2.8%], creating greater utility for consumers from more relevant advertisements and clear appeal for advertisers from increased ad conversion"; and most network advertising revenue was spent on acquiring inventory from publishers, "making behavioral targeting an important source of revenue for online content and services providers as well as third party ad networks".

See "The Value of Behavioral Targeting", Howard Beales, George Washington University, March 2010, and associated NAI press release.

IAB. In October 2009 the Internet Advertising Bureau, which is the UK industry body for digital marketing, announced research with UK law firm Olswang showing that "the appeal of behavioral advertising increased from 23% to 75% once consumers were given further information, such as what information is actually collected and used and their right to opt-out… This research highlights the need for further education and supports our approach in providing greater reassurance about behavioural advertising. We know that once internet users are presented with all the facts the appeal of targeting advertising increases…" (See the IAB press release for more findings on consumers' attitudes to online privacy.)

Wilkinson Plus. One concrete example at least of where OBA seems to have helped a company is the case of Wilkinson Plus which used an "intelligent recommendations engine" "to generate, in real-time, individual recommendations relevant to that person. These are based on data including where visitors have come from, where they go, what they add to the basket as well as what they buy." They said in November 2009 that after introducing this system, online conversion rates increased by 22.9% while average order values rose by 16%.

Privacy and behavioral economics

"Nudging Privacy - The Behavioral Economics of Personal Information", Nov/Dec 2009, by Alessandro Acquisti of Carnegie Mellon University, may also be of interest generally -

"individuals are less likely to provide personal information to professional-looking sites than unprofessional ones, or when they receive strong assurances that their data will be kept confidential (see http://ssrn.com/abstract=1430482). We’ve found that individuals assign radically different values to their personal information depending on whether they’re focusing on protecting data from exposure or selling away data that would be otherwise protected. We’ve found that they might also suffer from an illusion of control bias that make them unable to distinguish publication control from control of access to personal information…"

Seems counter intutive, so doubley interesting. Acquisti has done a lot of work on privacy and privacy economics, see this blog and this blog.

Regulation - and self-regulation?

IAB

Good Practice Principles. In March 2009, the IAB had launched their publication Good Practice Principles for Online Behavioural Advertising (PDF download), "the UK’s first self-regulatory guidelines to set good practice for companies that collect and use data for online behavioural advertising purposes." There have since been audited certifications of compliance with these principles by 7 organisations including some AOL, Google, Microsoft and Yahoo entities.

"Your online choices" site. In March 2009 the IAB also launched "Your online choices - a guide to online behavioural advertising", a website targeting consumers, billed as "guide to online behavioural advertising and online privacy."

Opt out. The site provides a page to opt out of behavioural advertising, but there are only 6 participating companies including MSN and Yahoo at the date of this blog. The page also links to NAI's own opt out page whose participating companies include Google and Microsoft. (See also NAI's opt out protector, an add on for the Firefox browser).

Guide to OBA. Soon after their consumer study, mentioned above, in November 2009 the IAB launched "A guide to online behavioural advertising" (PDF download), saying "The guide demystifies the technique and explains how behavioural advertising works, how it differs to other types of targeted advertising on the internet, its benefits to web publishers and advertisers, consumer attitudes as well as an introduction to online privacy and industry good practice."

Children? The DCSF's "The Impact of the Commercial World on Children’s Wellbeing Report of an Independent Assessment For the Department for Children, Schools and Families and the Department for Culture, Media and Sport" published in December 2009 and mentioned by the IAB, noted -

"There have been a number of best practice guides (e.g. Network Advertising Initiative) which agree that it should be very clear to consumers what data is being collected about them, where it goes, and how they can opt out of data collection. There have also been some guidelines (e.g. Internet Advertising Bureau) for behavioural targeting which state that it must be clear to consumers that they are being tracked for the purposes of advertising and that they are given the choice to opt out of being tracked. However, none of these guidelines takes children into consideration or make provision for age-appropriate information on children’s sites."

UPDATED:

Internet Advertising Bureau Europe (IABE)

European Self-regulation for Online Behavioural Advertising - Transparency and Control for Consumers, April 2011

UK MPs and Lords - ApComms

In October 2009, ApComms (formerly known as the All Party Internet Group or APIG, so the name change wasn't a bad idea) published “Can we keep our hands off the net?” Report of an Inquiry by the All Party Parliamentary Communications Group, following a 6 month inquiry. There's a summary of the report too.

One question they asked was "Should the Government be intervening over behavioural advertising services, either to encourage or discourage their deployment; or is this entirely a matter for individual users, ISPs and websites?"

Their report contains a substantial section discussing the pros and cons of regulating behavioural advertising, and OBA to children. Their conclusions on OBA:

"115. We were deeply disappointed that Google, the clear market leader in online advertising, and the operator of a rather different type of behavioural advertising system, did not submit any evidence to this inquiry, despite a specific request from us to do so.

116. We do not believe that it is at all appropriate to consider the deployment of any type of behavioural advertising system without explicit, informed, “opt-in” by everyone whose data is to be processed, and whose behaviour is to be monitored and whose interests are to be deduced. We do not believe that “opt-out”, however commercially convenient, is the way that these systems should be run. To that extent, the Good Practice Principles promoted by the Internet Advertising Bureau are insufficient to protect people.

117. We recommend that the Government review the existing legislation applying to behavioural advertising, and bring forward new rules as needed, to ensure that these systems are only operated on an explicit, informed, opt-in basis.

118. We are particularly concerned that behavioural advertising systems may be being deployed without sufficient consideration being given to protecting the interests of children and young people. We did not receive sufficient evidence to form a view as to the way forward, but it is a matter that requires urgent consideration. We recommend the UK Council for Child Internet Safety (UKCCIS) consider how behavioural advertising that is aimed at children and young people should be regulated."

Office of Fair Trading

On 25 May 2010, UK regulator OFT, which has some responsibility for consumer issues, published a market study setting out its current views on behavioural advertising and targeted pricing practices. They announced the commencement of their study in October 2009, just before the IAB published the results of their own consumer study.

The OFT report found that (emphasis added) -

"although industry self-regulation addresses some concerns about behavioural advertising, more could be done to provide consumers with better information about how personal information is collected and used. It also sets out how regulation might apply to these new and emerging practices… while behavioural advertising may offer benefits to consumers such as, for example, free access to content, there are objections to the practice which centre around privacy issues and the possibility for the misuse of personal data.

To address these concerns, the OFT will encourage the IAB, the trade association for online advertising, to work with the industry to provide clear notices alongside behavioural adverts and information about opting out

…the Consumer Protection from Unfair Trading Regulations 2008 (CPRs) could also apply to business practices in this area, for example misleading consumers about the collection of information where this would lead consumers to alter their 'transactional decision'. This would include, in the OFT's view, a decision on whether or not to visit a website.

Should industry action prove ineffective, the OFT and the ICO [UK data protection regulator] are strengthening the effectiveness of regulation by seeking to agree a Memorandum of Understanding to establish in which circumstances the ICO, or the OFT, would take enforcement action… [the possible overlap with the Information Commissioner is because of the Privacy and Electronic Communications (EC Directive) Regulations 2003 (Privacy Regulations) implementing the EU Directive, and see the ICO's draft Personal Information Online Code of Practice which is to be published shortly, in July 2010.]

The study also examines the prospects for the online targeting of pricing based on previous purchases, browsing behaviour or geographic location. Research suggests that consumer opposition to such practices would be very strong. This has led the OFT to conclude that consumers who knew that targeted prices were being applied would change their behaviour, meaning that failure to inform consumers about the practice could breach the CPRs and in such an event the OFT would consider enforcement action."

See OFT1231 "Online Targeting of Advertising and Prices - A market study", May 2010.

And there's a short article on the UK law on OBA generally by Nigel Williams of law firm Fox Williams in Computing, May 2010.

EU

Article 29 Working Party

The Article 29 Working Party's opinion on OBA, "Opinion 2/2010 on online behavioural advertising" WP171 22 June 2010, was recently published, as presaged in their work programme for 2010-2011. (A fuller report will follow in due course.)

The EU data protection regulators have interpreted Directive 2002/58/EC (Directive on privacy and electronic communications aka e-Privacy Directive), amended by the telecoms reform package last year, as requiring explicit consent to be positively given before advertisers can plant a cookie on website visitors' computers - it's not enough to say there must have been implicit consent because the consumer's browser was set up to accept cookies automatically (which is the default setting on all major browsers). In other words, it's an opt in rather than opt out approach. And OBA is forbidden in relation to children.

Advertisers have protested strongly that it's "an overly strict interpretation", is "out of step with the relationships that businesses and consumers are building online and flies in the face of the reality of the Internet."

Other EU pronouncements

Commissioner Reding in a speech of 6 October 2009 said "European privacy rules are crystal clear: a person's information can only be used with their prior consent. Transparency and choice are key words in this debate. The Commission is closely monitoring the use of behavioural advertising to ensure respect for our privacy rights. I will not shy away from taking action where an EU country falls short of this duty. A first example is the infringement action the Commission has taken with regard to the United Kingdom in the Phorm case [see more on Phorm's "industrial scale snooping", the EU Phorm action and other EU action against the UK on data protection in June 2010 and October 2009]." Much of this echoed her earlier speech in March 2009.

Subsequently, in a speech on 5 November 2009 consumer Commissioner Kuneva (as she then was) announced "a Stakeholder Forum on Fair Data Collection that will meet several times next year", in relation to "online collection of personal and behaviour data. This is currently being done on an unprecedented scale on a massive scale and mostly without any user awareness at all." (See also a previous speech by her in this regard in March 2009; and see generally COMMISSION STAFF WORKING DOCUMENT - Report on cross-border e-commerce in the EU, March 2009.)

And in January 2010 and March 2010 Commissioner Reding again reiterated that behavioural advertising is very much on the EU's radar, along with RFIDs, social networking, video surveillance etc, as they produce proposals to update the EU Data Protection Directive, due to be published by end 2010.

Her latest pronouncement on the issue, just last week on 22 June 2010 - "online operators use behavioural advertising to create profiles of users' online activities to better target them with advertising. There are estimates that this market will grow to over €3 billion in 2012, eight times as much as in 2007. Advertising pays for a large part of the services that makes the internet world turn. But our data protection principles say that peoples' emails and online activity can only be used this way if individuals are fully aware of the use and they do not object. So we need rules that make the obligations for respecting privacy rights very clear."

Reding's March 2009 speech talked about use of personal information only "with their prior consent", but interestingly last week (as quoted above) she used the phrase "do not object". Although Commissioner Reding seemed to be against an opt out approach generally, might this phraseology signal some degree of softening in the Commission's stance? In an even more recent speech earlier this month, she said "Internet users must have effective control of what they put online and be able to correct, withdraw or delete it at will… I see the need for having more clarity about what "users' consent" means in practice. This is one of the essential steps if we are to build a firm basis of trust. Users must have informed consent to use of their personal data. In practice, that means working to avoid ambiguous and confusing information or the absence of any real information. More trust also means more legal certainty for the "merchants of data."

There will undoubtedly be lots of lobbying in the EU on OBA over the next few months or years.

UPDATED:

Impact of advertising on consumer behaviour - resolution adopted by European Parliament, 15 Dec 2010

USA

The NAI had updated their NAI Principles Code of Conduct for OBA, in December 2008 (press release).

In February 2009 the US Federal Trade Commission, which like the UK OFT has some consumer responsibilities, issued an FTC Staff Report: Self-Regulatory Principles For Online Behavioral Advertising (which NAI summarised). Their 4 principles were -

  1. Transparency and Consumer Control
  2. Reasonable Security, and Limited Data Retention, for Consumer
    Data
  3. Affirmative Express Consent for Material Changes to Existing Privacy Promises, and
  4. Affirmative Express Consent to (or Prohibition Against) Using Sensitive Data for Behavioral Advertising.

In July 2009, a coalition of ad industry bodies including the IAB (a separate US organisation, not the UK one) then released "Self-Regulatory Principles for Online Behavioral Advertising" (PDF) comprising principles intended to correspond with the FTC's principles (with the addition of Education and Accountability), including the requirement for "enhanced notices" to consumers on websites collecting their data, with common wording and a prominent link or icon.

On 1 September 2009, a coalition of several consumer and privacy groups, unhappy with self-regulation which they felt was inadequate to protect consumers, urged the US Congress to enact legislation to protect consumer privacy "in response to threats from the growing practices of online behavioral tracking and targeting."

See the CDT press release for the principles espoused by the coalition, including their Online Behavioral Tracking and Targeting: Legislative Primer of September 2009 and short overview. (The Electronic Freedom Foundation's 3-part series on how tracking can be carried out is very instructive, and explains the technical aspects well for non-computer scientists.)

In April 2010, NAI and US IAB released "CLEAR (Control Links for Education and Advertising Responsibly) Ad Notice Technical Specifications, to enable implementation of the July 2009 "enhanced consumer notice" approach on third party websites carrying advertising. CLEAR comprised -

"a set of common technical standards enabling enhanced notice in online ads. These technical specifications will allow advertisers and ad networks to begin offering a clickable icon in or near online ads that directs users to additional information about online behavioral advertising and choices about such ads…

…The CLEAR Ad Notice Technical Specifications detail how third-party media companies, such as advertising networks, can provide enhanced notice to consumers through an industry standard set of metadata tags that are delivered along with behaviorally targeted advertisements. Those metadata tags include information on which organization(s) served the ad, where to find their advertising policies and how to opt-out of such targeting in the future…."

Later in April 2010, the FTC decided to review the Children’s Online Privacy Protection Rule (COPPA Rule) to make sure that it is still adequately protecting children’s privacy, including "Whether Web site operators have the ability to contact specific individuals using information collected from children online, such as persistent IP addresses, mobile geolocation data, or information collected from children online in connection with behavioral advertising, and whether the Rule’s definition of “personal information” should be expanded accordingly". And they're consulting on iSAFE proposed guidelines to help website operators comply with COPPA.

The FTC have also hosted roundtables on consumer privacy issues, including OBA - see the Wall Street Journal report on one which addressed behavioural advertising.

And recently, in June 2010, the FTC submitted their views, including on behavioral advertising, to the US Department of Commerce's Internet Policy Task Force's "comprehensive review of the nexus between privacy policy and innovation in the Internet economy."

So in the USA too, things are still in a state of flux on OBA.

©WH. This work is licensed under a Creative Commons Attribution Non-Commercial Share-Alike England 2.0 Licence. Please attribute to WH, Tech and Law, and link to the original blog post page. Moral rights asserted.

Thursday, 24 June 2010

Commission to UK - fix your inadequate data protection laws

The European Commission thinks that neither the UK Data Protection Act 1998 nor its application by the UK courts properly implements the requirements of the EU Data Protection Directive, and in a "reasoned opinion" to the UK (the second stage of EU infringement procedures) they've asked the UK to remedy the shortcomings.

If the UK doesn't comply within 2 months, the Commission could refer the UK to the European Court of Justice. The Commission had previously tried, in October 2009, to take the UK to task over privacy and data protection, specifically on the interception of electronic communications like email, but nothing much seems to have happened on that front. So who knows if something happen will on this. (The UK aren't alone, the Commission are also unhappy about Finland allowing taxpayers' personal data to be effectively public, and to be bought and sold on and on and on - you thought the UK DVLA were bad…?)

The Commission press release 24 June 2010 said (emphasis added):

"…In the UK, national data rules are curtailed in several ways, leaving the standard of protection lower than required under EU rules. The UK now has two months to inform the Commission of measures taken to ensure full compliance with the EU Data Protection Directive…

The Commission has worked with UK authorities to resolve a number of issues, but several remain, notably limitations of the Information Commissioner's Office's powers:

  • it cannot monitor whether third countries' data protection is adequate. These assessments should come before international transfers of personal information;
  • It can neither perform random checks on people using or processing personal data, nor enforce penalties following the checks.

Furthermore, courts in the UK can refuse the right to have personal data rectified or erased. The right to compensation for moral damage when personal information is used inappropriately is also restricted.

These powers and rights are protected under the EU Data Protection Directive and must also apply in the UK. As expressed in today’s reasoned opinion, the Commission wants the UK to remedy these and other shortcomings."

"Data protection authorities have the crucial and delicate task of protecting the fundamental right to privacy. EU rules require that the work of data protection authorities must not be unbalanced by the slightest hint of legal ambiguity. I will enforce this vigorously," said Vice-President Viviane Reding, Commissioner for Justice, Fundamental Rights and Citizenship. "I urge the UK to change its rules swiftly so that the data protection authority is able to perform its duties with absolute clarity about the rules. Having a watchdog with insufficient powers is like keeping your guard dog tied up in the basement."

I've mentioned before that I think the main reason why PETs are not used is because regulators can't monitor systems properly for security or other issues, and can't make data controllers use "privacy by design" technologies, so this is an interesting development. We'll see whether the UK does let the dogs out!

©WH. This work is licensed under a Creative Commons Attribution Non-Commercial Share-Alike England 2.0 Licence. Please attribute to WH, Tech and Law, and link to the original blog post page. Moral rights asserted.

Monday, 14 June 2010

Google, Microsoft, Yahoo - search data retention periods table

For convenience, here's the current position on the main search engines' retention and deletion or "anonymization" of search query data, extracted from my much longer blog setting out the history of the discussions between the main search engines and EU data protection regulators (and explaining IP addresses and the last octet, cookies and hashing).

The EU regulators, i.e. Article 29 Working Party, ideally want all personal data related to search queries to be deleted (or rendered fully and effectively anonymous) after 6 months. They want more info on the hashing techniques used, which should be independently audited, before they can be satisfied as to their effectiveness to anonymise personal data. And, of course, hashing or substitution of other identifiers doesn't stop searches from being linked across sessions to the same person or at least same computer or browser.

If anyone has more up to date info on the following, please let me know.

Search data retention periods

IP addresses

Cookies

Notes

Google Last octet only of IP address is deleted after 9 months. (As mentioned previously, that's like deleting the building number from a street address where 255 other people live on the same street) 18 months Not clear if cookies are deleted or hashed after 18 months. Probably deleted.
Microsoft Deleted after 6 months Hashed possibly immediately after search, but it seems certainly after 6 months (registered users) or 18 months (unregistered users).

Deleted after 18 months.
Other cross session identifiers are deleted after 18 months
Yahoo Deleted after 90 days (i.e. 3 months) 1-way secret hash applied to cookies of unregistered users (and to registration identifiers of registered users; then 50% of the hashed registration identifiers deleted or truncated. Not clear exactly when cookies are hashed - probably only after 90 days, rather than immediately after the search?

©WH. This work is licensed under a Creative Commons Attribution Non-Commercial Share-Alike England 2.0 Licence. Please attribute to WH, Tech and Law, and link to the original blog post page. Moral rights asserted.

Digital privacy - ICO personal information online Code of Practice - ICO summary of responses, & Code due out July

A few months ago the UK Information Commissioner consulted on a draft Personal Information Online Code of Practice.

The ICO have recently released a summary of the main consultation issues and the responses they received to the consultation (over 200 responses). It seems the changes are considerable, but they don't seem to have provided the text of the redraft following the consultation, perhaps because it seems the changes are mainly explanatory rather than substantive.

The final Code is to be issued, presumably effective immediately, quite soon - in July 2010 - and will be available online including in PDF format.

Key points at a glance -

  1. Scope - the Code applies to anyone processing personal information online - ISPs, websites, businesses, consumers too. There will be additional material aimed specifically at SMEs and individual users of online services.
  2. Law vs. best practice - the draft Code will be revised to provide a better explanation of the relationship between the code and the Data Protection Act (DPA).
  3. Key terms - there'll now be a glossary plus explanatory material showing the various roles of the organisations that collect personal data online and deliver content to service users. They've revised the section on internet-based computing and more clearly defined the terms used.
  4. Personal data - the Code now states clearly the ICO view that in many cases IP addresses will be personal data, and that the DPA will therefore apply. "We continue to recognise the practical difficulties in complying with all aspects of the DPA with
    respect to non-obvious identifiers."
  5. Vulnerable users, children - meaning of "vulnerable" clarified, and specific reference to non-English speakers deleted. They've also expanded and clarified the section dealing with children, making it clearer when parental consent for collection of information about children is needed and in what form.
  6. Online marketing - revised extensively following further consultation with industry experts. "Online marketing and advertising is now explained clearly, and will be supplemented with a series of visual demonstrations of the processes involved."
  7. Security - the ICO is revising its security guidance and advice on securing personal data, but that work won't be finished till after the Code is published. However, the Code will now have a section on Privacy Enhancing Technologies and a simple security checklist.
  8. Example scenarios - examples will be added to illustrate both good and bad practice, and additional material too to help organisations with compliance issues.
  9. Data protection kitemark scheme - was suggested by some respondents; the ICO said they can't commit to introducing such a logistically complex scheme, but "will give it serious consideration".

©WH. This work is licensed under a Creative Commons Attribution Non-Commercial Share-Alike England 2.0 Licence. Please attribute to WH, Tech and Law, and link to the original blog post page. Moral rights asserted.

Saturday, 12 June 2010

Search engines, data protection & Article 29 Working Party - table summary - data retention, anonymization & the last octet, and hashing

Search data can be very personal. The search engines can get to know an awful lot of info about a person and their doings from their search queries.

I've created a table summarising the current positions of the EU Article 29 Working Party and the search engines on search query data, based on the public Article 29 Working Party letters about the various exchanges between the Working Party and the 3 main search engines Google, Microsoft Bing and Yahoo! (the latest being letters sent by the Working Party a few weeks ago, in May 2010).

You'll see from the table that Microsoft seems to have engaged most with the Working Party (comprising the EU data protection regulators) and Yahoo have gone the furthest, but in all cases the Working Party wants personal data to be deleted ASAP whereas the search engines want to retain it for as long as they can, and also they want to "anonymize" the data rather than delete it completely - usually by substituting another identifier for an IP address or cookie.

So it's not just a question of how long the search engines keep personal data, but also, if they don't delete the data, how well they scrub the data of identifying information, i.e. the quality of the anonymisation techniques used, not to mention transparency about the search engines' hashing or anonymisation techniques. This will clearly be a big issue for the future.

And, as the WP have said, even if you get rid of all other data to do with a query, the search terms themselves can be identifying - e.g. of someone who does an egosearch - and if you can link the search queries made in different sessions to the same person, whether through the searcher having the same IP address, cookie or an artificial substitute ID number as an "anonymous ID", you can still put them together to identify the individual concerned. As the WP said, the capacity to link individual searches may reveal enough personal data to identify an individual data subject.

I've taken the liberty of framing some of the following in the form of "WP said, Google said" for ease of reading - obviously they're informal paraphrases. And I've not included everything from the WP letters, just some key points. (I've outlined IP addresses, last octet, cookies and hashing at the end of this blog, for anyone not familiar with them.)

Article 29 Working Party WP148 opinion 1/2008, April 2008 to search engines Personal data related to search queries is very sensitive, and search history should be treated as confidential personal data.

    Note - see the well known Google video, above, giving away lots of info about a (presumably fictional) person's life - his story, literally - just from his search queries.

The retention period shouldn't be longer than necessary for the specific purpose, and then the data should be deleted.

Even if IP address or cookie is replaced by a unique identifier, the individual can still be identified by correlating stored queries.

Responses from search engines (notably at hearings of the search engines with the WP in February 2009)

Google

Microsoft

Yahoo

IP addresses - "anonymized" after 9 months by deleting the last octet.

    Note - deleting the last octet is like deleting the building number from your street address (building, street, city, country) where 255 other people live on the same street.

Cookies - kept for 18 months.

Search history is immediately de-identified by storing search logs separately from registration data (name, address etc), and search logs are effectively anonymised.

We're willing to reduce retention periods for cookies and IP addresses to 6 months
- if the other search engines do too.

We were going to reduce our retention period to 13 months - but now we'll reduce that to 90 days (with limited exceptions for fraud detection, security, legal obligations.)

IP addresses - last octet is deleted, but for fraud detection a 1 way secret hash is applied to the last octet.

Cookies - a 1-way secret hash is applied to cookies of unregistered users (and to registration identifiers of registered users; then we delete (truncate) 50% of the hashed registration identifiers).

Article 29 Working Party to search engines, October 2009 (see the letters for more details) -

"Anonymisation" isn't proper anonymisation unless it's fully effective and irreversible.

IP addresses - it should be 6 months, and deleting the last octet isn't good enough.

Cookies - retention of cookies allows correlation of individual search queries, and seems to allow easy retrieval of IP addresses for new queries made in those 18 months.

Caches - it seems your caches are updated nowhere near quickly enough; removal tool needs improvement.

Well done on the search history anonymisation, leadership medal to you!

But really, you should reduce your retention period to 6 months whatever your competitors do.

Hashing still doesn't prevent linking / associating different searches.

    Note - if you hash my user ID (e.g. smith) to change it to e.g. h5s, you'll still know my searches are by the same person, i.e, h5s, and you can still correlate them.

50% deletion may not be good enough anonymisation.

How much and for how long are search data kept for litigation or legal obligation purposes?

Search engine response to Article 29 Working Party - these represent the current position of the search engines as at the date of writing

Silenzio, it seems.

So, still the same as before - deletion of last octet of IP address after 9 months, and deletion (or is it just "anonymisation"?) of cookies after 18 months.

OK.

Immediately after a search - we'll de-identify cookies by applying a 1-way hash.

IP addresses - will be deleted after 6 months.

Cookies - will be deleted, together with other remaining cross session identifiers, after 18 months.

    Note - it's said the de-identification procedure and hash are applied to cookies after 6 months (for registered users (if logged in at the time, presumably) or 18 months (for unregistered users), but that seems to contradict de-identification "immediately after a search"?

Fine.

IP addresses - will now be deleted in full, after 90 days - not just the last octet.

Article 29 Working Party to search engines, letters 26 May 2010

Google dominates the EU search market, with 95% market share in some countries.

Fair and lawful personal data processing by search engines is increasingly crucial given audiovisual data and geolocation.

Google's "apparent lack of focus on privacy in this area is concerning".

Using an "anonymous ID" still seems to allow cross-matching of search queries for a long time.

Hashing techniques - you haven't given us sufficient info to assess the technical quality of your anonymisation policy.

Deletion of only part of the data isn't true anonymisation.

And you've not given sufficient details on the hashing, especially of user identifiers and cookies.

You're all still not compliant with data protection law.

You should review your anonymisation claims and make the anonymization process verifiable, preferably by developing a credible audit process involving an external and independent auditing entity.

Microsoft & Yahoo haven't given enough info about their techniques for hashing user identifiers and cookies in order for us to assess how effective their anonymisation policy is. (Google don't seem to hash at all.)

The actual anonymisation techniques used deserve open debate and public scrutiny in light of now well known anonymisation failures [i.e. where supposedly anonymised data was successfully linked back to identified individuals - see more on anonymization techniques].

We're asking the US FTC to examine your behaviour under section 5 Federal Trade Commission Act (unfair or deceptive practices). And we're copying this to the European Commission Vice-President in charge of Justice, Fundamental Rights and Citizenship.

I imagine most people who read this blog will be familiar with IP addresses, cookies and hashing, but for those who aren't, I'm planning to write detailed blogs in due course. Meanwhile -

IP address

The internet address of your computer, or strictly your router (the box provided by your ISP to connect to phone line, cable). It's automatically recorded by websites like search engines when you visit them. Often your IP address is assigned dynamically by your ISP, e.g. BT or Virgin, so it could (but needn't) change each time you dial up or disconnect and reconnect. If you have a permanent fixed IP address, which usually you have to pay more to your ISP for, then by definition it shouldn't change.

So a website could identify you from your IP address. With the twist that obviously different people might use the same computer, or use different computers connected to the same router, which means that different searches made from the same IP address could in fact be made by different people from the same household or business, rather than by the same person. Alternatively, of course, they could still be by the same person…

An IP (version 4) address consists of a series of 32 binary digits (bits), i.e. 32 ones and zeroes in a row - 4 groups of 8 numbers each, hence the references above to "octet". Example - 11010001.1010101.11100011.10010011

A slightly more human-friendly version of an IP address breaks it up into 4 decimal numbers in a row, separated by dots, in what's called dot decimal notation. Many people will have seen an IP address in this form, e.g. 192.168.1.254. Each of the 4 dot-separated numbers can't exceed 255 (as the biggest octet possible, 11111111, equals 255 in decimal).

With an IP address like 209.85.227.147 (which is 11010001.1010101.11100011.10010011 in binary), deleting the last octet would involve deleting the 10010011, i.e. the 147.

Cookie

A file that can be stored on your computer via your web browser when you visit a website. The site decides what info to store, and it can if it wishes allow third party sites (typically advertisers) to store cookies on your computer too. Cookies can be persistent and last across different visits to the site, even on different days. (Some sites set the expiry date to a ludicrous one like 99 years away!)

When you visit the same site again later it can read the info contained in the cookie it stored, so, quite independently of IP addresses, cookies can enable the site to recognise that the same web browser is visiting it again and, depending on the info you gave the site previously and the info stored in the cookie, can even identify you personally. I.e., cookies can be personal data.

An advertiser can retrieve its cookie if you browse to another site that has the same advertiser's ads, even if the cookie was stored by the advertiser on your computer while you were visiting a different site. So advertisers can track your visits across different websites.

I'll leave flash cookies and the like to another blog.

Hashing

Applying a particular mathematical process or procedure to something, e.g. a name or a file, to produce something else, usually a shorter unique series of letters and numbers.

Here's a highly unrealistic example. Say my extremely original and exciting hash for a name involves counting the number of letters in the name, taking the 1st letter and last letter, and then combining them in the order: last letter, number of letters, and 1st letter. Applying this never seen before (nor likely to be seen again) "T&L hash" to the name "smith", I'd get "h5s".

Of course in practice it's much, much more complicated and cryptic (and effective!) than that, but you get the drift.

A "secret" hash simply means, well, it's a secret isn't it. The process used is kept secret.

And "1-way" means the procedure is irreversible, you can only go 1 way, you can't work out from the result what the original name (or whatever) was. From "h5s" you might, if you knew the process I used, figure out that the original name was "s---h", but that won't give you the middle 3 letters. (Though you could guess them based on your independent background knowledge if you knew it was an English surname, of course.)

©WH. This work is licensed under a Creative Commons Attribution Non-Commercial Share-Alike England 2.0 Licence. Please attribute to WH, Tech and Law, and link to the original blog post page. Moral rights asserted.

EDPS - recent & forthcoming work

The European Data Protection Supervisor has been busy lately.

1. Recent publications especially Lisbon Treaty speeches

include -

2. Priorities and future opinions

Opinions are going to be issued in 2010 on the following subjects (from the EDPS June 2010 Inventory; see the overview of priorities):

Data protection legal framework

  • Revision of the EU data protection regime established in particular by the data protection Directive and Framework Decision 2010/JLS/279 - including further defining the concepts of 'controller' and 'processor' and clarifying the notion of 'accountability' and the issue of applicable law and jurisdiction
  • Communication on data protection - A strategy for the protection of the fundamental right to data protection after the entry into force of the Lisbon Treaty 2010/JLS/166

Freedom, security & justice; international data transfers

  • Recommendations from the Commission to the Council to authorise the opening of the negotiations
    • on Passenger Name Records -
      • with the US on the transfer and processing of PNR data. 2010/JLS/286
      • with Australia on the transfer and processing of PNR data. 2010/JLS/287
      • for an agreement between the EU and Canada on the transfer and processing of passenger name record (PNR) data to prevent and combat terrorism and other transnational serious crime, including organised crime 2009/JLS/207, & Agreement between the EU and Canada on the exchange of PNR data 2010/JLS/038
    • with the US on data protection and data exchange 2010/JLS/165
  • Data retention - Report from the Commission to the Council on the implementation of the Directive 2006/24/EC on the retention of data generated or processed in connection with the provision of publicly available electronic communications services or of public communications networks and amending Directive 2002/58/EC 2010/JLS/043
  • Identity, biometrics, biodata -
    • Framework for Electronic identity (eID) and authentication 2010/INFSO/021
    • FP7 Integrated project TURBINE (TrUsted Revocable Biometric Identities)
    • Guidelines on EU coding system on tissues and cells 2009/SANCO/011 (Directive 2004/23/EC on human tissues and cells)
  • Selected other issues
    • Legislative proposal for a regulatory framework on smart grids (issues on the establishment of smart grids, including data protection, electric vehicles and open access to the grids)
    • Communication on Privacy and Trust in the Ubiquitous Information Society – responding to emerging and existing challenges 2008/INFSO/018
    • European Digital Agenda 2010/INFSO/001 (there's already been an EDPS opinion on privacy in the digital age to feed into that)
    • Commission Communication on policy and instruments for a reinforced network and information security 2010/INFSO/019 & 2010/INFSO/020
    • Revision to Regulation (EC) No 831/2002 on access to confidential data for research purposes
    • Proposal for a Council Decision on a Union position within the EU-Japan Joint Customs Cooperation Committee concerning the mutual recognition of Authorised Economic Operator programmes in the European Union and in Japan, COM(2010) 55 final

©WH. This work is licensed under a Creative Commons Attribution Non-Commercial Share-Alike England 2.0 Licence. Please attribute to WH, Tech and Law, and link to the original blog post page. Moral rights asserted.

Thursday, 10 June 2010

Digital rights conference, London

The Open Rights Group are holding ORGCON, a 1-day conference in London, on Saturday 24 July 2010, covering UK digital rights issues such as copyright, data protection, data retention and freedom of information.

There's an excellent lineup including leading SF author and writer Cory Doctorow, and Heather Brooke the journalist who was instrumental in exposing the MP expenses scandal. The talks will feature -

Plus "training sessions about how to lobby your MP and more volunteer workshops. There'll also be discussions on the state of UK politics after #GE2010 and why this is a key moment to push harder for reform on digital issues from surveillance to copyright to DRM."

Sounds very good value as it's free to attend if you join ORG, £5 for existing supporters, and £10 for anyone else - so if you're interested, sign up for ORGCON.

Note - some may think I'm a little biased as I support ORG and used to know Cory. But I do think it's worth attending for anyone interested in these topics, especially when compared with the fees charged for most academic / business conferences.

©WH. This work is licensed under a Creative Commons Attribution Non-Commercial Share-Alike England 2.0 Licence. Please attribute to WH, Tech and Law, and link to the original blog post page. Moral rights asserted.

Article 29 Working Party letters to Facebook, other social networking sites

I've updated my previous blog post on the EU Article 29 Working Party criticisms of Facebook and other social networking sites that had signed the Safer Social Networking principles, to add links to the texts of the letters sent out by the Working Party.

The only difference seems to be that in their Facebook letter, the Working Party added at the end:
"In this context the Working Party finds it unacceptable that your company has chosen to fundamentally change the default settings on the Facebook platform only days after the hearing of SNS providers on 30 November 2009 in which your representatives took part.
Article 29 Working Party is also concerned at subsequent changes made by Faccbook which also give rise to data protection and privacy concerns. In the EU, data protection and privacy are fundamental rights which must be respected and taken into account especially by operators like you."

©WH. This work is licensed under a Creative Commons Attribution Non-Commercial Share-Alike England 2.0 Licence. Please attribute to WH, Tech and Law, and link to the original blog post page. Moral rights asserted.

UK government - views by end today

It'll be interesting to see if this ends up being merely e-democracy theatre - and to see who replies, and to what extent their views are taken on board - but we have until the end of today UK time, Thursday 10 June 2010, to comment on the new UK coalition government's programme.

Name and email are required to post a comment, but it looks like you can put down a pseudonym if you wish.

There seem to have been hardly any comments on the Digital Economy Act so far.

They've not provided a search facility for the site, so here's a form I made earlier which searches the site via Google. Feel free to use it to check out the site content or comments made so far on the programme. (Please click the Search button rather than hit Enter)



©WH. This work is licensed under a Creative Commons Attribution Non-Commercial Share-Alike England 2.0 Licence. Please attribute to WH, Tech and Law, and link to the original blog post page. Moral rights asserted.