Data Retention, Protection, and Privacy

Data retention is always a sensitive issue; what is retained, for how long, under what conditions, and who can access the data? Recently, Ireland’s Memorandum of Understanding (MoU) between the government and telecommunications providers was leaked, providing members of the public with a non-redacted view of what these MoU’s look like and how they integrate with the European data retention directive. In this post, I want to give a quick primer on the EU data retention directive, identify some key elements of Ireland’s MoU and the Article 29 Data Protection Working Group’s evaluation of the directive more generally. Finally, I’ll offer a few comments concerning data protection versus privacy protection and use the EU data protection directive as an example. The aim of this post is to identify a few deficiencies in both data retention and data protection laws and argue that  privacy advocates and government officials to defend privacy first, approaching data protection as a tool rather than an end-in-itself.

A Quick Primer on EU Data Retention

In Europe, Directive 2006/24/EC (the Data Retention Directive, or DRD) required member-nations to pass legislation mandating retention of particular telecommunications data. Law enforcement sees retained data as useful for public safety reasons. A community-level effort was required to facilitate harmonized data retention; differences in members’ national laws meant that the EU was unlikely to have broadly compatible cross-national retention standards. As we will see, this concern remains well after the Directive’s passage.

The DRD only applies to data and locational data, excluding “the content of electronic communications, including information consulted using an electronic communications” (Art 2 2). It is important to note that the DRD extends the definition of traffic data from what initially appeared in the EU e-Privacy Directive, which defines traffic data as “any data processed for the purpose of the conveyance of a communication on an electronic communications network or for the billing thereof” (Art. 2 b). The DRD refers to both traffic data and related data needed to identify users/subscribers. The disclosure of this information is to be provided in accordance with national legislation enacting the DRD. Data is retained for 6-24 months, and includes the following:

  • data necessary to trace and identify the source of a communication;
  • data necessary to identify the destination of a communication;
  • data necessary to identify the date, time and duration of a communication;
  • data necessary to identify the type of communication;
  • data necessary to identify users’ communication equipment or what purports to be their equipment;
  • data necessary to identify the location of mobile communications equipment.

Importantly, yearly statistics of how the DRD legislation is used must be submitted. Per article 10, this information must include:

  • the cases in which information was provided to the competent authorities in accordance with applicable national law;
  • the time elapsed between the date on which the data were retained and the date on which the competent authority requested the transmission of the data;
  • the cases where requests for data could not be met.

These statistics play an important role in actually evaluating the (in)effectiveness of the DRD and national laws. The date for evaluating the DRD has actually passed – it was September 15, 2010 – and the Parliament and Council are expected to evaluate the applications of the DRD and its impacts on consumers, citizens, and government. The review will be based on the statistics provided, and results will be made public.

The Irish Leak!

The Irish instantiation of the DRD isn’t really all that controversial in most ways. The Memorandum of Understanding (MoU) between the government and industry ‘partners’ takes note of costs, specifically stating that “[t]his MoU seeks to minimize the costs, time delay and audit requirements of complying with data access requests under the Act and to promote efficient administration of its requirements within the Communications Industry working with agencies of the state” (7). The MoU, and related Act, applies to mobile network operators, fixed line network operators, and ISPs. Any and all data that the public ISPs/telecom operators collect should not to be independently verified by operators. Further,  while the retained data (e.g. IP addresses, endpoint information, subscriber information) could be used to roughly guess who might be using a device, the MoU recognizes that that this information cannot certify which individual is actually using a device at any particular time.

Data concerning fixed and mobile telephony must be retained for 24 months and ISP information (e.g. Internet Access, Internet email and Internet Telephony) retained for 12 months. Procedures must be established for making and servicing data requests by authorities. Perhaps reflecting both discussions with Irish industry, and an acknowledgement of the wider concerns raised by ISPs facing lawful access and retention laws around the world, the parties of the MoU recognize that the development, testing, and deployment of systems needed to comply with the Irish law may impose significant time delays before full compliance is met. Data that is retained as a result of the Irish Act must be disposed of after the above mentioned times, unless there are independent business motivations to retain data for longer periods of time.

Of interest, the MoU recognizes that a standard electronic mail and paper form should be developed to identify data requested by authorities and provided by telecommunications groups, but it doesn’t go so far as to require parties to use this form. As we will soon read, this stands in contravention of proposals by the Article 29 Data Protection Working Party.

Working Group Recommendations

This year the Article 29 Data Protection Working Group released a report on the Data Retention Directive, and it was damning. Whereas the DRD was meant to harmonize retention processes and disclosure procedures, this has yet to be evidenced. At the time of the report’s writing very few of the Commission members had provided the DRD-required statistics . The Working Group acknowledged the concerns raised about the DRD, writing:

…the availability of traffic data allows disclosing preferences, opinions, and attitudes and may interfere accordingly with the users’ private lives and impact significantly on the confidentiality of communications and fundamental rights such as freedom of expression. These scenarios are unfortunately likely to occur both because of intentional activities and on account of negligent mechanisms.

The Working Group recognizes that many cases the companies retaining data lack an automated system to delete logs, and thus are out of line with the DRD. Automation is required. Moreover, nations implementing the DRD have generally required ISPs to adopt self-regulatory regimes to be/remain DRD compliant. Such regimes are insufficient because there are considerable imbalances of power between ISPs and law enforcement. In terms of disclosure information, the Working Group has proposed a single data handover format that requires a single contact to provide the following:

  • User data, containing a known, finite number of fields related to service subscription and the terminals made available to users;
  • traffic data, containing known finite fields;
  • provider code containing a unique EU-wide ID to identify the communications provider and/or ISP;
  • LEA code to identify what LEA made the request;
  • Judiciary code to identify the judicial authority requiring the disclosure;
  • timestamp and request number;
  • request type, to specify the data request category (e.g. by serious crime or by amount of requested traffic data)

The Value in Transmission Data

So, no content data is monitored, but what does this mean, really? What can European states and authorities do with the data being retained?

Critically, with traffic data you can map out social networks, fixing their position within a larger group of associates. While this bit of information is obvious, what is less so follows:

…the position of an agent in the social network is in many ways more characteristic of them than any of their individual attributes. This position determines their status, but also their capacity to mobilize social resources and act (social capital). (Danezis and Clayton 2008: 99).

Traffic analysis can even be effective in identifying individuals engaged in encrypted (SSH) communications; when working in interactive mode “SSH transmits every key stroke as a packet and, hence, the password length is trivially available.” Further, there is enough variability in typing patterns themselves to plausibly identify particular users with enough traffic data. Remember: no content is ‘touched’ or captured in this kind of an analysis. Such approaches work well enough on civilian communications because they’re not permanently encrypted, nor are there a persistent levels of traffic. As a result, it is possible to capture information from packet payloads if required and an attacker can identify when encrypted communications are taking place (itself a significant piece of information). (As a sidenote: Sensitive military communications address this eavesdropping problem by having ongoing streams of encrypted data traffic between nodes of networks; when someone transmits actual (i.e. non-noise) data between nodes it is effectively invisible amongst the stream of ongoing encrypted data.)

That ‘only’ traffic data is captured assumes that we can make a clear distinction between the ‘administration’ versus ‘content’ of digital communications; information in packet headers is administrative whereas information in payloads constitute content. Unfortunately, this ‘clear’ distinction is misleading given the capacities of data mining as it pertains to traffic information. Using pattern-based mining techniques (in comparison to subject-based datamining) it is possible to leverage theories of data linkages’ predictive power to identify suspicious individuals. When behavioural marketers use this kind of information they lack any kind of necessary marker – they don’t necessarily know that they have a certified piece of authentic information about a person’s identity – whereas the breadth of the DRD and its national instantiations include requirements for subscriber information. As a result, it is possible to map where individuals reside in broader social networks, where they traverse online, who they communicate with, and so forth. Lacking access to the content of communications isn’t necessarily the same thing as being ‘privacy protective’, and given the sheer amount of data transiting around the world each day it’s likely impossible to capture it all anyways. Thus, the ‘limitations’ on what is captured should be recognized as reflective of technical realities rather than the limitations somehow being ‘privacy protective’. The end result is that surveillance, rather than privacy, becomes the necessary default for all online communications barring exceptional circumstances.

Data Protection versus Privacy

Clearly then, while the Irish case is within the boundaries of the DRD and (perhaps) data protection requirements, it is a gross infringement of individuals’ privacy and out of alignment with the Working Group’s recommendations. We should consider the effectiveness of the DRD against the backdrop of legal data protection, as well as the broader issue of privacy.

Legal protections of privacy have proliferated over the past several decades but there is academic uncertainty about their effectiveness. Writing back in 1997, Gellman writes that “it is difficult to see whether the law is really an effective device for protecting privacy” (Gellman 1997: 212). This is certainly the case given that the DRD came into being in a comprehensive privacy protection regime. Lyon has suggested that data protection laws promote a “culture of care regarding personal information” (Lyon 2007: 173) as a critique of these laws’ abilities to actually prevent collection of data in the first place. Perhaps depressingly, we might critique his position as being optimistic: it is unclear just how much ‘care’ is provided to personal information gathered during routine data retention given the failures in reporting and standardization around the DRD. While the Working Party didn’t suggest that the DRD was illegal on the basis of the European Convention on Human Rights, Breyer does. Specifically, the DRD’s harms to civil rights are arguably disproportionate to the aims of the legislation in question, it may reduce the sharing of information critical of the government and thus affect freedom of expression, and it puts an undue burden on ISPs (Breyer 2005).

In effect, formal data protection doesn’t seem to be securing privacy as a fundamental right. In fact, the present landscape lends credence to Farrell’s argument that “if an epistemic community of privacy experts helped drive the international convergence on data protection principles at an earlier juncture, officials in justice, home affairs, and security ministries are now playing a similar – but much less privacy friendly – role in driving many pertinent areas of policy” (Farrell 2008: 382). The present epistemic community driving ‘anti-privacy’ initiatives regularly uses the language of ‘balancing interests’ to push through their projects, but the very language used needs to be challenged. Balances often see privacy traded away “in concessions to managing surveillance, rather than restricting it.” As a result, governmental protectors of privacy need to abandon the language of balance and adopt a revised paradigm emphasizing “steering as the essential part of a decision-making process in which balancing is an instrument to be manipulated in the interests of privacy, rather than a desirable outcome at any level” (Raab 1999: 83).

The adoption of balancing as a tool amongst others repositions privacy advocates and commissioners as champions of privacy as opposed to advocates of broader social responsibility; it reorients them as staunch defenders. This isn’t to suggest that protecting privacy is contra to social responsibilities, but merely that advocates’ and commissioners’ tasks are to defend privacy; others will be responsible for making the broader arguments. At the same time, this doesn’t mean that in defending their particular visions of privacy these parties cannot use the tool of balance to better achieve protections. In effect, this would see data protection laws themselves as tools for defending privacy rather than ends in themselves.

In the vein of data protection as a tool, I have to question the accuracy of David Flaherty’s assertion that there is no privacy issue that cannot be satisfactorily addressed “by the application of fair information practices, broadly defined, to include such critically important notions as transparency of data collection and processing to the affected public, the need-to-know principle for personal data sharing, and the crucial importance of audit trails to monitor compliance during and after data transfers, as required” (Flaherty 1999: 35). While Flaherty’s position has the benefit of legal clarity I worry that FIPs aren’t necessarily complete enough to independently address the breadth of privacy issues facing individuals (and society) today. Privacy issues are becoming broad enough that seemingly innocuous data can be used for substantial profiling and discrimination purposes. Transparency is certainly important for determining whether a practice is legitimate or not, but the DRD and its national instantiations are relatively transparent. The addition of better access controls and audits wouldn’t alleviate the fact that individuals lack the agency necessarily to effect changes in the DRD-mandated surveillance process. Neither does transparency resolve the broader social and constitutional harms that arise as individuals cease seeing themselves as authors and addressees of mass-surveillance law. The capacity to assert agency is as important as access to knowledge, and it is unclear to me how transparency of mass surveillance processes alone facilitates the agency to resist directives like the DRD.


Art. 29 Data Protection Working Party. (2010). “Report 01/2010 (WP172),” available at:

Breyer, Patrick. (2005). “Telecommunications Data Retention and Human Rights: The Compatability of Blanket Traffic Data Retention with the ECHR,” in European Law Journal 11. 365-375.

Danezis, George and Clayton, Richard. (2008). “Introducing Traffic Analysis,” in Digital Privacy: Theory, Technologies, and Practices. A. Acquisti, S. Gritzalis, C. Lambrinoudakis, and S. Vimercati (eds). 95-116.

Directive 2002/58/EC [The e-Privacy Directive] – link:

Directive 2006/24/EC [The Data Retention Directive] – link:

Flaherty, David. (1999). “Visions of Privacy: Past, Present, and Future,” in Visions of Privacy: Policy Choices for the Digital Age. C. J. Bennett and R. Grant (eds). Toronto: University of Toronto Press. 19-38.

Gellman, Robert. (1997). “Conflict and Overlap in Privacy Regulation: National, International, and Private,” in Borders in Cyberspace. B. Kahin and C. Nesson (eds). Cambridge MA: The MIT Press. 255-282.

Lyon, David. (2007). Surveillance Studies: An Overview. Cambridge, UK: Polity Press.

Raab, Charles. (1999). “From Balancing to Steering: New Directions for Data Protection,” in Visions of Privacy: Policy Choices for the Digital Age. C. J. Bennett and R. Grant (eds). Toronto: University of Toronto Press. 68-93.

Solove, Daniel J. (2008). “The New Vulnerability: Data Security and Personal Information,” in Securing Privacy in the Internet Age. A. Chander, L. Gelman, and M. J. Radin (eds). Stanford: Stanford University Press. 111-136.

2 thoughts on “Data Retention, Protection, and Privacy

  1. Hello, firstly, I want to note that I think it’s a excellent weblog you got here. And to the point, I haven’t find out how to add your site feed in my feed reader, where’s the link to the rss feed? Thanks


Comments are closed.