Thoughts: P2P, PET+, and Privacy Literature

p2pwindowPeer-to-peer (P2P) technologies are not new and are unlikely to disappear anytime soon. While I’m tempted to talk about the Pirate’s Bay, or ‘the Pirate Google‘ in the context of P2P and privacy, other people have discussed these topics exceptionally well, and at length. No, I want to talk (in a limited sense) about the code of P2P and how these technologies are (accidentally) used to reflect on what privacy literature might offer to the debate concerning the regulation of P2P programs.

I’ll begin with code and P2P. In the US there have been sporadic discussions in Congress that P2P companies need to alter their UIs and make it more evident what individuals are, and are not, sharing on the ‘net when they run these programs. Mathew Lasar at Ars Technica has noted that Congress is interested in cutting down on what is termed ‘inadvertent sharing’ – effectively, members of Congress recognize that individuals have accidentally shared sensitive information using P2P applications, and want P2P vendors to design their programs in a way that will limit accidental sharing of personal/private information. Somewhat damningly, the United States Patent and Trademark Office declared in 2006 that P2P applications were “uniquely dangerous,” and capable of causing users “to share inadvertently not only infringing files, but also sensitive personal files like tax returns, financial records, and documents containing private or even classified data” (Source).

In calling on P2P coders to redesign their UIs so that they limit or prevent individuals from sharing personal information Congress is, in a sense, calling for coders to adopt Ann Cavoukian‘s Privacy Enhanced Technologies Plus (PET+) model. Cavoukian advocates that we move beyond seeing privacy and surveillance technologies as being in zero-sum relationships; it should be possible for these two kinds of technology to happily co-exist. In terms of P2P, we can imagine that the programs as ‘surveillant’, insofar as they watch directories for files to share with the world. Rather than broadly sharing folders and their contained files they could (for example) target specific locations where commonly shared files routinely are found; to share your music, the P2P looks for iTunes directory structures, as well as a the more general ‘My Music’ folder in Windows. This targeted sharing would be coded into the application itself, and it would require authentication to share an entire hard drive. When sharing music folders, if a text file is located in the folders the user might be challenged to authorize the sharing of the file(s) before the P2P application will serve the file to the world at large.

Of course, there are several questions that might immediately jump out:

  1. Can we consider accidental sharing a privacy problem, or is it just a UI/problem?
  2. What kind of privacy problem would this constitute?
  3. Is the solution PET+, or just better education?

Concerning the first question, as someone who is generally comfortable with technology I’m inclined to say that it’s ‘just’ a user problem; if you read the manuals, follow the instructions, and ask questions about the technology before deploying it you’re unlikely to accidentally share sensitive files. As such, I would initially be inclined to say that this isn’t a privacy problem, but a UI/User problem. Education and a better UI should be enough to alleviate issues that crop up with P2P programs.

Then I think of my dad and step-mother, who are likely to have accidentally shared files because a P2P UI sucks. They might even agree that they should have learned a bit more about the program before they used it, but would more generally argue that the technology itself should be designed to keep them safe from themselves. Much as we include air bags in cars, something should be included in the P2P software to shield users from sharing sensitive files that might facilitate their identity being stolen. My parents would likely want to spin this as a ‘privacy problem’, but how might we understand this as a privacy problem, and what insights might the privacy literature offer?

Priscilla Regan, in her book Legislating Privacy: Technology, Social Values and Public Policy, argues that privacy is a social and an individual value. Privacy has three central values to the public/society at large:

  • All individuals value privacy in their lives to some extent;
  • Privacy facilitates public cohesion by allowing individuals to shield some of their differences;
  • Privacy functions as a collective good given that it is (in an economic sense) non-divisible and non-excludable.

Under her account, privacy is something that the public should be interested in for a broad spectrum of reasons and, as it applies to the P2P case, something like PET+ should be implemented at the level of code to ensure that some of individuals’ differences are shielded from public light. IP addresses aren’t anonymous, and as soon as personal data hits a P2P network there are decent chances that it will never be secluded from the gaze of third-parties, who have no business knowing about an individual’s personal details, ever again. Thus, Regan might argue that rather than focusing on the exclusive damage that unintended sharing does to the individual, that the society as a whole has an interest in seeing P2P applications coded in a way that mitigates the chances of personal details accidentally going online. We need some kind of ‘flow control’ to be built into the design of these technologies.

Talking about the sharing or not sharing of personal information as ‘flow control’ is not accidental – it is meant, in part, to evade the challenges that arise when we use the language of privacy ‘invasions’ or ‘intrusions’. Privacy advocates have regularly used the language of invasion and intrusion to get people riled up about having their personal information shared in a fashion that people did not approve of. This isn’t new; we can look to how Warren and Brandeis absconded with Judge Cooley’s notion of “being let alone” in their effort to establish legal footing to stop paparazzi from taking pictures of Boston’s elite as an early piece of work that saw privacy problems as problems of intrusion. They correlated intrusion with physical violations of one’s person (though, of course, the mental intrusion was ‘worse’ than physical touch…) to try and develop a tort-based claim to privacy.

With the benefit of hindsight, we can see recognize torts alone are generally ineffective to address contemporary privacy problems. The dependence of showing personal harm (especially in the US) has regularly demonstrated the challenge for law to grasp that something has happened when one’s personal data is released to the wilds of the ‘net. (Example: is the theft of banking data without that data being used a privacy violation? Not per most tort claims.) Torts demand injuries, and intrusions demand understanding what kind of a bubble surrounds a person. Neither torts or intrusions are a particularly comprehensive of fulfilling ways of understanding and redressing privacy problems.

The language of something like ‘flow control’ is meant to pick up on some of Helen Nissenbaum‘s work on privacy as contextual integrity. In brief, she argues that we should try to recognize that each situation carries with it particular informational norms that attend to the appropriateness of revealing particular information as well as norms of distribution that address when information should be transferred between parties. She is attempting to say that even when data emitted in public that privacy norms accompany the data – even when we share files, we do not necessarily think of those files as ‘public’. When I share my music directory using a P2P application, as an example, the norms of appropriateness and distribution might be interpreted thusly:

  • I am willing to reveal my musical tastes, and the directory structure of my music collection. I may also be willing to share my screen name, and less likely to reveal my personal information to anyone but associates that I build a relationship with using any social networking features built into the P2P software. It is inappropriate for the software to disclose any more information than is normally transmitted in the relationship-type I have with the individual(s) I am sharing with;
  • I am willing to distribute music files to other users on the network – I am not willing to share non-music files that were accidentally saved into the directory holding those music files, nor do I authorize the retransmission of those non-music files.

Under this account, a privacy problem is registered when the norms of the situation are not respected. This account recognizes that data is always being shared between parties – data is always flowing – but that what we want is to be able to adjust those flows depending on what we are doing and who we are doing it with. The problem that arises out of Helen’s work is that we are led to ask:

  • What norms might we appeal to in a judicial environment?
  • How ‘contextual’ do we need to get – is ‘contextual’ to be understood at the level of the individual user, or as a regional (state/province) level, the national level?
  • Should we  classify individuals according to their technical proficiency, as individuals, or in some other way?

While Nissenbaum’s work has only ambiguous replies to these sorts of questions (at least as I’ve read here), her work does fit into a discussion of PET+. Using the PET+ notion that was outlined above, individuals should be able to hone the level of sharing that they are willing to engage in. This is done because privacy is valued both by society (and hence there is a need to code P2P to align it with a PET+ philosophy) in such a way that remains sensitive to individuals’ decisions (context comes into play to the level of the individual). Conceiving of the problem holistically, by recognizing privacy as an issue that is both public and private, lets us re-articulate the very discussions that we can have about privacy, including as it pertains to P2P file sharing programs.

There are other advantages to adopting a ‘flow control’ model – it means that we have a very real way of evaluating how P2P programs are designed. Former privacy commissioners David Flaherty and Spiros Simits have seen most ‘privacy’ issues as ones of data control. Should P2P vendors not design PET+ into the heart of their programs, and thus undermine privacy at the individual and societal level, then legislation should come crashing down on them on the basis that the vendors have neither adequately enabled individuals to ‘control’ their data or respected the legally socio-juridical privacy norms. This isn’t to say that these two regulators (and others, such as Cavoukian) would not develop a far more nuanced understanding of control as it relates to P2P and the sharing of personal information, but merely to suggest that the Commissioners’ accounts and understandings of regulating privacy can be taken alongside the social and individual understandings of privacy.

Perhaps most significantly, if we can see the deficiencies of P2P applications as generating privacy problems, we open up a space for regulatory bodies concerned with privacy to try and force changes. While I recognize that focusing on Commissioners with regulatory powers means that the US isn’t captured by this discussion, it does mean that Canada and the EU (and all of the P2P vendors in these nations) would be subject to privacy commissioners’ weight. There is at least some value in this approach, and perhaps a similar argument could be made in terms of the FCC or another regulatory body in the US (though I’m less certain on this point). Ultimately, I see it as a conceptual win-win to draw the idea of PET+ and Commisioners’ regulatory powers into the P2P discussion along with a flow-based understanding of privacy that recognizes privacy as a value to both individuals and societies more broadly. How to actually work this out in policy, of course, is a separate matter entirely…