Facebook Got Off Easy: Third-Parties and Data Collection

datadestroyI’m on Facebook, and have been for years. I also dislike Facebook, and have for several years. I don’t dislike the social networking service because it’s bad at what it aims to do, but because it’s far too good at what it does. Let’s be honest: Facebook does not exist to ‘help me connect to my friends’. Maybe that was its aim when it was first dreamt up, but the current goal of Facebook is to make money from my data. Part of this involves Facebook mining my data, and another (and more significant) part entails third-party developers mining my data. I want to think out loud about this latter group and their practices.

A core issue (amongst several others) that Office of the Privacy Commissioner of Canada (OPC) raised in their recent findings about Facebook focused on the data that third-party application developers gain access to when an individual installs an Facebook application. Before getting into this in any depth, I just want to recognize the full range of information that application developers can call on using the Facebook API:

…your name, your profile picture, your gender, your birthday, your hometown location (city/state/country), your current location (city/state/country), your political views, your activities, your interests, your musical preferences, television shows in which you are interested, movies in which you are interested, books in which you are interested, your favorite quotes, the text of your “About Me” section, your relationship status, your dating interests, your relationship interests, your summer plans, your Facebook user network affiliations, your education history, your work history, your course information, copies of photos in your Facebook Site photo albums, metadata associated with your Facebook Site photo albums (e.g., time of upload, album name, comments on your photos, etc.), the total number of messages sent and/or received by you, the total number of unread messages in your Facebook in-box, the total number of “pokes” you have sent and/or received, the total number of wall posts on your Wall™, a list of user IDs mapped to your Facebook friends, your social timeline, and events associated with your Facebook profile (Source).

That’s a lot of information, much of it personal. Now, when culling this data from your profile using the API, developers are ethically limited to hold this information for no more than 24 hours. Data that is input to applications themselves, however, is owned by the third-party developer. Thus, while it’s illegal for American authorities to requisition your public library records, and if you put this information into your profile application developers can only cache this data for 24 hours, by you’re providing this information to a third-party Facebook application the data can be held onto forever (and also be subject to law enforcement demands). This means that private corporates can gain insights into users’ lives that are otherwise obfuscated by legal protections, and also that authorities could do end-runs around standing laws to get access to information they’re legally obliged to leave alone. While some might smirk at that last point, I would suggest that law enforcement is increasingly moving to social networking sites to engage in preemptive policing. Police recognize the usefulness of social networking, and we should expect that they will turn to it more and more in the course of investigations over the coming years.

On the positive side, the OPC has noted that there is no need for developers to gain unlimited access to users’ information, and that Facebook should implement technological safeguards to limit the range of data that developers can harvest. The Commissioner’s office recognizes that there are few/no measures presently in place that limit the transfer of cached personal information to secondary servers, which enables developers to harvest information and retain it for indefinite periods of time. I know of researchers who have exploited this technological deficiency in the course of their work, and see no reason why third-party commercial bodies wouldn’t have engaged in similar practices. Such practices need to stop, and Facebook should be made responsible for stopping them.

While the OPC is to be commended for recognizing that third-party developers can access and manipulate data, I worry that the Office has not gone far enough. I would have liked to see a strident demand that Facebook force developers to more clearly identify what information that they collect and establish deletion/opt-out systems that are similar to those that the OPC insists that Facebook implement for their own services. Of course, the challenge with such a suggestion is that Facebook is not a third-party developer, but is developing the ecosystem. Facebook’s response to a demand that third-parties be subject to the OPCs suggestions is likely along the lines of, “We shouldn’t be responsible for the applications that live in our ecosystem, any more than Microsoft is responsible for the applications that are written for the Windows ecosystem.”

This said, Facebook is in the practice of encouraging the disclosure of personal information for commercial benefit and, as such, I think that they have (at the very least) ethical requirements to limit how this information is transmitted and used. Facebook has already encouraged developers to use collected data in ethical ways; why not integrate mandatory data transfer controls into their freely available API? There are almost one million developers right now; a code of ethics alone is insufficient to guarantee compliance with Facebook’s privacy expectations. This is one area where I agree with Lessig: code should be used to enforce norms as law, and Facebook is fully capable of such enforcement.

I maintain that when a technology is provided, developers and policy makers should be responsible for the technology’s intended uses. If atomic energy is developed for the express purposes of destruction, then the developers of that technology and/or policy makers should be held responsible for those expected and intended effects of the technology. Facebook is intentionally developing a marketplace that caters in people’s personal information, and I’d argue that few of those people recognize the full implications of indiscriminately feeding their personal information to application developers. When inputting personal information into a quiz, for example, the application developer can then retain that information beyond the 24 hour Facebook data retention period. Moreover, this data can be monetized – it can be sold to other third-parties that are outside of the formal Facebook ecosystem. This means that the ‘private space’ that many users think that they are playing in (e.g. applications and the games they let people play) can actually be used to discriminately provide other products and services in non-Facebook environments.

The common reaction that I hear, at this point, is that many application developers have privacy protection clauses that clearly note that developers will not share the information that they collect. It’s important to note that if a business transitions to bankruptcy, its assets are sold off in an effort to recuperate outstanding bills. This includes databases the company has developed, even if they hold personal information (in fact, unique databases of personal information can be particularly useful in appeasing creditors haunting dying businesses). Moreover, given that many applications permit sharing information amongst corporate affiliates, there isn’t always a need to share collected data with third-parties that aren’t affiliated with the application developer’s studio. If General Electric (for example) is ultimately a corporate controller of 1,000 application development studios, then whatever data those studios collect could be funneled to GE businesses without the user having a real notion of what businesses can actually make use of their personal information without violating any of the developers’ privacy policies. This is worrisome, and an increasingly troubling issue as large conglomerates enter the social networking space to develop ‘relationships’ with consumers.

The challenge for the OPC, or any regulator for that matter, is that typically each of these third-parties would been to be examined individually. I would argue that with the limited resources available to privacy regulators, regulators should focus on Facebook to limit (or at least make users aware of) third-parties’ uses of personal data. Failing to do so means that the leaches of the Facebook ecosystem, those that are potentially the sneakiest and most likely to upset or injure an individual’s life, are free to continue exploiting individuals’ ignorance over the full range of possible harmful uses of their personal information. This isn’t right, and some regulatory body needs to step in and add some salt to the Facebook ecosystem.

Without adding a little salt Facebook will have gotten off easy, insofar as they can leave application development to its own devices. Applications are a core reason why Facebook is so ‘sticky’; as long as app development remains the wild-west situation that it is presently, then a core driver to the Facebook ecosystem will continue to operate in a more or less unregulated fashion without addressing the potential harms emergent with this lack of regulation. I should not that, in making these criticisms, I don’t want to suggest that the OPC’s recommendations have not resulted in significant changes, and do not blunt many practices that are of concern. Unfortunately, as I read the OPC’s decision, their recommendations don’t really address the 500lbs. elephant in the room, which means that some other regulator needs to step in and follow the OPCs lead in mediating the worrisome practices of social networking sites and their affiliated developers.