Thoughts: Deep Packet Inspection and Copyright Protection

In Lessig’s most recent book, Remix, he avoids directly endorsing any particular method of alleviating the issues with copyright infringement. Rather, he notes that there are models that have been proposed to alter how monies are collected for copyright holders. I want to briefly attend to the notion that file signatures can be used to identify particular copywritten works, and how deep packet inspection (DPI) could be used to facilitate this identification process.

The idea for using file signatures to track the movement of copywritten files goes like this: when you create a work that you want to have copywritten, the work is submitted to a body responsible for maintaining records on copywritten work. We can imagine that this could be national libraries. When the libraries receive the work, they create a unique signature, or hash code, for the copywritten work. This signature is stored in the national library’s database, and is known to the copyright holder as well. We can imagine a situation where we can choose what kind of signature we want copywritten work to have – there could be a full-stop copyright, a share-and-share alike non-commercial style copyright, and so forth. By breaking copyright up in this fashion, it would be possible to more granularly identify how content can and should be used.

Now, when a work is digitally transmitted, it would be possible to identify the signature that is encoded into the media file. Thus, when a copy of my MA thesis was sent from one person to another it would be possible to identify the file’s signature as it was being transmitted and correlate that with information populated by the national library’s database. The question, however, becomes where does the data holding the signature lie? If we presume that metadata would be held in a file – rather than wholly in each of the file’s packets – then it would be necessary to use deep flow capture technologies to gather file contents, identify the signature, and then notify the appropriate body that the file was being transferred/making a record of its transferral. Ignoring the fact that encrypted traffic evades DPI analysis (especially where the file itself must be identified for a ‘successful’ analysis), and that there would quickly be a way to strip this metadata out of files, what should we make of this kind of use of DPI technologies?

By-and-large, such a use appears to be a particularly obtrusive method of securing copyright. This method does, however, have the advantage of perfectly securing copyright over files that haven’t been altered. Moreover, supposing that it would be possible to establish different license types, it would be possible to deeply encode relatively free or open licenses that would be machine readable. Beyond the notion that this is obtrusive, what is significant about this and any other use of ISPs to monitor files that pass through their network en masse is that innovation is occurring in the middle of the ‘net. Whereas the Internet has been predicated on end-to-end intelligence, with the routing devices being ‘dumb’ insofar as they have minimally interfered or affected the movement of content beyond the rules of TCP/IP (and similar protocols), by suggesting that something should be going on at the network level we are making the middle intelligent.

As it pertains to copyright, it seems to me that any attempt to read signatures is going to be privacy invasive, at least if you want 100% enforcement. You will need to check every ‘storage container’ (i.e. packet), something that can’t be done in the U.S. despite billions of dollars thrown into the war on terror. What is perhaps of even greater concern, is that a packet-by-packet analysis will focus on the act of copying, rather than the use of the copy. I am unabashedly convinced by Lessig’s argument that we need to reform (not abolish) copyright – copyright needs to address how material is used in an age of remixes, of creative sharing. If we were to use DPI devices to sniff each file, we would merely try to reaffirm a copyright strategy that has criminalized massive populations by demanding that their gatekeepers to the ‘net watch over everything that they do.

While I don’t want to be so extreme as to say that I want the ‘middle’ of the ‘net to be dumb (I think, for example, that there is a real advantage to local caching of regularly accessed data so that ISPs don’t need to pay Tier-1 ISPs to move data along the backbone [yes, I realize this can be read as a blow against network neutrality ]), I do think that any suggestion that ISPs act as copyright police by watching for metadata is almost necessarily privacy invasive.

(As a note: there are presently plans in the U.S. to have ISPs watch for signatures of files that are related to child pornography. I’m as against that sniffing for that sort of traffic as I am of copywritten material, on the basis that it is widely privacy invasive. Measured responses that don’t amount to pulling over each person and conducting a full body cavity search for drugs are needed in digital spaces, and the use of DPI equipment is the equivalent of that cavity search.)