Deep Packet Inspection: The Good, the Bad, and the Ugly

goodbaduglyIn this post, I want to try to lay out where I see some of the Deep Packet Inspection (DPI) discussions. This is to clarify things in my head that I’ve been thinking through for the past couple of days and to lay out for readers some of the ‘bigger picture’ elements of the DPI discussion (as I read them). If you’ve been fervently following developments surrounding this technology, then a lot of what is below is just rehashing what you know – hopefully the summary is useful – but if you’re relatively unfamiliar with what’s been going on this might help to orient what’s been, and is being, said.

Participants and Themes

The uses of DPI appliances are regularly under fire by network neutrality advocates, privacy advocates, and people who are generally concerned about communication infrastructure. DPI lets network operators ‘penetrate’ data packets that are routed through their networks and this practice is ‘new’, insofar as prior networking appliances were generally prevented from inspecting the actual payload, or content, of the data packets that are shuttled across the ‘net. To make this a bit clearer, when you send email it is broken into a host of little packets that are reassembled at the destination; earlier networking appliances could determine the destination, the kind of file being transmitted (e.g. a .mov or .jpeg), and so forth but they couldn’t accurately identify what content was in the packet (e.g. the characters of an email message held within a packet). Using DPI, network operators can now (in theory) configure their DPI appliances to capture the actions that users perform online and ‘see’ what they are doing in real time.

Somewhat crudely, I want to think through DPI in the phases of ‘the good, the bad, and the ugly’. When talking about the Good I want to really quickly acknowledge that this technology is not somehow inherently evil. The Bad addresses why some individuals and groups have taken a stance to oppose or insist on regulating the deployment of DPI appliances. The Ugly is meant to lay out some of the problematics in the discussions between network owners and those who oppose the technology.

None of these sections are ‘complete’ – I’m not doing a total canvas of the entire breadth of literature on this topic – and some are more detailed than others. I’m not planning on examining the technology itself at length (though I’ve put together a paper that aims to do just that), but am instead focusing on the stakes various parties have in the discussion.

Now – onto the good!

The Good

There is real value in understanding what is being shuttled along ISPs’ fibre-optic lines: DPI can be ‘good’. When there are major viral outbreaks, analyzing packet streams can assist in mitigating a widespread dissemination of said virus – while it’s almost impossible to stop a virus in ‘meat space’, it might be possible to ‘stop’ or limit the spread of a serious code-based ‘pandemic’ using DPI to quarantine vast areas of ISPs’ networks. DPI can also be used to develop incredibly rigorous understandings of what is going on on an ISP’s network, which enables ISPs to better allocate resources, build capacity, and so forth. DPI can be used to provide superior Quality of Service – VoIP applications might be given priority over P2P applications, online gaming packets might be prioritized over latency-insensitive packets generated by email programs. As consumers, this is something that (generally) we want – we want the best possible service that is feasible, and DPI can be used to facilitate better services.

The Bad

Of course, most people who have heard about DPI have learned of it because of the individuals, groups, and parties who have outed DPI as a ‘bad’ (or, to be fair to many, potentially bad). This is the space where many of the anti-DPI advocates tend to sit in. There are a bunch of angles to this – I’ll try to spell out a few of them, but you are guaranteed that there are a lot more than the ones that I’m presenting below.

1. The Network Neutrality (NN) Angle

I need to be up front, and say that I’m far from knowledgeable about the various NN positions. As such, what I’ll present is likely unfair/incorrect under some interpretations of NN, but I’m trying to present this argument in good faith.

NN advocates are deeply concerned about how DPI lets ISPs pick and choose what packets (and associated applications) are ‘good’ and are ‘bad’ – if P2P applications are regularly throttled after being identified by DPI applications, and direct download services are not, then consumers will implicitly ‘learn’ that some applications are better than others. Those that consume bandwidth in a particular way will be ‘bad’ and others ‘good’.

Commonly, this position (implicitly) adopts some form of an ‘End-to-End‘ (E2E) understanding of network design. Briefly, E2E sees the ‘intelligence’ of networks at the edges of the network – rather than routers being ‘smart’, they should (very generally!) ‘just’ follow a set of protocols to transmit data from point A to B, and points A and B actually generate, interpret, and modify the packets that are sent and received. Jonathan Zittrain has recently argued in The Future of the Internet and How to Stop It that the E2E environment promotes ‘generativity’, or the capacity for a wide range of individuals to create innovative new applications because they don’t need to negotiate with an ‘intelligence’ in the middle of the network itself (this, of course, bears resemblance to some of Lawrence Lessig’s work as well). By locating intelligence in the network itself, developers have to ‘game’ that intelligence and may be unable to produce new applications if the intelligence in the network refuses to let certain applications send and receive packets.

DPI appliances locate intelligence in the network itself, and NN advocates point to that fact that ISPs are targeting some applications to slow/block their packet transmissions and receipts to indicate that situating intelligence in the hands of ISPs is detrimental to consumers. Consumers are paying to transmit packets – the ISP should not be concerned with what content is being sent, or what application is sending or receiving packets. The worry is that by targeting particular applications we are seeing a situation that bears resemblance to the monopolies over telephone lines – a few decades ago, it was illegal (in the US, at least) to attach a non-telecom approved devices to a phone line. This has changed, but DPI threatens to put the genie back in the bottle. DPI potentially (not necessarily) lets ISPs regulate what can be connected to their ‘pipes’, in the sense that certain sending and receiving applications may be permitted to access the pipes, and others disallowed from connecting to them. Per this, DPI is ‘bad’ because it introduces a new way of regulating data traffic that potentially upsets the E2E principles while simultaneously denying individuals the ability to evade what are perceived as heavy-handed uses of DPI appliances. This is (arguably) at the core of what most NN are worried about in relation to DPI, and what they are actively campaigning against.

2. Privacy Advocate (PA) Angle

Privacy advocates are not necessarily separate from NN advocates, or people with more general interests/concerns in DPI. What is unique about these individuals is that they often tend to focus on what DPI appliances might mean for how individuals (and society more generally) can realize privacy online. Many of these advocates attend to the theoretical possibilities of DPI – this is important, because few ISPs actively want to be seen as infringing on individuals’ privacy (it’s just bad for business!). What makes DPI so special is that, because ISPs stand as a necessary gateway between individuals and the ‘net at large, this technology can be used to examine all of a person’s data traffic. This means that privacy concerns around DPI are of a different caliber than, say, Google, which cannot totally examine all of a user’s non-encrypted data traffic.

Given that DPI technologies can penetrate all layers of a packet, PAs are very concerned about these devices being deployed throughout ISPs’ networks. This isn’t necessarily because ISPs are presently using the devices to upset norms of what is and isn’t private; Canadian ISPs, at least, are not all that interested in actually inspecting the contents of packets at a level that would reveal conversations, email text, etc. When we look to how various actors, such as members of the media industry, would like to configure these devices once they are deployed there is a very real concern that communications/data traffic will be examined and subsequently classified, labeled, branded, or otherwise segregated. Now, such ‘classification’ may not entail a total analysis of every single packet; a packet stream might be examined, a fraction of the payload examined (as opposed to its totality), and based on the information gleaned from the fraction the packet stream is subsequently affected by the DPI appliance. This can happen using a variety of different methods, such as signature-/hash-based analysis, ‘fingerprinting’, behavioural analysis, and so forth.

Now, DPI vendors want to say that if their technologies are not examining the totality of a packet then their appliances are not, in fact, infringing on any person’s reasonable expectations of privacy. Those who are familiar with networking technology recognize that packets are, in a sense, like postcards – they are usually transparent, and are not ‘secured’ against analysis (i.e. not secured against being read). Given that packets can be thought as analogous to postcards, DPI vendors might be praised for developing a system that, by default, typically analyzes, rather than tries to understand, the packets that are going through the appliance (though one might argue that behavioural analysis does, in fact, attempt to ‘understand’ the packet stream).

As someone who worked in networking, the postcard analogy and subsequent argument is not ‘new’ – in fact, it was something that I regularly told my users when they sent email or did anything in a non-encrypted session that they thought of as ‘private’. At the same time, DPI is different from the postal system – when I send a postcard there is only a chance that someone will read what I wrote. There is also (more likely?) a chance that no one will read what I have written on the back of the postcard because they just don’t care/have time/etc. DPI does, at least briefly, look to the postcard’s content, and this vast analysis of all packet content (even just the teetering edges of it) makes it a systematic analysis of content. This is very different from the randomized analysis of content by postal workers. Whereas a different postal worker might happen to read my postcards every time one goes through a postal outlet (and thus reduce the likelihood of the postal outlet developing a profile about me that notes what kinds of postcards/content I send and receive in the mail), when you have an ISP ‘reading’ the postcard you have a centralized body that can then (potentially, and not necessarily) develop application-, bandwidth-, and content-based customer profiles.

(It should also be noted that people got used to sending message to one another via sealed letter – this is a relatively affordable way of sending messages. It is more technically challenging, and costly, however to secure all digital communications using encryption than it is to lick the back of a letter. While some websites, such as banks’, do use encrypted sessions, the majority of non-commerce websites do not. This invites a problem with the postcard analogy  – there is no ‘simple’ digital equivalent to the letter’s simplicity that would secure most Internet users’ data traffic.)

Now, I stated at the head of this section that PAs are often interested in or concerned about the theoretical capacities of these devices. Vendors legitimately note on a regular basis that most of their customers (i.e. ISPs, major businesses) would only capture data to create the above mentioned kinds of profiles to comply with government-imposed regulations/demands/laws (e.g. CALEA). This is a fair position. What worries many PAs is that DPI invites the danger of ‘function creep’ – while DPI appliances may be initially purchased by ISPs for traffic analysis and management, government, law enforcement, and private interests (e.g. Big Media) may apply pressure on ISPs to reconfigure their appliances and use them for purposes beyond those motivating their initial purchase. Function creep commonly occurs when you have a generative technology (e.g. the ‘net), and while not always a bad thing (e.g. VoIP, online gaming) it is critical to think of how a technology might be used in the future and sound alarm bells about potential uses that might be construed as harmful. I will note that this is where discussions of regulation and law usually begin, but I don’t want to dip my toe into that tepid pool just now.

Generally, then, PAs see DPI as potentially bad given what it can potentially be used for, though they can point to instances where individuals’ psychological and communicative privacy have been upset (e.g. ISPs using DPI to alter web pages and layer ads and messages on top of pages). Badness for a PA is often realized in particular practices and uses of DPI, and there is a strong worry that function creep will promote ‘bad’ uses of the technology.

3. Communications Infrastructure Angle

This is a huge group. I can’t talk about nearly all of the various positions that this group holds, in part because I just don’t understand some of them (some are devilishly technical!). That said, some worry that by inserting ‘intelligence’ into the network it is opened to threats that a less-intelligent network is not/less susceptible to. Diffie and Landau have noted in Privacy on the Line (2007) that as soon as the capacity to analyze data traffic is built into a network that network operators are essentially building a security vulnerability into their network. Effectively, if we assume that communications networks are meant to shuttle data from point A to B, and keep unauthorized individuals from getting access to that data, then giving ISPs the ability to identify what, exactly, is going across their networks introduces an angle of attack that previously didn’t exist.

Now, before we get hysterical and argue that ISPs are integrating security vulnerabilities into their network and they must be stopped from doing so, we have to recognize that it can be costly (in technical skill, time, ability, motive, and access) to configure DPI appliances. I’m actually uncertain about what kind of access is required, or how new analysis processes/signatures/heuristic patterns are generally added – all of the literature that I have access to on this topic is very ‘market speak’, which prevents me from knowing precisely whether it is possible to reconfigure these devices from range (I expect it is) or if you need to be on-site to update them (I doubt this). I suspect that the response is probably along the lines of , “it depends what device, what you want to update, and how the appliance has been integrated into the ISP network.” As far as I can tell, since this kind of information is not made public (or at least I haven’t found it), members of the public are talking about what marketers are suggesting is possible, rather than what might be practical/actually possible. Given that we’re talking about core telecommunications infrastructure, it seems (to me) that it is critically important for this kind of information to be made public, but the semi-privatization of telecommunications infrastructure means that we don’t often get access to this information.

Other individuals in this group are worried about what DPI might mean for expanding network capacity. In the United States, recent outcry against the abysmally low bandwidth caps that were proposed by Time Warner has led the company to state that they’re not necessarily going to deploy DOCSIS 3.0 equipment – rather than expand their network capacity, there is a looming threat that we’re going to see just what it means to push and pull more data across the ISP’s network than it can handle. That will demonstrate to the consumer need for caps, bandwidth management, and so forth! The corresponding worry surrounding DPI is twofold: first, that ISPs might adopt a similar position as Time Warner and bandwidth caps (i.e. if government and consumers don’t submit to DPI-based traffic management, then network upgrades will be put on hold. The exaflood myth just might be realized.); second, that DPI will be used to ‘manage’ consumer bandwidth in a way that will let ISPs delay capacity expansions and leave North Americas in the dust in terms of broadband speeds and access to next-generation content delivery systems.

There are also people in this broad group who are concerned about the technical standards (or lack thereof) that DPI appliances adhere to. Similarly, there are questions about whether or not DPI remains in ‘the spirit’ of open networks – are the basic principles that underlie the Internet itself being undermined by DPI? If we were re-building the Internet, today, would we integrate superior QoS principles into our routing protocols and, if so, is DPI a kludge that interferes with genuinely redesigning the ‘net?

These are just a few positions from within this massive group, and I’m not doing justice to many (any?) of them. Suffice it to say that it isn’t just network neutrality and privacy advocates that have an interest in worrying about DPI, and so the technology isn’t just ‘bad’ for the arguments that those two (often loud!) groups tend to emit.

The Ugly

So, I tried to note that DPI appliances can be very good for consumers, and also that it is possible to read these technologies as very bad. In the process, a gulf opened between the crude position that I situated ISPs in, and the equally crude descriptions of network neutrality, privacy, and communication infrastructure advocates. ISPs are responsible for actually managing the network and have to turn a profit – they are private corporations, not government, and so appeasing shareholders is their primary goal. On the other side, you see individuals and groups who see the ‘net as a necessary part of daily life in the Global North, and perceive DPI as threatening to restructure how people can engage with one another and express themselves.

Money, rights, privacy, standards – in this competition, who wins out?

At the moment, DPI is in the spotlight across the Global North. In Canada, there are some hopes that the CRTC will find in favor of rights, privacy, and standards, but those hopes are dim at best. Should an unfavorable decision be made by the CRTC about network management, we will likely see an immediate complaint to the Office of the Privacy Commissioner of Canada (who has an information site about DPI), and then we’ll see what happens from there.

In the US, Congress is (again) taking an interest in the use of DPI appliances by ISPs. It’s a marvel that members of Congress continue to be interested in the technology, and speaks well of both the politicians involved and the ability of groups like the EFF to keep the technology on the agenda. Presumably, if any regulations are passed in the US they are going to be narrow and pointed at particular practices – the US is not well known for its progressive privacy policies.

The EU is an interesting case. The UK is letting Phorm integrate DPI technologies into ISPs networks to deliver targeted advertising, but the EU privacy commissioner might yet look into whether this use of the technology infringes on Europeans’ rights to privacy. I’m not an EU scholar, so I don’t know how this might play out, but is something to keep an eye on.

In each of these cases, different regulatory systems will be used by proponents for, and advocates against, the use of DPI technologies. From a policy perspective, it will be curious to see if reactions to DPI will align with policy scholarship that has addressed different policy regimes around the world, of if DPI throws a wrench into current theories of privacy policy capacity and the ability for regulators to actually affect technocratic practices.

Concluding Thoughts

The DPI ‘issue’ is interesting for a series of reasons, not the least being that it will indicate where the ‘net will be in a few years. Will innovation surround expanding capacity as rapidly as possible so that the ‘exaflood’ can be avoided, or will we instead move towards a tiered system that enforces some kind of bandwidth management, transforming bandwidth into a scarce resource and subsequently creating some kind of a very granular usage-based payment model? Will we see a ‘balancing’ of privacy as DPI is rolled out across ISP networks, or will regulatory bodies actually stand up and fight for citizens’ rights?

The stance taken by those supporting ‘the good’ is not necessarily in contrast with all parties who worry about the ‘bad’ of DPI appliances. What is ugly, is that the debate has become so polarized that neither side is often willing to extend an olive branch to the other for fear that it will be read as a weakness and preyed upon. This said; do we want an olive branch, or would we rather focus on a polarized discussion of whether we want DPI use to grow versus burning the technology to the ground?

I worry that it will be the latter dichotomous discussion that will continue to consume the public sphere, and thus limit the range of potentially very interesting discussions that could be had in the middle spaces about the technology, it’s deployment, and its governance. Hopefully my worries are unfounded, but I doubt that they are.