Review: Delete – The Virtue of Forgetting in the Digital Age

Viktor Mayer-Schonberger’s new book Delete: The Virtue of Forgetting in the Digital Age (2009) is a powerful effort to rethink basic principles of computing that threaten humanity’s epistemological nature. In essence, he tries get impress upon us the importance of adding ‘forgetfulness’ to digital data collection process. The book is masterfully presented. It draws what are arguably correct theoretical conclusions (we need to get a lot better at deleting data to avoid significant normative, political, and social harms) while drawing absolutely devastatingly incorrect technological solutions (key: legislating ‘forgetting’ into all data formats and OSes). In what follows, I sketch the aim of the book, some highlights, and why the proposed technological solutions are dead wrong.

The book is concerned with digital systems defaulting to store data ad infinitum (barring ‘loss’ of data on account of shifting proprietary standards). The ‘demise of forgetting’ in the digital era is accompanied by significant consequences: positively, externalizing memory to digital systems preserves information for future generations and facilitates ease of recalls through search. Negatively, digital externalizations dramatically shift balances of power and obviate temporal distances. These latter points will become the focus of the text, with Mayer-Schonberger arguing that defaulting computer systems to either delete or degrade data over time can rebalance the challenges facing temporal obviations that presently accompany digitization processes.

Key to the text’s argument is that reconstructing past events from memories, memories that have ‘degraded’ from forgetting certain elements of situations and contexts, has the effect of permitting generalizations and principle-drawing activities. The absence of perfect memory “helps us to reason swiftly and economically, to abstract and generalize, and to act in time, rather than remain caught up in conflicting recollections” (21). The externalization of memory – through verbal language, script, ‘shared’ memory constituted through common media (e.g. newspapers, TV) – has assisted in transferring and transmitting information and knowledge, but recalling externalized memory in the analogue era remained complex and time consuming, and thus costly. This cost is largely diminished with digitization processes.

The shift to digital media is accompanied by cheap storage, rapid search/recall, and reliable data retention processes. Further, the birth of digital communications networks has made physical presence to data sources a (relative) non-issue; the economics of search and retrieval of information are drastically reshaped when one can retrieve information from a database on the other side of the world without leaving your local coffee shop. Information vendors who exclusively trade in these databases may not know precisely what information an individual wants and so seek to aggregate a larger bundle of information that might satisfy a wider set of consumer preferences in order to enhance the company’s revenue generation. The result is that collecting and remembering information is a major business. Such massive data aggregation is accompanied by two problems: First, it is hard to ‘delete’ information once it has been released into the wild – deleting it from one database doesn’t guarantee that the information has been removed from the entirety of the information-ecosystem – and, second, that each online interaction itself is information about oneself that one’s interaction partner(s) now have and can potentially share with others. In essence, the economics of information means that there are strong disincentives for the past to ever be forgotten, on the basis that someone, at some point, might find value in the information.

What this showcases, in part, is that there has been a reduction of control – a loss of power – over the information that we produce. This loss occurs from three features of digital memory:

  1. Accessibility – As a result of social norms, we are willing to share information with others in particular contexts and for specific purposes. This approach to information works well if it is stored in discrete databases (and is often referred as ‘practical obscurity’ in privacy circles) but is less suited for the increasingly networked world we operate in. Our data is more accessible than ever before, and such accessibility shifts the control/power we have over where externalizations of our memories (and memories of others about us) reside and who can access them.
  2. Durability – In an era of analogue records it was difficult to keep non-critical externalized memories accessible without extreme effort. In an era of semantic search and total search recall external memories are more durable, insofar as they are readily accessible. This shifts power to the searcher, and away from the searched.
  3. Comprehensiveness – Records can be massively collated and thus produce more comprehensive, centralized, memory banks than previously possible. Not only is the access to such records a rebalancing of previous power relationships, but the lack of context in such records is also threatening to the formation/maintenance of one’s identity; if one doesn’t know how their utterances will be used, and by whom, in the future they must assume the worst and self-censor. Whereas the spatial panopticon suggests that individuals will self-discipline based on surveillance, Mayer-Schonberger suggests that this fear of future looking constitutes a temporal panopticon.

The digital also obviates time, and thereby threatens our ability to decide rationally. This threat manifests because recalled external memory may; (1) act as a memory cue, recalling what we had forgotten; (2) exacerbate human difficulties in putting past events in in proper temporal sequence; (3) confront us with too much of the past and therefore prevent us from acting; (4) cause us to lose faith in human memory when the digitally externalized conflicts with the human memory. Moreover;

…because digital memory amplifies only digitized information, humans like Jane trusting digital memory may find themselves worse off than if they’d relied solely on their human memory, with its tendency to forget information that is no longer important or relevant (123).

What is critical to take away from this is that comprehensive digital remembering collapses history and thus impairs our judgement to act in time, while denying humans the chance to evolve, develop, and learn. This leaves us to helplessly oscillate between two equally troubling options: a permanent past and an ignorant present.

Mayer-Schonberger takes up several potential responses to the difficulties posed by digitally externalized memory. I’ll briefly note the solution’s name and problem(s).

  • Digital abstinence – Even if individuals adjust their information sharing behaviour, information processors may have little incentive to follow suit.
  • Information privacy rights – The right to informational self-determination are largely problematic because tort-based approaches are costly for individuals to take up. The idea of individuals controlling and protecting their information could work, but only if enforcement can be drastically simplified.
  • Digital privacy rights infrastructure – There are at least four issues related to a infrastructure (DRM-like) approach. (1) the potential comprehensiveness of the DRM system; (2) systems are never tamper-proof; (3) it requires establishing a significant surveillance system to ensure the DRM is appropriately applied – privacy requires extension of existing surveillance practices(!); (4) meta-information is a prerequisite for the system to work, and the question of who would input the data and maintain it is an open question from a practical point of view.
  • Cognitive adjustment – It is dubious that we will quickly ‘adapt’ to permanent digital remembering. Something must be done in the gap between now and when we ‘learn to accept’ perfect digital recall.
  • Information ecology – Issues of what information should be kept and deleted is unfortunately binary – making this a blunt approach to digital recall – and information ecology norms are difficult to enact. Information retention laws are extending the duration and access to external memory, nor reducing either, in the wake of 9/11.
  • Perfect contextualization – Even if all communication and external information is recorded, digital memory misses non-digitized thinking and thus will always remain fundamentally incomplete. Further, even if perfect contextualization can re-create the information context it cannot take us back in time to frame the event at that specific moment with the recalled facts-at-hand.

In light of the deficiencies associated with the above solutions, Mayer-Schonberger argues that we need to reintroduce forgetting to the digital. When a file or record is saved, individuals could be required to input an ‘expiry date’ before the file is saved, and the file would be deleted once the expiry date was met. This deletion metadata could track with the file and a deletion negotiation of some sort could occur between parties when when the file was shared. Such deletion processes could be mandated by law and instantiated in software code. For large businesses this might increase the value of their records given that the records will provide superior perspectives on individuals’ (presently) expected behaviour. Mayer-Schonberger explicitly notes that reintroducing forgetting does not solve all the challenges of informational privacy (nor is this the intention), but forgetting might resolve some challenges. Also, mandating forgetting does not guarantee perfect adherence; some individuals will get around the technological protection measures and illicitly extend memory retention periods. Perfect enforcement is not required, however, and widespread violation is unlikely so long as society frowns on circumvention efforts, forgetting is enforced by law, and reinforced by technical measures. Perhaps a ‘rusting’ or ‘data rot’ process could be implemented to approximate how humans forget as well.

I should note, before getting to my criticisms of the book, that I generally generally agree that shorter data retention periods for personal information are needed and, moreover, that Fair Information Practices (FIPs) as well as PIPEDA require that data be kept for minimal periods of time – minimal being relative to the time the data is needed to accomplish the goals the data is collected for. Of course, the ‘relative’ element of data retention means some data is held (seemingly) for indefinite periods. Thus, I agree with the principle of forgetting, agree that adopting a system by which computers somehow ‘mimic’ human memory is a good thing at a theoretical and conceptual level, and agree that associating human processes with computing is important in establishing norms for the bio-digital world. I fundamentally disagree, however, with the processes he suggests for implementing forgetting.

Mayer-Schonberger is talking about legally requiring deletion metadata to be added to all data formats, and that such metadata be filled out before a file can be saved. This would demand a fairly significant reworking of data formats, as well as require widespread OS-level buy-in and adherence to standards so that deletion periods were uniformly met to across different OSes. It would also require a significant level of security around expiry information to secure all data files from a virus that modified expiry data. In essence, expiry dates as a standardized requirement threatens to open a significant security vulnerability, requires world-wide adoption to deal with off-shore data havens, and the political will to go to war with many major technology companies that will likely resist this effort. None of these are insignificant challenges.

Further, to hinder/prevent the illicit modification of expiry date metadata will presumably require some kind of technological protection measure – a system that ‘secures’ files from their owners – would need be instantiated across all storage capable devices. Your computer, camera, iPad, and so forth would all need some kind of module to ‘protect’ data from you modifying it. This significantly takes away from the freedom that individuals currently have over files in their possession.

At a user interface level, I have very real concerns with adding a few clicks to all data transactions that result in the saving of data. Adding one additional click to a buying process significantly reduces the likelihood of a purchase and thus will likely be resisted by eCommerce providers, and routinely confronting users with an expiry date form will lead to click-fatigue: user will do everything in their power to get around the damn expiry dates because of frustration and failing a work around will just ignore the expiry dates. Believe me when I say that this will lead to incredible headaches as IT staff around the world figure out how to recover ‘expired’ files years after they were deleted. Admittedly Mayer-Schonberger recognizes that the expiry date system must be user-friendly, but he hasn’t offered any suggestions as to what this could include.

One issue that wasn’t touched on in the expiry data system, and that seems absolutely critical, is copyright. Given that copyright is automatically extended to unique expressions, then doesn’t this mean that the ‘normal’ expiry date for many file formats (i.e. any where unique expression can be made…) should default to the period of time the file might be under copyright? Insisting that individuals delete files that contain unique expression seems to run against the notions embedded in copyright legislation. Admittedly, this might work to spur a real debate about whether files should automatically ‘expire’ (in Canada) 50 years after the author’s death or well before (and thus question the need for such extensive copyright periods), though I suspect that rights-holders would instead scream that this was a subtle way of trying to somehow ‘undermine’ the present copyright system. Maybe that’s just the pessimist in my however…

In the end, I’d highly recommend the book. It’s well written, controversial, and exciting: all things that you want in a book dealing with cutting edge issues and topics. It’s available through Princeton as well as your ‘local’ online bookseller.