Ebook: Digital Enlightenment Yearbook 2013
The value of personal data has traditionally been understood in ethical terms as a safeguard for personality rights such as human dignity and privacy. However, we have entered an era where personal data are mined, traded and monetized in the process of creating added value - often in terms of free services including efficient search, support for social networking and personalized communications. This volume investigates whether the economic value of personal data can be realized without compromising privacy, fairness and contextual integrity. It brings scholars and scientists from the disciplines of computer science, law and social science together with policymakers, engineers and entrepreneurs with practical experience of implementing personal data management.
The resulting collection will be of interest to anyone concerned about privacy in our digital age, especially those working in the field of personal information management, whether academics, policymakers, or those working in the private sector.
I welcome the effort of the Digital Enlightenment Forum to harness the humanism, rationality and optimism of the Age of Enlightenment to better shape and to enhance the benefits to society of our evolving Digital Age. I write this foreword in the midst of a warm global debate on the value of Data – a theme for the DEF and for its Davos cousin the WEF – as well as on the balance between Westphalian security needs and personal privacy aspirations.
This yearbook's focus on the ‘Value of Personal Data’ is apt, because the explosion of data and of computing power over the last few years represents a further stepchange in the development of the Internet. The statistics of the data explosion bears repetition: by some estimates, 90% of the world's current data has been produced in the last two years; in turn, this is more than was generated in the previous 2,000 years. Meanwhile, the world generates 1.7 billion bytes every minute. Big data (high volume, high velocity, high variability) is here to stay.
Beyond its purely commercial uses, the societal benefits of Big Data will be more slowly understood and developed. For the public sector, better data allows services that are more efficient, transparent and personalised. In the health sector, open results and open data permit whole new fields of research. For example, scientists at Columbia and Stanford universities analysed millions of online searches to learn about the symptoms and conditions of certain drugs. This led to the unexpected medical discovery that the combination of two drugs – paroxetine, an antidepressant, and pravastatin, a cholesterol-lowering drug – cause high blood sugar. Meanwhile, the Global Viral Forecasting Initiative (GVFI) uses advanced data analysis on information mined from the Internet to identify comprehensively the locations, sources and drivers of local outbreaks before they become global epidemics. Such techniques offer guidance up to a week ahead of previous indicators.
Controversy can arise. TomTom, a Dutch manufacturer of satellite navigation devices, ran into problems with the anonymised data that it collected from its users about individual driving behaviour, which it provided to the Dutch government to help improve the national road system. However, it transpired that data was being used in part to identify the most appropriate sites for speed cameras. Users complained that they were unaware of this application of their data, and were concerned that the police would be able to identify individual speeding violations from the data. TomTom assured consumers that the data had indeed been fully anonymised and that the company would prevent such use in the future. The lesson for all Big Data companies is to focus on perception of data use as much as on actual use.
As that example goes to show, user trust is key to Big Data success. Neelie Kroes, Commission Vice-President for the Digital Agenda, has put this succinctly in many public statements on ensuring such confidence: “Privacy is not just about technical features. Without privacy, consumers will not trust the online world. And without trust, the digital economy cannot reach its full potential”. She goes on to identify her three requirements for privacy in the digital age: transparency so that citizens know exactly what the deal is; fairness so that citizens are not forced into sharing their data; and user control so that citizens can decide – in a simple and effective manner – what they allow others to know. These concepts underpin much of the material presented in this Yearbook.
In an emergent field such as Big Data, the Forum's work can inform EU-wide innovation, the EU research agenda and our vision of our common future.
On the innovation front, I believe that the right standards can accelerate demand growth. Without interoperability and harmonised formats, large datasets can be too difficult to fit together and use in practice. In this respect the Commission has engaged with stakeholders in the European public sector information ecosystem to forge lightweight agreements and standards that are needed to enable interoperability and integration of Public Sector Information (PSI). It is also promoting standardisation of data formats on our EC Open Data portal and one of the goals of its Pan-European Open Data Portal is to drive the harmonisation of data-formats and licensing conditions in Europe.
On research and innovation, the Commission has provided on average EUR 76 million p.a. for data and language technologies. In Horizon 2020, it intends to continue to fund innovation in the area of data products and services, and has also set up the Research Data Alliance, to help scientific data infrastructure become fully interoperable. Open Data standards are also considered set to continue as part of the Horizon 2020 activities from 2014.
As to the longer-term vision, this is where “Digital Futures” comes in, a foresight project launched by DG Connect to prepare for the world beyond 2030. The project taps into the collective wisdom and aspirations of stakeholders to co-create long term visions (on a time horizon 2040–50) and fresh ideas for policies that can inspire the future strategic choices of DG Connect and the Commission. It draws inspiration from the long term advances at the intersection between ICT and society, economy, environment, politics, humanities and other key enabling technologies and sciences. This is why we are co-hosting with the Forum a Digital Futures Workshop on the “Future of Personal Data and Citizenship”.
In terms of the privacy implications of Big Data, these are just some of the issues that could be addressed by Digital Futures: Can we achieve better metadata management so that potential re-users understand better what uses of the data are covered by consent and/or the statutory law grounds on the basis of which data were collected? Can we build in features in data-management systems that allow a level of anonymisation of personal data compatible with legal requirements? How can we develop privacy-enhancing technologies facilitating the process of giving consent to new uses of personal data? Can we establish “data banks” – dedicated digital spaces for the management of the personal information for each data subject? What kind of role should self-regulation and co-regulation have in ensuring compliance with privacy rules? Regulators do not have all the answers, but we can at least ask the right questions.
Robert Madelin
DG Connect of the Euorpean Commission
Has today's digital society succeeded in becoming mature? If not, how might a new Enlightenment philosophy and practice for the digital age be constructed that could hope to address this situation? Such a philosophy must take into account the irreducibly ambivalent, ‘pharmacological’ character of all technics and therefore all grammatisation and tertiary retention, and would thus be a philosophy not only of lights but of shadows. Grammatisation is the process whereby fluxes or flows are made discrete; tertiary retention is the result of the spatialisation in which grammatisation consists, a process that began thirty thousand years ago. The relation between minds is co-ordinated via transindividuation, and transindividuation occurs according to conditions that are overdetermined by the characteristics of grammatisation. Whereas for several thousand years this resulted in the constitution of ‘reading brains’, today the conditions of knowledge and transindividuation result in a passage to the ‘digital brain’. For this reason, the attempt to understand the material or hyper-material condition of knowledge must be placed at the heart of a new discipline of ‘digital studies’. The pharmacological question raised by the passage from the reading to the digital brain is that of knowing what of the former must be preserved in the latter, and how this could be achieved. This means developing a ‘general organology’ through which the social, neurological and technical organs, and the way these condition the materialisation of thought, can be understood. Integral to such an organology must be consideration of the way in which neurological automatisms are exploited by technological automatisms, an exploitation that is destructive of what Plato called thinking for oneself. The task of philosophical engineering today should be to prevent this short-circuit of the psychosomatic and social organological layers, a task that implies the need for a thoroughgoing reinvention of social and educational organisations.
We first use Medium Theory to develop the tension between print and digital media, i.e as contrasts between literacy-print and the secondary orality of contemporary online communication. Literacy-print facilitates high modern notions of individual selfhood requisite for democratic polities and norms, including equality and gender equality. By contrast, secondary orality correlates with more relational conceptions of selfhood, and thereby more hierarchical social structures. Recent empirical findings in Internet Studies, contemporary philosophical theory, Western virtue ethics and Confucian traditions elaborate these correlations, as do historical and contemporary practices and theories of “privacy.” Specifically, traditional conceptions of the relational self render individual “privacy” into something solely negative: by contrast, high modern conceptions of autonomous individuals render individual privacy into a foundational positive good and right. Hence, the shift towards relational selves puts the conception of selfhood – at work in current EU efforts to bolster individual privacy – at risk.
Nonetheless, contemporary notions of “hybrid selves” (conjoining relational and individual selfhood) suggest ways of preserving individual autonomy. These notions are in play in Helen Nissenbaum's theory of privacy as contextual integrity [1] and in contemporary Norwegian cultural assumptions, norms, and privacy practices.
The implications of these transformations, recent theoretical developments, and contemporary cultural examples for emerging personal data ecosystems and user-centric frameworks for personal data management then become clear. These transformations can increase human agency and individual's control over personal data. But to do so further requires us to reinforce literacy and print as fostering the individual autonomy underlying modern democracy and equality norms.
There are many communities of ubiquitous computing users that are on the periphery of society, and these liminal users are often left to negotiate their relationship with technology without the help and support provided to more mainstream users. One such community is formed around users of Augmentative Alternative Communication (AAC) technology. Changes in the commercial landscape have brought within reach dramatic improvements in AAC and made them more accessible and supportive to their user community. These improvements, though overwhelmingly positive, amplify a family of personal data management problems that are similar to those experienced by more typical ubiquitous computing users. This paper argues that information manaagement practices deployed by the AAC user community are ones that mainstream society may benefit from using. Accordingly, this paper explores a number of personal data management problems that arise during AAC use and considers how AAC users have developed work arounds and information management practices to protect their personal information. Whilst this paper is focused on AAC technology, the responses could be generalised for a broader spectrum of society.
In this chapter we present a new method for visualising the use of personal information by stakeholders, and the transfer of value between those same groups based on the provision of tangible and intangible goods. Value network analysis combines elements of social network analysis with value chains and social capital in order to identify where value is generated in a network of stakeholders. The privacy value network (PVN) approach develops this methodology to identify the ways in which the value of personal information is realised across a network of stakeholders. Privacy Value Networks also introduce the notion of information costs within the model – and tracking personal information across a network allows for the identification of both exogenous and endogenous costs. At present, the PVN approach is primarily an analytic and visualisation tool, but in the future it should be possible to quantify value and costs across the network, and to calculate the degree of value/cost balance (and imbalance).
Personal data in the networked world is considered “the new oil” – its collection is said to enhance user experience but is in the control and for the profit of others, leading to a lack of transparency and erosion of privacy. Expectations surrounding what constitute a healthy privacy-protective relationship between individuals and organizations are being reset under the umbrella of the emerging Personal Data Ecosystem (PDE). The PDE is supported by new technologies and services, such as Personal Data Vaults (PDV) and data sharing platforms. These technologies and services allow individuals to control and manage their own information. While PDE developments are positive from a privacy perspective given the control they provide to the individual, in the wrong hands, one's PDV and activities within the PDE could be exploited as a major surveillance tool. The paper introduces Privacy by Design (PbD) which the author sees as essential to the success of the PDE. For several years, the Information and Privacy Commissioner of Ontario, Canada, has examined emerging technologies and best practices that are relevant to the PDE, which can assist in developing the PDE in a manner consistent with PbD. By following PbD, privacy in the PDE can indeed be assured.
From the earliest days of the information economy, personal data has been its most valuable asset. Despite data protection laws, companies trade personal information and often intrude on the privacy of individuals. As a result, consumers feel that they do not have control, and lose trust in electronic environments. Technologists and regulators are struggling to develop solutions that meet the demands of business for more personal information while maintaining privacy. However, no promising proposals seem to be in sight. We propose a 3-tier personal information market model with privacy. In our model, clear roles, rights and obligations for all actors re-establish trust. The ‘relationship space’ enables data subjects and visible business partners to build trusting relationships. The ‘service space’ supports customer relationships with distributed information processing. The ‘rich information space’ enables anonymized information exchange. To transition to this model, we show how existing privacy-enhancing technologies and legal requirements can be integrated.
While the collection and monetization of user data has become a main source for funding “free” services like search engines, online social networks, news sites and blogs, neither privacy-enhancing technologies nor their regulations have kept up with user needs and privacy preferences. The aim of this chapter is to raise awareness of the actual state of the art of online privacy, especially in the international research community and in ongoing efforts to improve the respective legal frameworks, and to provide concrete recommendations to industry, regulators, and research agencies for improving online privacy. In particular we examine how the basic principle of informational self-determination, as promoted by European legal doctrines, could be applied to infrastructures like the internet, Web 2.0 and mobile telecommunication networks.
This paper contributes to the discourse on how information technologies can empower individuals to effectively manage their privacy while contributing to the emergence of balanced personal data ecosystems with increased trust between all stakeholders. We motivate our work through the prism of privacy issues in online social networks (OSN) acknowledging the fact that OSNs have become important platforms for information sharing. We note that on OSN platforms, individuals share very intimate details about different aspects of their lives, often lacking awareness about and understanding of the degree of accessibility/visibility of information they share as well as what it may implicitly reveal about them. Furthermore, service providers and other entities participating on OSN platforms are increasingly relying on profiling and analytics to find and extract valuable, often hidden patterns in large collections of data about OSN users. These issues have caused serious privacy concerns. In the light of such issues and concerns, we argue that protecting privacy online implies ensuring information and risk awareness, active control and oversight by the users over the collection and processing of their data, and support in assessing the trustworthiness of peers and service providers. Towards these goals, we propose the Personal Information Dashboard (PID), a system that relies on usable automation tools and intuitive visualizations to empower end-users to effectively manage both their personal data and their self-presentations. With the PID, the user is able to monitor and visualize her social networking footprint across multiple OSN domains. To enable this, the PID aggregates personal data from multiple sources and links the user's multiple OSN profiles. Leveraging various inference and correlation models as well as machine learning techniques, the PID empowers users to assess and understand the level of privacy risk they are facing when sharing information on OSNs. Based on outputs of the underlying prediction and learning methods, the PID can suggest to the user corrective options aimed at reducing risks of unintended information disclosure. We present a prototype implementation demonstrating the feasibility of our proposal.
The principle of consent is widely seen as a key mechanism for enabling user-centric data management. Informed consent has its origins in the context of medical research but the principle has been extended to cover the lawful processing of personal data. In particular, the proposed EU regulation on data protection seeks to strengthen the consent requirements moving them from unambiguous to explicit. Nevertheless, there are a number of limitations to the way that even explicit consent operates in real-life situations which suggest that an alternative, more dynamic form of consent is needed. This chapter reviews the key concerns with static forms of consent for the control of personal data and proposes a technologically mediated form of dynamic consent instead.
Open data debate and PSI (public sector information) legislation have so far focused on the access to public rather than personal data for reuse purposes. As a number of cases discussed before some European Data Protection Authorities have shown over the past years, however, conditions of access to PSI for reuse purposes may raise issues of data protection. Consider commercial and land registers, case law databases, public deliberations, vehicle registrations, socioeconomic data, and more, in light of such techniques as big data analytics. Correspondingly, it is likely that the reuse of PSI will increasingly concern principles and criteria for making personal data processing legitimate. The aim of this paper is to examine today's legal framework and the technical means that may enable the lawful reuse of personal data, collected and held by public sector bodies. Whereas openness and data protection are often conceived as opposed in a ‘zero sum game’, the paper explores whether a ‘win-win’ scenario is feasible.
This paper addresses the relationship between freedom of information and privacy. It looks into the privacy exemptions in freedom of information legislation and the way they are applied in national redress mechanisms. It argues that the balancing exercise performed between FOI and privacy can provide important insights in the discussion on the possible conflicts between open data on the one hand and privacy and data protection on the other hand.
There has been an explosion of data on the Web. Much of this data is generated by or else refers to individuals. This emerging area of personal information assets is presenting new opportunities and challenges for all of us.
This paper reviews a UK Government initiative called ‘midata’. The midata programme of work is being undertaken with leading businesses and consumer groups in order to give consumers access to their personal data in a portable and electronic format. Consumers can then use this data to help them better understand their own consumption behaviours and patterns, as well as make more informed and appropriate purchasing and other decisions.
The paper reviews the history and context, principles and progress behind midata. It describes concrete examples and examines some of the challenges in making personal information assets available in this way. The paper reviews some of the key tools and technologies available for managing personal information assets. We also summarise the legislative landscape and various legal proposals under development that are relevant to midata.
We review similar efforts elsewhere in particular those underway in the US under a programme of work called Smart Disclosure. This work seeks to release personal information held by government and business back to citizens and consumers. Finally we discuss likely future developments.
Data is rapidly changing how companies operate, offering them new business opportunities as they generate increasingly sophisticated insights from the analysis of an ever-increasing pool of information. Businesses have clearly moved beyond a focus on data collection to data use, but users have an inadequate model of notice and consent at the point of data collection to limit inappropriate use. An interoperable metadata-based architecture that allows permissions and policies to be bound to data, and is flexible enough to allow for changing trust norms, can help balance the tension between users and business, satisfy regulators' desire for increased transparency, and still enable data to flow in ways that provide value to all participants in the ecosystem.
Life Management Platforms (LMPs) combine personal data stores, personal cloud-based computing environments, and trust frameworks. They allow individuals to manage their daily life in a secure, privacy-aware, and device-independent way. In contrast to pure personal data stores, they support concepts which allow interacting with other parties in a meaningful way without unveiling data. This concept of ‘informed pull’ allows apps to consume information from the personal data store as well as from third parties, and to use that information for decision making, without unveiling data to any of the other parties. Think about comparing offers for insurance contracts or think about pulling articles from various sites for a ‘personalised newspaper’ without unveiling the full list of current interests. The problem with sharing data is: ‘Once it's out, it's out of control’. Even with more granular approaches on sharing data, the problem remains – once data has left the personal data store, the individual has no grip on that data anymore. LMPs thus go beyond that, by new concepts of sharing information with security and privacy in mind, but also by relying on trust frameworks. The latter are, for instance, important to rely on a set of relations and contracts once it comes to sharing data – and there is no way to fully avoid sharing data. Once a decision about the preferred insurance company has been made, there needs to be a contract. Some data has to flow. However, defined contracts can reduce the risk of abuse for that data. The chapter explains the need for LMPs, the underlying concepts, and potential benefits. It also looks at some real-world use cases.
The present online economy is based on organisation-centric control of personal data, identity and proof of claims. The benefits of adding individual-centric control over personal data and identity appear significant. The paper describes the research, development and deployment to mid-2013 of a platform to achieve this by the London-based social enterprise tech startup Mydex CIC.
In the coming decade, a seamless integration of our on- and offline lives will be necessary for a sustainable digital society. This requires urgent multi-disciplinary debate on the collection, control and use of personal data in society. This paper proposes a framework that can be used to shape this dialogue. It is based on a neutral and consistent terminology that may support a constructive and fruitful debate, avoiding terms that have too often led to controversy and confusion. We have attempted to position in this structure state-of-the-art technology developments, including context-awareness, user-centred data management, trust networks and personal data ecosystems. It demonstrates clear relations between ongoing work in various groups and hence an urgent need for cooperation to achieve common goals. We argue that such cooperation can lead to the emergence of a personal data ecosystem that may truly support a sustainable digital society.