Ebook: Applications of Secure Multiparty Computation
We generate and gather a lot of data about ourselves and others, some of it highly confidential. The collection, storage and use of this data is strictly regulated by laws, but restricting the use of data often limits the benefits which could be obtained from its analysis. Secure multi-party computation (SMC), a cryptographic technology, makes it possible to execute specific programs on confidential data while ensuring that no other sensitive information from the data is leaked. SMC has been the subject of academic study for more than 30 years, but first attempts to use it for actual computations in the early 2000s – although theoretically efficient – were initially not practicable. However, improvements in the situation have made possible the secure solving of even relatively large computational tasks.
This book describes how many different computational tasks can be solved securely, yet efficiently. It describes how protocols can be combined to larger applications, and how the security-efficiency trade-offs of different components of an SMC application should be chosen. Many of the results described in this book were achieved as part of the project Usable and Efficient Secure Multi-party Computation (UaESMC), which was funded by the European Commission.
The book will be of interest to all those whose work involves the secure analysis of confidential data.
We generate and gather a lot of data about ourselves and others, both willingly and unwillingly. Much of that data is considered confidential by someone, and its collection, storage and use are regulated by laws, contracts, societal norms or personal care. The restrictions on the use of data can limit the benefits we can obtain from its analysis, even if all the parties involved agree that it would be beneficial to learn the result of a certain computation applied to confidential data. Examples include the statistical analysis of personally identifiable information (PII) of a large number of people, where the results of a well-designed analysis can no longer be linked to a single person, or a small group of persons, but can be used as a basis for sound policy decisions. Examples also include the monitoring of critical infrastructure and reacting to changes in its status, where different parts of the infrastructure are owned by different entities afraid that their trade secrets might leak. Another example considers negotiations among several parties planning a cooperative activity, where the parties are reluctant to bring all the relevant facts to the negotiation table in fear of leaking their trade secrets and harming their position, but the lack of these facts may lead to a sub-optimal contract.
In all these examples, the parties would like to execute a specific program on their confidential data, and learn the outcome of the result, while making sure that nothing else about the data is leaked. There is cryptographic technology that is capable of providing the parties with exactly this functionality and guarantees. In secure multiparty computation (SMC), parties provide their inputs to a cryptographic protocol that is used to compute a pre-agreed function in such a manner that anything that a party or a sufficiently small coalition of parties sees during the protocol could be deduced from the party's or the coalition's inputs and outputs. SMC has been a subject of academic studies since the 1980s, when the first constructions were proposed and it was shown that any Boolean circuit could be securely evaluated with only a polynomial overhead. As it is possible to express any computation as a Boolean circuit (again, with only a polynomial overhead), it can also be performed securely. Since then, several different SMC techniques have appeared with different communication patterns, different numbers of parties they are able to support, different coalitions of adversarial parties they are able to tolerate, and different adversarial actions they can handle.
While SMC has been the subject of academic studies for more than 30 years, the first attempts to use it for actual computations only took place in the early 2000s. Indeed, though the first constructions of SMC were theoretically efficient, the constants hidden in their complexity measures were too large to render them practicable. The situation has steadily improved since then and by now, even relatively large computational tasks can be solved securely.
Academic research has concentrated on improving the times of securely executing Boolean circuits, or arithmetic circuits consisting of addition and multiplication gates. This makes a lot of sense from the theoretical point of view, as these operations are sufficient to express any computation with polynomial overhead. It is also advisable to use these operations when expressing certain combinatorial or numerical computation tasks as circuits. For many other kinds of computations, though, the cost of converting them into a circuit that only consists of additions and multiplications, together with the overheads of SMC, may be too much for practical applications. There have been relatively few research efforts to
• devise efficient SMC protocols for other operations, e.g. comparisons;
• efficiently combine the existing SMC protocols into secure protocols for other primitive operations used in algorithms, e.g. array access.
In our opinion, these are the advances that will encourage the take-up of SMC techniques in many different areas for many different tasks.
This book describes research, mostly performed over the last few years, that has brought forth these kind of advances. The results described show how certain algorithmic steps can be performed in a privacy-preserving manner, and how to combine these steps into larger privacy-preserving applications. We start this book with two chapters about fundamental SMC techniques. In Chapter 1, we describe the different kinds of protocols for SMC, and the ways they represent private data in a manner that prevents a small coalition from learning anything about the individual values. In Chapter 2 we discuss the composability of SMC protocols. For building large privacy-preserving applications, it is very important to ensure that the protocols can be composed freely or almost freely. Underlying the composability results is a very convenient abstraction of SMC — the ideal functionality called the arithmetic black box (ABB) that allows computing parties to perform operations with the data it stores without revealing it to those parties.
In Chapter 3 we study the society's preparedness to use SMC and adapt its processes around it. We discuss where and how SMC techniques could make the fastest inroads. Following the recommendations given in this chapter can decrease the time it takes to achieve tangible outcomes from SMC research.
We continue with an application area determined in Chapter 3 as one of the most likely to rapidly benefit from SMC techniques. In Chapter 4 we demonstrate methods for privacy-preserving statistical analysis. We explain how the functions and distributions often used in these analyses can be computed by SMC protocols. In Chapter 5 we show how we can avoid not only leaks during the computation, but also leaks through the results of statistical analyses — we show how mechanisms for differential privacy can be implemented through SMC.
Another important primitive in building privacy-preserving applications is oblivious data access — reading and writing the contents of arrays according to secret indices. In Chapter 6 we present efficient and composable methods for this building block. Some uses of these methods are presented in Chapter 7 where we show how to construct privacy-preserving protocols for language processing or business process analysis.
We continue with methods for checking the behavior of the participants of SMC protocols. In Chapter 8 we show how game-theoretic methods can encourage parties to input correct data to multiparty computation protocols. Chapters 9 and 10 describe different post-execution methods for verifying the correctness of the behavior of computing parties. The first of these is a general method where the honest majority can determine all deviating parties. The second method is applicable to tasks where, in practice, verifying the result of a computation can be much simpler than computing it. The method is demonstrated on the task of privacy-preserving linear programming.
In Chapter 11 we explore the limits of another method for privacy-preserving computation — transforming a task on the basis of its algebraic properties, publishing the transformed task and transforming the solution of the published task back into the solution of the original task. We find that, for linear programming, this method is most likely not applicable, as any transformation that preserves the confidentiality of the original task probably needs to solve the task. Still, the method can be applicable for solving systems of linear equations.
We finish the book with Chapter 12 by reviewing existing practical applications of SMC. In this chapter, we do not describe various research prototypes that have been developed by many different groups, but only the applications that have processed real data. We describe how the data was collected and processed, and how the results were made public.
Many of the results described in this book have been achieved in the project Usable and Efficient Secure Multiparty Computation (UaESMC), funded by the European Commission under the Seventh Framework Programme (grant agreement No. 284731). The project that ran from February 2012 to July 2015 under the theme ICT-2011.9.2 (HighTech Research Intensive SMEs in FET research), was a joint effort by Cybernetica AS (Estonia), the Royal Institute of Technology (Sweden), the National and Kapodistrian University of Athens (Greece), and the University of Tartu (Estonia). The goal of the project was to expand the number of areas and the kinds of algorithms that are amenable for applying SMC techniques.
In September 2014, a workshop for disseminating the results of the UaESMC project, as well as other results on applied SMC took place as a satellite event of ESORICS. In addition to talks by the project partners, there were two talks by the representatives of other research groups, for which we are very grateful. Both speakers have also generously contributed to this book as authors, and their chapters fit in well with the overall theme.
The publication of this book has also been supported by the Institutional Research Grant IUT27-1 from the Estonian Research Council, and by the Estonian Centre of Excellence in Computer Science, EXCS, funded through the European Regional Development Fund.
Peeter Laud and Liina Kamm
Tartu, April 2015
In this chapter, we formally define multiparty computation tasks and the security of protocols realizing them. We give a broad presentation of the existing constructions of secure multiparty computation (SMC) protocols and explain why they are correct and secure. We discuss the different environmental aspects of SMC protocols and explain the requirements that are necessary and sufficient for their existence.
In this chapter we present the Arithmetic Black Box (ABB) functionality. It is an ideal functionality that preserves the privacy of the data it stores and allows computations to be performed on the data stored, as well as retrieve certain pieces of data. We show that it is a very convenient abstraction of secure multiparty computation (SMC), which enables easy extensions and the construction of large privacy-preserving applications on top of it. In this chapter, we give a detailed definition of the ABB functionality and present different ways in which it can be implemented securely. We explain what extending an ABB means and how it is integrated into larger applications.
The aim of this chapter is to introduce and discuss the potential socio-technical barriers that might be hindering the adoption of Secure Multiparty Computation (SMC) techniques. By investigating the conditions of adoption of technology under development, we are able to find different solutions that support the technology adoption process. We set out to interview the potential future users of SMC technologies to investigate the most likely adopters and to find the best directions for future developments of SMC. SMC is compared with the existing practices of data handling to demonstrate its advantages as well as disadvantages. In the current development phase, the focus of SMC advances needs to be on the usefulness and on finding an appropriate niche for the technology.
This chapter gives an overview of privacy-preserving versions of the analysis methods and algorithms that are most commonly used in statistical analysis. We discuss methods for data collection and sharing, and describe privacy-preserving database joins and sorting. From simple statistical measures, we discuss the count and sum of elements, quantiles, the five-number summary, frequency tables, mean, variance, covariance and standard deviation. We look into outlier detection and explore privacy-preserving versions of several different statistical tests, such as Student's t-test, Wilcoxon rank-sum and signed-rank tests and the χ2-test. We discuss how to evaluate the significance of the test statistic in the privacy-preserving environment. We give several options for linear regression and conclude the chapter with a privacy-preserving method for data classification.
Computing aggregate statistics about user data is of vital importance for a variety of services and systems, but this practice seriously undermines the privacy of users. Recent research efforts have focused on the development of systems for aggregating and computing statistics about user data in a distributed and privacy-preserving manner. Differential privacy has played a pivotal role in this line of research: the fundamental idea is to perturb the result of the statistics before release, which suffices to hide the contribution of individual users.
In this paper, we survey existing approaches to privacy-preserving data aggregation and compare them with respect to system assumptions, privacy guarantees, utility, employed cryptographic methods, and supported sanitization mechanisms. Furthermore, we explore the usage of secure multiparty computations (SMC) for the development of a general framework for privacy-preserving data aggregation. The feasibility of such an approach is a long-held belief, which we support by providing the first efficient cryptographic realization. In particular, we present PrivaDA, a new and general design framework for distributed differential privacy that leverages recent advances in SMC on fixed and floating point numbers. PrivaDA supports a variety of perturbation mechanisms, e.g. the Laplace, discrete Laplace, and exponential mechanisms. We demonstrate the efficiency of PrivaDA with a performance evaluation and its usefulness by exploring two potential application scenarios, namely, web analytics and lecture evaluations.
In this chapter, we describe efficient protocols for performing reads and writes in private arrays according to private indices. The protocols are implemented on top of the arithmetic black box (ABB) and can be composed freely to build larger privacy-preserving applications. We present two approaches to speed up private reads and writes — one based on precomputation and the other one on sorting. We show how several different problems become significantly more tractable while preserving the privacy of inputs. In particular, our second approach opens up a large class of parallel algorithms for adoption to run on SMC platforms.
In this chapter we use secure multiparty computation (SMC) to enable privacy-preserving engineering of inter-organizational business processes. Business processes often involve structuring the activities of several organizations, for example when several potentially competitive enterprises share their skills to form a temporary alliance.
One of the main obstacles to engineering the processes that govern such collaborations is the perceived threat to the participants' autonomy. In particular, the participants can be reluctant to expose their internal processes or logs, as this knowledge can be analyzed by other participants to reveal sensitive information.
We use SMC techniques to handle two problems in this context: process fusion and log auditing. Process fusion enables the constituents of a collaboration to discover their local views of the inter-organizational workflow, enabling each company to re-shape, optimize and analyze their local flows. Log auditing enables a participant that owns a business process to check if the partner's logs match its business process, thus enabling the two partners to discover failures, errors and inefficiencies.
In this chapter we give a very brief overview of some fundamentals from mechanism design, the branch of game theory dealing with designing protocols to cope with agents' private incentives and selfish behavior. We also present recent results involving a new, extended utilities model that can incorporate externalities, such as malicious and spiteful behavior of the participating players. A new notion of strong truthfulness is proposed and analyzed. It is based on the principle of punishing players that lie. Due to this, strongly truthful mechanisms can serve as subcomponents in bigger mechanism protocols in order to boost truthfulness. The related solution concept equilibria are discussed and the power of the decomposability scheme is demonstrated by an application in the case of the well-known mechanism design problem of scheduling tasks to machines for minimizing the makespan.
We present a generic method for turning passively secure protocols into protocols secure against covert attacks. This method adds to the protocol a post-execution verification phase that allows a misbehaving party to escape detection only with negligible probability. The execution phase, after which the computed protocol result is already available to the parties, has only negligible overhead added by our method.
The method uses shared verification based on linear probabilistically checkable proofs. The checks are done in zero-knowledge, thereby preserving the privacy guarantees of the original protocol. This method is inspired by recent results in verifiable computation, adapting them to the multiparty setting and significantly lowering their computational costs for the provers. The verification is straightforward to apply to protocols over finite fields.
A longer preprocessing phase can be introduced to shorten the verification phase even more. Beaver triples can be used to make it possible to verify the entire protocol execution locally on shares, leaving for verification just some linear combinations that do not need complex zero-knowledge proofs. Using preprocessing provides a natural way of verifying computation over rings of the size of 2n.
In this chapter, we show how to guarantee correctness when applying multiparty computation in outsourcing scenarios. Specifically, we consider how to guarantee the correctness of the result when neither the parties supplying the input nor the parties performing the computation can be trusted. Generic techniques to achieve this are too slow to be of practical use. However, we show that it is possible to achieve practical performance for specific problems by exploiting the existence of certificates proving that a computation result is correct.
In this chapter we study transformation-based approaches for outsourcing some particular tasks, based on publishing and solving a transformed version of the problem instance. First, we demonstrate a number of attacks against existing transformations for privacy-preserving linear programming. Our attacks show the deficiencies of existing security definitions; we propose a stronger, indistinguishability-based definition of security of problem transformations that is very similar to IND-CPA security of encryption systems. We study the realizability of this definition for linear programming and find that barring radically new ideas, there cannot be transformations that are information-theoretically or even computationally secure. Finally, we study the possibility of achieving IND-CPA security in privately outsourcing linear equation systems over real numbers, and see that it is not as easy as for linear equations over finite fields for which it is known that efficient information-theoretically secure transformations do exist.
As secure multiparty computation technology becomes more mature, we see more and more practical applications where computation moves from a lab setting to an actual deployment where each computation node is hosted by a different party. In this chapter, we present some of the practical SMC applications that have worked on real data. A few of these have been developed by Cybernetica AS, while others have been found from public sources. All scientific prototypes or simulations that work on generated or public data have been left out of this chapter. For each chosen SMC application, we give a brief overview and refer the reader to the corresponding published sources for more details.