Ebook: ECAI 2016
Artificial Intelligence continues to be one of the most exciting and fast-developing fields of computer science.
This book presents the 177 long papers and 123 short papers accepted for ECAI 2016, the latest edition of the biennial European Conference on Artificial Intelligence, Europe’s premier venue for presenting scientific results in AI. The conference was held in The Hague, the Netherlands, from August 29 to September 2, 2016. ECAI 2016 also incorporated the conference on Prestigious Applications of Intelligent Systems (PAIS) 2016, and the Starting AI Researcher Symposium (STAIRS). The papers from PAIS are included in this volume; the papers from STAIRS are published in a separate volume in the Frontiers in Artificial Intelligence and Applications (FAIA) series.
Organized by the European Association for Artificial Intelligence (EurAI) and the Benelux Association for Artificial Intelligence (BNVKI), the ECAI conference provides an opportunity for researchers to present and hear about the very best research in contemporary AI. This proceedings will be of interest to all those seeking an overview of the very latest innovations and developments in this field.
This volume contains the proceedings of the Twenty-second European Conference on Artificial Intelligence (ECAI 2016), held from August 29th to September 2nd, 2016, in The Hague, The Netherlands. Since 1974, the biennial European Conference on Artificial Intelligence, organized by the European Association for Artificial Intelligence (EurAI, formerly named ECCAI), has been the premier venue for presenting AI research in Europe. ECAI is the place for researchers and practitioners of Artificial Intelligence (AI) to gather and to discuss the latest trends and challenges in all subfields of AI as well as to demonstrate innovative applications and uses of advanced AI technology.
ECAI 2016 was co-located with Collective Intentionality X, the interdisciplinary conference on collective intentionality, and the IEEE Symposium on Ethics of Autonomous Systems (SEAS Europe). As in the past, ECAI 2016 incorporates the Conference on Prestigious Applications of Intelligent Systems (PAIS 2016) and the Starting AI Researcher Symposium (STAIRS). The papers from PAIS are included in this volume, while the papers from STAIRS are published in a separate volume. ECAI 2016 also featured a special topic on Artificial Intelligence for Human Values, with a dedicated track and a public event in the Peace Palace in The Hague.
The program of ECAI 2016 included five invited plenary talks, including two, by Michael Bratman and Johanna Seibt, shared with Collective Intentionality X, and an extensive workshop and tutorial program.
In total, 656 papers were submitted to ECAI 2016. Of these, 177 (27%) were accepted as long papers and 123 (19%) were accepted as short papers, of which 108 were presented at the conference. This makes ECAI 2016 the largest edition in the 40-year history of ECAI conferences, reflecting the growth and vitality of AI as a research field. This year we pioneered the idea of supporting continuity of the peer reviewing process, by allowing authors to submit resubmissions of papers rejected from IJCAI 2016, alongside their reviews from that conference. We also introduced a Summary Reject process, enabling Senior Program Committee (SPC) members to use their knowledge and experience to help to reduce the load on reviewers. We also tried a new process of allocating papers to Program Committee (PC) members, in which SPCs allocated papers to a team of PC members that they had recruited themselves, in order to encourage team-working. Thanks to the dedicated support of SPC and PC members, these innovations worked well. Out of the short paper acceptances, 12 (9.8%) and out of the accepted long papers, 31 (17.4%) were former IJCAI submissions. Clearly people took advantage of the opportunity, and a large proportion of their revised submissions were rewarded by success. During the paper discussion period, four papers were given the option of transferring to PAIS. The PAIS programme consisted of 10 papers presenting substantial applications of AI research.
We were lucky to be part of a dedicated team. We would like to thank Meir Kalech for putting in place an extensive workshop program of 18 workshops; Sophia Ananiadou and Leon van der Torre for attracting 13 exciting tutorials, including 5 “Spotlight” tutorials; Helena Sofia Pinto and David Pearce for developing an exciting STAIRS 2016 program; Jeroen van den Hoven and Henry Prakken for managing the AI and Human Values Track; Robert-Jan Sips for organising the AIckathon, and last but not least Eyke Hullermeijer and Paolo Bouquet for chairing PAIS 2016.
We would also like to thank the local organizers, Frank Dignum and Virginia Dignum, and the General Chair, Frank van Harmelen. We are indebted to our predecessor, Torsten Schaub, whose materials we found helpful, and Thomas Preuss for help and advice in using the Confmaster system. Finally, heartfelt thanks to all PC and SPC members, reviewers, sponsors, and all authors who submitted to ECAI 2016.
Maria Fox and Gal A. Kaminka
Program Chairs
ECAI 2016
Institutions regulate societies. Comprising Searle's constitutive counts-as rules, “A counts-as B in context C”, an institution ascribes from brute and institutional facts (As), a social reality comprising institutional facts (Bs) conditional on the social reality (contexts Cs). When brute facts change an institution evolves from one social reality to the next. Rule changes are also regulated by rule-modifying counts-as rules ascribing rule change in the past/present/future (e.g. a majority rule change vote counts-as a rule change). Determining rule change legality is difficult, since changing counts-as rules both alters and is conditional on the social reality, and in some cases hypothetical rule-change effects (e.g. not retroactively criminalising people). However, without a rigorous account of rule change ascriptions, AI agents cannot support humans in understanding the laws imposed on them. Moreover, advances in automated governance design for socio-technical systems, are limited by agents' ability to understand how and when to enact institutional changes. Consequently, we answer “when do rule changes count-as legal rule changes?” in a temporal setting with a novel formal framework.
A new algorithm named EXPected Similarity Estimation (EXPOSE) was recently proposed to solve the problem of large-scale anomaly detection. It is a non-parametric and distribution free kernel method based on the Hilbert space embedding of probability measures. Given a dataset of n samples, EXPOSE takes [Oscr ](n) time to build a model and [Oscr ](1) time per prediction.
In this work we describe and analyze a simple and effective stochastic optimization algorithm which allows us to drastically reduce the learning time of EXPOSE from previous linear to constant. It is crucial that this approach allows us to determine the number of iterations based on a desired accuracy, independent of the dataset size n. We will show that the proposed stochastic gradient descent algorithm works in general possible infinite-dimensional Hilbert spaces, is easy to implement and requires no additional step-size parameters.
The area under the curve (AUC) measures such as the area under the receiver operating characteristics curve (AUROC) and the area under the precision-recall curve (AUPR) are known to be more appropriate than the error rate, especially, for imbalanced data sets. There are several algorithms to optimize AUC measures instead of minimizing the error rate. However, this idea has not been fully exploited in Bayesian hierarchical models owing to the difficulties in inference. Here, we formulate a general Bayesian inference framework, called Bayesian AUC Maximization (BAM), to integrate AUC maximization into Bayesian hierarchical models by borrowing the pairwise and listwise ranking ideas from the information retrieval literature. To showcase our BAM framework, we develop two Bayesian linear classifier variants for two ranking approaches and derive their variational inference procedures. We perform validation experiments on four biomedical data sets to demonstrate the better predictive performance of our framework over its error-minimizing counterpart in terms of average AUROC and AUPR values.
In the literature, different topic models have been introduced that target the task of viewpoint extraction. Because, generally, these studies do not present thorough validations of the models they introduce, it is not clear in advance which topic modeling technique will work best for our use case of extracting viewpoints of political parties from parliamentary proceedings. We argue that the usefulness of methods like topic modeling depend on whether they yield valid and reliable results on real world data. This means that there is a need for validation studies. In this paper, we present such a study for an existing topic model for viewpoint extraction called cross-perspective topic modeling [11]. The model is applied to Dutch parliamentary proceedings, and the resulting topics and opinions are validated using external data. The results of our validation show that the model yields valid topics (content and criterion validity), and opinions with content validity. We conclude that cross-perspective topic modeling is a promising technique for extracting political parties' positions from parliamentary proceedings. Second, by exploring a number of validation methods, we demonstrate that validating topic models is feasible, even without extensive domain knowledge.
Extending qualitative CSPs with the ability of restricting selected variables to finite sets of possible values has been proposed as an important research direction with important applications. Complexity results for this kind of formalisms have appeared in the literature but they focus on concrete examples and not on general principles. We propose three general methods. The first two methods are based on analysing the given CSP from a model-theoretical perspective, while the third method is based on directly analysing the growth of the representation of solutions. We exemplify our methods on temporal and spatial formalisms including Allen's algebra and RCC5.
Q-learning associates states and actions of a Markov Decision Process to expected future reward through online learning. In practice, however, when the state space is large and experience is still limited, the algorithm will not find a match between current state and experience unless some details describing states are ignored. On the other hand, reducing state information affects long term performance because decisions will need to be made on less informative inputs. We propose a variation of Q-learning that gradually enriches state descriptions, after enough experience is accumulated. This is coupled with an ad-hoc exploration strategy that aims at collecting key information that allows the algorithm to enrich state descriptions earlier. Experimental results obtained by applying our algorithm to the arcade game Pac-Man show that our approach significantly outperforms Q-learning during the learning process while not penalizing long-term performance.
Given a logic-based argumentation framework built over a knowledge base in a logical language and a query in that language, the query is universally accepted if it is entailed from all extensions. As shown in [2, 14], universal acceptance is different from skeptical acceptance as a query may be entailed from different arguments distributed over all extensions but not necessarily skeptical ones. In this paper we provide a dialectical proof theory for universal acceptance in coherent logic-based argumentation frameworks. We prove its finiteness, soundness, completeness, consistency and study its dispute complexity. We give an exact characterization for non-universal acceptance and provide an upper-bound for universal acceptance.
Hashing has been proved an attractive technique for fast nearest neighbor search over big data. Compared to the projection based hashing methods, prototype based ones own stronger capability of generating discriminative binary codes for the data with complex inherent structure. However, our observation indicates that they still suffer from the insufficient coding that usually utilizes the complete binary codes in a hypercube. To address this problem, we propose an adaptive binary quantization method that learns a discriminative hash function with prototypes correspondingly associated with small unique binary codes. Our alternating optimization adaptively discovers the prototype set and the code set of a varying size in an efficient way, which together robustly approximate the data relations. Our method can be naturally generalized to the product space for long hash codes. We believe that our idea serves as a very helpful insight to hashing research. The extensive experiments on four large-scale (up to 80 million) datasets demonstrate that our method significantly outperforms state-of-the-art hashing methods, with up to 58.84% performance gains relatively.
Materialization is an important reasoning service for applications built on the Web Ontology Language (OWL). To make materialization efficient in practice, current research focuses on deciding tractability of an ontology language and designing parallel reasoning algorithms. However, some well-known large-scale ontologies, such as YAGO, have been shown to have good performance for parallel reasoning, but they are expressed in ontology languages that are not parallelly tractable, i.e., the reasoning is inherently sequential in the worst case. This motivates us to study the problem of parallel tractability of ontology materialization from a theoretical perspective. That is, we aim to identify the ontologies for which materialization is parallelly tractable, i.e., in NC complexity. In this work, we focus on datalog rewritable ontology languages. We identify several classes of datalog rewritable ontologies (called parallelly tractable classes) such that materialization over them is parallelly tractable. We further investigate the parallel tractability of materialization of a datalog rewritable OWL fragment DHL (Description Horn Logic) and an extension of DHL that allows complex role inclusion axioms. Based on the above results, we analyze real-world datasets and show that many ontologies expressed in DHL or its extension belong to the parallelly tractable classes.
Gaussian Process Regression (GPR) is a powerful non-parametric method. However, GPR may perform poorly if the data are contaminated by outliers. To address the issue, we replace the Gaussian process with a Student-t process and introduce dependent Student-t noise in this paper, leading to a Student-t Process Regression with Dependent Student-t noise model (TPRD). Closed form expressions for the marginal likelihood and predictive distribution of TPRD are derived. Besides, TPRD gives a probabilistic interpretation to the Student-t Process Regression with the noise incorporated into its Kernel (TPRK), which is a common approach for the Student-t process regression. Moreover, we analyze the influence of different kernels. If the kernel meets a condition, called β-property here, the maximum marginal likelihood estimation of TPRD's hyperparameters is independent of the degrees of freedom ν of the Student-t process, which implies that GPR, TPRD and TPRK have exactly the same predictive mean. Empirically, the degrees of freedom ν could be regarded as a convergence accelerator, indicating that TPRD with a suitable ν performs faster than GPR. If the kernel does not have the β-property, TPRD has better performances than GPR, without additional computational cost. On benchmark datasets, the proposed results are verified.
Link prediction is a key problem in social network analysis: it involves making suggestions about where to add new links in a network, based solely on the structure of the network. We address a special case of this problem, whereby the new links are supposed to connect different communities in the network; we call it the interlinks prediction problem. This is particularly challenging as there are typically very few links between different communities. To solve this problem, we propose a local node-similarity measure, inspired by the Owen-value interaction index—a concept developed in cooperative game theory and fuzzy systems. Although this index requires an exponential number of operations in the general case, we show that our local node-similarity measure is computable in polynomial time. We apply our measure to solve the inter-links prediction problem in a number of real-life networks, and show that it outperforms all other local similarity measures in the literature.
Most of the existing word embedding models only consider the relationships between words and their local contexts (e.g. ten words around the target word). However, information beyond local contexts (global contexts), which reflect the rich semantic meanings of words, are usually ignored. In this paper, we present a general framework for utilizing global information to learn word and text representations. Our models can be easily integrated into existing local word embedding models, and thus introduces global information of varying degrees according to different downstream tasks. Moreover, we view our models in the co-occurrence matrix perspective, based on which a novel weighted term-document matrix is factorized to generate text representations. We conduct a range of experiments to evaluate word and text representations learned by our models. Experimental results show that our models outperform or compete with state-of-the-art models. Source code of the paper is available at https://github.com/zhezhaoa/cluster-driven.
We investigate how incremental learning of long-term human activity patterns improves the accuracy of activity classification over time. Rather than trying to improve the classification methods themselves, we assume that they can take into account prior probabilities of activities occurring at a particular time. We use the classification results to build temporal models that can provide these priors to the classifiers. As our system gradually learns about typical patterns of human activities, the accuracy of activity classification improves, which results in even more accurate priors. Two datasets collected over several months containing hand-annotated activity in residential and office environments were chosen to evaluate the approach. Several types of temporal models were evaluated for each of these datasets. The results indicate that incremental learning of daily routines leads to a significant improvement in activity classification.
The Leader-Follower Markov Decision Processes (LF-MDP) framework extends both Markov Decision Processes (MDP) and Stochastic Games. It provides a model where an agent (the leader) can influence a set of other agents (the followers) which are playing a stochastic game, by modifying their immediate reward functions, but not their dynamics. It is assumed that all agents act selfishly and try to optimize their own long-term expected reward. Finding equilibrium strategies in a LF-MDP is hard, especially when the joint state space of followers is factored. In this case, it takes exponential time in the number of followers. Our theoretical contribution is threefold. First, we analyze a natural assumption (substitutability of followers), which holds in many applications. Under this assumption, we show that a LF-MDP can be solved exactly in polynomial time, when deterministic equilibria exist for all games encountered in the LF-MDP. Second, we show that an additional assumption of sparsity of the problem dynamics allows us to decrease the exponent of the polynomial. Finally, we present a state-aggregation approximation, which decreases further the exponent and allows us to approximately solve large problems. We empirically validate the LF-MDP approach on a class of realistic animal disease control problems. For problems of this class, we find deterministic equilibria for all games. Using our first two results, we are able to solve the exact LF-MDP problem with 15 followers (compared to 6 or 7 in the original model). Using state-aggregation, problems with up to 50 followers can be solved approximately. The approximation quality is evaluated by comparison with the exact approach on problems with 12 and 15 followers.
We consider Bucket Elimination (BE), a popular algorithmic framework to solve Constraint Optimisation Problems (COPs). We focus on the parallelisation of the most computationally intensive operations of BE, i.e., join sum and maximisation, which are key ingredients in several close variants of the BE framework (including Belief Propagation on Junction Trees and Distributed COP techniques such as ActionGDL and DPOP). In particular, we propose CUBE, a highly-parallel GPU implementation of such operations, which adopts an efficient memory layout allowing all threads to independently locate their input and output addresses in memory, hence achieving a high computational throughput. We compare CUBE with the most recent GPU implementation of BE. Our results show that CUBE achieves significant speed-ups (up to two orders of magnitude) w.r.t. the counterpart approach, showing a dramatic decrease of the runtime w.r.t. the serial version (i.e., up to 652× faster). More important, such speed-ups increase when the complexity of the problem grows, showing that CUBE correctly exploits the additional degree of parallelism inherent in the problem.
Future smart grids will empower home owners to buy energy from real-time markets, coalesce into energy cooperatives, and sell energy they generate from their local renewable energy sources. Such interactions by large numbers of small prosumers (that both consume and produce) will engender potentially unpredictable fluctuations in energy prices which could be detrimental to all actors in the system. Hence, in this paper, we propose negotiation mechanisms to orchestrate such interactions as well as pricing mechanisms to help stabilise energy prices on multiple time scales. We then prove 1) that our solution guarantees that, while prices fluctuations can be constrained, 2) that it is individually rational for agents to join energy cooperatives and 3) that the negotiation mechanisms we employ result in pareto-optimal solutions.
Classical logic based argumentation (ClAr) characterises single agent non-monotonic reasoning and enables distributed non-monotonic reasoning amongst agents in dialogues. However, features of ClAr that have been shown sufficient to ensure satisfaction of rationality postulates, preclude their use by resource bounded agents reasoning individually, or dialectically in real-world dialogue. This paper provides a new formalisation of ClAr that is both suitable for such uses and satisfies the rationality postulates. We illustrate by providing a rational dialectical characterisation of Brewka's non-monotonic Preferred Subtheories defined under the assumption of restricted inferential capabilities.
When attempting to persuade an agent to believe (or disbelieve) an argument, it can be advantageous for the persuader to have a model of the persuadee. Models have been proposed for taking account of what arguments the persuadee believes and these can be used in a strategy for persuasion. However, there can be uncertainty as to the accuracy of such models. To address this issue, this paper introduces a two-dimensional model that accounts for the uncertainty of belief by a persuadee and for the confidence in that uncertainty evaluation. This gives a better modeling for using lotteries so that the outcomes involve statements about what the user believes/disbelieves, and the confidence value is the degree to which the user does indeed hold those outcomes (and this is a more refined and more natural modeling than found in [19]). This framework is also extended with a modelling of the risk of disengagement by the persuadee.
Deep networks such as autoencoders and deep belief nets are able to construct alternative, and often informative, representations of unlabeled data by searching for (hidden) structure and correlations between the features chosen to represent the data and combining them into new features that allow sparse representations of the data. These representations have been chosen to often increase the accuracy of further classification or regression accuracy when compared to the original, often human chosen representations. In this work, we attempt an investigation of the relation between such discovered representations found using related but differently represented sets of examples. To this end, we combine the cross-domain comparison capabilities of unsupervised manifold alignment with the unsupervised feature construction of deep belief nets, resulting in an example mapping function that allows re-encoding examples from any source to any target task. Using the t-Distributed Stochastic Neighbour Embedding technique to map translated and real examples to a lower dimensional space, we employ KL-divergence to define a dissimilarity measure between data sets enabling us to measure found representation similarities between domains.
We consider a wide spectrum of regularized stochastic minimization problems, where the regularization term is composite with a linear function. Examples of this formulation include graph-guided regularized minimization, generalized Lasso and a class of ℓ1 regularized problems. The computational challenge is that the closed-form solution of the proximal mapping associated with the regularization term is not available due to the imposed linear composition. Fortunately, the structure of the regularization term allows us to reformulate it as a new convex-concave saddle point problem which can be solved using the Primal-Dual Hybrid Gradient (PDHG) approach. However, this approach may be inefficient in realistic applications as computing the full gradient of the expected objective function could be very expensive when the number of input data samples is considerably large. To address this issue, we propose a Stochastic PDHG (SPDHG) algorithm with either uniformly or non-uniformly averaged iterates. Through uniformly averaged iterates, the SPDHG algorithm converges in expectation with
In this work we address the problem of coordinated consumption shifting for electricity prosumers. We show that individual optimization with respect to electricity prices does not always lead to minimized costs, thus necessitating a cooperative approach. A prosumer cooperative employs an internal cryptocurrency mechanism for coordinating members decisions and distributing the collectively generated profits. The mechanism generates cryptocoins in a distributed fashion, and awards them to participants according to various criteria, such as contribution impact and accuracy between stated and final shifting actions. In particular, when a scoring rules-based distribution method is employed, participants are incentivized to be accurate. When tested on a large dataset with real-world production and consumption data, our approach is shown to provide incentives for accurate statements and increased economic profits for the cooperative.
Cost-optimal planning has become a very well-studied topic within planning. Needless to say, cost-optimal planning has proven to be computationally hard both theoretically and in practice. Since cost-optimal planning is an optimisation problem, it is natural to analyse it from an approximation point of view. Even though such studies may be valuable in themselves, additional motivation is provided by the fact that there is a very close link between approximability and the performance of heuristics used in heuristic search. The aim of this paper is to analyse approximability (and indirectly the performance of heuristics) with respect to lower time bounds. That is, we are not content by merely classifying problems into complexity classes — we also study their time complexity. This is achieved by replacing standard complexity-theoretic assumptions (such as P ≠ NP) with the exponential time hypothesis (ETH). This enables us to analyse, for instance, the performance of the h+ heuristic and obtain general trade-off results that correlate approximability bounds with bounds on time complexity.
A realistic model of multi-agent planning must allow us to formalize notions which are absent in classical planning, such as communication and knowledge. We investigate multi-agent planning based on a simple logic of knowledge that is grounded on the visibility of propositional variables. Using such a formal logic allows us to prove the existence of a plan given the description of the individual actions. We present an encoding of multi-agent planning problems expressed in this logic into the standard planning language PDDL. The solvability of a planning task is reduced to a model checking problem in a dynamic extension of our logic, proving its complexity. Feeding the resulting problem into a PDDL planner provides a provably correct plan for the original multi-agent planning problem. We apply our method on several examples such as the gossip problem.