Clustering is an important data mining problem. Many clustering algorithms have been proposed, most of them deal with clustering of numerical data. In this paper, we propose a new approach of clustering numerical data named RCND (Random Based Clustering for Numerical Data). We suggest to cluster data regardless of their distribution.
Controlling food intake is important to tackle obesity. This is achievable by developing apps to automatically classifying foods and estimating their calories. However, classification of foods is hard since it is highly deformable and variable. The key for solving this problem is to find an appropriate representation for foods. In this paper, we propose a Convolutional Neural Network for representing and classifying foods. Our ConvNet is different from common ConvNet architectures in the sense that it uses spatial pyramid pooling and it directly feeds the information from the middle layers to the fully connected layer. Our experiments show that while the best-performed hand-crafted feature classifies only 40.95% of the test samples, correctly, our ConvNet classifies them with 79.10% accuracy. In addition, it achieves 94% top-5 accuracy on the test set. Finally, we show that spatial pyramid pooling has a significant impact on the accuracy of our ConvNet.
Emran Saleh, Antonio Moreno, Aida Valls, Pedro Romero-Aroca, Sofia de La Riva-Fernandez
169 - 174
Diabetic retinopathy is an ophthalmic malady that is the major cause of blindness in diabetic patients. Early detection is important to minimize the risk of vision loss. An screening of the eye fundus can confirm the disease and its severity but this test is costly and time-consuming. In this work, we propose a decision support system that uses fuzzy random forests to analyze the clinical data of each patient in order to detect any sign of developing diabetic retinopathy and to determine the necessity of the screening. The combination of fuzzy sets and a classifier ensemble for the detection of diabetic retinopathy achieves high sensitivity and specificity scores, improving the results given when using a single decision tree.
Vicenç Gómez, Mohammad Gheshlaghi Azar, Hilbert J. Kappen
177 - 186
We consider the problem of inferring connectivity from time-series data under the presence of time-dependent common input originating from non-measured variables. We analyze a simple method to filter out the influence of such confounding variables in multivariate auto-regressive models (MVAR). The method learns the parameters of an extended MVAR model with latent variables. Using synthetic MVAR models we characterize where connectivity reconstruction is possible and useful and show that regularization is convenient when the common input has strong influence. We also illustrate how the method can be used to correct partial directed coherence, a causality measure used often in the neuroscience community.
Kemo Adrian, Paula Chocron, Roberto Confalonieri, Xavier Ferrer, Jesús Giráldez-Cru
187 - 196
Studying the prediction of new links in evolutionary networks is a captivating question that has received the interest of different disciplines. Link prediction allows to extract missing information and evaluate network dynamics. Some algorithms that tackle this problem with good performances are based on the sociability index, a measure of node interactions over time. In this paper, we present a case study of this predictor in the evolutionary graph that represents the CCIA co-authorship network from 2005 to 2015. Moreover, we present a generalized version of this sociability index, that takes into account the time in which such interactions occur. We show that this new index outperforms existing predictors. Finally, we use it in order to predict new co-authorships for CCIA 2016.
The analysis of opinions on social networks has recently received a considerable attention on many application fields. Although there exist many specialized and generalist social networks, nowadays Twitter is one of the most widely used when it comes to share and criticize relevant news, and the citizens response to news and events in Twitter is frequently taken as an indicator of the social interest for them. In order to understand what are the major accepted and rejected opinions in different domains by Twitter users, in a recent work we have developed an analysis system based on Valued Abstract Argumentation to model and reason about Twitter discussions under different schemes for weighting the tweet relevance. The argumentative model computes the set of socially accepted tweets in a discussion by taking the weight assigned to each tweet and the (possible) criticism between the discussion tweets. In this paper we propose to go one step further by considering the support between tweets and not only criticism between them. Our approach is not based on explicitly computing indirect attacks between tweets, but on the revaluation of the tweet relevance through the spread of the support tweet weights. In order to validate this new extension, we analyze different real Twitter discussions from the political domain and we compare the results with those obtained in a previous work, what allows us to evaluate how support relations can modify the accepted tweets.
The Hurst exponent is the only real number required to describe a type of stochastic process known as fractal Brownian motion that is employed to model time series of financial origin. The Hurst exponent can also be taken as a measure of the long term memory of time series. In this work we analyze four daily time series of prices (Open, Close, High and Low) of some American and European stock indices. There are very few studies of the characteristics of daily High and Low time series. However, an empirical, in-depth, comparative study of all four series reveals some consistent patterns at the day-to-day time scale. In all the cases considered, the Hurst exponents of High and Low time series are appreciably higher than those obtained for Open and Close. Our analysis indicates that High and Low index values are more persistent (positively auto-correlated) and, therefore, more predictable than Open and Close values, whose time series fluctuate between persistence and anti-persistence (negative auto-correlation) and in the long term have a Hurst exponent close to 0.5, characteristic of a random walk process.
Iván Paz, Àngela Nebot, Enrique Romero, Francisco Mugica
213 - 220
The present work describes an algorithm for human perceptual exploration of algorithmic composition systems' parameter spaces. It works by considering the values of the system parameters, together with the perceptual user evaluation of the system output corresponding to such parameter configuration, as input-output relations. Then, the algorithm iteratively searches in the data to find combinations of parameters with the same classification (evaluation) and differing only in the values of one parameter. If the absolute difference between them is less than a pre-established threshold, the combinations are compressed into one rule. The rules have the if-then standard form. As the parameters commonly express different physical dimensions (like amplitude and frequency), the threshold is set independently for each one. Finally, an example applied to the parameter space of a band limited impulse oscillator is presented.
Paulo N. Carrillo, Clara I. Peña, Josep Ll. de La Rosa
221 - 226
With the popularization of online games and social networks, the virtual currencies have acquired a boom growing as solution of alternative payment and best adapted to the particular needs of the exchange of goods or virtual services offering faster, more secure and low cost transactions of value. This work proposes a case study to create the Eurakos Next cryptocurrency based on a virtual currency named Eurakos which currently works using digital contracts. The idea is to allow agreements to be signed by two peers and validated by other peers in a mobile social community network using Smart Contracts through the Ethereum framework to take advantage of the Blockchain technology.
Atia Cortés, Javier Béjar, Cristian Barrué, Antonio B. Martínez, Ulises Cortés
227 - 232
The Ten Meter Walking Test (10MWT) has been widely used in rehabilitation literature as an indicator of physical decline and other health-related outcomes. With an increasing senior population, it is important to analyze and estimate physical limitations in older people to prevent falls and their consequences, not only for the individual's benefit but also for their social environment and the sustainability of public health-care systems. The 10MWT as measured today gives only values of speed. This paper introduces the sensing capabilities of the i-Walker and its use in measuring the 10MWT. The volunteers in this study are a subset the participants in the pilots installed during the I-DONT FALL EU funded project. This paper also proposes a Machine Learning method for analyzing individuals' walking ability and risk of falling by using an instrumented smart walker: the i-Walker.
Xerxes D. Arsiwalla, Ivan Herreros, Clément Moulin-Frier, Marti Sanchez, Paul Verschure
233 - 238
Understanding the nature of consciousness has been an outstanding scientific puzzle at the crossroads of neuroscience and artificial intelligence. While brains have long since known to be the bearers of consciousness and machines, that of computation, the history of cybernetics has been full of attempts trying to synthesize consciousness in computational architectures. In recent years, ideas from control theory have proven to be extremely useful for addressing systems-level questions in neuroscience and designing cognitive architectures. Extending these ideas to the study of consciousness, we discuss the core functions of consciousness and control architectural specifications of agents capable of operationalizing these functionalities. We suggest that evolutionary pressures on social dynamics of interacting agents leads to the emergence of consciousness, which is a process for predicting intentional states of other agents (and self) in order to generate social cooperative and competitive behaviors necessary to optimize an agent's survival drives in a world with limited resources.
Francesco Barbieri, Luis Espinosa-Anke, Horacio Saggion
239 - 244
Emojis are small sized images which are naturally combined with free text to visually complement or condense the meaning of a message. The set of available emojis is fixed, irrespective of a user's location. However, their interpretation and the way they are used may vary. In this paper, we compare the meaning and usage of emojis across two Spanish cities: Barcelona and Madrid. Our results suggest that the overall semantics of the subset of emojis we studied is preserved over these cities. However, some of them are interpreted differently, which suggests there may exist cultural differences between inhabitants of Barcelona and Madrid, and that these are reflected in how they communicate in social networks.
Jordi Sabater-Mir, Carlos Palma-Zurita, Joan Cuadros-Oller
245 - 250
There is an important corpus of research in cognitive architectures but they tend to overlook aspects like agent motivations and social behaviour. In this article we present the first step towards a cognitive architecture with advanced social capabilities. We propose a mechanism that allows the agent to activate goals according to its basic needs following a well known motivational theory.
Jennifer Nguyen, Germán Sánchez-Hernández, Núria Agell, Xari Rovira, Cecilio Angulo
253 - 262
Assigning papers to reviewers is a large, long and difficult task for conference chairs and scientific committees. The reviewer assignment problem is a multi-agent problem which requires understanding reviewer expertise and paper topics for the matching process. This paper proposes to elaborate on variables used to compute reviewer expertise and aggregate multiple factors to find the fittest combination of reviewers to each paper. Expertise information is gathered implicitly from publicly available information and a reviewer profile is generated automatically. An OWA (Ordered Weighted Average) aggregation function is used to summarize information coming from different sources and rank the candidate reviewers for each paper. General constraints for the RAP (Reviewer Assignment Problem) have been incorporated into a real case example: (i) conflicts of interest between the reviewer and authors should be avoided, (ii) each paper must have a minimum number of reviewers, and (iii) each reviewer load cannot exceed a certain number of papers.
Javier Segovia-Aguas, Jonathan Ferrer-Mestres, Anders Jonsson
263 - 272
In this paper we present a framework called Planning with Partially Specified Behaviors, or PPSB, for combining reinforcement learning and planning to solve sequential decision problems. Although not often combined, we show that reinforcement learning and planning complement each other well, in that each can take advantage of the strengths of the other. PPSB uses partial action specifications to decompose sequential decision problems into tasks that serve as an interface between reinforcement learning and planning. On the bottom level, we use reinforcement learning to compute policies for achieving each individual task. On the top level, we use planning to produce a sequence of tasks that achieves an overall goal. We validate PPSB in experiments in which a robot has to perform tasks in a realistic simulated environment.
In this paper we introduce EventAware, a context-aware mobile recommender system to personalize the agenda of users attending to a congress. In particular, we first introduce the EventAware system, which includes an intuitive user interface with an attractive design to enhance user experience. EventAware incorporates some implicit contextual information, automatically initializes both the user's profiles with minimal user interaction and the properties of the items, and it uses a context-aware tag-based recommender algorithm. EventAware has been specifically crafted to assist attending users to a congress by providing them with smart and personalized sessions and exhibitors during the congress. We demonstrate its usability through a live-user case-study in one of the biggest events of mobile technology in the world, held in Barcelona.
In the past few years, online services have become the most important tools for the interaction between people. New technologies allow instant communication with anyone in the world, anytime. The Web is full of new social applications for communication that are powerful and world–changing. This new age of social systems on the web governed by both computational and social processes is growing and evolving quickly. It is mandatory to strengthen the cooperation between the Web and the AI researchers to build together smarter systems with the aim of amplifying the capabilities of the people while on the social networks. This paper is about the definition of a Web platform for discussion and consensus achievement where humans and machines interact together. Thus, the document aims to demonstrate how an environment where AI and users who share information may enrich all partners and how current Web mining capabilities can be enhanced from a social perspective.
Luis Espinosa-Anke, Sergio Oramas, José Camacho-Collados, Horacio Saggion
291 - 296
Lexical taxonomies are tree or directed acyclic graph-like structures where each node represents a concept and each edge encodes a binary hypernymic (is-a) relation. These lexical resources are useful for AI tasks like Information Retrieval or Machine Translation. Two main trends exist in the construction and exploitation of these resources: On one hand, general purpose taxonomies like Word-Net, and on the other, domain-specific databases such as the CheBi chemical ontology, or MusicBrainz in the music domain. In both cases these are based on finding correct hypernymic relations between pairs of concepts. In this paper, we propose a generic framework for hypernym discovery, based on exploiting linear relations between (term, hypernym) pairs in Wikidata, and apply it to the domain of music. Our promising results, based on several metrics used in Information Retrieval, show that in several cases we are able to discover the correct hypernym for a given novel term.
Sayyed-Ali Hossayni, Mohammad-R Rajati, Esteve Del Acebo, Diego Reforgiato Recupero, Aldo Gangemi, Mohammad-R Akbarzadeh-T, Josep Lluis de La Rosa I Esteva
297 - 302
WordNet-like Lexical Databases (WLDs) group English words into synsets, being utilized in several text mining applications. Synsets were also open to criticism, because while synset members (wordsenses) are, in practice, considered as compeers, yet in theory not all of them represent the synset meaning with a same degree. Considering this criticism, fuzzy synsets (considering synsets as fuzzy sets) have been proposed. In this study, we show why the standard fuzzy synsets do not properly-enough model the membership uncertainty, and propose an upgraded version of them in which membership degrees are represented by intervals (similar to what in Interval Type 2 Fuzzy Sets). We present an algorithm for constructing the interval fuzzy version of WLDs of a language, given a large enough multi-contextual corpus of documents and a precise enough word-sense-disambiguation (WSD) system of that language. Utilizing the algorithm, we produced interval fuzzy synsets of English WordNet (for the frequent-enough synsets). For evaluation, we compared the results with crowdsourced data, asking people to rate the min/max compatibility degree of wordsenses of a synset with its definition. Comparisons, promisingly, showed the algorithm accuracy. The algorithm has also the drawback of being applicable only for synsets with wordsenses having enough frequency in all the corpus categories. This drawback is going to be covered in our future work.
Authorship attribution deals with the prediction of the author of a (usually written) discourse. This is of high relevance to a number of applications, including plagiarism detection, authenticity verification and deception detection. So far, most of the state-of-the-art approaches to author attribution rely mainly upon lexical and token (sequence) distribution features. But this means to neglect numerous linguistic studies that clearly indicate the high relevance of syntactic features to the characterization of a personal style of an author. We show in an experiment with 26 authors that indeed the use of syntactic features helps us to achieve a >77% an accuracy.
Simon Mille, Miguel Ballesteros, Alicia Burga, Gerard Casamayor, Leo Wanner
309 - 314
We present work in progress that tackles the problem of multilingual text summarization using semantic representations. As opposed to extractive summarization, in which text fragments are selected and a summary is assembled from them, our abstractive summarizer is based on abstract linguistic structures obtained from an analysis pipeline of disambiguation, syntactic and semantic parsing tools. The resulting structures are stored in a semantic repository, from which a text planning component produces content plans that go through a multilingual generation pipeline that eventually returns text in English, Spanish, French and/or German. We focuse on the multilingual generation part of the problem.