
Ebook: New Directions in Neural Networks

The book is a collection of selected papers from the 18th WIRN workshop, the annual meeting of the Italian Neural Networks Society (SIREN). As the number 18 marks the year young people come of age in Italy, the society invited two generations of researchers to participate in a discussion on neural networks: those new to the field and those with extensive familiarity with the neural paradigm. The challenge laid in understanding what remains of the revolutionary ideas from which neural networks stemmed in the eighties, how these networks have evolved and influenced other research fields, and ultimately, what the new conceptual/methodological frontiers are that need to be trespassed for a better exploitation of the information carried by data. This book presents the outcome of this discussion. New Directions in Neural Networks is divided in two general subjects, ‘models’ and ‘applications’ and two specific ones, ‘economy and complexity’ and ‘remote sensing image processing’. The editors of this book have set out to publish a scientific contribution to the discovery of new forms of cooperative work that are necessary today for the invention of efficient computational systems and new social paradigms.
The book is a collection of selected papers from the 18th WIRN workshopthe annual meeting of the Italian Neural Networks Society (SIREN). As 18 marks the year young people come of age in Italy, the Society invited two generations of researchers to participate in a common discussion: those new to the field and those with extensive familiarity with the neural paradigm. The challenge lay in understanding what remains of the revolutionary ideas from which neural networks stemmed in the eighties, how these networks have evolved and influenced other research fields, and ultimately what are the new conceptual/methodological frontiers to trespass for a better exploitation of the information carried by data.
From this discussion we selected 27 papers which have been gathered under two general headings, “Models” and “Applications,” plus two specific ones, “Economy and Complexity” and “Remote Sensing Image Processing.” The editors would like to thank the invited speakers as well as all those who contributed to the success of the workshops with papers of outstanding quality. Finally, special thanks go to the referees for their valuable input.
We are also pleased that non-SIREN member researchers joined us both at the meeting and in this editorial venture, bearing witness to a wide sphere of interest in the debate. We hope, moreover, that the book will serve in making a scientific contribution to the discovery of new forms of cooperative work – so necessary today for the invention of efficient computational systems and new social paradigms too.
November 2008
Bruno Apolloni, Simone Bassis, Maria Marinaro
The problem of reconstruction and mining object trajectories is of interest in the applications of mining transport enterprise data concerning with the route followed by its delivery vans in order to optimize time and space deliveries. The paper investigates the case of Wireless Sensor Network (WSN) technology, not primarily designed for localization, and reports a technique based on recurrent neural networks to reconstruct the trajectory shape of a moving object (a sensor on a Lego train) from the sensor accelerometer data and to recover its localization. The obtained patterns are thus mined to detecting periodic or frequent patterns, exploiting a recently proposed technique based on clustering algorithms and associative rules to assert the ability of the proposed approach to track WSN mote localizations.
In this paper a fuzzy Sugeno rule based controller is used to manage a semi-active suspension of 1/4 vehicle model. Our semi-active suspensions have the characteristic to change the damping constant in order to reduce vertical oscillation of the vehicle, to restrain the shell vehicle acceleration and consequently to be more comfortable for passengers. On the other hand, if no control is present, the suspension works in passive mode, i.e. without damping parameters changing. The inputs of our fuzzy controller, three in total, are the dynamic wheel load described by five membership functions, its derivative and the vertical acceleration both described by three membership functions. By comparing the performances of the suspension with and without control, the Sugeno based model controller results in a more comfortable and safe system.
We describe here a system, based on boosting, for the classification of defects on material running on a production line. It is constituted of a two-stages architecture: in the first stage a set of features are extracted from the images surveyed by a linear camera located above the material. The second stage is devoted to the classification of the defects from the features. The novelty of the system resides in the ability to rank the defects with respect to a set of classes, achieving a rate of identification of dangerous defects very close to 100%.
Results in literature show that the convergence of the Short-Term Maximum Lyapunov Exponent (STLmax) time series, extracted from intracranial EEG recorded from patients affected by intractable temporal lobe epilepsy, is linked to the seizure onset. When the STLmax profiles of different electrode sites converge (high entrainment) a seizure is likely to occur. In this paper Renyi's Mutual information (MI) is introduced in order to investigate the independence between pairs of electrodes involved in the epileptogenesis. A scalp EEG recording and an intracranial EEG recording, including two seizures each, were analysed. STLmax was estimated for each critical electrode and then MI between couples of STLmax profiles was measured. MI showed sudden spikes that occurred 8 to 15 min before the seizure onset. Thus seizure onset appears related to a burst in MI: this suggests that seizure development might restore the independence between STLmax of critical electrode sites.
The Electrocardiogram (ECG) is the recording of the effects produced from the bioelectric field generated by the cardiac muscle during its activity. Specific changes in ECG signals can reveal pathologic heart activity. For this reason, a dynamic model - that accurately describes the heart bioelectric behavior and that can be mathematically analyzed - could be a practical way to investigate heart diseases. The aim of this paper is to introduce a dynamic model to simulate pathological ECG as well as to evaluate an Artificial Neural Network able to distinguish the impact of some modeling parameters on specific and peculiar features of EGC's trend.
Fetal electrocardiogram (fECG) monitoring yields important information about the fetus condition during pregnancy and it consists in collecting electrical signals by some sensors on the body of the mother. In literature, Independent Component Analysis (ICA) has been exploited to extract fECG. Wavelet-ICA (WICA), a technique that merges Wavelet decomposition and INFOMAX algorithm for Independent Component Analysis, was recently proposed to enhance fetal ECG extraction. In this paper, we propose to enhance WICA introducing MERMAID as the algorithm to perform independent component analysis because it has shown to outperform INFOMAX and the other standard ICA algorithms.
In this paper we propose and experimentally analyze ensemble methods based on random projections (as feature extraction method) and SVM with polynomial kernels (as learning algorithm). We show that, under suitable conditions, polynomial kernels are approximately preserved by random projections, with a degradation related to the square of the degree of the polynomial. Experimental results with Random Subspace and Random Projection ensembles of polynomial SVMs, support the hypothesis the low degree polynomial kernels, introducing with high probability lower distortions in the projected data, are better suited to the classification of high dimensional DNA microarray data.
Automatic fingerprint classification provides an important indexing scheme to facilitate efficient matching in large-scale fingerprint databases in Automatic Fingerprint Identification Systems (AFISs). The paper presents a new fast fingerprint classification module implementing on embedded Weightless Neural Network (RAM-based neural network). The proposed WNN architecture uses directional maps to classify fingerprint images in the five NIST classes (Left Loop, Right Loop, Whorl, Arch and Tented Arch) without anyone enhancement phase. Starting from the directional map, the WNN architecture computes the fingerprint classification rate. The proposed architecture is implemented on Celoxica RC2000 board employing a Xilinx Virtex-II 2v6000 FPGA and it is computationally few expensive regards execution time and used hardware resources. To validate the goodness of proposed classificator, three different tests have been executed on two databases: a proprietary and FVC database. The best classification rate obtained is of 85.33% with an execution time of 1.15ms.
We propose the use of clustering methods in order to discover the quality of each element in a training set to be subsequently fed to a regression algorithm. The paper shows that these methods, used in combination with regression algorithms taking into account the additional information conveyed by this kind of quality, allow the attainment of higher performances than those obtained through standard techniques.
In this paper a natural gradient approach to blind source separation in complex environment is presented. It is shown that signals can be successfully reconstructed by a network based on the so called generalized splitting activation function (GSAF). This activation function, whose shape is modified during the learning process, is based on a couple of bi-dimensional spline functions, one for the real and one for the imaginary part of the input, thus avoiding the restriction due to the Louiville's theorem. In addition recent learning metrics are compared with the classical ones in order to improve the speed convergence.
Several experimental results are shown to demonstrate the effectiveness of the proposed method.
This work aims to propose a novel model to perform automatic music transcription of polyphonic audio sounds. The notes of different musical instruments are extracted from a single channel recording by using a non-linear Principal Component Analysis Neural Network. The estimated components (waveforms) are classified by using a dictionary (i.e. database). The dictionary contains the features of the notes for several musical instruments (i.e. probability densities). A Kullback-Leibler divergence is used to recognize the extract waveforms as belonging to one instrument in the database. Moreover, considering the weights of the Neural Network a MUSIC frequency estimator is used to obtain the frequencies of the musical notes. Several results are proposed to show the performance of this technique for the transcription of mixtures of different musical instruments, real songs and recordings obtained in a real environment.
The discussion on the neural networks paradigm casted a bright light on the notion of network and on what we may define as the “dynamical systems approach”. The rebounds of this emphasis involved, during the years, the most disparate scientific areas, such as biology, social sciences and artificial life, contributing to reinforce the foundations of the complex systems science.
In this work we will discuss about a particular network model, i.e. random Boolean networks, which have been introduced to represent the gene regulatory mechanism, and we will comment on the similarities and the differences with the model of neural networks.
Moreover, we will present the results of the analysis of the dynamical behaviour that random Boolean networks show in presence of different sets of updating function and discuss on the concepts of criticality and biological suitability of such a model.
In this paper, we compare the performances of some among the most popular kernel clustering methods on several data sets. The methods are all based on central clustering and incorporate in various ways the concepts of fuzzy clustering and kernel machines. The data sets are a sample of several application domains and sizes. A thorough discussion about the techniques for validating results is also presented. Results indicate that clustering in kernel space generally outperforms standard clustering, although no method can be proven to be consistently better than the others.
Most of the emphasis in machine learning has been placed on parametric models in which the purpose of the learning algorithm is to adjust weights mainly according to appropriate optimization criteria. However, schemes based on a direct data inference, like for instance K-nearest neighbor, have also become quite popular. Recently, a number of people have proposed methods to perform classification and regression that are based on different forms of diffusion processes from the labelled examples. The aim of this paper is to provide motivations for diffusion learning from the continuum setting by using Tikhnov's regularization framework. Diffusion learning is discussed in both the continuous and discrete setting and an intriguing link is established between the Green function of the regularization operators and the structure of the graph in the corresponding discrete structure. It is pointed out that an appropriate choice of the smoothing operators allows one to implement a regularization that gives rise to Green functions whose corresponding matrix is sparse, which imposes a corresponding structure on the graph associated to the training set. Finally, the choice of the smoothness operator is given a Bayesian interpretation in terms of prior probability on the expected values of the function.
In view of discussing the genuine roots of the connectionist paradigm we toss in this paper the non symmetry features of the involved random phenomena. Reading these features in terms of intentionality with which we drive a learning process far from a simple random walk, we focus on elementary processes where trajectories cannot be decomposed as the sum of a deterministic recursive function plus a symmetric noise. Rather we look at nonlinear compositions of the above ingredients, as a source of genuine non symmetric atomic random actions, like those at the basis of a training process. To this aim we introduce an extended Pareto distribution law with which we analyze some intentional trajectories. With this model we issue some preliminary considerations on elapsed times of training sessions of some families of neural networks.
Random projections in the Euclidean space reduce the dimensionality of the data approximately preserving the distances between points. In the hypercube it holds a weaker property: random projections approximately preserve the distances within a certain range.
In this note, we show an analogous result for the metric space 〈Σd,dH〉, where Σd is the set of words of length d on alphabet Σ and dH is the Hamming distance.
Many applications that involve inference and learning in signal processing, communication and artificial intelligence can be cast into a graph framework. Factor graphs are a type of network that can be studied and solved by propagating belief messages with the sum/product algorithm. In this paper we provide explicit matrix formulas for inference and learning in finite alphabet Forney-style factor graphs, with the precise intent of allowing rapid prototyping of arbitrary topologies in standard software like MATLAB.
The portfolio consistency fund studied in the paper is referred to a pension scheme of beneficiaries entering in the retirement state at the same time. It arises from the payment stream consisting in cash flows due at the beginning of each year in case of life of the pensioner: in the deferment period, in the form of premiums entering in the fund, and during the retirement period, in the form of instalments getting out of it. The fund dynamic as function of the time of valuation evolves increasing thanks to the interest maturing on the accumulated fund and decreasing because of the benefits paid to the survivals. Both the mortality trend, determining the number of payments at each time, and the interest maturing on the fund, are considered deterministically unknown at the moment of the contract issue. Stochastic assumptions about the behaviour of the interest rate process and the description of the mortality trend betterment, most likely to be expected in very long periods and in civilized world, provide a dynamic description of the pension scheme and in particular of the pension fund. In the paper we want to deepen the aspect of the financial and demographic basis chosen for premium calculation with respect to the impact on the portfolio fund consistency and on the risk arising from the randomness in the choice of the survival model used for the valuations. To this aim, we will assume for the fund description different projection levels for the survival probabilities and the Vasicek hypotheses for the interest rates maturing on the fund itself; moreover, we will determine the premium amounts introducing different scenarios in a safety loading perspective. As evident, in the premium quantification lower interest rate jointly with higher survival probabilities constitute a form of implied loading. The numerical application presented in the paper shows how the change in the premium calculation conditions affects the portfolio fund consistency in the different scenarios and investigates on the impact they have on the Demographic Model Risk Measure. Useful information for managing aim can be derived.
Over the past two decades the globalization of market economies have led to a large number of financial crises in emerging markets. The case of Paraguay in earlier ‘90 of the past century or, more recently, the crises in Turkey, Argentina, and Far East Asian markets have taught the important lesson that such phenomena, originally arising at local basis, can spread contagiously to other markets as well. At the same time, this made clear the importance of Early Warning System (EWS) models to identify economic and financial vulnerabilities among emerging markets, and, ultimately, to anticipate such events. With this in mind, we have introduced an EWS model based on the powerful clustering capabilities of Kohonen's Self Organizing Maps. Using macroeconomic data of several emerging countries, our analysis has been twofold. We have originally provided a static snapshot of countries in our dataset, according to the way their macroeconomic data cluster in the map. In this way, we were able to capture the (eventual) reciprocities and similarities from various emerging markets. As second step, we have dynamically monitored their evolution path in the map over the time. As main results, we were able to develop a crisis indicator to measure the vulnerability of countries, and we have also provided a proper framework to deduce probabilities of future crises.
In this paper we propose a flexible method for optimally choosing and sequencing in time a subset of pairwise comparisons between the alternatives in large–dimensional AHP problems. Two criteria are taken into account in defining the choice rule: the fair involvement of all the alternatives in the pairwise comparisons and the consistency of the elicited judgements. The combination of the two criteria guarantees the best reliability of the already collected information. The method indicates at each step the two alternatives to be next compared and stops the process taking into account both the reliability of the already elicited judgements and the necessity of bounding the potentially large number of judgements to be submitted to the decision maker.
We consider here fuzzy quantities, i.e. fuzzy sets without any hypothesis about normality nor convexity. Two main topics are examined: the first one consists in defining the evaluation of a fuzzy quantity, in such a way that it may be applied both in ranking and in defuzzification problems. The definition is based on α-cuts and depends on two parameters: a coefficient connected with the optimistic or pessimistic attitude of the decision maker and a weighting function similar to a density function. The second aim is showing that the proposed definition is suitable for defuzzifying the output of a fuzzy expert system: we treat a classical example discussed in [1], using several t-norms and t-conorms in aggregation procedures.