Ebook: Fuzzy Systems and Data Mining IV
Big Data Analytics is on the rise in the last years of the current decade. Data are overwhelming the computation capacity of high performance servers. Cloud, grid, edge and fog computing are a few examples of the current hype. Computational Intelligence offers two faces to deal with the development of models: on the one hand, the crisp approach, which considers for every variable an exact value and, on the other hand, the fuzzy focus, which copes with values between two boundaries.
This book presents 114 papers from the 4th International Conference on Fuzzy Systems and Data Mining (FSDM 2018), held in Bangkok, Thailand, from 16 to 19 November 2018. All papers were carefully reviewed by program committee members, who took into consideration the breadth and depth of the research topics that fall within the scope of FSDM. The acceptance rate was 32.85%.
Offering a state-of-the-art overview of fuzzy systems and data mining, the publication will be of interest to all those whose work involves data science.
Big Data Analytics is on the rise in the last years of the current decade. Data are overwhelming the computation capacity of high performance servers. Cloud, grid, edge and fog computing are a few examples of the current hype. Computational Intelligence offers two faces to deal with the development of models: on the one hand, the crisp approach, which considers for every variable an exact value and, on the other hand, the fuzzy focus, which copes with values between two boundaries and is often expressed with words providing an order among the categories.
Fuzzy Systems and Data Mining (FSDM) is a consolidated international conference which is held yearly, comprising four main groups of topics: a) Fuzzy theory, algorithm and system, covering issues like modelling, stability, concepts, formalization; b) Fuzzy application, including different kinds of processing as well as hardware and architectures with applicability, among others, to recognition, vehicle industry and multimedia; c) The interdisciplinary field of fuzzy logic and data mining, encompassing applications in any branch of engineering (electrical, manufacture, industrial, chemical, biomedical and health sector) as well as management and environment; and d) Data mining, providing new trends in scalable, parallel and distributed algorithms. Mining on graphs, web and data streams are very frequent nowadays without losing the point of view from visualization, privacy and security. By its part, sport data mining is gaining ground due to the importance not only to predict the outcome of any sport event, but also to match the exact result, which is undoubtedly essential to the betting system.
This edition is marked since the fourth anniversary is being celebrated and hence the FSDM series consolidation. This conference was first held in Shanghai in 2015 and has taken place in a different city every year. Up to now, two countries have hosted the conference. For the upcoming year Japan will be the destination. Following the great success of FSDM 2015, held in Shanghai (China), the second edition in the FSDM series took place in Macao (China), the third edition was hosted by National Dong Hwa University in Hualien, Taiwan (China) with the result of attracting many people from all over the world, and also to have shown different landscapes of this country. The fourth edition of the series (FSDM 2018) has come to Bangkok (Thailand), as a forum for experts, researchers, academics and industry people to introduce the last advances in the field of Fuzzy Sets and Data Mining. This new venue represents a movement from the Eastern part to the Southeast part of Asia. Thailand has five places declared as UNESCO World Heritage Sites from 1991 to 2005 according to the cultural criterion. Particularly, the most recent in the sense of its recognition is Dong Phayayen-Khao Yai Forest Complex. Khao Yai is along the route from Bangkok to Nakhon Ratchasima (Korat). Thailand has Thai as official language and four spoken languages.
This volume includes the papers accepted and presented at the 4th International Conference on Fuzzy Systems and Data Mining (FSDM 2018), held on 16–19 November 2018 in Bangkok, Thailand. All papers were carefully reviewed by program committee members, bearing in mind the quality, novelty, breadth and depth of the research themes which fall inside the FSDM scope. FSDM 2018 was a reference and outstanding conference which attracted three remarkable keynote speakers: Dr. A. Fazel Famili from Canada, Prof. Sheng-Lung Peng from Taiwan and Prof. Hari Mohan Srivastava from Canada who is chairing a position as Emeritus Professor. Hence, the conference has enjoyed three keynotes from overseas. For the first time, invited speakers have been arranged into two major groups depending on the field of interest such as Fuzzy Systems and Data Mining. Previous proceedings have been published in the prestigious book series Frontiers in Artificial Intelligence and Applications (FAIA) by IOS Press and are as follows: Vol. 281, Vol. 293 and Vol. 299 for FSDM 2015, FSDM 2016 and FSDM 2017, respectively. The current FAIA volume contains the selected contributions from FSDM 2018. If you have contributed to the FSDM conference before or even in this edition, we would like to see you on board again. Otherwise, if you have not submitted any paper to FSDM, we would like to invite you to prepare a good contribution and visit Japan where it is planned to hold FSDM 2019.
I am very glad to inform you that this year FSDM has received more than double the number of submissions of FSDM 2017, in total: 347. After an intense discussion stage, the committee, which included many experts, decided to accept 114 papers, which represents an acceptance rate of 32.85%. The profile of the authors is very remarkable and the number of full professors who have contributed is very high. As a follow-up of the conference, some special issues in well-regarded journals like Intelligent Data Analysis, Journal of Information Science and Engineering and Journal of Nonlinear and Convex Analysis are scheduled to be published; this is an important leap as the number of journal issues is increasing yearly. In previous years, a special issue with the Journal of Intelligent & Fuzzy Systems was published.
I am pleased to say thanks to all the keynote and invited speakers, and the authors who made the effort to prepare at least a contribution for the conference. Furthermore, we are very grateful to everyone, especially the program committee members and reviewers, who devoted time to assess the papers. It is an honour to continue with the publication of these proceedings in the outstanding series FAIA by IOS Press. Our particular thanks and regards also go to J. Breuker, N. Guarino, J.N. Kok, R. López de Mántaras, J. Liu, R. Mizoguchi, M. Musen, S.K. Pal and N. Zhong, who are the FAIA series editors, for supporting this conference.
Last but not least, I hope you enjoyed your stay in Bangkok, in the venue as well as its surroundings which are located in the heart of the city. The climate in Bangkok has three variations: cool, hot and rainy. Technically, November is not inside the rainy season, which ends in October. The rainy days in November are around five and the climate is changing to give place to the cool season from December. To sum up, November is a hybrid month combining slightly milder temperatures than October, a few days of rain although with many sunshine hours, and the average temperature, according to the historical series, is very likely to be between 24 and 32 degrees Celsius.
September 2018
Antonio J. Tallón-Ballesteros
University of Seville (Spain)
Seville city, Spain
Natural language plays a crucial role in many different and very specific scientific issues. It is also the main aspect of studying specific learning disabilities like dyslexia and helps to design adequate technologies aiming at the specific dyslexia needs. Although dyslexia affects up to 20 % of the worldwide population, no apparatus has been designed to build a user model or user categorization. Such individual categorization, describing individual reading problems and changes over time, may, however, be crucial for further psychological and neurological studies of this disorder.
Fuzzy principles are applicable in a wide range of areas, from economics, business, medical methods to natural science. In natural language, the fuzzy approach deals with linguistic variables, which are transformed from numbers to expressions on predefined scales. Given the complexity of dyslexia needs, it is very challenging to design a new and original fuzzy apparatus tailored specifically to individual needs.
This article introduces the idea and process of using the fuzzy approach for classifying dyslexia symptoms and their progression by obtaining information about individual problems that people with dyslexia deal with. A correct classification is important for the new emerging assistive technologies accommodating text for people with dyslexia and for further clinical and psychological studies of dyslexia and linguistic based problems. As reading problems are diverse, the clinical and psychological studies based on our approach may also be useful in studies of, for example, problems after a stroke, brain tumours, epilepsy, etc.
Subgroup algorithm composed of the intelligent algorithms is proposed for the multi-tasking cooperative path planning, which can avoid obstacle for the complex environment. Five kinds of intelligent multi-task collaborative planning algorithms, such as the particle swarm optimization (PSO) algorithm, the eight-direction recursive algorithm, the artificial potential field algorithm, the ant colony algorithm and the A* algorithm, are all integrated together, then the agents are divided into each subgroup for each task based on each intelligent algorithm to find the optimal path corresponding with the constraints, thus formed the subgroup algorithm, which are suitable for the multi-task collaborative planning in complex scenes. The effectiveness and feasibility of the multi-tasking cooperative path planning based on subgroup algorithm can be verified from the experiment results.
First, a new definition of conjugate mapping concept for convex fuzzy mapping is given in this paper, which is more reasonable than the concept in the literature. Then, we prove that the secondary conjugate mapping of convex fuzzy mapping is convex fuzzy mapping and that the relationship between two convex fuzzy mappings and their intimal convolution's conjugate mappings. Besides, the definition of conjugate mapping for general fuzzy mappings is given, which is the extension of the concept of conjugate mapping of convex fuzzy mapping. Moreover, we prove that conjugate mapping is convex mapping and that secondary conjugate mapping is convex fuzzy mapping. Finally, we discuss the relationship between the sub-differential of fuzzy mapping and conjugate mapping and prove some relational expressions.
In real international financial markets, the past data can't always effectively reflect the future and people must estimate the future values. In this paper the future security prices and foreign exchange rates are given by experts' evaluations rather than historical data. Regarding the future security prices and foreign exchange rates as uncertain variables we discusses an international portfolio selection. In the paper, an uncertain mean-chance model for international portfolio selection is proposed. The equivalent model is given. Finally, a numerical example is provided.
According to the short life cycle of food safety of public opinion and the disciplinary development curve, it is feasible for prediction of public opinion hot-degree to use improved entropy and Markov model. The paper selects “dirty porterhouse of one well-known catering enterprise” food safety public event in 2016, setting up comprehensive evaluation index system by AHP: the media, net citizen and government as the criterion level. Index weight can be calculated with improved entropy method. The state transition matrix will be constructed and food safety of public opinion's change interval will also be predicted by Markov. The results show that the comprehensive construction of the index system and the improved entropy weight method indeed improve the accuracy of the Markov model, and effectively realize the prediction of food safety public opinion finally.
The objective of this paper is to build an empirical model to predict the NBA teams' winning percentages from data collected in the past. The raw data has been standardized through Z-transform to remove any large variance bias effects. A multiple linear regression model was derived to predict the winning percentages. After trimming insignificant regression terms, the derived model can predict the teams' winning percentages with an R-Squared greater than 95%. The multicollinearity issues were addressed by minimizing the variance inflation factors. The redundant terms were removed to avoid the risk of over-fitting. The model has identified that three-point percentage, turnovers per game, and points per game were most critical to the team offensive efficiency. The nonlinearity terms have identified the complexity of basketball team behaviors. Defensive field goal percentage and points per game allowed were identified as the most significant interaction terms. The model accuracy was proved to be within +/-5% winning percentage of the predicted target across all thirty NBA teams.
This study aimed to demonstrate the evaluation of traffic emission control plans based on the group decision–making model. The model assigned values in the form of grey interval linguistic variables because of the uncertainty in their multiple attributes to create a group synthesis decision matrix. The optimal plan could then be determined by calculating the possible degrees of every plan that could define the ideal situation. The model could effectively deal with inaccuracy due to the uncertainty and multiple attributes of emission control plans compared with the existing studies. It ensured scientific and rational decision-making in the management and control of traffic-related emissions. Finally, four different emission control plans in a certain city were compared to verify the effectiveness of the model via a numerical example.
Remanufacturing is a systematic process to recover a product to like new condition with matching warranty. It has Environmental benefits as well as significant potential to influence product economy in reverse logistic. Remanufacturing begins with identification and inspection of cores (scrap products) further disassembly, reconditioning, assembly and testing. Inspections of cores are the critical activity which leads to the effectiveness of the remanufacturing. The aim of this paper is to use fuzzy TOPSIS optimization for selection of scrap pistons for remanufacturing. TOPSIS is based on the principle that the chosen alternative should have the shortest distance from the positive ideal solution and the longest distance from the negative ideal solution. Parameters considered for the selection are length, diameter, surface roughness, Ovality, groove thickness, height and total deformation. Five pistons with different failure conditions are taken into consideration for study also compared with a new piston. Results show that above-mentioned parameters are giving a significant contribution in the process of remanufacturing which improves the efficiency of a process.
In order to solve the social network members' distribution is no longer confined to the two-dimensional plane and social network partitioning process involves groups of content privacy problems, this paper provides an improved CURE algorithm based on principal component analysis to reduce the dimension of social network division, which named DRICURE algorithm. First, the concept of principal component analysis could be used to reduce the dimension of social network distribution and simplify the calculation method. Second, the distance between nodes is measured by the closeness among members. Finally, DRICURE algorithm is used to cluster until the number of categories meets the requirements, and uses similarity to solve the attribution of outlier to members of social networks. Experiments show that the algorithm reduces the dimension of social network distribution and improves the time and space efficiency distinctly, and also does not involve the privacy of social network members. The results show that the quality of community structure obtained in this paper is higher, and the separation of isolated members is considered effectively.
To solve MPA education problems such as out of touch with the actual practice and lack of applicability, the author carries out a survey with a self-made questionnaire and find out the four main factors affecting the quality of MPA education with the use of exploratory factor analysis: The corresponding MPA education quality assessment index system is built according to these four dimensions: education environment, teacher and enrollment, process management and educational effect.
As a significant multi-granularity computing model in the rough set community, multigranulation rough sets (MGRSs) establish multi-dimension and multi-view problem solving methods through providing synthesis and analysis strategies for solution space in different granularity levels, which act as a powerful tool for coping with various complicated multi-attribute group decision making (MAGDM) problems. The main point of interest in this work is to apply the concept of bipolar-valued fuzzy sets (BVFSs) to MGRSs. Firstly, the definition of bipolar-valued fuzzy multigranulation rough sets (BVF MGRSs) over two universes is presented. Then, we construct a MAGDM method on the basis of BVF MGRSs over two universes. Finally, a case study is provided to show the validation and practicality of the newly constructed method.
This study is to help We-Media to improve the users' attention by using Big Data technology. A new TAM model is optimzed based on the TAM model proposed by Shein Bowman and Chirs Willis. The model can better obtain the users' data and optimize We-Media to attract more attention. The paper took Guangdong University of Foreign Studies as an example and carried out experiments. We applied MapReduce and Hadhoop to process the Big Data, such as the users' interests, hobbies and browse habits of We-Media users with SPSS. The above processes are completed by using literature research, empirical research, quantitative analysis and qualitative analysis. The experiment result can help We-Media to improve its content. We-Media can optimize the released content according to the users' Big Data, so as to obtain the users' higher attention, assist people to recognize things positively and promote the direction of We-Media.
In this work, we propose a model of the influential factors of the O2O e-commerce acceptability in local services. This model is established on the base of Equation Model Theory, Technology Acceptance Model, and Perceived Risk Theory, and can be used to help enterprises promote e-platforms. Using relevant, domestic and international information, we perform case studies and questionnaire surveys. There are two steps in the development of the model. The first one involves the development of an initial model, which consists of eight latent variables of influential factors, including subjective perception, Social group influences, and perceived risk, as input variables. The second step is the development of the final influential factors of the acceptability model, which is the optimization of the initial model according to the results from the simulation experiments. The results from the simulation experiments reveal the positive effects of Social group influences and subjective perception on willingness. All paths are significant.
Industrial processes and machines pose risks in terms of equipment failure and worker accidents. In order to prevent these unwanted occurrences, the associated risks must first be analyzed. However, in traditional fault tree analysis, exact data values are used. But in real life often these values are not known precisely. There is therefore a degree of uncertainty associated with the data. Fuzzy numbers, expressed in this paper as triangular fuzzy number, provide a method for dealing and taking into account this uncertainty. In this paper, fuzzy fault tree analysis is then used. An example of a metal brake press is used to demonstrate this approach. A fault tree for a particular accident scenario is built and the fuzzy probability of occurrence of the accident under consideration is evaluated. An interesting second problem consists in starting with this value and deducing from the fault tree, what values of the occurrence probabilities of the contributing events in the fault tree minimize a function expressing the cost of work accidents. This problem is expressed mathematically and solved using a Matlab-based method over fuzzy numbers. The optimized contributing event probabilities are obtained along with the optimal cost function.
Fuzzy quantified queries have been studied over several types of data and allow to sum up large volumes of data in a very intuitive manner. However, few works have been led on Resource Description Framework (RDF) graph databases. In this paper, we introduce the notion of fuzzy quantified queries in a (fuzzy) RDF database context. We firstly show how these queries can be defined and implemented in the quantified graph pattern, which is an extension of graph patterns by supporting linguistic quantifier on edges. Then, we develop an algorithm for evaluating quantified RDF graph patterns, that is based on a backtracking strategy which incrementally finds partial solutions by adding joinable candidate vertices or abandoning them when it determines they cannot be completed. Finally, we show some experimental results on real-life data and synthetic data that show the efficiency and scalability of our approach.
To achieve an optimal solution to the traveling sales problem, School of Automation (TSP), an improved wolf pack algorithm (IWPA) that addresses this problem through migration, summoning, and besiegement behavior is proposed. This proposal is based on a basic WPA concept. In this algorithm, it retains the division-work cooperative search characteristics of WPA, adds negative feedback to the algorithm, and sets adaptive parameters, which solves the disadvantages of slow convergence speed and low-dimensional search efficiency of WPA. Finally, the path planning simulation of the IWPA was performed and the proposed approach was compared with ant colony optimization and particle swarm optimization. The simulation results indicate that IWPA has specific advantages over similar algorithms regarding feasibility, convergence, and stability.
The control component of a generator excitation system is key to its smooth operation and ability to effectively resist unexpected situations. In this study, we design a control strategy for the excitation system of a high-temperature superconducting motor that combines fuzzy control with conventional PID control. The method is based on a conventional PID with a fuzzy inference algorithm and variable universe fuzzy control added. Fuzzy control is realized by the computer processor through the selection of the scalable factor function model and fuzzy control rules, further improving the precision of the control system. To facilitate its transplantation into other control strategies, we adopt the discrete time system for the control strategy design. We establish an excitation system simulation, through which we compare fuzzy PID control and variable universe fuzzy PID control, and find that variable universe fuzzy PID control has better dynamic and static performance.
Type-2 sets are the generalized “fuzzified” sets that can be used in the fuzzy system. Unlike type-1 fuzzy sets, Type-2 allow the fuzzy sets to be “fuzzy” rather than the crisp definition of the set. Although this would improve the flexibility of inferring a decision, the implementation of Type-2 is rather more complex than type-1. Based on this principle, this paper proposes the mechanism of “Fuzzimetric Sets” that is capable of defining a rigid fuzzy set as well as “fuzzy” Fuzzy sets (Type-2). This is based on the concept of Fuzzimetric Arcs which will also be reviewed in this paper. Most of the implementations of this type of fuzzification would be suitable for decision support systems where an example of how to implement Fuzzimetric sets is also presented in this article. The platform of Fuzzimetric sets is composed of the initial definition of the fuzzy sets within the context of Fuzzimetric Arcs and then, the use of mutation and crossover operations on the sets allow the “Fuzziness” property of the set. To control the level of fuzziness in such sets, the introduction of “Degree of Fuzziness” factor was also proposed. DOF is composed of two dimensions: Vertical-DOF: Allowing changes in the level of the fuzzy membership of the set (0-1) and Horizontal-DOF: allowing the fuzziness level between maximum and minimum tolerances of the fuzzy sets, causing the centroid to move between the maximum and minimum allowed tolerances. If V-DOF was equated to zero, and H-DOF range defined between [90-90] then the set becomes type-1, otherwise, fuzziness level of the fuzzy sets can be controlled via this factor.
Net present value is the traditional approach to evaluate the financial viability of projects including large ones. Unfortunately, the NPV may mislead decision makers for two reasons. First, it does not take into consideration managerial flexibility which is the ability of decision-makers to react to upcoming information related to some uncertain events in a way that may change the project design and planning. Randomness based on standard deviation is usually utilized to accommodate such flexibility. Second, the NPV calculation relies on some fuzzy variables that are subject to imprecision or ambiguity such as the characteristics of the project cash flow and investment cost. The variability of these errors can also be represented in the volatility of the model. In such scenario, volatility would be randomness as well as fuzziness based. Researchers in the field propose to use project Investment and Project cost as fuzzy variables. We argue in this paper that both of these factors can be embedded in the randomness-based volatility factor. This paper investigates the use of the volatility factor as fuzzy parameter in a Real options approach to evaluate projects. Hence, the overall ambiguity would be represented in a combination of randomness as well as fuzziness to achieve acceptable evaluations.
This paper proposes an evaluation index system and constructs an evaluation model for loading and reinforcement plans of out-of-gauge and overweight cargoes in railway transportation. Based on the plan effect factors as summarized from years of research and on-site experiences, the evaluation indices, their calculation methods and the confirmation of index values are proposed. The fuzzy analytic hierarchy process (FAHP) is selected to make the decision of plan, which is more objective than other methods. At last, the method is tested by a case with three different plans. The proposed index system and model may serve as a guide in practices.
Covariance matrix estimation is an important problem in various fields of social science including financial economics. In this paper, we consider the estimation problem in the regression framework in order to resolve the deficiencies of the traditional methods. In particular, we establish the regression framework using support vector regression for the in-sample-based and the shrinkage-based estimation methods. Empirical results will indicate that our proposed covariance matrix estimation methods sufficiently perform superior to the two traditional estimation methods.
The multi-tiered cross-market dependency structures among capital market worldwide have a highly complex architecture which makes international portfolio management a challenging exercise. The challenge in recent times has been magnified given that the strengths of linkage structures have increased over the past decades. Understanding the complex interdependencies in a temporal scale and identifying the backbone structure in this interdependent system will aid in unearthing the avenues from where diversification benefits may possibly arise. With this objective in mind, the current study attempts to mathematically formulate the cross-market connectivity into weighted network models and elucidate the backbone structures by deploying a global threshold filtering approach. The present work investigates the dependency structure based on weekly-data series belonging to forty-three global markets. The weighted networks depicting cross-market relationships are filtered and visually inspected to decipher the significant connectivity structures. The study identifies that average cross-market linkage strengths increased during market stress conditions. The study also identified a disjointed set of markets wherein one can direct the investments to cushion oneself from systemic risk impacts.
The paper introduces concepts of jumping fuzzy finite automata and jumping fuzzy grammars. Relations among some families of fuzzy languages accepted by the automata and generated by the grammars are described.