Ebook: Fuzzy System and Data Mining
Fuzzy logic is widely used in machine control. The term ‘fuzzy’ refers to the fact that the logic involved can deal with concepts that cannot be expressed as either ‘true’ or ‘false’, but rather as ‘partially true’. Fuzzy set theory is very suitable for modeling the uncertain duration in process simulation, as well as defining the fuzzy goals and fuzzy constraints of decision-making. It has many applications in industry, engineering and social sciences.
This book presents the proceedings of the 2015 International Conference on Fuzzy System and Data Mining (FSDM2015), held in Shanghai, China, in December 2015. The application domain covers geography, biology, economics, medicine, the energy industry, social science, logistics, transport, industrial and production engineering, and computer science. The papers presented at the conference focus on topics such as system diagnosis, rule induction, process simulation/control, and decision-making. They include papers on solving practical problems with intelligent algorithms; statistical analysis; classification and clustering; and association rule learning. They also reflect the frontier in data mining research and address the challenges posed to data analytics research by the increasingly large datasets yielded by many application domains, together with new types of unstructured data.
The book provides an overview of the ways in which fuzzy theory and data mining principles are applied in various fields, and will be of interest to all those who work in either the theory or practice of fuzzy systems and data mining.
I am very much honored to have been asked to write preface for the conference proceeding, and want to take this opportunity to summarize my thoughts and share with readers.
This conference proceeding (FAIA) contains papers submitted for the 2015 International Conference on Fuzzy System and Data Mining, held in Shanghai, December 12–15, 2015 (FSDM2015, the conference that I helped reviewing papers) which focuses on Fuzzy theory and Data mining research and their application in interdisciplinary fields.
All papers are carefully reviewed by program committee members and takes into account the breadth and depth of research topics, which include Fuzzy Theory, Algorithm and System, and Data Mining. It also includes the applications and interdisciplinary field of fuzzy logic and data mining. The application domain covers geography, biology, economic, medicine, energy industry, social science, logistics, transportation, industrial and production engineering and computer science.
Fuzzy set theory is very suitable to model the uncertain duration in process simulation, as well as define the fuzzy goals and fuzzy constrains for decision making. Here, fuzzy set theory and its application in industry engineering and social science is one of main topics, with an emphasis on system diagnosis, rule induction, process simulation/control and decision making. Also, the increasing large data sets from many application domains together with new types of unstructured data, have posed new challenges to data analytics research. In order to tackle the challenge, this proceeding considerately brings in papers on solving practical problems by intelligence algorithm (for example, evolution algorithm, neural network), statistical analysis, classification and clustering, association rule learning, etc. Moreover, these papers not only propose new algorithm or improve existing algorithm, but also reflect the frontier in the data mining research. Another interesting observation is that the fast growth of data in economics and social science nowadays has encouraged the development of data driven algorithms too.
Every paper provides authors' unique insights from their research field. By collecting these insights and experiences together into this proceeding, researchers can have an idea how fuzzy theory and data mining principle is applied to various fields, and practitioners can have deeper understanding on the theory side of algorithms. I truly believe that both researchers and practitioners will be inspired by the wide range of real world solutions collected here.
In the end, we would like to thank all the reviewers for spending time help reviewing papers, and all the authors for their participation for FSDM2015 and contributions to this proceeding.
Gang Chen
Samsung Electronics America,
Mountain View, CA, USA
Intelligent optimization control combines intelligent optimization and intelligent control in order to optimize the local or global performance of control systems; it provides an effective way to solve control problems in complex systems. In this paper, a fuzzy logic controller is employed based on the genetic algorithm for control rules in a water curtain cooling system. Simulation results and industrial experiments indicate that fuzzy logic control based on the genetic algorithm optimization for the water curtain cooling process of accelerating steel is effective using this control model.
This paper investigated mathematical modelling and control scheme of Static Synchronous Compensator (STATCOM) based on fuzzy logic controller. At first, based on switching functions, a mathematic model of STATCOM in the form of a matrix was presented. Then, to overcome the problem of lacking an accurate model of a power system with STATCOM, a fuzzy logic PI controller was established in the feedback loop circuit. The membership function of the fuzzy logic controller was gained through the collection of experimental data. The proposed control scheme and the mathematic model were validated by MATLAB simulation, which showed that STATCOM with the designed fuzzy logic controller was able to improve the transient stability and voltage stability of power systems effectively.
Brand competitiveness can be used to measure brand value, by building a model system to measure the brand competitiveness to reflect the intrinsic value contained. Since the launch of China's power sector reforms, the market competition mechanism has been gradually introduced and the brand has become an important asset of power enterprises. To scientifically measure the brand value of power enterprises to achieve steady improvement of brand value, an evaluation model of brand competitiveness of power enterprises is built in this paper from five dimensions; brand awareness, brand popularity, brand reputation, brand loyalty, and brand association. In addition, the fuzzy comprehensive evaluation method is used for comprehensive calculations of the evaluation results. The evaluation results have significant value for measuring the brand value of power enterprises and enhancing brand competitiveness.
This paper presents a decision support system (DSS) for developing consensus in group decision making. A fuzzy consensus index is proposed to evaluate group consensus in group decision making. This consensus index is incorporated into a DSS so that better consensus decisions can be made through interactive exchanges of information between decision makers and the DSS.
Tn this paper, an effective error checking and correction method of or Chinese medical records recognized by OCR is proposed. In our research, an optimized N-gram language model based on vocabulary rather than words is adopted to correct errors, and supervised machine learning based on maximum entropy (MaxEnt) is deployed to build a model for tokenization and named entity recognition. A medical knowledge base (MKB) is established, including dictionaries of medicine, symptoms, diseases, etc., and the frequency of each word as it appeared in the study corpus. Furthermore a Knowledge Base for Error correction (KBE) is built to automatically correct high-frequency errors. With the developed approach, the accuracy rate of the electronic medical record increases from 85.20% to 95.72%, indicating an error reduction of 71.08%.
A magneto-rheological damper is designed and tested to minimize the driver cab vibration of heavy-duty vehicles, and a model suitable for fuzzy PID control is proposed to analyze a dynamic model with three masses. With the comparison of passive and active control systems, the controllable magneto-rheological damping decreases with system acceleration and filters the high-frequency components with a reasonable base.
This paper focuses on the further improved stability criteria of uncertain T-S fuzzy systems with time-varying delay by the delay-partitioning approach and a new integral inequality, i.e., the Free-Matrix-Based integral inequality. A modified augmented Lyapunov-Krasovskii functional (LKF) is established by partitioning the delay in all integral terms. Then, new results on tighter bounding inequalities such as Peng-Park's integral inequality (reciprocally convex approach) and the Free-Matrix-based integral inequality (which yields less conservative stability criteria than the Wirtinger-based inequality) are introduced to reduce the enlargement in bounding the derivative of LKF as much as possible, therefore, less conservative results can be expected in terms of es and MIs. Finally, a numerical example is included to show that the proposed conditions are less conservative than existing ones.
An aircraft pushback control scheme is designed with a fuzzy sliding mode controller. First, based on analysis of the departure process, the queuing theory is used to build an aircraft departure queuing model. Then, the queue length error and its variation are taken as inputs to design a fuzzy controller of dual-input, single-output structure to determine the quantity of aircraft to be pushed. The quantity reliability is then examined through the sliding mode controller and unreliable quantity is rectified. Finally, the overall aircraft pushback control scheme in Xian-yang International Airport, Xi'an, China, is simulated using Simulink. The results indicate that with the fuzzy controller to perform the main operation and the sliding mode controller to verify and rectify the output, the airport ground traffic congestion is effectively relieved, the operation efficiency increases, the fuel consumed by waiting decreases, and thus, the environmental pollution is reduced.
There are many risk factors such as corrosion, design defect, third party damage and so on during the process of the oil and gas pipeline. It is necessary for the enterprise to take the appropriate strategies to eliminate control, transfer or evade those risk factors. Because of the different risk tolerance, different oil and gas pipeline enterprises may take a different risk strategy to the same risk. At present, researches on the risk of the oil and gas pipelines is focused on the existing pipelines, and less on the risk tolerance of the oil and gas pipelines enterprises. Therefore, from the enterprise's risk tolerance of risk decision, the risk tolerance assessment model was established based on the AHP (Analytic Hierarchy Process), and the risk value were carried out by the fuzzy comprehensive assessment in this paper. And then the model was applied to an oil pipeline project, the risk factors were identified and assessed, the risk tolerance decision value of each risk was calculated to determine the major risk factors of the pipeline engineering. Also the corresponding risk strategies were put forward finally. The practical application shows that the risk assessment model based on the risk tolerance is comprehensive, reliable, and applicable, and can be used as an effective tool of risk evaluation and decision-making of oil and gas pipeline engineering.
In Flexible Manufacturing System (FMS) fault diagnosis, there are some problems hard to tackle, such as fuzziness, polymorphism, etc. So this paper proposes an improved Bayesian network (BN) approach. By first introducing BN and describing the transformation process from FT to BN. In addition, this paper applies fuzzy theory to set up conditional probability table of BN, and proposes observing nodes used to describe symptom information. Finally, by analyzing the fault of the numerical control processing unit of FMS, results indicate that this approach can improve the efficiency and accuracy of reasoning for fault diagnosis.
Multi-hesitant fuzzy sets (MHFSs) can deal effectively with cases where some values are repeated more than once in hesitant fuzzy sets (HFSs). In this paper, the novel convex combination of multi-hesitant fuzzy elements (MHFEs) is introduced. The generalized multi-hesitant fuzzy weighted average (GMHFWA) operator based on convex operation is developed, and corresponding property is discussed. Then, based on the proposed aggregation operator, a novel approach for multi-criteria decision-making (MCDM) problems is proposed for ranking alternatives. Finally, an example is provided in order to verify the developed approach and demonstrate its validity and feasibility.
With respect to the automatic drilling control of geological winch motor, the paper proposes a novel control method based on fuzzy theory for winch motor drive. The paper analyzes the operation principle of automatic drilling systems. It designs a novel winch motor drive system based on a Permanent Magnet Motor (PMM). The paper proposes an intelligent fuzzy logic algorithm that could be applied to the winch motor automatic drilling system in order to replace the traditional PID control algorithm. The system based on fuzzy logic settles the conundrum of the instability of winch motor drilling pressure caused by time-delay, time-variability and nonlinearity of controlled objects. The results prove fuzzy logic can pledge the stability of the winch motor drilling pressure, and the drilling system is efficient in operating. The research is beneficial to geological winch motor automatic drilling control both in theory and in technology.
Data mining is a broad area that integrates research efforts from several fields with the aim of processing large volumes of data into knowledge bases for better decision making. Since numerical and nominal data are equally important in practical data mining applications, dealing with different types of data items are among the most important problems in data mining research and development. This paper introduces a new fuzzy rule induction algorithm, able to deal properly with either numerical or nominal attributes, for the creation of classification and predictive models. To better handle numerical data, fuzzy sets are used to represent intervals in the domains of numerical attributes. Experimental results have shown that the proposed algorithm produces robust and general models that can be used for prediction as well as for classification.
An indiscernibility relationship between intuitionistic fuzzy sets is constructed by using a kind of hausdorff measure on intuitionistic fuzzy sets at first. Also, the operators of lower approximation with accuracy ε are introduced by this indiscernibility relation, and its properties are discussed. Then, the intuitionistic fuzzy rough set model is defined based on the indiscernibility relation and lower approximate reductions, and a kind of attribute reduction algorithm of this model is given. An example is employed to show its application.
The optimization of Automated Storage and Retrieval System (AS/RS) can shorten the productive time and improve circulation efficiency, but the discreteness and complexity of the running system threatens the reliability of the system operation. Therefore, this paper establishes a fault petri net for the situation in which AS/RS operates abnormally. An example is provided to demonstrate a fuzzy petri net model of a primary circuit, using fuzzy type-2 sets and numbers to determine the fault cause. The matrix description method is employed to describe the probability of the failure phenomenon. Finally, by changing input conditions for different degrees of various fault simulations, Visual Object Net++ is used to verify the reliability and accuracy of the proposed diagnostic method, which increases warehouse management efficiency.
In this paper, building a rule-based fuzzy hierarchical model with information granules is accomplished. In this scenario, a number of separate sources of data and the resulting individual fuzzy models formed on their basis are encountered. The ultimate objective is to realize rule-based fuzzy modeling globally on the basis of invoking some mechanisms of knowledge sharing and reconciliation. The underlying format of knowledge built by individual systems is that of information granules and fuzzy sets in particular. Thus, a mechanism of generating information granules named the principle of justifiable granularity is explicated.
Hand gesture recognition has become a major focus of research in the field of human-computer interaction (HCI). This work proposes a static hand gesture recognition system. The Histogram of Oriented Gradients (HOG) was used for feature extraction. The features are reduced by PCA and further reduced using the attribute reduction algorithm in the theory of the neighborhood rough set. Then, the weight of every feature is calculated using the attribute significance algorithm in the theory of neighborhood rough sets. The weighted features are applied as input to the fuzzy neural network to recognize static hand gestures. Experimental results on commonly-referred databases show that the proposed method based on neighborhood rough sets improves the recognition accuracy of fuzzy neural networks.
Multiple strategies and fuzzy comprehensive evaluation were used to recognize term similarity relation types. First, a variety of similarity algorithms were used to calculate the term similarities, and then the relations and intervals were identified by continuous attribute discretization algorithm. The sample distribution probability was used to determine the membership degree of the interval to the relations, and the weight of elements were determined by particle swarm algorithm and cross validation method. Then, all the calculation results were combined using a fuzzy comprehensive evaluation method to recognize the term similarity relation types. Finally, the precision, recall, and F value were used to evaluate the effect of the results, and the results were compared to the experimental results of the SVM to demonstrate the effectiveness of this method. This experiment regarded the Chinese scientific and technical vocabulary system (new energy vehicles) as the test set. The results showed that the method was able to recognize the term similarity relation types effectively.
This paper proposes the modified time series data mining framework applying Reconstructed Phase Space to construct clusters from the temporal patterns, which are predictive of interesting events. Cluster objective function used in the presented technique is defined not only by cluster internal predictive patterns but also by estimation of the efficacy of cluster to characterize the predictive clusters. For prediction stage, framework uses initial both information about predictive clusters and expert knowledges by applying Sugeno-type fuzzy inference. Experimental results demonstrate presented framework can reach more effective results than existed algorithms, which utilize reconstructed phase space.
The chlorophyll-a concentration of water from Nansihu Lake of Shandong Province, China was analyzed using hyperspectral data with a semi-analytic model. First, three bands of the hyperspectral data centered at 660nm, 708nm and 812nm were chosen based on iterative optimization analysis. Then, the linear-TBA model, the poly-TBA model, and the BP-TBA model were used to determine chlorophyll concentration. Field data were used to validate the results and performances of three different models. The validation results indicated that the BP-TBA model achieved the highest accuracy with a mean relative error of 0.2268 in the training data set and 0.1983 in the validation data set. However, the BP-TBA model was not fit for turbid coastal water. When the chlorophyll-a concentration exceeded 0.03mg/L, the error increased, demonstrating a mean of 0.364.
A novel harmonic current detection method based on the adaptive neural networks is proposed in this paper to improve the quality of the power station. The harmonic parameters are estimated through the adaptive measurement theorem. It can predict the future time of harmonic currents according to the current data and the former historical data, achieving the harmonic in real-time and with fast detection. Simulations are conducted on the PV Power stations in a certain area. The results show that the algorithm achieves high accuracy and rapid speed in convergence and is a good candidate for measuring the harmonics with asynchronous sampling and short data in a grid-connected power plant.
To improve the efficiency of optimization via simulation, a new method called the generalized regression neural network based optimization via simulation (GRNN-OvS) is put forward. This method takes advantage of GRNN's non-linear approximation, learning speed and network stability, which is promising for predicting the simulation output if the neural network is fully trained by samples obtained from simulations. By means of substituting GRNN prediction for simulation output, the time spent in simulation via optimization can be greatly reduced. To be detailed, a certain amount of representative samples were initially generated from the simulation model. Then, the GRNN was trained using these samples so that the GRNN model can form a regression surface that provides good approximation of the input-output relationship of the simulation, which is considered as black box. Based on the GRNN model, the optimization via simulation problem is transformed to optimizing the GRNN model, which is more efficient and time-saving. Numerical example shows that GRNN-OvS is effective and feasible, which is very helpful in optimization via simulation applications where the simulation task is time-consuming.