In this paper we discuss the search spaces of fractional/fractal dimensions and some of the problems where such spaces may emerge. We also consider relationships between Cantor set and probabilistic spaces, and the potential application of Cantor Dust as a combination of probability trees to create hybrid models. We believe that such considerations can be useful when developing new optimisation models over the spaces of standard or fractional dimensions.
We propose mean-shift to detect outlier points. The method processed every point by calculating its k-nearest neighbors (k-NN), and then shifting the point to the mean of its neighborhood. This is repeated three times. The bigger the movement, the more likely the point is an outlier. Boundary points are expected to move more than inner points; outliers more than boundary. The outlier detection is then a simple thresholding based on standard deviation of all movements. Points that move more than that are detected as outliers. The method outperforms all compared outlier detection methods.
In recent years, with the rapid development of location based services (LBS) and the improvement of indoor localization technology, a large amount of valuable trajectory data can be collected in an effective manner. In this article, a probabilistic model for Point-Of-Interest (POI) recommendation in indoor space is proposed, which can automatically discover the category of stores and the interest of users, and then recommend the stores that users may be interested in. It facilitates local business. Comprehensive experiments on a real shopping mall trajectory data set is conducted. The result demonstrates that the recommendation system proposed achieves better performance than the state-of-the-art method.
This paper deals with one class of propositional fuzzy logics. Recently, fuzzy logic systems have been introduced as logics being complete with respect to linearly ordered algebras, in particular, algebras on the unit interval [0,1]. One of important trends in this logic is to introduce logic systems having more general structures. As one work of this trend, we introduce implicational tonoid fuzzy logics as fuzzy logics with tonic properties. For this, we first define implicational tonoid fuzzy logics in general. We then introduce their corresponding ternary relational semantics, called Routley–Meyer–style semantics. Routley–Meyer semantics was first introduced as semantics for relevance logics and then has been generalized to semantics for other non-classical logics. Finally, we prove that implicational tonoid fuzzy logics are sound and complete with respect to their corresponding Routley–Meyer–style semantics.
A heuristic algorithm is provided to trim many edges for reducing the search space of traveling salesman problem (TSP). The heuristic algorithm is designed according to a probability model and the fuzzy numbers plays an important role to enhance the performance. The heuristic algorithm may lose a few edges in some optimal solutions if the parameter N is too small or F is too big in the algorithm. Since we are not sure whether all optimal solutions in the original graph are trimmed, the best and worst solutions in the preserved graphs are not proven. The experimental results demonstrate that the heuristic algorithm computes a residual graph with less than nlog2n edges for most of the TSP instances in the TSPLIB. Thus, the computation times of algorithms for TSP will be greatly reduced when these TSP instances on sparse graphs are resolved.
The Ocular cup-to-disc ratio is one of the habitual screening procedures done prior to the endorsement of glaucoma disorders. Glaucoma is one of the most dominant eye malfunctions which leads to perpetual vision loss hence retinal segmentation plays a censorious role in diagnosis of optic disc and cup. This paper highlights a very simple and straight forward approach to suspect glaucoma through k-means and a meticulous study has been made for CDR estimation. The Optic disc and cup are segmented as clusters and its CDR is calculated through mathematical quantifications. The results are tabulated and scrutinized which clearly shows the percentage and risk factor of glaucoma. The screened results can be further extended for clinical suites.
In this paper, the authors propose a new formula π=½eθ, in which the new constant θ is a real number. The new formula is a perfect supplement the Euler's formula eπi=−1. These two formulas together reveal the completely relationship between π and e. Constant θ=1+γ+2μ, γ is the Euler's constant, constant
A new triple correlation function–sparse autoencoder (TCF–SAE) algorithm based on SAE for m-sequence recognition is proposed. First, the peak characteristic of the TCF of m-sequences is introduced. The peak characteristic is found to be well kept irrespective of periodic or aperiodic m-sequence. Second, a construction method of input sample for network based on the TCF characteristic of m-sequence is proposed. Finally, a feature learning network is constructed by a SAE, and the learned features are classified by softmax regression. A network model with optimal recognition performance is then obtained by simulation experiments with different numbers of hidden layers and hidden units. The results show that the proposed TCF-SAE algorithm for the m-sequence classification is effective and displays a good recognition performance at low signal-to-noise ratio.
In web service recommender systems, users are always asked to provide their observed QoS data to assist the personalized QoS prediction for other users. Most existed approaches assume that all the users will provide real data to the system, however the dishonest users may be appeared in many recommender systems. Attracted by commercial benefit, some users may intentionally provide unfair feedback inconsistent with their real experience, which will harm to the robustness of service recommender system. In this paper, we propose a clustering-based reputation evaluation approach to identify the dishonest users. Firstly, we calculate the trustworthy cluster on each service by Clustering of users' QoS feedback. Then the feedback of users will be classified according to their deviation degree from the trustworthy cluster. Finally, according to the users' statistic feedback information, we apply Beta reputation model to evaluate users' reputation dynamically. Experimental results demonstrate that this approach can accurately evaluate users' reputaion compared to other state-of-the-art approaches.
In order to recommend the best web service to users, the QoS feedback information provided by advisors are always needed to predict the QoS of candidate services. Attracted by commercial benefit, some advisors may intentionally provide unfair feedback inconsistent with their real experience, which will cause the deviation of prediction results. To attack this problem, an unfair QoS feedback filtering algorithm based on beta reputation model is proposed in this paper. Firstly, several certified center users are obtained to initialize the trustworthy users set. Further, we evaluate the deviation between the target user's feedback and the average value of trustworthy users set. Finally, each user's reputation is calculated based on how many deviated feedback he has submitted. The users whose reputation exceed the trustworthy threshold will be recognized as trustworthy users. After several iterations, a majority of unfair users are filtered out. Experimental results demonstrate that our approach can accurately filter out the unfair users and improve the robustness of QoS prediction algorithms against unfair feedback attacks.
Homophily is the social phenomenon that people who are similar in some aspect interact at a significantly higher rate. We conducted web mining based on Twitter's application programming interfaces to investigate political homophily in the congressional Twitter community. We examined the structures of public Twitter communications among active members of the US congress. Our findings showed strong evidence of homophily with respect to party affiliation and a significant higher rate of communication for members in the same party. In an entirely distinct differentiation, we discovered moderate evidence of inverse homophily with respect to seniority and a significant higher rate of communication between members at different seniority levels.
The distribution method of regenerative braking force for pure electric car is studied. According to the speed and state of charge (SOC), a fuzzy logic based braking force distribution control strategy is proposed to achieve the better effective recovery of braking energy. The braking force distribution control strategy is simulated by MATLAB. The simulation results show that the energy recovery efficiency of regenerative braking is increased by 12.03%, and then the driving range of pure electric car is improved greatly.
The Basis Pursuit DeNoising problem is a variant of the well-known lasso method for obtaining sparse least-squares approximations, which can be represented as the minimization problem min β½‖Aβ−y‖22+λ‖β‖1. In this paper the problem is treated as a parametric quadratic program with parameter λ≥0. By this technique, very large scale problems can be essentially solved exactly for a range of values of λ, as long as the number of non-zero components of β is small to modest.
Intelligent customer service platform has the ability of automatic communication and service based on natural language. It can achieve intelligent interaction between human and machine interaction, such as, providing users with higher quality services through instant messaging, Internet, telephone, text messaging, etc. This paper mainly explores the subdivision research of airline customers based on text classification. First of all, we extract the characteristics on text data of airline customer service platform through TF-IDF. Secondly, naive Bayes, SVM, KNN, and logistic regression are used to train the model. Thirdly, a combinational model based on the four algorithms is constructed; Finally, we use 10-fold cross validation to verify the testing results. And experiments show that the combinational algorithms are better than the original methods.
Due to the characteristics of the data in public databases, it is difficult to clean the redundant data. It is an ideal solution to make full use of the features of original data to perform technical data cleaning. The entity identification technology is one of the effective solutions. Based on the key technologies of entity identification, combining the characteristics of the Chinese text data of Chinese public databases and the network relationships existing in the data, an entity identification framework based on text clustering and social network community division was proposed for the challenges with the same representation of different entities. The framework has been experimented on China's library database. Experimental results show the validity and accuracy of the framework.
Sentiment analysis is significant for social media. Although many achievements have been made, most of these are focused on either only text modality or only audio modality. In this paper, we proposed an architecture of multimodal sentiment analysis based on RNN and feature selection. It fully uses joint textual, audio and video features representation to perform multimodal sentiment analysis. By designing a feature selection component, we can select the informative features from the redundant and heterogeneous unimodal features to improve the performance of sentiment analysis model. At the same time, the additional RNN architecture can capture the dependency and information flow among the utterances of a video in a single modality and perform modality fusion at every timestep in the feature-level. The proposed method achieves better performance in sentiment prediction and shows improvement in performance over the baseline.
A multi-objective improved genetic algorithm is constructed to solve the train operation simulation model of urban rail train and find the optimal operation curve. In the train control system, the conversion point of operating mode is the basic of gene encoding and the chromosome composed of multiple genes represents a control scheme,and the initial population can be formed by the way.The fitness function can be designed by the design requirements of the train control stop error, time error and energy consumption. the effectiveness of new individual can be ensured by checking the validity of the original individual when its in the process of selection, crossover and mutation,and the optimal algorithm will be joined all the operators to make the new group not eliminate on the best individual of the last generation.The simulation result shows that the proposed genetic algorithm comparing with the optimized multi-particle simulation model can reduce more than 10% energy consumption, it can provide a large amount of sub-optimal solution and has obvious optimization effect.
Mining spatial co-location pattern is a challenging and essential task in spatial data mining, which is aimed to discover the subsets of spatial features frequently observed together in the adjacent geographic space. In this paper, we apply the concept of fuzzy set theory to the high utility co-location pattern mining, which allows one to find all the high utility co-location patterns from fuzzy datasets. Firstly, we define the related concepts, including the utility of fuzzy pattern and its ratio. Secondly, we propose an efficient fuzzy high utility co-location mining basic algorithm and the optimization one. The latter algorithm constructs the star row-instance weighted utility downward closure property, which can significantly improve the efficiency and running time of finding the patterns. Finally, the feasibility of proposed algorithms is experimentally verified using synthetic and real datasets.
Quality is considered to be an essential determinant of the success of every type of software. Special attention should be therefore paid to its evaluation, regardless of the employed software development paradigm. Nevertheless, social Web applications available today are often of poor quality which is due to the lack of suitable quality models, methodologies, and measuring instruments. This paper presents first step towards an evaluation methodology that would generate the quality index as a single score thus enabling analysis and comparison of social Web applications at various levels in the quality model. With an aim to determine relevance of criteria that constitute quality model, an empirical study was carried out in which fuzzy AHP for the purpose of group weighting was applied. Participants in the study were field experts from seven different countries. Study findings revealed that pragmatic criteria are more relevant than hedonic ones when quality evaluation with respect to social Web applications is considered.
The behavior of the robot for the treatment and education of autism spectrum disorders (ASD) children with various characteristic disability symptoms was suggested with the help of artificial neural network (ANN). The proposed method combines ANN with robot motion based on existing field experiments to enhance the therapeutic effect. The newly designed motion classifies the behavior patterns in ambiguous situations while performing the basic specific actions. For implementing this operation, the hardware structure and control algorithm of the robot is also described. The basic structure of Feed-forward Neural Network (FNN) is designed under consideration how to construct and to improve the performance of the robot. This neural network architecture is verified by real robot experiment based on the actions used for actual treatment. In the future, we will expand the hardware supplement and ANN so that it can perform more intelligent actions in various situations and find out the effectiveness of the motion designed by applying it to real field of autistic children education.
The existence of a content-based download cooperation scheme in intermittently connected VANET not only satisfies the needs of vehicle users for security information services, but also facilitates the implementation of network value-added services. It enables users to download large amount of data files or watch online video in the driving vehicle, and so on. Meanwhile, the performance index of VANET is improved, such as increase downloaded throughput and spectral efficiency, reduce the data transmission delay and complexity of cooperation scheme, etc. Although there are many researches on the current cooperation schemes, but they are short of a systematic review. Therefore, the cooperative communication schemes in VANET will be divided into three categories. The cooperation schemes are introduced in detail in the cooperation conditions, communication modes and cooperative strategies, respectively. Based on the above classification, this paper focuses on the study, comparison and analysis of the latest cooperative communication schemes from wireless access technology, cooperation direction, download data volume, etc. In addition, some problems and future research directions of cooperation scheme in the current VANET are proposed.
In rough set theory, the computation of approximation is widely used in the field of knowledge discovery and data mining. As continuous data and noisy data exist extensively in practical applications, it's valuable to computing approximations in continuous data which are accompanied by noises. Three-way decisions model is an important vehicle of processing noises, but it can't directly process continuous data. To resolve this problem, in this paper, neighborhood concept is employed to compute approximations of continuous data. Three-way decisions rules with fundamental notion of tri-partition of a universal set are redefined through the approximations. A general theory of three-way decisions for continuous data with noises is built. The experimental results indicate that the proposed approach is effective and feasible.
In data mining and machine learning applications, the cost of collecting the features have to be taken into account in the feature selection. The cost-sensitive feature selection is widely discussed in single-label. However, there is little theoretical analysis on the issue of cost-sensitive multi-label feature selection. In coping with this problem, we propose a novel cost-sensitive multi-label feature selection based on positive approximation. In this proposed model, the feature significance is redefined by combination of the positive approximation and the feature cost on the basis of feature cores. Theoretical analysis and experimental results demonstrate the effectiveness and efficiency of the proposed algorithm thoroughly performed on three real multi-label datasets.
Cancer is a number of related yet highly heterogeneous diseases. Correct identification of cancer subtypes is critical for clinical decisions. The advance in sequencing technologies has made it possible to study cancer based on abundant genomics and transcriptomic (-omics) data. Such a data-driven approach is expected to address limitations and issues with traditional methods in identifying cancer subtypes. We evaluate the suitability of clustering–a data mining tool to study heterogenous data when there is a lack of sufficient understanding of the subject matters–in the identification of cancer subtypes. A number of popular clustering algorithms and their consensus are explored, and we find cancer subtypes identified by consensus clustering agree well with clinical studies.
Systems that are adjustable, time based and non-linear categories are credibly handled by applying fuzzy logic. Many authors had emphasized the role of fuzzy logic in food quality control for varying applications in the food industry domain. When considering the reasoning process, the support offered by fuzzy logic is remarkable, particularly in language based terminologies, related to operators and experts. In such cases, the system gets modeled using fuzzy logic by considering the values of linguistic variables and the vague relationships found amidst them. Uncertain and complex bioprocesses can be studied and their states can be assessed and forecast about their future can be done with the help of smart computer programs that play the role of software sensors. Imprecise and unfinished information can be accurately constructed with the help of fuzzy logic systems. Also process models can be precisely built by integrating the human expert knowledge into such models. This paper presents a survey enlisting the growth, advancement and current state of the art in applications of fuzzy logic in food product quality control. Remarkable expansions and inclinations to be made in future in this domain have also been emphasized.