Semi-supervised support vector machine (S3VM) algorithms can effectively deal with the problem of a few labeled instance and a large number of unlabeled instances due to its good performance. The solution of the existing semi-supervised support vector machine algorithms requires the use of many types of optimization strategies because it takes all the training data as parameters to participate in iterative optimization, which makes it difficult to efficiently process large-scale data. Although simple random sampling is an effective means to consider efficient modeling from the perspective of data preprocessing, the problem that it determines the sample size in advance is difficult to process for the existence of sampling randomness and sample difference. To fully characterize the original unlabeled data and ensure the robustness of the model, we have proposed an adaptive sampling to train the model on the labeled set and the sampled unlabeled set. The fixed size unlabeled instances are continually sampled from the original unlabeled set until the proposed statistics on the obtained sample meet the stopping condition, where the statistics and stopping condition are generated by the density estimation. This method solves the problem of subjectively determining the sample size in advance, the robustness of the proposed algorithm has been proved with the probably approximately correct learning theory.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com