In this paper we study a new active learning method by using statistical asymptotic theory. Active learning is a learning method that we select input distributions to the system. Then it is important to determine which input distribution we should select. When we estimate the system, we often use a parametric model. But generally the model does not include the true system. If we use active learning method and the maximum likelihood estimator (MLE), we cannot estimate the system appropriately. Because MLE is not consistent when we use active learning method to estimate the system. For the consistency of the estimation we suggest to use the maximum weighted likelihood estimator (MWLE). We construct an algorithm of active learning using MWLE and evaluate the generalization error of the algorithm. Moreover we verify the validity of the result by computer simulation.
Key words: Optimal experimental design, Kullback-Leibler divergence, Generalization error, Statistical asymptotic theoryFour nonparametric methods are proposed for the testing problem of the latent vector in principal component analysis. They are constructed using the principal component scores and applying the Moses rank-like test for the equal variances. These proposed methods and Anderson's statistic were checked for their accuracies and powers by a Monte Carlo simulation under a non-normal distribution such as the lognormal distribution. Consequently, we may see that Anderson's statistic is not applicable under the non-normal distribution, and the nonparametric methods proposed in this paper are accurate and practical.
Key words: Moses rank-like test, Nonnormality, SimulationThe discriminant analysis is an important method for the data analysis, like the regression analysis. The algorithm for the Fisher's linear discriminant function and the least square algorithm is the most popular for the regression model, and the LAV(least absolute value) regression using the linear programming is also well-known. On the other hand, An algorithm for the linear discriminant function using mathematical programming has not been offered until now. In this paper, we introduce two optimal linear discriminant functions using the linear programming and using the integer programming. In order to evaluate these new methods, we apply the methods, the Fisher's linear discriminant function and quadratic discriminant function to two data sets: Fisher's iris data and a medical data concerning CPD(Cephalo Pelvic Disproportion) collected by Dr. Suzumura et al.. In the Fisher's iris data, two groups, virginica and vircicle, are used. Each of the groups contained 50 cases. CPD data is the measurement of 19 variables for each of the two groups, a natural delivery (180 cases) and Caesarian operation(60 cases). There are multicollinearity relation in the CPD data. We obtain two results from the CPD data. The first result is that the number of misclassification using IP linear discriminant function are decreasing according to increasing of dependent variables, nevertheless the results of other methods are very bad. The second result is obtained from the comparisons of 19 variables and 16 variables which is removed off 3 variables having multicollinearity. Misclassified cases of 16 variables on the backward selection are less than those of others such as 16 variables on the forward selection and 19 variables on the backward and forward selections from 1 variable to 16 variables. On the other hand, R-squares of the latter sequences of the selections are superior to the former sequence over 6 variables. This result suggests us that IP model is useful for the case where two groups are not distributed according to normal distributions or variance-covariance matrix of those are not equal. The avoidance of multicollinearity cause a good result.
Key words: Fisher's linear discriminant function, Integer Programming, IP linear discriminant function, LP linear discriminant function, Multi-collinearityFundamental and generalized theory of rough sets developed by Pawlak and their relationship to modal logic are outlined. Then an application of rough sets and modal logic to association rules, one of important areas in data mining, is sketched by example. Finally the paper is concluded by discussing significance of rough set theory.
Key words: Classification, Approximation, Kripke semantics, Association rules