Arne Schulz
2009-Oct-30 21:33 UTC
[R] Obtaining illogical results from posterior LDA-classification because of "too good" data?
Dear list, my problem seems to be primarily a statistical one, but maybe there is a misspecification within R (and hopefully a solution). I have two groups with two measured variables as training data. According to the variables, the groups differ totally. I know that this is a very easy situation, but the later analysis will use the same principle (aside from more groups and more possible values). The example should be enough to draw my problem: matrix <- matrix(rep(c(0,0,0,0,0,1,1,1,1,1),3), ncol = 3, byrow = FALSE) matrix[,2:3] <- jitter(matrix[,2:3], .001) lda <- lda(matrix[,2:3],matrix[,1], prior = c(5,5)/10) I added some jitter to obtain a little within-group variance. The LDA would fail otherwise. When trying to predict to probability of new values, I get some strange results: testmatrix <- matrix(c(0,0,1,1,0,1,1,0), ncol = 2, byrow = TRUE) predict(lda,testmatrix)$posterior> predict(lda,testmatrix)$posterior0 1 [1,] 1 0 [2,] 0 1 [3,] 0 1 [4,] 1 0 Row 1 and 2 are quite right, although the probability should be not equal to 1, rather be close to 1. But row 3 and 4 really bothers me. The probabilities should be .5 for every value. Additionally the coefficients seem to be way to high:> lda[["scaling"]]LD1 [1,] 5835.805 [2,] 7000.393 When I insert 1 error per group, the results are quite right (jitter is not needed in this case): matrix <- matrix(rep(c(0,0,0,0,0,1,1,1,1,1),3), ncol = 3, byrow = FALSE) matrix[3,2] <- c(1) matrix[8,3] <- c(0) lda <- lda(matrix[,2:3],matrix[,1], prior = c(5,5)/10) predict(lda,testmatrix)$posterior> predict(lda,testmatrix)$posterior0 1 [1,] 0.9996646499 0.0003353501 [2,] 0.0003353501 0.9996646499 [3,] 0.5000000000 0.5000000000 [4,] 0.5000000000 0.5000000000 My question is now: Is my data "too good" or did I make a mistake in my code? Best regards, Arne Schulz
Uwe Ligges
2009-Nov-01 17:58 UTC
[R] Obtaining illogical results from posterior LDA-classification because of "too good" data?
Arne Schulz wrote:> Dear list, > my problem seems to be primarily a statistical one, but maybe there is a > misspecification within R (and hopefully a solution). > > I have two groups with two measured variables as training data. According to > the variables, the groups differ totally. I know that this is a very easy > situation, but the later analysis will use the same principle (aside from > more groups and more possible values). The example should be enough to draw > my problem: > matrix <- matrix(rep(c(0,0,0,0,0,1,1,1,1,1),3), ncol = 3, byrow = FALSE) > matrix[,2:3] <- jitter(matrix[,2:3], .001) > lda <- lda(matrix[,2:3],matrix[,1], prior = c(5,5)/10) > > I added some jitter to obtain a little within-group variance. The LDA would > fail otherwise. When trying to predict to probability of new values, I get > some strange results: > testmatrix <- matrix(c(0,0,1,1,0,1,1,0), ncol = 2, byrow = TRUE) > predict(lda,testmatrix)$posterior >> predict(lda,testmatrix)$posterior > 0 1 > [1,] 1 0 > [2,] 0 1 > [3,] 0 1 > [4,] 1 0 > > Row 1 and 2 are quite right, although the probability should be not equal to > 1, rather be close to 1. But row 3 and 4 really bothers me. The > probabilities should be .5 for every value. Additionally the coefficients > seem to be way to high: >> lda[["scaling"]] > LD1 > [1,] 5835.805 > [2,] 7000.393 > > When I insert 1 error per group, the results are quite right (jitter is not > needed in this case): > matrix <- matrix(rep(c(0,0,0,0,0,1,1,1,1,1),3), ncol = 3, byrow = FALSE) > matrix[3,2] <- c(1) > matrix[8,3] <- c(0) > lda <- lda(matrix[,2:3],matrix[,1], prior = c(5,5)/10) > predict(lda,testmatrix)$posterior >> predict(lda,testmatrix)$posterior > 0 1 > [1,] 0.9996646499 0.0003353501 > [2,] 0.0003353501 0.9996646499 > [3,] 0.5000000000 0.5000000000 > [4,] 0.5000000000 0.5000000000 > > > My question is now: Is my data "too good" or did I make a mistake in my > code?Your learning data has an intra-group variance close to 0 and hence the pooled variance is also almost 0. Hence minimal deviation from the center makes the posterior almost 1 in the corresponding direction. In your second example you are increasing the variance by orders of magnitude. Best, Uwe Ligges> > > Best regards, > Arne Schulz > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.