Roberto
2012-Aug-01 22:19 UTC
[R] Different results between lda(mass) and spss discriminant analysis
Hi all, I obtained a strage result with LDA (MASS) function in R with NIR data. I tried both CV (leave one out cross validation) and splitting my data in odd (training) and even (prediction) sets. In all the cases the minimum error was near to 0. Due to the strange result, I tried with SPSS IBM software and it give me around 11% of minimum error with and without leave one out cross validation. Maybe the problem is a my error in my script? Someone can check it pls? data <- "data\\raw_data.csv" r <- read.csv(data, header = T) sound <- r[1:844,] unsound <- r[845:2195,] even_s <- seq(nrow(sound)) %% 2 even_u <- seq(nrow(unsound)) %% 2 t <- rbind(sound[even_s == 1,], unsound[even_u == 1,]) p <- rbind(sound[even_s != 1,], unsound[even_u != 1,]) fit <- lda(samples ~., data = t) ct <- table(p[, 1], predict(fit, p[,-1])$class) errors <- 1-diag(prop.table(ct, 1)) min.err <- 1-sum(diag(prop.table(ct))) fit_cv <- lda(samples ~., data = r, CV =T) ct_cv <- table(r[,1], fit_cv$class) errors_cv <- 1-diag(prop.table(ct_cv, 1)) min.err_cv <- 1-sum(diag(prop.table(ct_cv))) thank you for your help! Best regards, Roberto -- View this message in context: http://r.789695.n4.nabble.com/Different-results-between-lda-mass-and-spss-discriminant-analysis-tp4638763.html Sent from the R help mailing list archive at Nabble.com.
David L Carlson
2012-Aug-02 04:23 UTC
[R] Different results between lda(mass) and spss discriminant analysis
I don't see anything obviously wrong, but I seem to remember that SPSS uses equal prior probabilities as the default whereas lda() uses "the class proportions for the training set." That might explain the difference. ---------------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Roberto > Sent: Wednesday, August 01, 2012 5:20 PM > To: r-help at r-project.org > Subject: [R] Different results between lda(mass) and spss discriminant > analysis > > Hi all, > I obtained a strage result with LDA (MASS) function in R with NIR data. > I tried both CV (leave one out cross validation) and splitting my data > in > odd (training) and even (prediction) sets. > In all the cases the minimum error was near to 0. > > Due to the strange result, I tried with SPSS IBM software and it give > me > around 11% of minimum error with and without leave one out cross > validation. > > Maybe the problem is a my error in my script? > > Someone can check it pls? > > data <- "data\\raw_data.csv" > > r <- read.csv(data, header = T) > sound <- r[1:844,] > unsound <- r[845:2195,] > > even_s <- seq(nrow(sound)) %% 2 > even_u <- seq(nrow(unsound)) %% 2 > t <- rbind(sound[even_s == 1,], unsound[even_u == 1,]) > p <- rbind(sound[even_s != 1,], unsound[even_u != 1,]) > > fit <- lda(samples ~., data = t) > ct <- table(p[, 1], predict(fit, p[,-1])$class) > errors <- 1-diag(prop.table(ct, 1)) > min.err <- 1-sum(diag(prop.table(ct))) > > fit_cv <- lda(samples ~., data = r, CV =T) > ct_cv <- table(r[,1], fit_cv$class) > errors_cv <- 1-diag(prop.table(ct_cv, 1)) > min.err_cv <- 1-sum(diag(prop.table(ct_cv))) > > thank you for your help! > > Best regards, > Roberto > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Different- > results-between-lda-mass-and-spss-discriminant-analysis-tp4638763.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.