Tony Breyal
2009-Feb-18 11:40 UTC
[R] Training nnet in two ways, trying to understand the performance difference - with (i hope!) commented, minimal, self-contained, reproducible code
Dear all, Objective: I am trying to learn about neural networks. I want to see if i can train an artificial neural network model to discriminate between spam and nonspam emails. Problem: I created my own model (example 1 below) and got an error of about 7.7%. I created the same model using the Rattle package (example 2 below, based on rattles log script) and got a much better error of about 0.073%. Question 1: I don't understand why the rattle script gives a better result? I must therefore be doing something wrong in my own script (example 1) and would appreciate some insight :-) Question 2: As rattle gives a much better result, i would be happy to use it's r-code instead of my own. How can I interpret it's predictions as either being either 'spam' or 'nonspam'? I have looked at the type='class' parameter in ?predict.nnet but it doesn't apply to this situation i believe. Below i give commented, minimal, self-contained and reproducible code. (if you ignore the output, it really is very few lines of code and therefore minimal i believe?) ## load library>library(nnet)## Load in spam dataset from package kernlab>data(list = "spam", package = "kernlab") >set.seed(42) >my.sample <- sample(nrow(spam), 3221) >spam.train <- spam[my.sample, ] >spam.test <- spam[-my.sample, ]## Example 1 - my own code # train artificial neural network (nn1)>( nn1 <- nnet(type~., data=spam.train, size=3, decay=0.1, maxit=1000) )# predict spam.test dataset on nn1> ( nn1.pr.test <- predict(nn1, spam.test, type='class') )[1] "spam" "spam" "spam" "spam" "nonspam" "spam" "spam" [etc...] # error matrix>(nn1.test.tab<-table(spam.test$type, nn1.pr.test, dnn=c('Actual', 'Predicted')))Predicted Actual nonspam spam nonspam 778 43 spam 63 496 # Calucate overall error percentage ~ 7.68%>(nn1.test.perf <- 100 * (nn1.test.tab[2] + nn1.test.tab[3]) / sum(nn1.test.tab))[1] 7.68116 ## Example 2 - code based on rattles log script # train artifical neural network>nn2<-nnet(as.numeric(type)-1~., data=spam.train, size=3, decay=0.1, maxit=1000)# predict spam.test dataset on nn2. # ?predict.nnet does have the parameter type='class', but i can't use that here as an option>nn2.pr.test <- predict(nn2, spam.test)[,1] 3 0.984972396013 4 0.931149225918 10 0.930001139978 13 0.923271300707 21 0.102282256315 [etc...] # error matrix>( nn2.test.tab <- round(100*table(nn2.pr.test, spam.test$type,dnn=c("Predicted", "Actual"))/length (nn2.pr.test)) ) Actual Predicted nonspam spam -0.741896935969825 0 0 -0.706473834678304 0 0 -0.595327594045746 0 0 [etc...] # calucate overall error percentage. Am not sure how this line works tbh, # and i think it should be multiplied by 100. I got this from rattle's log script.>(function(x){return((x[1,2]+x[2,1])/sum(x))})(table(nn2.pr.test, spam.test$type, dnn=c("Predicted", "Actual"))) [1] 0.0007246377 # i'm guessing the above should be ~0.072% I know the above probably seems complicated, but any help that can be offered would be much appreicated. Thank you kindly in advance, Tony OS = Windows Vista Ultimate, running R in admin mode> sessionInfo()R version 2.8.1 (2008-12-22) i386-pc-mingw32 locale: LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom. 1252;LC_MONETARY=English_United Kingdom. 1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] RGtk2_2.12.8 vcd_1.2-2 colorspace_1.0-0 MASS_7.2-45 rattle_2.4.8 nnet_7.2-45 loaded via a namespace (and not attached): [1] tools_2.8.1
Tony Breyal
2009-Feb-18 14:32 UTC
[R] Training nnet in two ways, trying to understand the performance difference - with (i hope!) commented, minimal, self-contained, reproducible code
hmm, further investigation shows that two different fits are used. Why did nnet decide to use different fits when the data is basically the same (2 factors in nn1 and binary in nn2)? # uses an entropy fit (maximum conditional likelihood)> nn1a 57-3-1 network with 178 weights inputs: make address all num3d our over [etc...] output(s): type options were - entropy fitting decay=0.1 # uses the default least squares fit> nn2a 57-3-1 network with 178 weights inputs: make address all num3d our over [etc...] output(s): as.numeric(type) - 1 options were - decay=0.1 again, many thanks for any help. Tony On 18 Feb, 11:40, Tony Breyal <tony.bre... at googlemail.com> wrote:> Dear all, > > Objective: I am trying to learn about neural networks. I want to see > if i can train an artificial neural network model to discriminate > between spam and nonspam emails. > > Problem: I created my own model (example 1 below) and got an error of > about 7.7%. I created the same model using the Rattle package (example > 2 below, based on rattles log script) and got a much better error of > about 0.073%. > > Question 1: I don't understand why the rattle script gives a better > result? I must therefore be doing something wrong in my own script > (example 1) and would appreciate some insight ?:-) > > Question 2: As rattle gives a much better result, i would be happy to > use it's r-code instead of my own. How can I interpret it's > predictions as either being either 'spam' or 'nonspam'? I have looked > at the type='class' parameter in ?predict.nnet but it doesn't apply to > this situation i believe. > > Below i give commented, minimal, self-contained and reproducible code. > (if you ignore the output, it really is very few lines of code and > therefore minimal i believe?) > > ## load library > > >library(nnet) > > ## Load in spam dataset from package kernlab > > >data(list = "spam", package = "kernlab") > >set.seed(42) > >my.sample <- sample(nrow(spam), 3221) > >spam.train <- spam[my.sample, ] > >spam.test <- spam[-my.sample, ] > > ## Example 1 - my own code > # train artificial neural network (nn1)>( nn1 <- nnet(type~., data=spam.train, size=3, decay=0.1, maxit=1000) ) > > # predict spam.test dataset on nn1> ( nn1.pr.test <- predict(nn1, spam.test, type='class') ) > > ? ?[1] "spam" ? ?"spam" ? ?"spam" ? ?"spam" ? ?"nonspam" "spam" > "spam" > ? ?[etc...] > # error matrix>(nn1.test.tab<-table(spam.test$type, nn1.pr.test, dnn=c('Actual', 'Predicted'))) > > ? ? ? ? ? ?Predicted > ? Actual ? ?nonspam spam > ? ? nonspam ? ? 778 ? 43 > ? ? spam ? ? ? ? ? 63 ? ?496 > # Calucate overall error percentage ~ 7.68%>(nn1.test.perf <- 100 * (nn1.test.tab[2] + nn1.test.tab[3]) / sum(nn1.test.tab)) > > [1] 7.68116 > > ## Example 2 - code based on rattles log script > # train artifical neural network>nn2<-nnet(as.numeric(type)-1~., data=spam.train, size=3, decay=0.1, maxit=1000) > > # predict spam.test dataset on nn2. > # ?predict.nnet does have the parameter type='class', but i can't use > that here as an option>nn2.pr.test <- predict(nn2, spam.test) > > ? ? ? ? ? ? ? ?[,1] > 3 ? ?0.984972396013 > 4 ? ?0.931149225918 > 10 ? 0.930001139978 > 13 ? 0.923271300707 > 21 ? 0.102282256315 > [etc...] > # error matrix>( nn2.test.tab <- round(100*table(nn2.pr.test, spam.test$type, > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? dnn=c("Predicted", "Actual"))/length > (nn2.pr.test)) ) > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Actual > ? Predicted ? ? ? ? ? ? ? ? ? ?nonspam spam > ? ? -0.741896935969825 ? ? ? ? ? ? ?0 ? ?0 > ? ? -0.706473834678304 ? ? ? ? ? ? ?0 ? ?0 > ? ? -0.595327594045746 ? ? ? ? ? ? ?0 ? ?0 > ? [etc...] > # calucate overall error percentage. Am not sure how this line works > tbh, > # and i think it should be multiplied by 100. I got this from rattle's > log script.>(function(x){return((x[1,2]+x[2,1])/sum(x))}) > > ? ? ? ? ? ? (table(nn2.pr.test, spam.test$type, ?dnn=c("Predicted", > "Actual"))) > [1] 0.0007246377 > # i'm guessing the above should be ~0.072% > > I know the above probably seems complicated, but any help that can be > offered would be much appreicated. > > Thank you kindly in advance, > Tony > > OS = Windows Vista Ultimate, running R in admin mode> sessionInfo() > > R version 2.8.1 (2008-12-22) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom. > 1252;LC_MONETARY=English_United Kingdom. > 1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252 > > attached base packages: > [1] grid ? ? ?stats ? ? graphics ?grDevices utils ? ? datasets > methods ? base > > other attached packages: > [1] RGtk2_2.12.8 ? ? vcd_1.2-2 ? ? ? ?colorspace_1.0-0 > MASS_7.2-45 ? ? ?rattle_2.4.8 ? ? nnet_7.2-45 > > loaded via a namespace (and not attached): > [1] tools_2.8.1 > > ______________________________________________ > R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Possibly Parallel Threads
- Translating R code + library into Fortran?
- About Mcneil Hanley test for a portion of AUC!
- How to compare areas under ROC curves calculated with ROCR package
- How to compare areas under ROC curves calculated with ROC R package
- Error:non-numeric argument in my function