thr3ads.net - R help - [R] Training nnet in two ways, trying to understand the performance difference - with (i hope!) commented, minimal, self-contained, reproducible code [Feb 2009]

If this information is useful, please help other people find it:
Share via:

Tony Breyal

2009-Feb-18 11:40 UTC

[R] Training nnet in two ways, trying to understand the performance difference - with (i hope!) commented, minimal, self-contained, reproducible code

Dear all,

Objective: I am trying to learn about neural networks. I want to see
if i can train an artificial neural network model to discriminate
between spam and nonspam emails.

Problem: I created my own model (example 1 below) and got an error of
about 7.7%. I created the same model using the Rattle package (example
2 below, based on rattles log script) and got a much better error of
about 0.073%.

Question 1: I don't understand why the rattle script gives a better
result? I must therefore be doing something wrong in my own script
(example 1) and would appreciate some insight  :-)

Question 2: As rattle gives a much better result, i would be happy to
use it's r-code instead of my own. How can I interpret it's
predictions as either being either 'spam' or 'nonspam'? I have
looked
at the type='class' parameter in ?predict.nnet but it doesn't apply
to
this situation i believe.

Below i give commented, minimal, self-contained and reproducible code.
(if you ignore the output, it really is very few lines of code and
therefore minimal i believe?)

## load library>library(nnet)
## Load in spam dataset from package kernlab>data(list = "spam", package = "kernlab")
>set.seed(42)
>my.sample <- sample(nrow(spam), 3221)
>spam.train <- spam[my.sample, ]
>spam.test <- spam[-my.sample, ]

## Example 1 - my own code
# train artificial neural network (nn1)>( nn1 <- nnet(type~., data=spam.train, size=3, decay=0.1, maxit=1000) )
# predict spam.test dataset on nn1> ( nn1.pr.test <- predict(nn1, spam.test, type='class') )   [1] "spam"    "spam"    "spam"   
"spam"    "nonspam" "spam"
"spam"
   [etc...]
# error matrix>(nn1.test.tab<-table(spam.test$type, nn1.pr.test, dnn=c('Actual',
'Predicted')))           Predicted
  Actual    nonspam spam
    nonspam     778   43
    spam           63    496
# Calucate overall error percentage ~ 7.68%>(nn1.test.perf <- 100 * (nn1.test.tab[2] + nn1.test.tab[3]) /
sum(nn1.test.tab))[1] 7.68116


## Example 2 - code based on rattles log script
# train artifical neural network>nn2<-nnet(as.numeric(type)-1~., data=spam.train, size=3, decay=0.1,
maxit=1000)# predict spam.test dataset on nn2.
# ?predict.nnet does have the parameter type='class', but i can't
use
that here as an option>nn2.pr.test <- predict(nn2, spam.test)               [,1]
3    0.984972396013
4    0.931149225918
10   0.930001139978
13   0.923271300707
21   0.102282256315
[etc...]
# error matrix>( nn2.test.tab <- round(100*table(nn2.pr.test, spam.test$type,                            dnn=c("Predicted",
"Actual"))/length
(nn2.pr.test)) )
                                   Actual
  Predicted                    nonspam spam
    -0.741896935969825              0    0
    -0.706473834678304              0    0
    -0.595327594045746              0    0
  [etc...]
# calucate overall error percentage. Am not sure how this line works
tbh,
# and i think it should be multiplied by 100. I got this from rattle's
log script.>(function(x){return((x[1,2]+x[2,1])/sum(x))})            (table(nn2.pr.test, spam.test$type,  dnn=c("Predicted",
"Actual")))
[1] 0.0007246377
# i'm guessing the above should be ~0.072%


I know the above probably seems complicated, but any help that can be
offered would be much appreicated.

Thank you kindly in advance,
Tony

OS = Windows Vista Ultimate, running R in admin mode> sessionInfo()R version 2.8.1 (2008-12-22)
i386-pc-mingw32

locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.
1252;LC_MONETARY=English_United Kingdom.
1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets
methods   base

other attached packages:
[1] RGtk2_2.12.8     vcd_1.2-2        colorspace_1.0-0
MASS_7.2-45      rattle_2.4.8     nnet_7.2-45

loaded via a namespace (and not attached):
[1] tools_2.8.1

Tony Breyal

2009-Feb-18 14:32 UTC

head link

[R] Training nnet in two ways, trying to understand the performance difference - with (i hope!) commented, minimal, self-contained, reproducible code

hmm,  further investigation shows that two different fits are used.
Why did nnet decide to use different fits when the data is basically
the same (2 factors in nn1 and binary in nn2)?

# uses an entropy fit (maximum conditional likelihood)> nn1a 57-3-1 network with 178 weights
inputs: make address all num3d our over [etc...]
output(s): type
options were - entropy fitting  decay=0.1


# uses the default least squares fit> nn2a 57-3-1 network with 178 weights
inputs: make address all num3d our over [etc...]
output(s): as.numeric(type) - 1
options were - decay=0.1


again, many thanks for any help.
Tony

On 18 Feb, 11:40, Tony Breyal <tony.bre... at googlemail.com>
wrote:> Dear all,
>
> Objective: I am trying to learn about neural networks. I want to see
> if i can train an artificial neural network model to discriminate
> between spam and nonspam emails.
>
> Problem: I created my own model (example 1 below) and got an error of
> about 7.7%. I created the same model using the Rattle package (example
> 2 below, based on rattles log script) and got a much better error of
> about 0.073%.
>
> Question 1: I don't understand why the rattle script gives a better
> result? I must therefore be doing something wrong in my own script
> (example 1) and would appreciate some insight ?:-)
>
> Question 2: As rattle gives a much better result, i would be happy to
> use it's r-code instead of my own. How can I interpret it's
> predictions as either being either 'spam' or 'nonspam'? I
have looked
> at the type='class' parameter in ?predict.nnet but it doesn't
apply to
> this situation i believe.
>
> Below i give commented, minimal, self-contained and reproducible code.
> (if you ignore the output, it really is very few lines of code and
> therefore minimal i believe?)
>
> ## load library
>
> >library(nnet)
>
> ## Load in spam dataset from package kernlab
>
> >data(list = "spam", package = "kernlab")
> >set.seed(42)
> >my.sample <- sample(nrow(spam), 3221)
> >spam.train <- spam[my.sample, ]
> >spam.test <- spam[-my.sample, ]
>
> ## Example 1 - my own code
> # train artificial neural network (nn1)>( nn1 <- nnet(type~.,
data=spam.train, size=3, decay=0.1, maxit=1000) )
>
> # predict spam.test dataset on nn1> ( nn1.pr.test <- predict(nn1,
spam.test, type='class') )
>
> ? ?[1] "spam" ? ?"spam" ? ?"spam" ?
?"spam" ? ?"nonspam" "spam"
> "spam"
> ? ?[etc...]
> # error matrix>(nn1.test.tab<-table(spam.test$type, nn1.pr.test,
dnn=c('Actual', 'Predicted')))
>
> ? ? ? ? ? ?Predicted
> ? Actual ? ?nonspam spam
> ? ? nonspam ? ? 778 ? 43
> ? ? spam ? ? ? ? ? 63 ? ?496
> # Calucate overall error percentage ~ 7.68%>(nn1.test.perf <- 100 *
(nn1.test.tab[2] + nn1.test.tab[3]) / sum(nn1.test.tab))
>
> [1] 7.68116
>
> ## Example 2 - code based on rattles log script
> # train artifical neural network>nn2<-nnet(as.numeric(type)-1~.,
data=spam.train, size=3, decay=0.1, maxit=1000)
>
> # predict spam.test dataset on nn2.
> # ?predict.nnet does have the parameter type='class', but i
can't use
> that here as an option>nn2.pr.test <- predict(nn2, spam.test)
>
> ? ? ? ? ? ? ? ?[,1]
> 3 ? ?0.984972396013
> 4 ? ?0.931149225918
> 10 ? 0.930001139978
> 13 ? 0.923271300707
> 21 ? 0.102282256315
> [etc...]
> # error matrix>( nn2.test.tab <- round(100*table(nn2.pr.test,
spam.test$type,
>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? dnn=c("Predicted",
"Actual"))/length
> (nn2.pr.test)) )
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Actual
> ? Predicted ? ? ? ? ? ? ? ? ? ?nonspam spam
> ? ? -0.741896935969825 ? ? ? ? ? ? ?0 ? ?0
> ? ? -0.706473834678304 ? ? ? ? ? ? ?0 ? ?0
> ? ? -0.595327594045746 ? ? ? ? ? ? ?0 ? ?0
> ? [etc...]
> # calucate overall error percentage. Am not sure how this line works
> tbh,
> # and i think it should be multiplied by 100. I got this from rattle's
> log script.>(function(x){return((x[1,2]+x[2,1])/sum(x))})
>
> ? ? ? ? ? ? (table(nn2.pr.test, spam.test$type,
?dnn=c("Predicted",
> "Actual")))
> [1] 0.0007246377
> # i'm guessing the above should be ~0.072%
>
> I know the above probably seems complicated, but any help that can be
> offered would be much appreicated.
>
> Thank you kindly in advance,
> Tony
>
> OS = Windows Vista Ultimate, running R in admin mode> sessionInfo()
>
> R version 2.8.1 (2008-12-22)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.
> 1252;LC_MONETARY=English_United Kingdom.
> 1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252
>
> attached base packages:
> [1] grid ? ? ?stats ? ? graphics ?grDevices utils ? ? datasets
> methods ? base
>
> other attached packages:
> [1] RGtk2_2.12.8 ? ? vcd_1.2-2 ? ? ? ?colorspace_1.0-0
> MASS_7.2-45 ? ? ?rattle_2.4.8 ? ? nnet_7.2-45
>
> loaded via a namespace (and not attached):
> [1] tools_2.8.1
>
> ______________________________________________
> R-h... at r-project.org mailing
listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Reasonably Related Threads

Search for more apparently analagous threads

R help - Feb 2009 - Training nnet in two ways, trying to understand the performance difference - with (i hope!) commented, minimal, self-contained, reproducible code

[R] Training nnet in two ways, trying to understand the performance difference - with (i hope!) commented, minimal, self-contained, reproducible code

[R] Training nnet in two ways, trying to understand the performance difference - with (i hope!) commented, minimal, self-contained, reproducible code

Reasonably Related Threads