I am trying to understand ``deviance'' in classification tree output
from tree package.
library(tree)
set.seed(911)
mydf <- data.frame(
name = as.factor(rep(c("A", "B"), c(10, 10))),
x = c(rnorm(10, -1), rnorm(10, 1)),
y = c(rnorm(10, 1), rnorm(10, -1)))
mytree <- tree(name ~ ., data = mydf)
mytree
# node), split, n, deviance, yval, (yprob)
# * denotes terminal node
# 1) root 20 27.730 A ( 0.5 0.5 )
# 2) y < -0.00467067 10 6.502 B ( 0.1 0.9 )
# 4) x < 1.50596 5 5.004 B ( 0.2 0.8 ) *
# 5) x > 1.50596 5 0.000 B ( 0.0 1.0 ) *
# 3) y > -0.00467067 10 6.502 A ( 0.9 0.1 )
# 6) x < -0.578851 5 0.000 A ( 1.0 0.0 ) *
# 7) x > -0.578851 5 5.004 A ( 0.8 0.2 ) *
# Replicate results for node 2
# Probabilities tie out
with(subset(mydf, y < -0.00457), table(name))
# name
# A B
# 1 9
# Cannot replicate deviance = -1 * sum(p_mk * log(p_mk))
0.1 * log(0.1) + 0.9 * log(0.9)
# [1] 0.325083
1. In the documentation, is it possible to find the definition of
deviance?
2. Is it possible to see the code where it calculates deviance?
Thanks,
Naresh