search for: obs_l

Displaying 1 result from an estimated 1 matches for "obs_l".

Did you mean: obs_0
2011 Jun 21
0
How does rpart computes "improve" for split="information"?? (which seems to be different then the "gini" case)
...impurity is still a mystery for me. Might you help with explaining it? Bellow is some R code simply showing how the gini is computed (and how the information is not as clear) # creating data set.seed(1324) y <- sample(c(0,1), 20, T) x <- y x[1:5] <- 0 # manually making the first split obs_L <- y[x<.5] obs_R <- y[x>.5] n_L <- sum(x<.5) n_R <- sum(x>.5) n <- length(x) calc.impurity <- function(func = gini) { impurity_root <- func(prop.table(table(y))) impurity_L <- func(prop.table(table(obs_L))) impurity_R <-func(prop.table(table(obs_R))) imp &l...