Dear r-help-list: I' got a question about the computation of the improve of a split. The following is an extract of an output of the summary of a tree: Node number 1: 600 observations, complexity param=0.007272727 predicted class=0 expected loss=0.1666667 class counts: 500 100 probabilities: 0.833 0.167 left son=2 (211 obs) right son=3 (389 obs) Primary splits: x4 < 0.5 to the left, improve=1.2284910, (0 missing) x1 < 1.5 to the left, improve=0.9729730, (0 missing) x10 < 1.5 to the right, improve=0.8371014, (0 missing) Node number 2: 211 observations, complexity param=0.006666667 predicted class=0 expected loss=0.1232227 class counts: 185 26 probabilities: 0.877 0.123 left son=4 (123 obs) right son=5 (88 obs) Primary splits: x6 < 0.5 to the right, improve=1.0366150, (0 missing) x1 < 1.5 to the left, improve=0.7918369, (0 missing) x11 < 0.5 to the right, improve=0.5032110, (0 missing) Node number 3: 389 observations, complexity param=0.007272727 predicted class=0 expected loss=0.1902314 class counts: 315 74 probabilities: 0.810 0.190 left son=6 (209 obs) right son=7 (180 obs) Primary splits: x7 < 0.5 to the right, improve=1.2448010, (0 missing) x10 < 1.5 to the right, improve=1.2076890, (0 missing) x9 < 1.5 to the right, improve=0.8054428, (0 missing) I used the default values for the "parms" parameter. So, loss is the unity matrix, prior are estimated by (5/6, 1/6) and split is "Gini". Why is the improve of the first split 1.228? My calculation: Impurity measure at the root node: 1/6*5/6=5/36 Node 2: 185/211*26/211, weight: 211/600 Node 3: 315/389*74/389, weight: 389/600 -> improve=5/36 - 211/600 * 185/211*26/211 - 389/600 * 315/389*74/389 = 0.001023743 Is there any normalisation? If I use matrix(c(0,3,3,0),nrow=2) as loss matrix, I get the same values as above. Shouldn't I get simply three times the improve of the case above because? Or is there again any normalisation? If I use matrix(c(0,1,5,0),nrow=2) as loss matrix, I get different values. Shouldn't I get simply the same improve as in the case "matrix(c(0,3,3,0),nrow=2)" because of the symmetrizaton of the loss matrix in case of two classes and the use of the Gini criterion? Thank you very much for your help! Henri -- "Ein Herz f?r Kinder" - Ihre Spende hilft! Aktion: www.deutschlandsegelt.de Unser Dankesch?n: Ihr Name auf dem Segel der 1. deutschen America's Cup-Yacht!