You only have 43 cases. After one split, the groups are too small
to split again with the default settings. See ?rpart.control.
On Fri, 25 May 2007, Silvia Lomascolo wrote:
>
> I work on Windows, R version 2.4.1. I'm very new with R!
>
> I am trying to build a classification tree using rpart but, although the
> matrix has 108 variables, the program builds a tree with only one split
> using one variable! I know it is probable that only one variable is
> informative, but I think it's unlikely. I was wondering if someone can
help
> me identify if I'm doing something wrong because I can't see it,
nor could I
> find it in the help or in this forum.
>
> I want to see whether I can predict disperser type (5 categories) of a
> species given the volatile compounds that the fruits emit (108 volatiles)
> I am writing:
>
>> dispvol.x<- read.table ('C:\\Documents and
> Settings\\silvia\\...\\volatile_disperser_matrix.txt', header=T)
>> dispvol.df<- as.data.frame (dispvol.x)
>> attach (dispvol.df) #I think I need to do this so the variables are
> identified when I write the regression equation
>> dispvol.ctree <- rpart (disperser~ P3.70 +P4.29 +P5.05 +... +P30.99
+P32.25
> +TotArea, >data= dispvol.df, method='class')
>
> and I get the following output:
>
> n= 28
>
> node), split, n, loss, yval, (yprob)
> * denotes terminal node
>
> 1) root 28 15 non (0.036 0.32 0.071 0.11 0.46)
> 2) P10.01>=1.185 10 4 bat (0.1 0.6 0.2 0 0.1) *
> 3) P10.01< 1.185 18 6 non (0 0.17 0 0.17 0.67) *
>
> There is nothing special about P10.01 that I can see in my data and I
don't
> know why it chooses that variables and stops there!
>
> My matrix looks something like this (except, with a lot more variables)
>
> disperser P3.70 P4.29 P6.45 P6.55 P10.01 P10.15 P10.18 TotArea
> ban 0.00 0.00 1.34 0.00 1.49 0.00 0.00 2.83
> non 0.00 0.00 0.00 152.80 0.00 14.31 0.00 167.11
> bat 0.00 0.00 0.00 131.56 0.65 0.00 0.00 132.21
> bat 0.00 0.00 5.05 0.00 13.01 6.85 0.00 24.90
> non 0.00 0.00 72.65 103.26 4.10 0.00 0.00 180.02
> non 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> bat 1.23 0.00 0.48 0.89 0.25 0.00 0.00 2.85
> bat 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> non 0.00 0.00 0.00 0.00 1.06 0.00 0.00 1.06
> bat 0.00 0.00 0.00 0.00 28.69 0.00 21.33 50.02
> mix 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> non 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> non 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> non 0.00 0.00 0.00 0.00 1.15 0.00 0.00 1.15
> non 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> non 0.00 0.82 0.00 1.65 0.00 0.00 0.00 2.47
> bat 0.00 0.00 133.24 0.00 3.13 0.00 0.00 136.37
> bir 0.00 0.00 11.08 3.16 1.79 2.09 0.48 18.61
> non 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> mix 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> bat 0.00 0.00 0.00 0.00 1.31 0.00 0.00 1.31
> non 0.00 0.00 0.00 0.00 0.00 0.00 1.23 1.23
> bat 0.00 0.00 1.81 0.00 2.84 0.00 0.00 4.65
> non 0.00 0.00 1.18 0.00 0.73 0.00 0.00 1.91
> bir 0.00 0.00 0.00 0.00 1.40 0.00 0.00 1.40
> bat 0.00 0.00 8.16 1.50 1.22 0.00 0.00 10.88
> mix 0.00 0.55 0.00 0.00 0.00 0.00 0.00 0.55
> non 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
>
> Thanks! Silvia.
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595