murphyk@cs.berkeley.edu
2001-Jul-13 22:10 UTC
[Rd] ordered factors in tree package - bug? (PR#1025)
I am new to R, and didn't know which list to send this to, since it is a
bug report about a package, not about core R...
I have created a regression tree using 4 predictors: 3 are unordered
(binary) predictors, and the last is a date (integer), which I declare
to be an ordered factor. However, the tree treats the date as if it were
un-ordered, splitting into non-consecutive subsets.
My code
dat <- read.table("/home/cs/murphyk/R/Eugene/102.dat", header=TRUE)
dat$machine <- factor(dat$machine)
dat$TIM <- factor(dat$TIM)
dat$lid <- factor(dat$lid)
#dat$date <- factor(dat$date, ordered=TRUE)
dat$date <- ordered(dat$date)
tr <- tree(TRES ~ ., dat)
produces
1) root 100 0.149300 0.2345
2) TIM: 111 47 0.020650 0.2011
4) date: 9,11 11 0.002673 0.1845 *
5) date: 6,7,8,10,13,14,15,16,17,18,19 36 0.014060 0.2061 *
3) TIM: 222 53 0.029490 0.2642
6) date: 6,7,11,12,14,15,18,19 28 0.010670 0.2539
12) lid: A 6 0.001350 0.2350 *
13) lid: B 22 0.006582 0.2591 *
7) date: 8,9,10,13,16,17 25 0.012620 0.2756
14) machine: 101 14 0.005750 0.2850 *
15) machine: 102 11 0.004055 0.2636 *
and yet
> dat$date
[1] 6 6 6 6 6 7 7 7 7 7 7 7 8 8 8 9 9 9 9 9 9 9
9 9 10
[26] 10 10 10 10 10 11 11 11 11 11 11 11 11 12 12 12 13 13 13 13 13 13
13 13 13
[51] 13 13 14 14 14 14 14 14 14 15 15 15 15 15 15 15 15 15 15 16 16 16
16 16 16
[76] 16 16 17 17 17 17 17 17 17 17 17 17 17 17 17 18 18 18 18 18 19 19
19 19 19
Levels: 6 < 7 < 8 < 9 < 10 < 11 < 12 < 13 < 14 < 15
< 16 < 17 < 18 < 19 > is.ordered(dat$date)
[1] TRUE
Also, how do I deal with dates of the form dd/mm/yy, instead of just
integers? (In the above file, I used perl to extract the day, since I
new month and year were constant.)
I have attached the text file 102.dat, so you can easily reproduce the
above bug. I am using R 1.2.3 on linux. For future reference, is it
considered bad form to send attachments?
Kevin
Maybe Matching Threads
- ordered factors in tree package - bug?
- question about contrast in R for multi-factor linear regression models?
- R Error/Warning Messages with library(MASS) using glm.
- CESA-2007:1025 Important CentOS 4 ia64 gpdf - security update
- CESA-2007:1025 Important CentOS 4 s390(x) gpdf - security update
