thr3ads.net - R help - [R] using xval in mvpart to specify cross validation groups [Mar 2010]

If this information is useful, please help other people find it:
Share via:

Andrew Dolman

2010-Mar-12 12:15 UTC

[R] using xval in mvpart to specify cross validation groups

Dear R's

I'm trying to use specific rather than random cross-validation groups
in mvpart.

The man page says:
xval Number of cross-validations or vector defining cross-validation groups.


And I found this reply to the list by Terry Therneau from 2006

The rpart function allows one to give the cross-validation groups explicitly.
So if the number of observations was 10, you could use
   > rpart( y ~ x1 + x2, data=mydata, xval=c(1,1,2,2,3,3,1,3,2,1))
which causes observations 1,2,7, and 10 to be left out of the first xval
sample, 3,4, and 9 out of the second, etc.

        Terry Therneau


I can't see how this string of values, c(1,1,2,2,3,3,1,3,2,1), codes
for observations 1,2,7,10 being left out of the 1st and so on.

Can anyone fill me in please?

Thanks,

andydolman at gmail.com

Andrew Dolman

2010-Mar-12 22:05 UTC

head link

[R] using xval in mvpart to specify cross validation groups

Thank you Dennis, I've got the idea now.

However, a followup question to make sure I'm not wasting my time.

If I specify the precise CV folds to use, should I not get the same
tree every time?


e.g. here I have an hypothetical time sequence observed with error
from 3 sites 's'

If I specify to leave out 1 site each time in a 3-fold CV (leaving
aside that 3-fold cv might not be a good idea)

Should I not get the same tree each time?


library(mvpart)
library(lattice)

y <- rep(sin(seq(0.1,6, 0.1)),3)
y1 <- y+rnorm(length(y), sd=0.5)
x <- rep(1:(length(y)/3),3)
s <- rep(1:3, each=(length(y)/3))

dat <- data.frame(x,y1,s)

xyplot(y1~x|s, data=dat)


(mvpart(y1~x, data=dat, xv="1se", xval=s))




Thank you for your help.



andydolman at gmail.com



On 12 March 2010 18:03, Dennis Murphy <djmuser at gmail.com>
wrote:> Hi:
>
> See inline...
>
> On Fri, Mar 12, 2010 at 4:15 AM, Andrew Dolman <andydolman at
gmail.com> wrote:
>>
>> Dear R's
>>
>> I'm trying to use specific rather than random cross-validation
groups
>> in mvpart.
>>
>> The man page says:
>> xval Number of cross-validations or vector defining cross-validation
>> groups.
>>
>>
>> And I found this reply to the list by Terry Therneau from 2006
>>
>> The rpart function allows one to give the cross-validation groups
>> explicitly.
>> So if the number of observations was 10, you could use
>> ? > rpart( y ~ x1 + x2, data=mydata, xval=c(1,1,2,2,3,3,1,3,2,1))
>> which causes observations 1,2,7, and 10 to be left out of the first
xval
>> sample, 3,4, and 9 out of the second, etc.
>>
>> ? ? ? ?Terry Therneau
>>
>>
>> I can't see how this string of values, c(1,1,2,2,3,3,1,3,2,1),
codes
>> for observations 1,2,7,10 being left out of the 1st and so on.
>
>
>> x <- c(1,1,2,2,3,3,1,3,2,1)
>> which(x == 1)?????? # elements left out of the first xval sample
> [1]? 1? 2? 7 10
>> which(x == 2)?????? # elements left out of the second xval sample
> [1] 3 4 9
>> which(x == 3)?????? # elements left out of the third xval sample
> [1] 5 6 8
>
> This vector is used to index a response vector/model matrix.
>
> To see how this is applied, consider the following. y is a vector of
> length 10, the same as x:
>> y <- rpois(10, 15)
>> y
> ?[1] 12 15 17 11 14 14 12 12 16 16
>> y[x != 1]????????????????? # first xval sample (y[1], y[2], y[7], y[10]
>> removed)
> [1] 17 11 14 14 12 16
>> y[x != 2]????????????????? # second xval sample (y[3], y[4], y[9]
removed)
> [1] 12 15 14 14 12 12 16
>> y[x != 3]????????????????? # third xval sample (y[5], y[6], y[8]
removed)
> [1] 12 15 17 11 12 16 16
>
> Indexing is one of the most important and powerful features of R.
>
> HTH,
> Dennis
>
>> Can anyone fill me in please?
>>
>> Thanks,
>>
>> andydolman at gmail.com
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

Apparently Analagous Threads

Search for more apparently analagous threads

R help - Mar 2010 - using xval in mvpart to specify cross validation groups

[R] using xval in mvpart to specify cross validation groups

[R] using xval in mvpart to specify cross validation groups

Apparently Analagous Threads