jshuter at uoguelph.ca
2008-Feb-08 21:07 UTC
[R] Using cv.tree to assign cases to specific cv-groups
Hello, I would like to use cv.tree to run a 10-fold cross-validation experiment on a tree object to help me choose a tree size. Many users seem to allow their cases to be assigned to CV groups randomly, but I have assigned each case to one of 10 cv groups, such that the data from each of my experimental units is included in only one cv-group. According to the manual for the tree Package (Ripley 2007), the cv.tree argument "rand" [cv.tree(object, rand, FUN = prune.tree, K=10)], allows the user the option to specify an ?integer vector of the length the number of cases used to create object, assigning the cases to different groups for cross-validation? (Ripley 2007). However, after searching the R-archives and various online sources, I have been unable to find an example of code in which someone has exercised this option, so I am unsure how to proceed. Specifically, should I: 1. Create a 1 column dataframe, with each case containing a number from 1-10, with the order corresponding to the order of cases in the original dataset used to generate the tree object. 2.Call that dataset using the ?rand? argument when I run the full syntax for cv.tree OR should I: 1.List the integers used for case assignment directly in the syntax for cv.tree, following the ?rand? argument? If anyone has any experience using cv.tree (or another function) to assign specific cv-groups, any advice would be greatly appreciated! Jen Shuter University of Guelph