MarkBeauchene
2012-Jul-06 21:38 UTC
[R] Plotting rpart trees with long list of class members
I have a class with 732 members, so using rpart.plot is giving me a tiny plot in the middle of the window. Is there a good way to modify the plot, or replace the long list with something like "group1"? -- View this message in context: http://r.789695.n4.nabble.com/Plotting-rpart-trees-with-long-list-of-class-members-tp4635671.html Sent from the R help mailing list archive at Nabble.com.
Jean V Adams
2012-Jul-09 13:20 UTC
[R] Plotting rpart trees with long list of class members
If you provide a simple example, with actual data and code so that we can reproduce your issue, it would make it easier for readers of this list to help. Jean MarkBeauchene <MarkBeauchene@hotmail.com> wrote on 07/06/2012 04:38:32 PM:> I have a class with 732 members, so using rpart.plot is giving me a tinyplot> in the middle of the window. Is there a good way to modify the plot, or > replace the long list with something like "group1"?[[alternative HTML version deleted]]
Jean V Adams
2012-Jul-12 21:27 UTC
[R] Plotting rpart trees with long list of class members
The example you gave had only one split. If your real situation has three splits, you'll have to take a look at testtree$csplit matrix and decide how you want to define the new grouping variable. Here's one way to do it ... Jean library(rpart) library(rpart.plot) test_set <- data.frame( list_var=paste("A", (1:1000)%/%25, sep=''), list_val=c(runif(250, 1, 4), runif(250, 3, 5), runif(250, 4, 6), runif(250, 5, 7)) ) # a preliminary tree, to get the splits (not plotted) testtree <- rpart(list_val ~ list_var, minbucket=100, data=test_set) # a vector of the unique values of list_var, sorted suvar <- sort(unique(test_set$list_var)) # define a new variable to represent all combinations of splits in testtree groups <- factor(apply(testtree$csplit, 2, paste, collapse="-"), labels=seq(table(splitz))) # expand this new variable to the length of the original data frame test_set$var_grp <- as.factor(groups[match(test_set$list_var, suvar)]) # fit another tree, using the grouping variable, for plotting purposes testtree2 <- rpart(list_val ~ var_grp, data=test_set) rpart.plot(testtree2, type=3) Mark Beauchene <markbeauchene@hotmail.com> wrote on 07/11/2012 02:34:52 PM:> Thank you, it works very well. > > Could you help me out by explaining a little bit of how it works? > In my actual plot I have 3 splits on the same long list class > variable, and I don't completely follow your code. > > Mark Beauchene > > To: MarkBeauchene@hotmail.com > CC: r-help@r-project.org > Subject: Re: [R] Plotting rpart trees with long list of class members > From: jvadams@usgs.gov > Date: Tue, 10 Jul 2012 09:10:05 -0500 > > Thanks. Very helpful. > > You can use the information from the splits in the first tree, to > define a new grouping variable, which will simplify the plot: > suvar <- sort(unique(test_set$list_var)) > test_set$var_grp <- as.factor(testtree$csplit[match(test_set > $list_var, suvar)]) > testtree2 <- rpart ( list_val ~ var_grp, data = test_set ) > rpart.plot(testtree2, type=3) > > Not to other readers, you will need to load these packages, before > running the code: > library(rpart) > library(rpart.plot) > > Jean > > > MarkBeauchene <MarkBeauchene@hotmail.com> wrote on 07/09/2012 03:42:32PM:> > Here is some sample code. It generates a class (list_var) that isused in> > rpart. list_val is the dependant variable. > > > > The plot shows all the values of the class, which is a mess and makesthe> > plot unuseable. I'd like to either suppress the list entirely orreplace it> > with something like "Group 1", "Group 2", etc. > > > > list_var <- rep(NA,2000) > > list_val <- rep(NA,2000) > > for (i in 1:1000) { > > list_var[i] <- paste("A",i%/%25,sep='') > > list_val[i] <- runif(1,0,1) } > > test_set <- data.frame(list_var, list_val ) > > > > > > > > > > testtree <- rpart ( list_val ~ list_var, data = test_set ) > > rpart.plot(testtree, type=3)[[alternative HTML version deleted]]