MarkBeauchene
2012-Jul-06 21:38 UTC
[R] Plotting rpart trees with long list of class members
I have a class with 732 members, so using rpart.plot is giving me a tiny plot in the middle of the window. Is there a good way to modify the plot, or replace the long list with something like "group1"? -- View this message in context: http://r.789695.n4.nabble.com/Plotting-rpart-trees-with-long-list-of-class-members-tp4635671.html Sent from the R help mailing list archive at Nabble.com.
Jean V Adams
2012-Jul-09 13:20 UTC
[R] Plotting rpart trees with long list of class members
If you provide a simple example, with actual data and code so that we can reproduce your issue, it would make it easier for readers of this list to help. Jean MarkBeauchene <MarkBeauchene@hotmail.com> wrote on 07/06/2012 04:38:32 PM:> I have a class with 732 members, so using rpart.plot is giving me a tinyplot> in the middle of the window. Is there a good way to modify the plot, or > replace the long list with something like "group1"?[[alternative HTML version deleted]]
Jean V Adams
2012-Jul-12 21:27 UTC
[R] Plotting rpart trees with long list of class members
The example you gave had only one split. If your real situation has three
splits, you'll have to take a look at testtree$csplit matrix and decide
how you want to define the new grouping variable. Here's one way to do it
...
Jean
library(rpart)
library(rpart.plot)
test_set <- data.frame(
list_var=paste("A", (1:1000)%/%25, sep=''),
list_val=c(runif(250, 1, 4), runif(250, 3, 5), runif(250, 4, 6),
runif(250, 5, 7))
)
# a preliminary tree, to get the splits (not plotted)
testtree <- rpart(list_val ~ list_var, minbucket=100, data=test_set)
# a vector of the unique values of list_var, sorted
suvar <- sort(unique(test_set$list_var))
# define a new variable to represent all combinations of splits in
testtree
groups <- factor(apply(testtree$csplit, 2, paste, collapse="-"),
labels=seq(table(splitz)))
# expand this new variable to the length of the original data frame
test_set$var_grp <- as.factor(groups[match(test_set$list_var, suvar)])
# fit another tree, using the grouping variable, for plotting purposes
testtree2 <- rpart(list_val ~ var_grp, data=test_set)
rpart.plot(testtree2, type=3)
Mark Beauchene <markbeauchene@hotmail.com> wrote on 07/11/2012 02:34:52
PM:
> Thank you, it works very well.
>
> Could you help me out by explaining a little bit of how it works?
> In my actual plot I have 3 splits on the same long list class
> variable, and I don't completely follow your code.
>
> Mark Beauchene
>
> To: MarkBeauchene@hotmail.com
> CC: r-help@r-project.org
> Subject: Re: [R] Plotting rpart trees with long list of class members
> From: jvadams@usgs.gov
> Date: Tue, 10 Jul 2012 09:10:05 -0500
>
> Thanks. Very helpful.
>
> You can use the information from the splits in the first tree, to
> define a new grouping variable, which will simplify the plot:
> suvar <- sort(unique(test_set$list_var))
> test_set$var_grp <- as.factor(testtree$csplit[match(test_set
> $list_var, suvar)])
> testtree2 <- rpart ( list_val ~ var_grp, data = test_set )
> rpart.plot(testtree2, type=3)
>
> Not to other readers, you will need to load these packages, before
> running the code:
> library(rpart)
> library(rpart.plot)
>
> Jean
>
>
> MarkBeauchene <MarkBeauchene@hotmail.com> wrote on 07/09/2012
03:42:32
PM:> > Here is some sample code. It generates a class (list_var) that is
used in> > rpart. list_val is the dependant variable.
> >
> > The plot shows all the values of the class, which is a mess and makes
the> > plot unuseable. I'd like to either suppress the list entirely or
replace it> > with something like "Group 1", "Group 2", etc.
> >
> > list_var <- rep(NA,2000)
> > list_val <- rep(NA,2000)
> > for (i in 1:1000) {
> > list_var[i] <- paste("A",i%/%25,sep='')
> > list_val[i] <- runif(1,0,1) }
> > test_set <- data.frame(list_var, list_val )
> >
> >
> >
> >
> > testtree <- rpart ( list_val ~ list_var, data = test_set )
> > rpart.plot(testtree, type=3)
[[alternative HTML version deleted]]