similar to: Surrogate splits for decision trees

Displaying 20 results from an estimated 2000 matches similar to: "Surrogate splits for decision trees"

2001 Nov 08
2
programming question
Dear r-help, I am trying to build a new function (to process rpart objects) that will output matrix that has a row for each node and a column for each feature. With each entry in the table is a numerical property at that node for that feature (e.g. surrogate split agreement, improvement). My current trouble is that the only clue to the identity of the feature is stored as the *name* of the
2001 Nov 12
0
Additional Documentation for rpart?
Dear r-help, I am looking for additional documentation on the "adj" column in rpart's splits matrix. The help says: adj gives the adjusted concordance for surrogate splits I am looking info about "adjusted concordance". I cannot find this phrase in either Therneau & Atkinson original RPART documentation or the CART book. This question came up in the
2002 Jan 22
1
documentation and plotting with lqs
Dear r-help, Is there any available description of the components of lqs objects found in the package "lqs"? > names(slts) [1] "crit" "sing" "coefficients" "bestone" [5] "fitted.values" "residuals" "scale" "terms" [9] "call"
2001 Sep 04
1
searching the r-help list
Do those who have had trouble finding stuff in the R-archive used the search engine? http://www.scirus.com/ And as far as threading goes. My mailler "mutt" threads the mailing list quite nicely. --Clayton -- Clayton Springer, Ph. D. Sandia National Laboratories csprin at ca.sandia.gov Biosystems Research Department (925) 294-2143 P.O. Box
2002 Feb 26
1
Logistic Regression woes
Hi All, When I tried to do logistic regression and I got the following messages: > SampledW.glm.ALL <- glm (V1 ~ ., family = binomial, data = SampledW) Warning messages: 1: Algorithm did not converge in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y, 2: fitted probabilities numerically 0 or 1 occurred in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y =
2002 Apr 25
1
understanding and resolving seg faults
Dear r-devel, I am mutating rpart to do calculations on trees. I am trying to extract information from the tree. However, I got a seg. fault. This is the offending line in "rpmatrix.c": deltaI[0][0] = spl->improve; (Commenting it out cures the seg fault) I would like some advice on how to debug this. I have allocated memory with calloc and deltaI[0][0] should be
2007 Oct 23
1
Multivariate regression tree: problems with surrogate splits
R helpers, I am working with the R program performing multivariate regression trees (MRT). I have a matrix with species and environmental variables saved as a CSV file (sprot_matrix.csv), I have 42 species and 8 environmental variables (SECCHI+PH+TA+PTOT+NTOT+CHLA+AREA+ MEANDEP) for 104 samples Title SpA SpB SpC SpD Varible1 Variable2 Variable3 Sample1 Sample 2
2010 May 18
1
proportion of treatment effect by a surrogate (fitting multivariate survival model)
Dear R-help, I would like to compute the variance for the proportion of treatment effect by a surrogate in a survival model (Lin, Fleming, and De Gruttola 1997 in Statistics in Medicine). The paper mentioned that the covariance matrix matches that of the covariance matrix estimator for the marginal hazard modelling of multiple events data (Wei, Lin, and Weissfeld 1989 JASA), and is implemented
2004 Jun 04
1
rpart
Hello everyone, I'm a newbie to R and to CART so I hope my questions don't seem too stupid. 1.) My first question concerns the rpart() method. Which method does rpart use in order to get the best split - entropy impurity, Bayes error (min. error) or Gini index? Is there a way to make it use the entropy impurity? The second and third question concern the output of the printcp() function.
2011 Jan 24
1
How to measure/rank ?variable importance when using rpart?
--- included message ---- Thus, my question is: *What common measures exists for ranking/measuring variable importance of participating variables in a CART model? And how can this be computed using R (for example, when using the rpart package)* ---end ---- Consider the following printout from rpart summary(rpart(time ~ age + ph.ecog + pat.karno, data=lung)) Node number 1: 228 observations,
1999 May 04
1
surrogate poisson models
Dear R-help, I'm applying the surrogate Poisson glm, by following Venables & Ripley (7.3 pp238-42). >overall_cbind(expand.grid(treatment=c("Pema","control"),age=c("young","adult","old"),repair=c("excellent","good","poor")),Fr=c(8,0,7,1,2,0,2,7,1,4,7,1, 0,3,2,5,1,9))
2010 Feb 28
1
Gradient Boosting Trees with correlated predictors in gbm
Dear R users, I’m trying to understand how correlated predictors impact the Relative Importance measure in Stochastic Boosting Trees (J. Friedman). As Friedman described “ …with single decision trees (referring to Brieman’s CART algorithm), the relative importance measure is augmented by a strategy involving surrogate splits intended to uncover the masking of influential variables by others
2007 Jan 04
3
randomForest and missing data
Does anyone know a reason why, in principle, a call to randomForest cannot accept a data frame with missing predictor values? If each individual tree is built using CART, then it seems like this should be possible. (I understand that one may impute missing values using rfImpute or some other method, but I would like to avoid doing that.) If this functionality were available, then when the trees
1999 May 05
1
Ordered factors , was: surrogate poisson models
For ordered factor the natural contrast coding would be to parametrize by the succsessive differences between levels, which does not assume equal spacing of factor levels as does the polynomial contrasts (implicitly at least). This requires the contr.cum, which could be: contr.cum <- function (n, contrasts = TRUE) { if (is.numeric(n) && length(n) == 1) levs <- 1:n
2012 Jan 17
0
RTisean generating multivariate surrogates;
I have a question on generating multivariate time series surrogates using the "surrogates" function in the RTisean library. The surrogate data matrices are always much shorter than the input matrices. FYI, I'm using R version 2.12.2 on Windows XP RTisean library v 3.0.14 Tisean algorithms v 3.0.13 Creating a surrogate univariate time series returns a time series with the
2011 Jun 13
1
In rpart, how is "improve" calculated? (in the "class" case)
Hi all, I apologies in advance if I am missing something very simple here, but since I failed at resolving this myself, I'm sending this question to the list. I would appreciate any help in understanding how the rpart function is (exactly) computing the "improve" (which is given in fit$split), and how it differs when using the split='information' vs split='gini'
2007 Aug 16
1
Regression tree: labels in the terminal nodes
Dear everybody, I'm a new user of R 2.4.1 and I'm searching for information on improving the output of regression tree graphs. In the terminal nodes I am up to now able to indicate the number of values (n) and the mean of all values in this terminal node by the command > text(tree, use.n=T, xpd=T) Yet I would like to indicate automatically in the output graph of the tree some
2007 Oct 25
1
problems with the last version of R
R helpers, I would like to know if it is possible that the last version of R is not giving the surrogate splits when you perform a Multivariate regression tree analysis? I installed the programm in different computers and i run the some matrix and it didn't gave me this information. With a previus version R 2.1.1. I do get the information for the surrogates. Please let me know how to get the
2013 Jan 27
2
rpart
Hi, When I look at the summary of an rpart object run on my data, I get 7 nodes but when I plot the rpart object, I get only 3 nodes. Should the number of nodes not match in the results of the 2 functions (summary and plot) or it is not always the same? Look forward to your reply, Carol -------------------------------------------- ?summary(rpart.res) Call: rpart(formula = mydata$class ~ ., data
2010 Oct 13
5
Regular expression to find value between brackets
Hi, this should be an easy one, but I can't figure it out. I have a vector of tests, with their units between brackets (if they have units). eg tests <- c("pH", "Assay (%)", "Impurity A(%)", "content (mg/ml)") Now I would like to hava a function where I use a test as input, and which returns the units like: f <- function (x) sub("\\)",