similar to: Random Forests regression by strata

Displaying 20 results from an estimated 11000 matches similar to: "Random Forests regression by strata"

2008 Mar 09
1
sampsize in Random Forests
Hi all, I have a dataset where each point is assigned to a class A, B, C, or D. Each point is also assigned to a study site. Each study site is coded with a number ranging between 1-100. This information is stored in the vector studySites. I want to run randomForests using stratified sampling, so I chose the option strata = factor(studySites) But I am not sure how to control the number of
2010 Jul 20
1
Random Forest - Strata
Hi all, Had struggled in getting "Strata" in randomForest to work on this. Can I get randomForest for each of its TREE, to get ALL sample from some strata to build tree, while leaving some strata TOTALLY untouched as oob? e.g. in below, how I can tell RF to, - for tree 1 in the forest, to use only Site A and B to build the tree, while using the WHOLE Site C data for the oob error
2010 Apr 09
1
Question on implementing Random Forests scoring
So I've been working with Random Forests ( R library is randomForest) and I curious if Random Forests could be applied to classifying on a real time basis. For instance lets say I've scored fraud from a group of transactions. If I want to score any new incoming transactions for fraud could Random Forests be used in that context. Linear Regression is nice in that it is very easy to
2012 May 11
2
Random forests prediction
Hi all, I have a strange problem when applying RF in R. I have a set of variables with which I obtain an AUC of 0.67. I do have a second set of variables that have an AUC of 0.57. When I merge the first and second set of variables, the AUC becomes 0.64. I would expect the prediction to become better as I add variables that do have some predictive power? This is even more strange as the AUC
2009 Apr 10
1
Random Forests: Question about R^2
Dear Random Forests gurus, I have a question about R^2 provided by randomForest (for regression). I don't succeed in finding this information. In the help file for randomForest under "Value" it says: rsq: (regression only) - "pseudo R-squared'': 1 - mse / Var(y). Could someone please explain in somewhat more detail how exactly R^2 is calculated? Is "mse"
2009 Apr 20
1
Random Forests: Predictor importance for Regression Trees
Hello! I think I am relatively clear on how predictor importance (the first one) is calculated by Random Forests for a Classification tree: Importance of predictor P1 when the response variable is categorical: 1. For out-of-bag (oob) cases, randomly permute their values on predictor P1 and then put them down the tree 2. For a given tree, subtract the number of votes for the correct class in the
2008 Dec 04
1
Comparing survival curves with "survdiff" "strata" help
ExpeRts, I'm trying to compare three survival curves using the function "survdiff" in the survival package. Following is my code and corresponding error message. > survdiff(Surv(st_months, status) ~ strata(BOR), data=mydata) Error in survdiff(Surv(st_months, status) ~ strata(BOR), data = mydata) : No groups to test When I check the "strata" of the variable. I get .
2013 Jan 31
1
obtainl survival curves for single strata
Dear useRs, What is the syntax to obtain survival curves for single strata on many subjects? I have a model based on Surv(time,response) object, so there is a single row per subject and no start,stop and no switching of strata. The newdata has many subjects and each subject has a strata and the survival based on the subject risk and the subject strata is needed. If I do newpred <-
2011 May 06
2
coxph and survfit issue - strata
Dear users, In a study with recurrent events: My objective is to get estimates of survival (obtained through a Cox model) by rank of recurrence and by treatment group. With the following code (corresponding to a model with a global effect of the treatment=rx), I get no error and manage to obtain what I want : data<-(bladder)
2009 Apr 13
2
Random Forests Variable Importance Question
I am trying to use the random forests package for classification in R. The Variable Importance Measures listed are: -mean raw importance score of variable x for class 0 -mean raw importance score of variable x for class 1 -MeanDecreaseAccuracy -MeanDecreaseGini Now I know what these "mean" as in I know their definitions. What I want to know is how to use them. What I am trying to
2013 Dec 07
1
combine glmnet and coxph (and survfit) with strata()
Dear All, I want to generate survival curve with cox model but I want to estimate the coefficients using glmnet. However, I also want to include a strata() term in the model. Could anyone please tell me how to have this strata() effect in the model in glmnet? I tried converting a formula with strata() to a design matrix and feeding to glmnet, but glmnet just treats the strata() term with one
2012 Feb 01
1
package sampling, function strata
Dear all, I have to select 122 stratified random samples from a population of >3900 cells. I have 41 strata and I have to draw a different number of samples from them(between 2 and 8). I have tried to apply the funtion strata following the instruction in the manual: strata(dataframe, stratanames=NULL, size, method=c("srswor"), pik,description=TRUE) but I get the error
2015 Feb 04
2
Interpretación de coeficientes en un cox proportional hazards con variable strata
Buenas. Abajo pongo la salida de un modelo de cox , dónde he estratificado por una variable de país (Countryb) y por otra (Q6). Además hay interacción entre la variable mobilityPDurG2 (es una variable 0,1, y 0 es la categoría de referencia) país. La categoría de referencia para país es "united kingdom". Mi duda surge si quiero calcular el hazard ratio para los que tienen un 1
2006 Feb 07
2
getting strata/cluster level values with survey package?
First, I appoligise for the rooky question, but... I'm trying to obtain standard errors, confidence intervals, etc. from a sample design and have been trouble getting the results for anything other than the basic total or mean for the overall survey from the survey package. For example, using the following dataset, strata,cluster,vol A,1,18.58556192 A,1,12.55175443 A,1,21.65882438
2010 Jun 26
1
boot with strata: strata argument ignored?
Hello All. I must be missing the really obvious here: mm <- function(d, i) median(d[i]) b1 <- boot(gravity$g, mm, R = 1000) b1 b2 <- boot(gravity$g, mm, R = 1000, strata = gravity$series) b2 Both b1 and b2 seem to have done (almost) the same thing, but it looks like the strata argument in b2 has been ignored. However, str(b1) vs str(b2) does show that the strata have been noted
2017 Dec 28
2
Why aov() with Error() gives three strata?
Dear list users, I am trying to learn Repeated measures ANOVA using the aov() interface, but I'm struggling to understand its output. According to tutorials on the web, formula for a repeated measures design is: aov(Y ~ IV+ Error(SUBJECT/IV) ) This formula does work but it returns three strata (Error:SUBJECT, Error: SUBJECT:IV, Error: Within), when I would expect two strata (Within and
2008 Jun 18
1
How to create strata out of the data.frame table
My data.frame table consist of 3 variables (x,y and z) where each variable has 1000 units. I need to create 5 equal size strata according to one of the variable (let's say x) whereas units of x variable with a higher value have higher probability to be selected in a strata with a higher number (max strata number is 5). I've been trying different things so far and since I am fairly new to
2017 Dec 28
0
Why aov() with Error() gives three strata?
Jorge: FYI, *generally speaking,* queries that are mostly statistical in nature, such as yours, are off topic here -- this list is about R programming help, not statistical help. Having said that, you still may get a useful response here -- the r-help/statistics intersection *is* nonempty. However, if not, 2.5 suggestions: 1. Try posting to r-sig-mixed-models instead. Repeated measures are a
2006 Nov 13
1
random forest regression
Dear all, I am doing a regression in ramdomForest, using the option "sampsize" reduce the number of records used to produce the randomForest object. The manual says "For classification, if sampsize is a vector of the length the number of strata, then sampling is stratified by strata, and the elements of sampsize indicate the numbers to be drawn from the strata". I need my
2009 Jun 20
1
Plotting Cumulative Hazard Functions with Strata
Hello: So i've fit a hazard function to a set of data using kmfit<-survfit(Surv(int, event)~factor(cohort)) this factor variable, "cohort" has four levels so naturally the strata variable has 4 values. I can use this data to estimate the hazard rate haz<-n.event/n.risk and calculate the cumulative hazard function by H<--log(haz) Now, I would like to plot this