Displaying 20 results from an estimated 11000 matches similar to: "Random Forests regression by strata"
2008 Mar 09
1
sampsize in Random Forests
Hi all,
I have a dataset where each point is assigned to a class A, B, C, or
D. Each point is also assigned to a study site. Each study site is
coded with a number ranging between 1-100. This information is stored
in the vector studySites.
I want to run randomForests using stratified sampling, so I chose the option
strata = factor(studySites)
But I am not sure how to control the number of
2010 Jul 20
1
Random Forest - Strata
Hi all,
Had struggled in getting "Strata" in randomForest to work on this.
Can I get randomForest for each of its TREE, to get ALL sample from some
strata to build tree, while leaving some strata TOTALLY untouched as oob?
e.g. in below, how I can tell RF to,
- for tree 1 in the forest, to use only Site A and B to build the tree,
while using the WHOLE Site C data for the oob error
2010 Apr 09
1
Question on implementing Random Forests scoring
So I've been working with Random Forests ( R library is randomForest) and I
curious if Random Forests could be applied to classifying on a real time
basis. For instance lets say I've scored fraud from a group of
transactions. If I want to score any new incoming transactions for fraud
could Random Forests be used in that context. Linear Regression is nice in
that it is very easy to
2012 May 11
2
Random forests prediction
Hi all,
I have a strange problem when applying RF in R.
I have a set of variables with which I obtain an AUC of 0.67.
I do have a second set of variables that have an AUC of 0.57.
When I merge the first and second set of variables, the AUC becomes 0.64.
I would expect the prediction to become better as I add variables that do
have some predictive power?
This is even more strange as the AUC
2009 Apr 10
1
Random Forests: Question about R^2
Dear Random Forests gurus,
I have a question about R^2 provided by randomForest (for regression).
I don't succeed in finding this information.
In the help file for randomForest under "Value" it says:
rsq: (regression only) - "pseudo R-squared'': 1 - mse / Var(y).
Could someone please explain in somewhat more detail how exactly R^2
is calculated?
Is "mse"
2009 Apr 20
1
Random Forests: Predictor importance for Regression Trees
Hello!
I think I am relatively clear on how predictor importance (the first
one) is calculated by Random Forests for a Classification tree:
Importance of predictor P1 when the response variable is categorical:
1. For out-of-bag (oob) cases, randomly permute their values on
predictor P1 and then put them down the tree
2. For a given tree, subtract the number of votes for the correct
class in the
2008 Dec 04
1
Comparing survival curves with "survdiff" "strata" help
ExpeRts,
I'm trying to compare three survival curves using the function "survdiff" in the survival package. Following is my code and corresponding error message.
> survdiff(Surv(st_months, status) ~ strata(BOR), data=mydata)
Error in survdiff(Surv(st_months, status) ~ strata(BOR), data = mydata) :
No groups to test
When I check the "strata" of the variable. I get .
2013 Jan 31
1
obtainl survival curves for single strata
Dear useRs,
What is the syntax to obtain survival curves for single strata on many subjects?
I have a model based on Surv(time,response) object, so there is a single row per subject and no start,stop and no switching of strata.
The newdata has many subjects and each subject has a strata and the survival based on the subject risk and the subject strata is needed.
If I do
newpred <-
2011 May 06
2
coxph and survfit issue - strata
Dear users,
In a study with recurrent events:
My objective is to get estimates of survival (obtained through a Cox model) by rank of recurrence and by treatment group.
With the following code (corresponding to a model with a global effect of the treatment=rx), I get no error and manage to obtain what I want :
data<-(bladder)
2009 Apr 13
2
Random Forests Variable Importance Question
I am trying to use the random forests package for classification in R.
The Variable Importance Measures listed are:
-mean raw importance score of variable x for class 0
-mean raw importance score of variable x for class 1
-MeanDecreaseAccuracy
-MeanDecreaseGini
Now I know what these "mean" as in I know their definitions. What I
want to know is how to use them.
What I am trying to
2013 Dec 07
1
combine glmnet and coxph (and survfit) with strata()
Dear All,
I want to generate survival curve with cox model but I want to estimate the
coefficients using glmnet. However, I also want to include a strata() term
in the model. Could anyone please tell me how to have this strata() effect
in the model in glmnet? I tried converting a formula with strata() to a
design matrix and feeding to glmnet, but glmnet just treats the strata()
term with one
2012 Feb 01
1
package sampling, function strata
Dear all,
I have to select 122 stratified random samples from a population of
>3900 cells. I have 41 strata and I have to draw a different number of
samples from them(between 2 and 8).
I have tried to apply the funtion strata following the instruction in
the manual:
strata(dataframe, stratanames=NULL, size, method=c("srswor"),
pik,description=TRUE)
but I get the error
2015 Feb 04
2
Interpretación de coeficientes en un cox proportional hazards con variable strata
Buenas.
Abajo pongo la salida de un modelo de cox , dónde he estratificado por
una variable de país (Countryb) y por otra (Q6). Además hay interacción
entre la variable mobilityPDurG2 (es una variable 0,1, y 0 es la
categoría de referencia) país.
La categoría de referencia para país es "united kingdom".
Mi duda surge si quiero calcular el hazard ratio para los que tienen un
1
2006 Feb 07
2
getting strata/cluster level values with survey package?
First, I appoligise for the rooky question, but...
I'm trying to obtain standard errors, confidence intervals, etc. from a
sample design and have been trouble getting the results for anything other
than the basic total or mean for the overall survey from the survey
package.
For example, using the following dataset,
strata,cluster,vol
A,1,18.58556192
A,1,12.55175443
A,1,21.65882438
2010 Jun 26
1
boot with strata: strata argument ignored?
Hello All. I must be missing the really obvious here:
mm <- function(d, i) median(d[i])
b1 <- boot(gravity$g, mm, R = 1000)
b1
b2 <- boot(gravity$g, mm, R = 1000, strata = gravity$series)
b2
Both b1 and b2 seem to have done (almost) the same thing, but it looks like
the strata argument in b2 has been ignored. However, str(b1) vs str(b2)
does show that the strata have been noted
2017 Dec 28
2
Why aov() with Error() gives three strata?
Dear list users,
I am trying to learn Repeated measures ANOVA using the aov() interface, but
I'm struggling to understand its output.
According to tutorials on the web, formula for a repeated measures design
is:
aov(Y ~ IV+ Error(SUBJECT/IV) )
This formula does work but it returns three strata (Error:SUBJECT, Error:
SUBJECT:IV, Error: Within), when I would expect two strata (Within and
2008 Jun 18
1
How to create strata out of the data.frame table
My data.frame table consist of 3 variables (x,y and z) where each variable
has 1000 units. I need to create 5 equal size strata according to one of the
variable (let's say x) whereas units of x variable with a higher value have
higher probability to be selected in a strata with a higher number (max
strata number is 5).
I've been trying different things so far and since I am fairly new to
2017 Dec 28
0
Why aov() with Error() gives three strata?
Jorge:
FYI, *generally speaking,* queries that are mostly statistical in
nature, such as yours, are off topic here -- this list is about R
programming help, not statistical help. Having said that, you still
may get a useful response here -- the r-help/statistics intersection
*is* nonempty. However, if not, 2.5 suggestions:
1. Try posting to r-sig-mixed-models instead. Repeated measures are a
2006 Nov 13
1
random forest regression
Dear all,
I am doing a regression in ramdomForest, using the option "sampsize" reduce
the number of records used to produce the randomForest object.
The manual says "For classification, if sampsize is a vector of the length
the number of strata, then sampling is stratified by strata, and the
elements of sampsize indicate the numbers to be drawn from the strata". I
need my
2009 Jun 20
1
Plotting Cumulative Hazard Functions with Strata
Hello:
So i've fit a hazard function to a set of data using
kmfit<-survfit(Surv(int, event)~factor(cohort))
this factor variable, "cohort" has four levels so naturally the strata
variable has 4 values.
I can use this data to estimate the hazard rate
haz<-n.event/n.risk
and calculate the cumulative hazard function by
H<--log(haz)
Now, I would like to plot this