Displaying 20 results from an estimated 2000 matches similar to: "non-parametric sample size calculation"
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?
"classwt" in the current version of the randomForest package doesn't work
too well. (It's what was in version 3.x of the original Fortran code by
Breiman and Cutler, not the one in the new Fortran code.) I'd advise
against using it.
"sampsize" and "strata" can be use in conjunction. If "strata" is not
specified, the class labels will be used.
2006 Jan 25
1
imbalanced classes
Hi Andy,
I know this topic has been discussed before on the R-help, but I was
wondering if you could offer some advice specific to my application.
I'm using the R random forest package to compare two classes of data,
the number of cases in each class relatively low, 28 in class 1 and 9
in class 2. I'd really like to use R environment to analyze this data,
however I'm finding it
2008 Mar 09
1
sampsize in Random Forests
Hi all,
I have a dataset where each point is assigned to a class A, B, C, or
D. Each point is also assigned to a study site. Each study site is
coded with a number ranging between 1-100. This information is stored
in the vector studySites.
I want to run randomForests using stratified sampling, so I chose the option
strata = factor(studySites)
But I am not sure how to control the number of
2006 Nov 13
1
random forest regression
Dear all,
I am doing a regression in ramdomForest, using the option "sampsize" reduce
the number of records used to produce the randomForest object.
The manual says "For classification, if sampsize is a vector of the length
the number of strata, then sampling is stratified by strata, and the
elements of sampsize indicate the numbers to be drawn from the strata". I
need my
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?
Sorry for the repost, but I've really been looking, and can't find any
syntax direction on this issue...
Just browsing the documentation, and searching the list came up short... I
have some unbalanced data and was wondering if, in a "0" v "1"
classification forest, some combo of these options might yield better
predictions when the proportion of one class is low (less
2009 May 21
1
Need help on ploting Histograms
this is the command i made for a normal distribution, but when i try to plot
the histograms, i dont know why the bars don't stick on the line...
nsamples<-1000
sampsize<-15
Samples<-matrix(rnorm(nsamples*sampsize,0,1),nrow=nsamples)
a<-apply(Samples,1,var)
NC14<-a*14
x<-0:40
plot(x,dchisq(x,14),type='h')
hist(NC14,freq=F,add=T)
--
View this message in context:
2013 Feb 13
2
CARET: Any way to access other tuning parameters?
The documentation for caret::train shows a list of parameters that one can
tune for each method classification/regression method. For example, for
the method randomForest one can tune mtry in the call to train. But the
function call to train random forests in the original package has many
other parameters, e.g. sampsize, maxnodes, etc.
Is there **any** way to access these parameters using train
2009 Sep 24
3
pipe data from plot(). was: ROCR.plot methods, cross validation averaging
All,
I'm trying again with a slightly more generic version of my first question. I can extract the
plotted values from hist(), boxplot(), and even plot.randomForest(). Observe:
# get some data
dat <- rnorm(100)
# grab histogram data
hdat <- hist(dat)
hdat #provides details of the hist output
#grab boxplot data
bdat <- boxplot(dat)
bdat #provides details of the boxplot
2012 Apr 05
4
Appropriate method for sharing data across functions
In trying to streamline various optimization functions, I would like to have a scratch pad
of working data that is shared across a number of functions. These can be called from
different levels within some wrapper functions for maximum likelihood and other such
computations. I'm sure there are other applications that could benefit from this.
Below are two approaches. One uses the <<-
2011 Sep 13
1
class weights with Random Forest
Hi All,
I am looking for a reference that explains how the randomForest function in
the randomForest package uses the classwt parameter. Here:
http://tolstoy.newcastle.edu.au/R/e4/help/08/05/12088.html
Andy Liaw suggests not using classwt. And according to:
http://r.789695.n4.nabble.com/R-help-with-RandomForest-classwt-option-td817149.html
it has "not been implemented" as of 2007.
2010 Jul 20
1
Random Forest - Strata
Hi all,
Had struggled in getting "Strata" in randomForest to work on this.
Can I get randomForest for each of its TREE, to get ALL sample from some
strata to build tree, while leaving some strata TOTALLY untouched as oob?
e.g. in below, how I can tell RF to,
- for tree 1 in the forest, to use only Site A and B to build the tree,
while using the WHOLE Site C data for the oob error
2003 Jul 18
3
question about formulating a nls optimization
Dear list,
I'm migrating a project from Matlab to R, and I'm
facing a relatively complicated problem for nls. My
objective function is below:
>> objFun <- function(yEx,xEx,tEx,gamma,theta,kappa){
yTh <- pdfDY(xEx,tEx,gamma,theta,kappa)
sum(log(yEx/yTh)^2)
}
The equation is yTh=P(xEx,tEx) + noise.
I collect my data in:
>> data <-
2008 Aug 20
3
bug in lme4?
Dear all,
I found a problem with 'lme4'. Basically, once you load the package 'aod' (Analysis of Overdispersed Data), the functions 'lmer' and 'glmer' don't work anymore:
library(lme4)
(fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy))
(gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
family = binomial, data
2011 Nov 01
1
Sample size calculations for one sided binomial exact test
I'm trying to compute sample size requirements for a binomial exact test.
we want to show that the proportion is at least 90% assuming that it is
95%, with 80% power so any asymptotic approximations are out of the
questions. I was planning on using binom.test to perform the simple test
against a prespecified value, but cannot find any functions for computing
sample size. do any exist?
2004 Oct 31
2
Obtaining fitted model information
Dear list,
I am brand new to R and using Dalgaard's (2002) book Introductory Statistics with R (thus, some of my terminology may be incorrect).
I am fitting regression models and I want to use Hurvich and Tsai's AICC statistic to examine my regression models. This penalty can be expressed as: 2*npar * (n/(n-npar-1)).
While you can obtain AIC, BIC, and logLik, I want to impose the AICC
2003 May 20
3
a quick Q about memory limit in R
Hello, there,
I got this error when i tried to run " data.kr <- surf.gls(2, expcov,
data, d=0.7);"
"Error: cannot allocate vector of size 382890 Kb
Execution halted"
My data is 100x100 grid.
the following is the summary of "data":
> summary(data);
x y z
Min. : 1.00 Min. : 1.00 Min. :-1.0172
1st Qu.: 26.00
2012 Jan 09
2
Unexpected results using the oneway_test in the coin package
Dear fellow R users,
Keywords: Kruskal-Wallis, Post-Hoc, pair-wise comparisons, Nemenyi-Damico-Wolfe-Dunn test, coin package, oneway_test
I am using the "oneway_test" function in the R package "coin" and I am obtaining results which I cannot believe are accurate. I do not wish to waste anyone's time and so if the following problem is rather trivial, I apologize, however I
2010 Sep 15
1
optim with BFGS--what may lead to this, a strange thing happened
Dear R Users
on a self-written function for calculating maximum likelihood probability (plz
check function code at the bottom of this message), one value, wden, suddenly
jump to zero. detail info as following:
w[11]=2.14
lnw =2.37 2.90 3.76 ...
regw =1.96 1.77 1.82 ....
wden=0.182 0.178 0.179...
w[11]=2.14
lnw=2.37 2.90 3.76 ...
regw =1.96 1.77 1.82 ....
wden=0.182
2008 Mar 27
1
A faster way to compute finite-difference gradient of a scalar function of a large number of variables
Hi All,
I would like to compute the simple finite-difference approximation to the
gradient of a scalar function of a large number of variables (on the order
of 1000). Although a one-time computation using the following function
grad() is fast and simple enough, the overhead for repeated evaluation of
gradient in iterative schemes is quite significant. I was wondering whether
there are
2010 Sep 07
5
question on "optim"
Hey, R users
I do not know how to describe my question. I am a new user for R and write the
following?code for a dynamic labor economics?model and use OPTIM to get
optimizations and parameter values. the following code does not work due to
the?equation:
?? wden[,i]<-dnorm((1-regw[,i])/w[5])/w[5]
where w[5]?is one of the parameters (together with vector a, b and other
elements in vector