thr3ads.net - similar to: "non-parametric sample size calculation"

Displaying 20 results from an estimated 2000 matches similar to: "non-parametric sample size calculation"

Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?

2005 Oct 27

Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?

"classwt" in the current version of the randomForest package doesn't work too well. (It's what was in version 3.x of the original Fortran code by Breiman and Cutler, not the one in the new Fortran code.) I'd advise against using it. "sampsize" and "strata" can be use in conjunction. If "strata" is not specified, the class labels will be used.

imbalanced classes

2006 Jan 25

imbalanced classes

Hi Andy, I know this topic has been discussed before on the R-help, but I was wondering if you could offer some advice specific to my application. I'm using the R random forest package to compare two classes of data, the number of cases in each class relatively low, 28 in class 1 and 9 in class 2. I'd really like to use R environment to analyze this data, however I'm finding it

sampsize in Random Forests

2008 Mar 09

sampsize in Random Forests

Hi all, I have a dataset where each point is assigned to a class A, B, C, or D. Each point is also assigned to a study site. Each study site is coded with a number ranging between 1-100. This information is stored in the vector studySites. I want to run randomForests using stratified sampling, so I chose the option strata = factor(studySites) But I am not sure how to control the number of

random forest regression

2006 Nov 13

random forest regression

Dear all, I am doing a regression in ramdomForest, using the option "sampsize" reduce the number of records used to produce the randomForest object. The manual says "For classification, if sampsize is a vector of the length the number of strata, then sampling is stratified by strata, and the elements of sampsize indicate the numbers to be drawn from the strata". I need my

Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?

2005 Oct 27

Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?

Sorry for the repost, but I've really been looking, and can't find any syntax direction on this issue... Just browsing the documentation, and searching the list came up short... I have some unbalanced data and was wondering if, in a "0" v "1" classification forest, some combo of these options might yield better predictions when the proportion of one class is low (less

Need help on ploting Histograms

2009 May 21

Need help on ploting Histograms

this is the command i made for a normal distribution, but when i try to plot the histograms, i dont know why the bars don't stick on the line... nsamples<-1000 sampsize<-15 Samples<-matrix(rnorm(nsamples*sampsize,0,1),nrow=nsamples) a<-apply(Samples,1,var) NC14<-a*14 x<-0:40 plot(x,dchisq(x,14),type='h') hist(NC14,freq=F,add=T) -- View this message in context:

CARET: Any way to access other tuning parameters?

2013 Feb 13

CARET: Any way to access other tuning parameters?

The documentation for caret::train shows a list of parameters that one can tune for each method classification/regression method. For example, for the method randomForest one can tune mtry in the call to train. But the function call to train random forests in the original package has many other parameters, e.g. sampsize, maxnodes, etc. Is there **any** way to access these parameters using train

pipe data from plot(). was: ROCR.plot methods, cross validation averaging

2009 Sep 24

pipe data from plot(). was: ROCR.plot methods, cross validation averaging

All, I'm trying again with a slightly more generic version of my first question. I can extract the plotted values from hist(), boxplot(), and even plot.randomForest(). Observe: # get some data dat <- rnorm(100) # grab histogram data hdat <- hist(dat) hdat #provides details of the hist output #grab boxplot data bdat <- boxplot(dat) bdat #provides details of the boxplot

Appropriate method for sharing data across functions

2012 Apr 05

Appropriate method for sharing data across functions

In trying to streamline various optimization functions, I would like to have a scratch pad of working data that is shared across a number of functions. These can be called from different levels within some wrapper functions for maximum likelihood and other such computations. I'm sure there are other applications that could benefit from this. Below are two approaches. One uses the <<-

class weights with Random Forest

2011 Sep 13

class weights with Random Forest

Hi All, I am looking for a reference that explains how the randomForest function in the randomForest package uses the classwt parameter. Here: http://tolstoy.newcastle.edu.au/R/e4/help/08/05/12088.html Andy Liaw suggests not using classwt. And according to: http://r.789695.n4.nabble.com/R-help-with-RandomForest-classwt-option-td817149.html it has "not been implemented" as of 2007.

Random Forest - Strata

2010 Jul 20

Random Forest - Strata

Hi all, Had struggled in getting "Strata" in randomForest to work on this. Can I get randomForest for each of its TREE, to get ALL sample from some strata to build tree, while leaving some strata TOTALLY untouched as oob? e.g. in below, how I can tell RF to, - for tree 1 in the forest, to use only Site A and B to build the tree, while using the WHOLE Site C data for the oob error

question about formulating a nls optimization

2003 Jul 18

question about formulating a nls optimization

Dear list, I'm migrating a project from Matlab to R, and I'm facing a relatively complicated problem for nls. My objective function is below: >> objFun <- function(yEx,xEx,tEx,gamma,theta,kappa){ yTh <- pdfDY(xEx,tEx,gamma,theta,kappa) sum(log(yEx/yTh)^2) } The equation is yTh=P(xEx,tEx) + noise. I collect my data in: >> data <-

bug in lme4?

2008 Aug 20

bug in lme4?

Dear all, I found a problem with 'lme4'. Basically, once you load the package 'aod' (Analysis of Overdispersed Data), the functions 'lmer' and 'glmer' don't work anymore: library(lme4) (fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)) (gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), family = binomial, data

Sample size calculations for one sided binomial exact test

2011 Nov 01

Sample size calculations for one sided binomial exact test

I'm trying to compute sample size requirements for a binomial exact test. we want to show that the proportion is at least 90% assuming that it is 95%, with 80% power so any asymptotic approximations are out of the questions. I was planning on using binom.test to perform the simple test against a prespecified value, but cannot find any functions for computing sample size. do any exist?

Obtaining fitted model information

2004 Oct 31

Obtaining fitted model information

Dear list, I am brand new to R and using Dalgaard's (2002) book Introductory Statistics with R (thus, some of my terminology may be incorrect). I am fitting regression models and I want to use Hurvich and Tsai's AICC statistic to examine my regression models. This penalty can be expressed as: 2*npar * (n/(n-npar-1)). While you can obtain AIC, BIC, and logLik, I want to impose the AICC

a quick Q about memory limit in R

2003 May 20

a quick Q about memory limit in R

Hello, there, I got this error when i tried to run " data.kr <- surf.gls(2, expcov, data, d=0.7);" "Error: cannot allocate vector of size 382890 Kb Execution halted" My data is 100x100 grid. the following is the summary of "data": > summary(data); x y z Min. : 1.00 Min. : 1.00 Min. :-1.0172 1st Qu.: 26.00

Unexpected results using the oneway_test in the coin package

2012 Jan 09

Unexpected results using the oneway_test in the coin package

Dear fellow R users, Keywords: Kruskal-Wallis, Post-Hoc, pair-wise comparisons, Nemenyi-Damico-Wolfe-Dunn test, coin package, oneway_test I am using the "oneway_test" function in the R package "coin" and I am obtaining results which I cannot believe are accurate. I do not wish to waste anyone's time and so if the following problem is rather trivial, I apologize, however I

optim with BFGS--what may lead to this, a strange thing happened

2010 Sep 15

optim with BFGS--what may lead to this, a strange thing happened

Dear R Users on a self-written function for calculating maximum likelihood probability (plz check function code at the bottom of this message), one value, wden, suddenly jump to zero. detail info as following: w[11]=2.14 lnw =2.37 2.90 3.76 ... regw =1.96 1.77 1.82 .... wden=0.182 0.178 0.179... w[11]=2.14 lnw=2.37 2.90 3.76 ... regw =1.96 1.77 1.82 .... wden=0.182

A faster way to compute finite-difference gradient of a scalar function of a large number of variables

2008 Mar 27

A faster way to compute finite-difference gradient of a scalar function of a large number of variables

Hi All, I would like to compute the simple finite-difference approximation to the gradient of a scalar function of a large number of variables (on the order of 1000). Although a one-time computation using the following function grad() is fast and simple enough, the overhead for repeated evaluation of gradient in iterative schemes is quite significant. I was wondering whether there are

question on "optim"

2010 Sep 07

question on "optim"

Hey, R users I do not know how to describe my question. I am a new user for R and write the following?code for a dynamic labor economics?model and use OPTIM to get optimizations and parameter values. the following code does not work due to the?equation: ?? wden[,i]<-dnorm((1-regw[,i])/w[5])/w[5] where w[5]?is one of the parameters (together with vector a, b and other elements in vector

similar to: non-parametric sample size calculation