thr3ads.net - similar to: "comparing classification methods: 10-fold cv or leaving-one-out ?"

Displaying 20 results from an estimated 6000 matches similar to: "comparing classification methods: 10-fold cv or leaving-one-out ?"

classification with nnet: handling unequal class sizes

2004 Mar 30

classification with nnet: handling unequal class sizes

I hope this question is adequate for this list I use the nnet code from V&R p. 348: The very nice and general function CVnn2() to choose the number of hidden units and the amount of weight decay by an inner cross-validation- with a slight modification to use it for classification (see below). My data has 2 classes with unequal size: 45 observations for classI and 116 obs. for classII With

comparing random forests and classification trees

2007 Jan 29

comparing random forests and classification trees

Hi, I have done an analysis using 'rpart' to construct a Classification Tree. I am wanting to retain the output in tree form so that it is easily interpretable. However, I am wanting to compare the 'accuracy' of the tree to a Random Forest to estimate how much predictive ability is lost by using one simple tree. My understanding is that the error automatically displayed by the two

lda() called with data=subset() command

2004 Jan 05

lda() called with data=subset() command

Hi I have a data.frame with a grouping variable having the levels C, mild AD, mod AD, O and S since I want to compute a lda only for the two groups 'C' and 'mod AD' I call lda with data=subset(mydata.pca,GROUP == 'mod AD' | GROUP == 'C') my.lda <- lda(GROUP ~ Comp.1 + Comp.2 + Comp.3 + Comp.4+ Comp.5 + Comp.6 + Comp.7 + Comp.8 ,

compositional data: percent values sum up to 1

2003 Jun 01

compositional data: percent values sum up to 1

again, under another subject: sorry, maybe an all too trivial question. But we have power data from J frequency spectra and to have the same range for the data of all our subjects, we just transformed them into % values, pseudo-code: power[i,j]=power[i,j]/sum(power[i,1:J]) of course, now we have a perfect linear relationship in our x design-matrix, since all power-values for each subject sum up

randomforest and AUC using 10 fold CV - Plotting results

2011 Dec 22

randomforest and AUC using 10 fold CV - Plotting results

Here is a snippet to show what i'm trying to do. library(randomForest) library(ROCR) library(caret) data(iris) iris <- iris[(iris$Species != "setosa"),] fit <- randomForest(factor(Species) ~ ., data=iris, ntree=50) train.predict <- predict(fit,iris,type="prob")[,2]

Base R wilcox.test gives incorrect answers, has been fixed in DescTools, solution can likely be ported to Base R

2023 Dec 11

Base R wilcox.test gives incorrect answers, has been fixed in DescTools, solution can likely be ported to Base R

While using the Hodges Lehmann Mean in DescTools (DescTools::HodgesLehmann), I found that it generated incorrect answers (see <https://github.com/AndriSignorell/DescTools/issues/97> https://github.com/AndriSignorell/DescTools/issues/97). The error is driven by the existence of tied values forcing wilcox.test in Base R to switch to an approximate algorithm that returns incorrect results - see

c() question

2004 Mar 29

c() question

Hi I need to define the following c("one group" = class.weight[2], "other group" = class.weight[1]) #class.weight = c(1,2) but I don't like the hard-coded way and would like to use my.group <- array(c("one group", "other group")) but now c(my.group[1] = class.weight[2], my.group[2] = class.weight[1]) gives an error how can I solve this

lda: how to get the eigenvalues

2003 Jun 03

lda: how to get the eigenvalues

Dear R-users How can I get the eigenvalues out of an lda analysis? thanks a lot christoph -- Christoph Lehmann <christoph.lehmann at gmx.ch>

overlay two pixmap

2003 Sep 26

overlay two pixmap

Hi I need to overlay two pixmaps (library (pixmap)). One, a pixmapGrey, is the basis, and on this I need to overlay a pixmapIndexed, BUT: the pixmapIndexed has set only some of its "pixels" to an indexed color, many of its pixels should not cover the basis pixmapGrey pixel, means, for this "in pixmapIndexed not defined pixels" it should be transparent. What would you

logistic regression for a data set with perfect separation

2003 Sep 09

logistic regression for a data set with perfect separation

Dear R experts I have the follwoing data V1 V2 1 -5.8000000 0 2 -4.8000000 0 3 -2.8666667 0 4 -0.8666667 0 5 -0.7333333 0 6 -1.6666667 0 7 -0.1333333 1 8 1.2000000 1 9 1.3333333 1 and I want to know, whether V1 can predict V2: of course it can, since there is a perfect separation between cases 1..6 and 7..9 How can I test, whether this conclusion (being able to assign an

calculationg condition numbers

2003 Feb 26

calculationg condition numbers

am I right in the assumption, that for calculation of the condition numbers I have to use the correlation matrix of X, and not t(x) %*% x? > e <- eigen(t(x) %*% x) better (x must not have a first column of ones): > e <- eigen(cor(x)) > e$val [1] 6.6653e+07 2.0907e+05 1.0536e+05 1.8040e+04 2.4557e+01 2.0151e+00 > sqrt(e$val[1]/e$val) [1] 1.000 17.855 25.153 60.785 1647.478

partial proportional odds model (PPO)

2003 Dec 13

partial proportional odds model (PPO)

Hi Since the 'equal slope' assumption doesn't hold in my data I cannot use a proportional odds model ('Design' library, together with 'Hmisc'). I would like to try therefore a partial proportional odds model Please, could anybody tell me, where to find the code and how to specify such a model ..or any potential alternatives many thanks for your kind help christoph

my own function given to lapply

2004 Feb 26

my own function given to lapply

Hi It seems, I just miss something. I defined treshold <- function(pred) { if (pred < 0.5) pred <- 0 else pred <- 1 return(pred) } and want to use apply it on a vector sapply(mylist[,,3],threshold) but I get: Error in match.fun(FUN) : Object "threshold" not found thanks for help cheers chris -- Christoph Lehmann <christoph.lehmann at gmx.ch>

R^2 analogue in polr() and prerequisites for polr()

2003 Dec 08

R^2 analogue in polr() and prerequisites for polr()

Hi (1)In polr(), is there any way to calculate a pseudo analogue to the R^2. Just for use as a purely descriptive statistic of the goodness of fit? (2) And: what are the assumptions which must be fulfilled, so that the results of polr() (t-values, etc.) are valid? How can I test these prerequisites most easily: I have a three-level (ordered factor) response and four metric variables. many

Error from gls call (package nlme)

2003 Sep 25

Error from gls call (package nlme)

Hi I have a huge array with series of data. For each cell in the array I fit a linear model, either using lm() or gls() with lm() there is no problem, but with gls() I get an error: Error in glsEstimate(glsSt, control = glsEstControl) : computed gls fit is singular, rank 2 as soon as there are data like this: > y1 <- c(0,0,0,0) > x1 <- c(0,1,1.3,0) > gls(y1~x1)

sweave: graphics not at the expected location in the pdf

2004 Jun 25

sweave: graphics not at the expected location in the pdf

Hi I use sweave for excellent pdf output (thank you- Friedrich Leisch). I have just one problem. Quite often it happens, that the graphics are not at the place where I expect them, but (often on a separate page) later on in the pdf. How can I fix this, means how can I define, that I want a graphic exactly here and now in the document? Many thanks and best regards Christoph -- Christoph

Problem with SIP-Phones and * audio-files

2003 Nov 28

Problem with SIP-Phones and * audio-files

Hi All, I am a newbie to asterisk, and here is my first problem, where I do not know any further. I have to grandstream BT100 connected to asterisk. Working fine, for calling to each other, and to call via a IAX-Link to the outside. If I try to call the initial demo from the samples.extensions.conf I have nothing to hear. The CLI fine reports: -- Executing

Hodges-Lehmann EXACT confidence interval for small dataset with ties

2010 Feb 05

Hodges-Lehmann EXACT confidence interval for small dataset with ties

Dear r-helpers, I have a small dataset (n<50), and I want to compute the Hodges Lehmann exact confidence interval. So far, I know that "pairwiseCI" has the function "HL.diff". The description is as follows : HL.diff calculates the Hodges-Lehmann confidence interval for the difference of locations by calling wilcox.exact in package exactRankTests ; But when I check

criterion for variable selection in LDA

2003 Nov 10

criterion for variable selection in LDA

Hi Since a stepwise procedure for variable selection (as e.g. in SPSS) for a LDA is not implemented in R and anyway I cannot be sure, that all the required assumptions for e.g. a procedure using a statistic based on wilks' lambda, hold (such as normality and variance homogeneity) I would like to ask you, what you would recommend me: shall I e.g. define a criterion such as the error-rate

wilcox.test point estimates perverse (PR#1150)

2001 Oct 26

wilcox.test point estimates perverse (PR#1150)

The point estimates produced by wilcox.test are perverse (not wrong, just brain damaged). The Hodges-Lehmann estimator that goes with the signed rank test is the median of the Walsh averages. The Hodges-Lehmann estimator that goes with the rank sum test is the median of the pairwise differences. wilcox.test agrees except that it uses the following very peculiar definition of "sample

similar to: comparing classification methods: 10-fold cv or leaving-one-out ?