thr3ads.net - similar to: "lm/lme cross-validation"

Displaying 20 results from an estimated 20000 matches similar to: "lm/lme cross-validation"

2004 May 13

storage of lm objects in a database

Hello, I'd like to use DBI to store lm objects in a database. I've to analyze many of linear models and I cannot store them in a single R-session (not enough memory). Also it'd be nice to have them persistent. Maybe it's possible to create a compact binary representation of the object (the kind of format created created by "save"), so that one doesn't need to write

graphics and 'layout' question

2006 Sep 15

graphics and 'layout' question

Hello, I got stuck with a graphics question: I've 3 figures that I present on a single page (window) via 'layout'. The layout is layout(matrix(c(1,1,2,3), 2, 2, byrow=TRUE)); so that the frst plot spans the both columns in row one. Now I'd like to magnify the fist figure so that it takes 20% more vertical space (i.e. more space for the y-axis). How would I do this in R?

data import problem

2006 Mar 08

data import problem

Dear All, I'm trying to read a text data file that contains several records separated by a blank line. Each record starts with a row that contains it's ID and the number of rows for the records (two columns), then the data table itself, e.g. 123 5 89.1791 1.1024 90.5735 1.1024 92.5666 1.1024 95.0725 1.1024 101.2070 1.1024 321 3 60.1601 1.1024 64.8023 1.1024 70.0593

basic anova and t-test question

2005 Aug 26

basic anova and t-test question

Hello, I'm posting this to receive some comments/hints about a rather statistical than R-technical question ... . In an anova of a lme factor SSPos11 shows up non-significant, but in the t-test of the summay 2 of the 4 levels (one for constrast) are significant. See below for some truncated output. I realize that the two test are different (F-test/t-test), but I'm looking for for a

RandomForest question

2005 Jul 21

RandomForest question

Hello, I'm trying to find out the optimal number of splits (mtry parameter) for a randomForest classification. The classification is binary and there are 32 explanatory variables (mostly factors with each up to 4 levels but also some numeric variables) and 575 cases. I've seen that although there are only 32 explanatory variables the best classification performance is reached when

splitting very long character string

2006 Nov 01

splitting very long character string

Hello, I've a very long character array (>500k characters) that need to split by '\n' resulting in an array of about 60k numbers. The help on strsplit says to use perl=TRUE to get better formance, but still it takes several minutes to split this string. The massive string is the return value of a call to xmlElementsByTagName from the XML library and looks like this: ... 12345

Small mystery : passing a "subset=" argument to lme|lm through "..."

2009 Jun 04

Small mystery : passing a "subset=" argument to lme|lm through "..."

Dear list, I have problems involving passing a "subset=" argument through "...". I'm trying to augment the set of defined analyses for mice (homonymous package) with a call to lme. This package create multiple imputations of missing data in a "mids" object, each completed data set may be obtained through the complete(data, set) function. > sessionInfo() R

unbalanced design for anova with low number of replicates

2004 Jun 28

unbalanced design for anova with low number of replicates

Hello, I'm wondering what's the best way to analyse an unbalanced design with a low number of replicates. I'm not a statistician, and I'm looking for some direction for this problem. I've a 2 factor design: Factor batch with 3 levels, and factor dose within each batch with 5 levels. Dose level 1 in batch one is replicated 4 times, level 3 is replicated only 2 times. all

svm and scaling input

2005 Jun 28

svm and scaling input

Dear All, I've a question about scaling the input variables for an analysis with svm (package e1071). Most of my variables are factors with 4 to 6 levels but there are also some numeric variables. I'm not familiar with the math behind svms, so my assumtions maybe completely wrong ... or obvious. Will the svm automatically expand the factors into a binary matrix? If I add numeric

calculating IC50

2006 Feb 02

calculating IC50

Hello, I was wondering if there is an R-package to automatically calculate the IC50 value (concentration of a substrance that inhibits cell growth to 50%) for some measurements. kind regards, Arne [[alternative HTML version deleted]]

cross validation for lme

2008 Aug 21

cross validation for lme

Hello, We would like to perform a cross validation on a linear mixed model (lme) and wonder if anyone has found something analogous to cv.glm for such models? Thanks, Mark [[alternative HTML version deleted]]

R versus SAS: lm performance

2004 May 10

R versus SAS: lm performance

Hello, A collegue of mine has compared the runtime of a linear model + anova in SAS and S+. He got the same results, but SAS took a bit more than a minute whereas S+ took 17 minutes. I've tried it in R (1.9.0) and it took 15 min. Neither machine run out of memory, and I assume that all machines have similar hardware, but the S+ and SAS machines are on windows whereas the R machine is Redhat

p-values for classification

2005 Jul 01

p-values for classification

Dear All, I'm classifying some data with various methods (binary classification). I'm interpreting the results via a confusion matrix from which I calculate the sensitifity and the fdr. The classifiers are trained on 575 data points and my test set has 50 data points. I'd like to calculate p-values for obtaining <=fdr and >=sensitifity for each classifier. I was thinking about

randomForest

2005 Jul 07

randomForest

> From: Weiwei Shi > > it works. > thanks, > > but: (just curious) > why i tried previously and i got > > > is.vector(sample.size) > [1] TRUE Because a list is also a vector: > a <- c(list(1), list(2)) > a [[1]] [1] 1 [[2]] [1] 2 > is.vector(a) [1] TRUE > is.numeric(a) [1] FALSE Actually, the way I initialize a list of known length is by

general linear hypothesis glht() to work with lme()

2008 Jan 10

general linear hypothesis glht() to work with lme()

Hi, I am trying to test some contrasts, using glht() in multcomp package on fixed effects in a linear mixed model fitted with lme() in nlme package. The command I used is: ## a simple randomized block design, ## type is fixed effect ## batch is random effect ## model with interaction dat.lme<-lme(info.index~type, random=~1|batch/type, data=dat) glht(dat.lme, linfct = mcp(type

broken example: lme() + multcomp() Tukey on repeated measures design

2009 Apr 21

broken example: lme() + multcomp() Tukey on repeated measures design

I am trying to do Tukey HSD comparisons on a repeated measures expt. I found the following example on r-help and quoted approvingly elsewhere. It is broken. Can anyone please tell me how to get it to work? I am using R 2.4.1. > require(MASS) ## for oats data set > require(nlme) ## for lme() > require(multcomp) ## for multiple comparison stuff > Aov.mod <- aov(Y ~ N + V +

bug in predict.lme?

2005 Jun 08

bug in predict.lme?

Dear All, I've come across a problem in predict.lme. Assigning a model formula to a variable and then using this variable in lme (instead of typing the formula into the formula part of lme) works as expect. However, when performing a predict on the fitted model I gan an error messag - predict.lme (but not predictlm) seems to expect a 'properly' typed in formula and a cannot extract

error in plot.lmList

2005 May 13

error in plot.lmList

Hello, in R-2.1.0 I'm trying to prodice trellis plots from an lmList object as described in the help for plot.lmList. I can generate the plots from the help, but on my own data plotting fails with an error message that I cannot interpret (please see below). Any hints are greatly appreciapted. kind regards, Arne > dim(d) [1] 575 4 > d[1:3,] Level_of_Expression SSPos1 SSPos19

randomForest error

2005 Jun 30

randomForest error

Hello, I'm using the random forest package. One of my factors in the data set contains 41 levels (I can't code this as a numeric value - in terms of linear models this would be a random factor). The randomForest call comes back with an error telling me that the limit is 32 categories. Is there any reason for this particular limit? Maybe it's possible to recompile the module with a

help with memory greedy storage

2004 May 14

help with memory greedy storage

Hello, I've a problem with a self written routine taking a lot of memory (>1.2Gb). Maybe you can suggest some enhancements, I'm pretty sure that my implementation is not optimal ... I'm creating many linear models and store coefficients, anova p-values ... all I need in different lists which are then finally returned in a list (list of lists). The input is a matrix with 84 rows

similar to: lm/lme cross-validation