Displaying 20 results from an estimated 600 matches similar to: "Logistic Regression - Variable Selection Methods With Prediction"
2011 May 10
1
Filtering out bad data points
Hi,
I always have a question about how to do this best in R. I have a data
frame and a set of criteria to filter points out. My procedure is to
always locate indices of those points, check if index vector length is
greater than 0 or not and then remove them. Meaning
dftest <- data.frame(x=rnorm(100),y=rnorm(100));
qtile <- quantile(dftest$x,probs=c(0.05,0.95));
badIdx <- which((dftest$x
2012 May 01
3
Data frame vs matrix quirk: Hinky error message?
AdvisoRs:
Is the following a bug, feature, hinky error message, or dumb Bert?
> mtest <- matrix(1:12,nr=4)
> dftest <- data.frame(mtest)
> ix <- cbind(1:2,2:3)
> mtest[ix] <- NA
> mtest
[,1] [,2] [,3]
[1,] 1 NA 9
[2,] 2 6 NA
[3,] 3 7 11
[4,] 4 8 12
## But ...
> dftest[ix] <- NA
Error in `[<-.data.frame`(`*tmp*`, ix, value
2012 Jun 01
4
regsubsets (Leaps)
Hi
i need to create a model from 250 + variables with high collinearity, and
only 17 data points (p = 250, n = 750). I would prefer to use Cp, AIC,
and/or BIC to narrow down the number of variables, and then use VIF to
choose a model without collinearity (if possible). I realize that having a
huge p and small n is going to give me extreme linear dependency problems,
but I *think* these model
2008 May 07
1
help with regsubsets
Hi,
I'm new to R and this mailing list, so I will attempt to state my question as appropriately as possible.
I am running R version 2.7 with Windows XP and have recently been exploring the use of the function regsubsets in the leaps package in order to perform all-subsets regression.
So, I'm calling the function as:
2009 Mar 11
1
regsubsets() [leaps package] - please share some good examples of use
Hello dear R-help members,
I recently became interested in using biglm with leaps, and found myself
somewhat confused as to how to use the two together, in different settings.
I couldn't find any example codes for the leaps() package (except for in the
help file, and the examples there are not as rich as they could be). That
is why I turn to you in case you could share some good tips and
Error with regsubset in leaps package - vcov and all.best option (plus calculating VIFs for subsets)
2009 May 20
1
Error with regsubset in leaps package - vcov and all.best option (plus calculating VIFs for subsets)
Hi all
I am hoping this is just a minor problem, I am trying to implement a best subsets regression procedure on some ecological datasets using the regsubsets function in the leaps package. The dataset contains 43 predictor variables plus the response (logcount) all in a dataframe called environment. I am implementing it as follows:
library(leaps)
2012 Sep 25
3
Plotting of regsubsets adjr2 values not correct
Hi,
I want to make model selection with regsubsets. My code is:
a<-regsubsets(Gesamt ~ CommunistSocialist + CountrySize + GNI + Lifeexp +
Schoolyears + ExpMilitary + Mortality +
PopPoverty + PopTotal + ExpEdu + ExpHealth, data=olympiadaten, nbest=2)
summary(a)
plot(a,scale="adjr2")
(output attached)
The problem is now, that I want to fit the best model again "manually"
2007 May 11
2
PRESS criterion in leaps
I'm interested in writing some model selection functions (for linear
regression models, as a start), which incorporate the PRESS criterion since
it, to my knowledge, is not currently implemented in any available model
selection procedure.
I thought it would be simplest to build on already existing functions like
regsubsets in package leaps. It's easy enough to calculate the PRESS
2007 Aug 08
1
Regsubsets statistics
Dear R-help,
I have used the regsubsets function from the leaps package to do subset
selection of a logistic regression model with 6 independent variables and
all possible ^2 interactions. As I want to get information about the
statistics behind the selection output, I?ve intensively searched the
mailing list to find answers to following questions:
1. What should I do to get the statistics
2010 Dec 26
1
Calculation of BIC done by leaps-package
Hi Folks,
I've got a question concerning the calculation of the Schwarz-Criterion
(BIC) done by summary.regsubsets() of the leaps-package:
Using regsubsets() to perform subset-selection I receive an regsubsets
object that can be summarized by summary.regsubsets(). After this
operation the resulting summary contains a vector of BIC-values
representing models of size i=1,...,K.
My problem
2005 Sep 27
4
regsubsets selection criterion
Hello,
I am using the 'regsubsets' function
(from leaps package)
to get the best linear models
to explain 1 variable
from 1 to 5 explanatory variables
(exhaustive search).
Is there anyone who can tell me
on which criterion is based
the 'regsubsets' function ?
Thank you.
samuel
Samuel BERTRAND
Doctorant
Laboratoire de Biomecanique
LBM - ENSAM - CNRS UMR 8005
2012 Sep 25
2
Regsubsets model selection
Hi,
I have 12 independent variables and one dependent variable. Now I want to
select the best adj. R squared model by using the regsubsets command, so I
code:
> plot(regsubsets(Gesamt ~ CommunistSocialist + CountrySize + GNI + Lifeexp
+ Schoolyears + ExpMilitary + Mortality +
+ PopPoverty + PopTotal + ExpEdu + ExpHealth, data=olympiadaten, nbest=1,
nvmax=12), scale='adjr2')
Then I
2013 Jun 04
1
How to write a loop in R to select multiple regression model and validate it ?
I would like to run a loop in R. I have never done this before, so I would be
very grateful for your help !
1. I have a sample set: 25 objects. I would like to draw 1 object from it
and use it as a test set for my future external validation. The remaining 24
objects I would like to use as a training set (to select a model). I would
like to repeat this process until all 25 objects are used as a
2005 Mar 02
1
Leaps & regsubsets
Hello
I am trying to use all subsets regression on a test dataset consisting
of 11 trails and 46 potential predictor variables.
I would like to use Mallow's Cp as a selection criterion.
The leaps function would provide the required output but does not work
with this many variables (see below).
The alternative function regsubsets should be used, but I am not able to
define the function in
2011 Feb 22
1
regsubsets {leaps}
Hi,
I'd like to run regsubsets for model selection by exhaustive search. I have
a list with 20 potential explanatory variables, which represent the real and
the imaginary parts of 10 "kinds" of complex numbers:
x <- list(r1=r1, r2=r2, r3=r3, ..., r10=r10, i1=i1, i2=i2, i3=i3, ...,
i10=i10)
Is there an easy way to constrain the model search so that "r"s and
2004 Jan 29
1
a question regarding leaps
Hi,
I'm using regsubsets from the leaps package to select subsets of
variables. I'm calling the function as
lp <- regsubsets(x,y,nbest=5,nvmax=9)
Then I call plot to see which variables turned up in the models. I use
the R^2 scale and see my best model had a R^2 of 0.62.
However when I make a linear model using lm() with the same x my R^2 is
0.45. Should'nt I be seeing the
2005 May 11
2
Regsubsets()
Dear List members
I am using the regsubsets function to select a few predictor variables
using Mallow's Cp:
> sel.proc.regsub.full <- regsubsets(CO2 ~ v + log(v) + v.max + sd.v +
tad + no.stops.km + av.stop.T + a + sd.a + a.max + d + sd.d + d.max +
RPA + P + perc.stop.T + perc.a.T + perc.d.T + RPS + RPSS + sd.P.acc +
P.dec + da.acc.1 + RMSACC + RDI + RPSI + P.acc + cov.v + cov.a +
2011 Dec 29
1
How would I rewrite my code so that I can implement the use of multicore on an Rstudio server to run regsubsets using the "exhaustive" method? The data has 1200 variables and 9000 obs so the code has been shortened here:
How would I rewrite my code so that I can implement the use of multicore on
an Rstudio server to run regsubsets using the "exhaustive" method? The data
has 1200 variables and 9000 obs so the code has been shortened here:
model<-regsubsets(price~x + y + z + a + b + ...., data=sample,
nvmax=500, method=c("exhaustive"))
Our server is a quad core 7.5 gb ram, is that
2010 Nov 10
1
p-value from regsubsets
Hi, does anyone know if there is a way to easily extract p-values from
the regsubsets() function?
Thanks,
James Stegen
--
James C. Stegen
NSF Postdoctoral Fellow in Bioinformatics
University of North Carolina
Chapel Hill, NC
919-962-8795
stegen at email.unc.edu
http://www.unc.edu/~stegen/index.html
2008 Mar 10
2
write.table with row.names=FALSE unnecessarily slow?
write.table with large data frames takes quite a long time
> system.time({
+ write.table(df, '/tmp/dftest.txt', row.names=FALSE)
+ }, gcFirst=TRUE)
user system elapsed
97.302 1.532 98.837
A reason is because dimnames is always called, causing 'anonymous' row
names to be created as character vectors. Avoiding this in
src/library/utils, along the lines of
Index: