similar to: Logistic Regression - Variable Selection Methods With Prediction

Displaying 20 results from an estimated 600 matches similar to: "Logistic Regression - Variable Selection Methods With Prediction"

2011 May 10
1
Filtering out bad data points
Hi, I always have a question about how to do this best in R. I have a data frame and a set of criteria to filter points out. My procedure is to always locate indices of those points, check if index vector length is greater than 0 or not and then remove them. Meaning dftest <- data.frame(x=rnorm(100),y=rnorm(100)); qtile <- quantile(dftest$x,probs=c(0.05,0.95)); badIdx <- which((dftest$x
2012 May 01
3
Data frame vs matrix quirk: Hinky error message?
AdvisoRs: Is the following a bug, feature, hinky error message, or dumb Bert? > mtest <- matrix(1:12,nr=4) > dftest <- data.frame(mtest) > ix <- cbind(1:2,2:3) > mtest[ix] <- NA > mtest [,1] [,2] [,3] [1,] 1 NA 9 [2,] 2 6 NA [3,] 3 7 11 [4,] 4 8 12 ## But ... > dftest[ix] <- NA Error in `[<-.data.frame`(`*tmp*`, ix, value
2012 Jun 01
4
regsubsets (Leaps)
Hi i need to create a model from 250 + variables with high collinearity, and only 17 data points (p = 250, n = 750). I would prefer to use Cp, AIC, and/or BIC to narrow down the number of variables, and then use VIF to choose a model without collinearity (if possible). I realize that having a huge p and small n is going to give me extreme linear dependency problems, but I *think* these model
2008 May 07
1
help with regsubsets
Hi, I'm new to R and this mailing list, so I will attempt to state my question as appropriately as possible. I am running R version 2.7 with Windows XP and have recently been exploring the use of the function regsubsets in the leaps package in order to perform all-subsets regression. So, I'm calling the function as:
2009 Mar 11
1
regsubsets() [leaps package] - please share some good examples of use
Hello dear R-help members, I recently became interested in using biglm with leaps, and found myself somewhat confused as to how to use the two together, in different settings. I couldn't find any example codes for the leaps() package (except for in the help file, and the examples there are not as rich as they could be). That is why I turn to you in case you could share some good tips and
2009 May 20
1
Error with regsubset in leaps package - vcov and all.best option (plus calculating VIFs for subsets)
Hi all I am hoping this is just a minor problem, I am trying to implement a best subsets regression procedure on some ecological datasets using the regsubsets function in the leaps package. The dataset contains 43 predictor variables plus the response (logcount) all in a dataframe called environment. I am implementing it as follows: library(leaps)
2012 Sep 25
3
Plotting of regsubsets adjr2 values not correct
Hi, I want to make model selection with regsubsets. My code is: a<-regsubsets(Gesamt ~ CommunistSocialist + CountrySize + GNI + Lifeexp + Schoolyears + ExpMilitary + Mortality + PopPoverty + PopTotal + ExpEdu + ExpHealth, data=olympiadaten, nbest=2) summary(a) plot(a,scale="adjr2") (output attached) The problem is now, that I want to fit the best model again "manually"
2007 May 11
2
PRESS criterion in leaps
I'm interested in writing some model selection functions (for linear regression models, as a start), which incorporate the PRESS criterion since it, to my knowledge, is not currently implemented in any available model selection procedure. I thought it would be simplest to build on already existing functions like regsubsets in package leaps. It's easy enough to calculate the PRESS
2007 Aug 08
1
Regsubsets statistics
Dear R-help, I have used the regsubsets function from the leaps package to do subset selection of a logistic regression model with 6 independent variables and all possible ^2 interactions. As I want to get information about the statistics behind the selection output, I?ve intensively searched the mailing list to find answers to following questions: 1. What should I do to get the statistics
2010 Dec 26
1
Calculation of BIC done by leaps-package
Hi Folks, I've got a question concerning the calculation of the Schwarz-Criterion (BIC) done by summary.regsubsets() of the leaps-package: Using regsubsets() to perform subset-selection I receive an regsubsets object that can be summarized by summary.regsubsets(). After this operation the resulting summary contains a vector of BIC-values representing models of size i=1,...,K. My problem
2005 Sep 27
4
regsubsets selection criterion
Hello, I am using the 'regsubsets' function (from leaps package) to get the best linear models to explain 1 variable from 1 to 5 explanatory variables (exhaustive search). Is there anyone who can tell me on which criterion is based the 'regsubsets' function ? Thank you. samuel Samuel BERTRAND Doctorant Laboratoire de Biomecanique LBM - ENSAM - CNRS UMR 8005
2012 Sep 25
2
Regsubsets model selection
Hi, I have 12 independent variables and one dependent variable. Now I want to select the best adj. R squared model by using the regsubsets command, so I code: > plot(regsubsets(Gesamt ~ CommunistSocialist + CountrySize + GNI + Lifeexp + Schoolyears + ExpMilitary + Mortality + + PopPoverty + PopTotal + ExpEdu + ExpHealth, data=olympiadaten, nbest=1, nvmax=12), scale='adjr2') Then I
2013 Jun 04
1
How to write a loop in R to select multiple regression model and validate it ?
I would like to run a loop in R. I have never done this before, so I would be very grateful for your help ! 1. I have a sample set: 25 objects. I would like to draw 1 object from it and use it as a test set for my future external validation. The remaining 24 objects I would like to use as a training set (to select a model). I would like to repeat this process until all 25 objects are used as a
2005 Mar 02
1
Leaps & regsubsets
Hello I am trying to use all subsets regression on a test dataset consisting of 11 trails and 46 potential predictor variables. I would like to use Mallow's Cp as a selection criterion. The leaps function would provide the required output but does not work with this many variables (see below). The alternative function regsubsets should be used, but I am not able to define the function in
2011 Feb 22
1
regsubsets {leaps}
Hi, I'd like to run regsubsets for model selection by exhaustive search. I have a list with 20 potential explanatory variables, which represent the real and the imaginary parts of 10 "kinds" of complex numbers: x <- list(r1=r1, r2=r2, r3=r3, ..., r10=r10, i1=i1, i2=i2, i3=i3, ..., i10=i10) Is there an easy way to constrain the model search so that "r"s and
2004 Jan 29
1
a question regarding leaps
Hi, I'm using regsubsets from the leaps package to select subsets of variables. I'm calling the function as lp <- regsubsets(x,y,nbest=5,nvmax=9) Then I call plot to see which variables turned up in the models. I use the R^2 scale and see my best model had a R^2 of 0.62. However when I make a linear model using lm() with the same x my R^2 is 0.45. Should'nt I be seeing the
2005 May 11
2
Regsubsets()
Dear List members I am using the regsubsets function to select a few predictor variables using Mallow's Cp: > sel.proc.regsub.full <- regsubsets(CO2 ~ v + log(v) + v.max + sd.v + tad + no.stops.km + av.stop.T + a + sd.a + a.max + d + sd.d + d.max + RPA + P + perc.stop.T + perc.a.T + perc.d.T + RPS + RPSS + sd.P.acc + P.dec + da.acc.1 + RMSACC + RDI + RPSI + P.acc + cov.v + cov.a +
2011 Dec 29
1
How would I rewrite my code so that I can implement the use of multicore on an Rstudio server to run regsubsets using the "exhaustive" method? The data has 1200 variables and 9000 obs so the code has been shortened here:
How would I rewrite my code so that I can implement the use of multicore on an Rstudio server to run regsubsets using the "exhaustive" method? The data has 1200 variables and 9000 obs so the code has been shortened here: model<-regsubsets(price~x + y + z + a + b + ...., data=sample, nvmax=500, method=c("exhaustive")) Our server is a quad core 7.5 gb ram, is that
2010 Nov 10
1
p-value from regsubsets
Hi, does anyone know if there is a way to easily extract p-values from the regsubsets() function? Thanks, James Stegen -- James C. Stegen NSF Postdoctoral Fellow in Bioinformatics University of North Carolina Chapel Hill, NC 919-962-8795 stegen at email.unc.edu http://www.unc.edu/~stegen/index.html
2008 Mar 10
2
write.table with row.names=FALSE unnecessarily slow?
write.table with large data frames takes quite a long time > system.time({ + write.table(df, '/tmp/dftest.txt', row.names=FALSE) + }, gcFirst=TRUE) user system elapsed 97.302 1.532 98.837 A reason is because dimnames is always called, causing 'anonymous' row names to be created as character vectors. Avoiding this in src/library/utils, along the lines of Index: