similar to: model selection using ANOVA

Displaying 20 results from an estimated 9000 matches similar to: "model selection using ANOVA"

2008 Nov 17
2
looking for matches
My question is probably pretty basic, but since I'm really new to R, here it goes .... I have two separate data frames that include class names and various other information on classes. I'm trying to create a match based on class names and if a match exists to create a third data frame with the class name. I was hoping to accomplish this with sapply, but I can't figure out if I can
2011 Jun 20
3
importing a file
I haven't used R in a couple of years, and now am trying something as simple as importing a csv file and am running into problems right away. * mydata <- read.csv (Wordata1.csv, sep="") Error in read.table(file = file, header = header, sep = sep, quote = quote, : object 'Wordata1.csv' not found *I've tried in both as as read.csv and read.table and still get the
2009 Jan 21
2
Replacing dates with consecutive observations
I am working with a list of dates and I would like to replace each date with the one that comes after, ie. 1/1/07 will become 1/5/07, 1/5/07 will become 1/7/07, etc. The number of days between my dates always varies, so I can't just increase each one by 5 days or so. Does anyone know of a way I can do this in R? thank you [[alternative HTML version deleted]]
2011 Jun 21
5
converting character to numeric
I'm trying to convert data from character to numeric. I've imported data as a csv file, I'm assuming that the import is a database - are all the columns in a database considered "vectors" and that they can be operated on individually Therefore I've tried the following mydata <- as.numeric(mydata$apples) when i then look at mydata again the named column is still
2009 Feb 19
2
colored maps again
I'm trying to create a colored map that would show the number of students per state. My data frame consists of two columns - state and count. I'm using the following code library(maps) map("usa") library(plotrix) state.col<-color.scale(gre$count,0,0,c(0,1)) map("state",fill=TRUE,col=state.col) I'm getting a map, but the values are not being mapped to correct
2009 Mar 31
1
target of assignment expands to non-language object
I'm running the following code numbers <- 1:50 for (i in 1:50) assign(paste("model",numbers[i]),i)<-(lm(temp$Overall.Scaled.Score~temp$raw.score)) where I want R to create 50 different models-1:50, but get the following error message "target of assignment expands to non-language object". I've tried it with
2009 Feb 17
2
creating a map
I'm trying to create a fairly basic map using R. What i want to get is the map of the country with circles representing a count of students in each state. What I've done so far is as following - map("state") symbols(data1$count,circles=log(data1$count)*3,fg=col,bg=col,add=T,inches=F) this gives me the map of the country, but one that's not populated by my counts. Does
2010 Jul 05
2
Can anybody help me understand AIC and BIC and devise a new metric?
Hi all, Could anybody please help me understand AIC and BIC and especially why do they make sense? Furthermore, I am trying to devise a new metric related to the model selection in the financial asset management industry. As you know the industry uses Sharpe Ratio as the main performance benchmark, which is the annualized mean of returns divided by the annualized standard deviation of returns.
2009 Mar 04
1
mapping lat and long with maps package
I am trying to overlay a data frame with lat and longitude(which refer to zip codes) on the map of US that I get by using map ("states"). Is there anyway to do this or do I have to resort to using maptools? thank you [[alternative HTML version deleted]]
2008 Jun 21
1
stepAIC {MASS}
In a generalized linear model with k covariates, there are 2(kth power) - 1 possible models (excluding interactions). Awhile ago a posting to R-help suggested Model Selection and Multimodel Inference, 2nd ed, by Burnham and Anderson as a good source for understanding model selection. They recommend (page 71) computing AIC differences over all candidate models in the set of possible models. After
2006 Jan 30
4
Logistic regression model selection with overdispersed/autocorrelated data
I am creating habitat selection models for caribou and other species with data collected from GPS collars. In my current situation the radio-collars recorded the locations of 30 caribou every 6 hours. I am then comparing resources used at caribou locations to random locations using logistic regression (standard habitat analysis). The data is therefore highly autocorrelated and this causes Type
2010 Jun 29
1
Model validation and penalization with rms package
I?ve been using Frank Harrell?s rms package to do bootstrap model validation. Is it the case that the optimum penalization may still give a model which is substantially overfitted? I calculated corrected R^2, optimism in R^2, and corrected slope for various penalties for a simple example: x1 <- rnorm(45) x2 <- rnorm(45) x3 <- rnorm(45) y <- x1 + 2*x2 + rnorm(45,0,3) ols0 <- ols(y
2004 Jun 01
1
multi-model inference
Hello I've been investigating using multi-model inference, based on calculating AIC and AIC weights, using the techniques outlined in Burnham and Anderson's (2002) book. However I notice a couple of emails in the R-help archive stating that there are errors in the technique. Are these errors associated with the particular implementation that B & A propose in their text, or is the
2006 Feb 20
1
Nested AIC
Greetings, I have recently come into some confusion over weather or not AIC results for comparing among models requires that they be nested. Reading Burnham & Anderson (2002) they are explicit that nested models are not required, but other respected statisticians have suggested that nesting is a pre-requisite for comparison. Could anyone who feels strongly regarding either position
2007 Dec 04
1
Best forecasting methods with Time Series ?
Hello, In order to do a future forecast based on my past Time Series data sets (salespricesproduct1, salespricesproduct2, etc..), I used arima() functions with different parameter combinations which give the smallest AIC. I also used auto.arima() which finds the parameters with the smallest AICs. But unfortuanetly I could not get satisfactory forecast() results, even sometimes catastrophic
2002 Mar 01
2
step, leaps, lasso, LSE or what?
Hi, I am trying to understand the alternative methods that are available for selecting variables in a regression without simply imposing my own bias (having "good judgement"). The methods implimented in leaps and step and stepAIC seem to fall into the general class of stepwise procedures. But these are commonly condemmed for inducing overfitting. In Hastie, Tibshirani and Friedman
2007 Mar 09
4
Reg. strings and numeric data in matrix.
Hi All, Sorry for this basic question as I am new to this R. I would like to know, is it possible to consider a matrix with some columns having numeric data and some other's with characters (strings) data? How do I get this type of data from a flat file. Thanks very much, mallika ____________________________________________________________________________ Mallika Veeramalai, Ph.D.,
2004 Dec 22
2
GAM: Overfitting
I am analyzing particulate matter data (PM10) on a small data set (147 observations). I fitted a semi-parametric model and am worried about overfitting. How can one check for model fit in GAM? Jean G. Orelien
2003 Jul 04
1
Quasi AIC
Dear all, Using the quasibinomial and quasipoisson families results in no AIC being calculated. However, a quasi AIC has actually been defined by Lebreton et al (1992). In the (in my opinon, at least) very interesting book by Burnham and Anderson (1998,2002) this QAIC (and also QAICc) is covered. Maybe this is something that could be implemented in R. Take a look at page 23 in this pdf:
2003 Sep 16
1
simplifying randomForest(s)
Dear All, I have been using the randomForest package for a couple of difficult prediction problems (which also share p >> n). The performance is good, but since all the variables in the data set are used, interpretation of what is going on is not easy, even after looking at variable importance as produced by the randomForest run. I have tried a simple "variable selection"