similar to: GAM: Overfitting

Displaying 20 results from an estimated 600 matches similar to: "GAM: Overfitting"

2007 Apr 04
3
Power analysis and mixed model
Un texte encapsul? et encod? dans un jeu de caract?res inconnu a ?t? nettoy?... Nom : non disponible Url : https://stat.ethz.ch/pipermail/r-help/attachments/20070404/0f61f54a/attachment.pl
2011 Aug 13
2
linear regression
dear R users, my data looks like this PM10 Ref UZ JZ WT RH FT WR 1 10.973195 4.338874 nein Winter Dienstag ja nein West 2 6.381684 2.250446 nein Sommer Sonntag nein ja Süd 3 62.586512 66.304869 ja Sommer Sonntag nein nein Ost 4 5.590101 8.526152 ja Sommer Donnerstag nein nein Nord 5 30.925054 16.073091 nein Winter Sonntag nein
2013 Feb 17
3
Select components of a list
Hi Gustav, Try this: lapply(1:length(models),function(i) lapply(models[[i]],function(x) summary(x)$coef[2,]))[[1]] #1st list component [[1]] #??? Estimate?? Std. Error????? z value???? Pr(>|z|) # pm10 #5.999185e-04 1.486195e-04 4.036606e+00 5.423004e-05 #[[2]] #??? Estimate?? Std. Error????? z value???? Pr(>|z|) #ozone #0.0010117294 0.0003792739 2.6675428048 0.0076408155 #[[3]] #???
2005 Jan 13
2
GAM: Remedial measures
I fitted a GAM model with Poisson distribution to a data with about 200 observations. I noticed that the plot of the residuals versus fitted values show a trend. Residuals tend to be lower for higher fitted values. Because, I'm dealing with count data, I'm thinking that this might be due to overdispersion. Is there a way to account for overdispersion in any of the packages MGCV or GAM?
2008 May 21
2
an unknown error message when using gamm function
Dear everyone, I'm encountering an unknown error message when using gamm function: > fitoutput <- gamm(cvd~as.factor(dow)+pm10+s(time,bs="cr",k=15,fx=TRUE)+s(tmean,bs="cr",k=7,fx=TRUE) + ,correlation=corAR1(form=~1|city),family=poisson,random=list(city=~pm10),data=mimp) Maximum number of PQL iterations: 20 iteration 1 iteration 2 iteration 3 iteration 4
2006 May 08
1
Help on zoo and datetime series
Hello, i would like to import this txt file: Giorno;PM10 2006-01-01 10:10;10.3 2006-02-02 20:22;50.3 2006-03-03 23:33;20.1 ......... As it's an irregular time series i use zoo as follow: require(zoo) z <- read.table("c:\\1.csv", sep=";", na.strings="-999", header=TRUE) q <- zoo(z$PM10, strptime(as.character(z$Giorno),"%Y-%m-%d %H:%M")) At this
2007 Oct 03
1
How to avoid overfitting in gam(mgcv)
Dear listers, I'm using gam(from mgcv) for semi-parametric regression on small and noisy datasets(10 to 200 observations), and facing a problem of overfitting. According to the book(Simon N. Wood / Generalized Additive Models: An Introduction with R), it is suggested to avoid overfitting by inflating the effective degrees of freedom in GCV evaluation with increased "gamma"
2008 Aug 26
1
lattice: plotting an arbitrary number of panels, defining arbitrary groups
R Friends, I'm running R2.7.1 on Windows XP. I'm trying to get some lattice functionality which I have not seen previously documented--I'd like to plot the exact same data in multiple panels but changing the grouping variable each time so that each panel highlights a different feature of the data set. The following code does exactly that with a simple and fabricated air quality data
2004 Dec 22
2
GAM: Getting standard errors from the parametric terms in a GAM model
I am new to R. I'm using the function GAM and wanted to get standard errors and p-values for the parametric terms (I fitted a semi-parametric models). Using the function anova() on the object from GAM, I only get p-values for the nonparametric terms. Does anyone know if and how to get standard errors for the parametric terms? Thanks. Jean G. Orelien
2008 Feb 16
2
Possible overfitting of a GAM
The subject is a Generalized Additive Model. Experts caution us against overfitting the data, which can cause inaccurate results. I am not a statistician (my background is in Computer Science). Perhaps some kind soul would take a look and vet the model for overfitting the data. The study estimated the ebb and flow of traffic through a voting place. Just one voting place was studied; the
2011 Jun 21
5
please help for mgcv package
i read a book from WOOD, there's an example which is talking about the pollutant. library(gamair) library(mgcv) y<-gam(death~s(time,bs="cr",k=200)+s(pm10median,bs="cr")+s(so2median,bs="cr")+s(o3median,bs="cr")+s(tmpd,bs="cr"),data=chicago,family=Possion) lag.sum<-function(a,10,11) {n<-length(a) b<-rep(0,n-11) for(i in 0:(11-10))
2012 Aug 07
2
Passing arguments to a function within a function ...
Hallo Everybody How do you specify arguments for a function used within another function? Here is my problem: I am reconstructing a calculator for the burden of disease due to air pollution from publications and tools published by the WHO. The calculations make use of published dose-response relationships for particular health end-points. This is then applied to populations with known or
2010 Apr 08
2
Overfitting/Calibration plots (Statistics question)
This isn't a question about R, but I'm hoping someone will be willing to help. I've been looking at calibration plots in multiple regression (plotting observed response Y on the vertical axis versus predicted response [Y hat] on the horizontal axis). According to Frank Harrell's "Regression Modeling Strategies" book (pp. 61-63), when making such a plot on new data
2012 Feb 13
2
finding and describing missing data runs in a time series
Hi - I am trying to find and describe missing data in a time series. For instance, in the library openair, there is a data frame called "mydata": library(openair) head(mydata) date ws wd nox no2 o3 pm10 so2 co pm25 1 1998-01-01 00:00:00 0.60 280 285 39 1 29 4.7225 3.3725 NA 2 1998-01-01 01:00:00 2.16 230 NA NA NA 37 NA NA NA 3 1998-01-01 02:00:00
2008 Oct 17
1
ggplot2 scale relation free
I don't know if there is a way to use the scale relation free argument in ggplot2 like in lattice. I have a feeling that there is not, but I would like to make a plea for this feature. It would be nice to be able to plot Total Inorganic Nitrogen Total Phosphorus and the ratio of the two- the numbers on the axis are not related, but the previous two are surely related to the last (this ratio
2011 Aug 14
2
Central limit theorem
my data looks like this: PM10 Ref UZ JZ WT RH FT WR 1 10.973195 4.338874 nein Winter Dienstag ja nein West 2 6.381684 2.250446 nein Sommer Sonntag nein ja Süd 3 62.586512 66.304869 ja Sommer Sonntag nein nein Ost 4 5.590101 8.526152 ja Sommer Donnerstag nein nein Nord 5 30.925054 16.073091 nein Winter Sonntag nein nein Ost 6
2008 Mar 07
0
How to do a time-stratified case-crossover analysis for air pollution data?
Dear Experts, I am trying to do a time-stratified case-crossover analysis on air pollution data and number of myocardial infarctions. In order to avoid model selection bias, I started with a simple simulation. I'm still not sure if my simulation is right. But the results I get from the "ts-case-crossover" are much more variable than those from a glm. Is this: a. Due to
2008 Mar 07
0
How to do a time-stratified case-crossover analysis for air pollution data? Unformatted text-version, with an additional note
Dear Experts, I am trying to do a time-stratified case-crossover analysis on air pollution data and number of myocardial infarctions. In order to avoid model selection bias, I started with a simple simulation. I'm still not sure if my simulation is right. But the results I get from the "ts-case-crossover" are much more variable than those from a glm. Is this: a. Due to the simple
2013 Jan 15
0
e1071 SVM, cross-validation and overfitting
I am accustomed to the LIBSVM package, which provides cross-validation on training with the -v option % svm-train -v 5 ... This does 5 fold cross validation while building the model and avoids over-fitting. But I don't see how to accomplish that in the e1071 package. (I learned that svm(... cross=5 ...) only _tests_ using cross-validation -- it doesn't affect the training.) Can
2017 Nov 21
0
Do I need to transform backtest returns before using pbo (probability of backtest overfitting) package functions?
Hello, I'm trying to understand how to use the pbo package by looking at a vignette. I'm curious about a part of the vignette that creates simulated returns data. The package author transforms his simulated returns in a way that I'm unfamiliar with, and that I haven't been able to find an explanation for after searching around. I'm curious if I need to replicate the