thr3ads.net - similar to: "GAM: Overfitting"

Displaying 20 results from an estimated 600 matches similar to: "GAM: Overfitting"

2007 Apr 04

Power analysis and mixed model

Un texte encapsul? et encod? dans un jeu de caract?res inconnu a ?t? nettoy?... Nom : non disponible Url : https://stat.ethz.ch/pipermail/r-help/attachments/20070404/0f61f54a/attachment.pl

linear regression

2011 Aug 13

linear regression

dear R users, my data looks like this PM10 Ref UZ JZ WT RH FT WR 1 10.973195 4.338874 nein Winter Dienstag ja nein West 2 6.381684 2.250446 nein Sommer Sonntag nein ja Süd 3 62.586512 66.304869 ja Sommer Sonntag nein nein Ost 4 5.590101 8.526152 ja Sommer Donnerstag nein nein Nord 5 30.925054 16.073091 nein Winter Sonntag nein

Select components of a list

2013 Feb 17

Select components of a list

Hi Gustav, Try this: lapply(1:length(models),function(i) lapply(models[[i]],function(x) summary(x)$coef[2,]))[[1]] #1st list component [[1]] #??? Estimate?? Std. Error????? z value???? Pr(>|z|) # pm10 #5.999185e-04 1.486195e-04 4.036606e+00 5.423004e-05 #[[2]] #??? Estimate?? Std. Error????? z value???? Pr(>|z|) #ozone #0.0010117294 0.0003792739 2.6675428048 0.0076408155 #[[3]] #???

GAM: Remedial measures

2005 Jan 13

GAM: Remedial measures

I fitted a GAM model with Poisson distribution to a data with about 200 observations. I noticed that the plot of the residuals versus fitted values show a trend. Residuals tend to be lower for higher fitted values. Because, I'm dealing with count data, I'm thinking that this might be due to overdispersion. Is there a way to account for overdispersion in any of the packages MGCV or GAM?

an unknown error message when using gamm function

2008 May 21

an unknown error message when using gamm function

Dear everyone, I'm encountering an unknown error message when using gamm function: > fitoutput <- gamm(cvd~as.factor(dow)+pm10+s(time,bs="cr",k=15,fx=TRUE)+s(tmean,bs="cr",k=7,fx=TRUE) + ,correlation=corAR1(form=~1|city),family=poisson,random=list(city=~pm10),data=mimp) Maximum number of PQL iterations: 20 iteration 1 iteration 2 iteration 3 iteration 4

Help on zoo and datetime series

2006 May 08

Help on zoo and datetime series

Hello, i would like to import this txt file: Giorno;PM10 2006-01-01 10:10;10.3 2006-02-02 20:22;50.3 2006-03-03 23:33;20.1 ......... As it's an irregular time series i use zoo as follow: require(zoo) z <- read.table("c:\\1.csv", sep=";", na.strings="-999", header=TRUE) q <- zoo(z$PM10, strptime(as.character(z$Giorno),"%Y-%m-%d %H:%M")) At this

How to avoid overfitting in gam(mgcv)

2007 Oct 03

How to avoid overfitting in gam(mgcv)

Dear listers, I'm using gam(from mgcv) for semi-parametric regression on small and noisy datasets(10 to 200 observations), and facing a problem of overfitting. According to the book(Simon N. Wood / Generalized Additive Models: An Introduction with R), it is suggested to avoid overfitting by inflating the effective degrees of freedom in GCV evaluation with increased "gamma"

lattice: plotting an arbitrary number of panels, defining arbitrary groups

2008 Aug 26

lattice: plotting an arbitrary number of panels, defining arbitrary groups

R Friends, I'm running R2.7.1 on Windows XP. I'm trying to get some lattice functionality which I have not seen previously documented--I'd like to plot the exact same data in multiple panels but changing the grouping variable each time so that each panel highlights a different feature of the data set. The following code does exactly that with a simple and fabricated air quality data

GAM: Getting standard errors from the parametric terms in a GAM model

2004 Dec 22

GAM: Getting standard errors from the parametric terms in a GAM model

I am new to R. I'm using the function GAM and wanted to get standard errors and p-values for the parametric terms (I fitted a semi-parametric models). Using the function anova() on the object from GAM, I only get p-values for the nonparametric terms. Does anyone know if and how to get standard errors for the parametric terms? Thanks. Jean G. Orelien

Possible overfitting of a GAM

2008 Feb 16

Possible overfitting of a GAM

The subject is a Generalized Additive Model. Experts caution us against overfitting the data, which can cause inaccurate results. I am not a statistician (my background is in Computer Science). Perhaps some kind soul would take a look and vet the model for overfitting the data. The study estimated the ebb and flow of traffic through a voting place. Just one voting place was studied; the

please help for mgcv package

2011 Jun 21

please help for mgcv package

i read a book from WOOD, there's an example which is talking about the pollutant. library(gamair) library(mgcv) y<-gam(death~s(time,bs="cr",k=200)+s(pm10median,bs="cr")+s(so2median,bs="cr")+s(o3median,bs="cr")+s(tmpd,bs="cr"),data=chicago,family=Possion) lag.sum<-function(a,10,11) {n<-length(a) b<-rep(0,n-11) for(i in 0:(11-10))

Passing arguments to a function within a function ...

2012 Aug 07

Passing arguments to a function within a function ...

Hallo Everybody How do you specify arguments for a function used within another function? Here is my problem: I am reconstructing a calculator for the burden of disease due to air pollution from publications and tools published by the WHO. The calculations make use of published dose-response relationships for particular health end-points. This is then applied to populations with known or

Overfitting/Calibration plots (Statistics question)

2010 Apr 08

Overfitting/Calibration plots (Statistics question)

This isn't a question about R, but I'm hoping someone will be willing to help. I've been looking at calibration plots in multiple regression (plotting observed response Y on the vertical axis versus predicted response [Y hat] on the horizontal axis). According to Frank Harrell's "Regression Modeling Strategies" book (pp. 61-63), when making such a plot on new data

finding and describing missing data runs in a time series

2012 Feb 13

finding and describing missing data runs in a time series

Hi - I am trying to find and describe missing data in a time series. For instance, in the library openair, there is a data frame called "mydata": library(openair) head(mydata) date ws wd nox no2 o3 pm10 so2 co pm25 1 1998-01-01 00:00:00 0.60 280 285 39 1 29 4.7225 3.3725 NA 2 1998-01-01 01:00:00 2.16 230 NA NA NA 37 NA NA NA 3 1998-01-01 02:00:00

ggplot2 scale relation free

2008 Oct 17

ggplot2 scale relation free

I don't know if there is a way to use the scale relation free argument in ggplot2 like in lattice. I have a feeling that there is not, but I would like to make a plea for this feature. It would be nice to be able to plot Total Inorganic Nitrogen Total Phosphorus and the ratio of the two- the numbers on the axis are not related, but the previous two are surely related to the last (this ratio

Central limit theorem

2011 Aug 14

Central limit theorem

my data looks like this: PM10 Ref UZ JZ WT RH FT WR 1 10.973195 4.338874 nein Winter Dienstag ja nein West 2 6.381684 2.250446 nein Sommer Sonntag nein ja Süd 3 62.586512 66.304869 ja Sommer Sonntag nein nein Ost 4 5.590101 8.526152 ja Sommer Donnerstag nein nein Nord 5 30.925054 16.073091 nein Winter Sonntag nein nein Ost 6

How to do a time-stratified case-crossover analysis for air pollution data?

2008 Mar 07

How to do a time-stratified case-crossover analysis for air pollution data?

Dear Experts, I am trying to do a time-stratified case-crossover analysis on air pollution data and number of myocardial infarctions. In order to avoid model selection bias, I started with a simple simulation. I'm still not sure if my simulation is right. But the results I get from the "ts-case-crossover" are much more variable than those from a glm. Is this: a. Due to

How to do a time-stratified case-crossover analysis for air pollution data? Unformatted text-version, with an additional note

2008 Mar 07

How to do a time-stratified case-crossover analysis for air pollution data? Unformatted text-version, with an additional note

e1071 SVM, cross-validation and overfitting

2013 Jan 15

e1071 SVM, cross-validation and overfitting

I am accustomed to the LIBSVM package, which provides cross-validation on training with the -v option % svm-train -v 5 ... This does 5 fold cross validation while building the model and avoids over-fitting. But I don't see how to accomplish that in the e1071 package. (I learned that svm(... cross=5 ...) only _tests_ using cross-validation -- it doesn't affect the training.) Can

Do I need to transform backtest returns before using pbo (probability of backtest overfitting) package functions?

2017 Nov 21

Do I need to transform backtest returns before using pbo (probability of backtest overfitting) package functions?

Hello, I'm trying to understand how to use the pbo package by looking at a vignette. I'm curious about a part of the vignette that creates simulated returns data. The package author transforms his simulated returns in a way that I'm unfamiliar with, and that I haven't been able to find an explanation for after searching around. I'm curious if I need to replicate the

similar to: GAM: Overfitting