Hi I have the following code which I would like to simplify. Id does linear regressions and returns the r-squares, and the coefficients. It runs slow, as it is doing the regressions for each - is it possible to get the values in a dataframe which looks as follow: expert | xx | seeds | r.squared | slope | intercept Thanks in advance, Rainer library(reshape) rsqs <- as.data.frame( stamp( tc.long, expert * xx * seeds ~ ., function(df) try( summary( lm(distance ~ generation, data=df))$r.squared, silent=TRUE ) ) ) slope <- as.data.frame( stamp( tc.long, expert * xx * seeds ~ ., function(df) try( summary( lm(distance ~ generation, data=df))$coefficients[2], silent=TRUE ) ) ) d.slope <- as.data.frame( stamp( tc.long, expert * xx * seeds ~ ., function(df) try( summary( lm(distance ~ generation, data=df))$coefficients[4], silent=TRUE ) ) ) intercept <- as.data.frame( stamp( tc.long, expert * xx * seeds ~ ., function(df) try( summary( lm(distance ~ generation, data=df))$coefficients[1], silent=TRUE ) ) ) d.intercept <- as.data.frame( stamp( tc.long, expert * xx * seeds ~ ., function(df) try( summary( lm(distance ~ generation, data=df))$coefficients[3], silent=TRUE ) ) ) -- Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Department of Conservation Ecology and Entomology University of Stellenbosch Matieland 7602 South Africa Tel: +27 - (0)72 808 2975 (w) Fax: +27 - (0)21 808 3304 Cell: +27 - (0)83 9479 042 email: RKrug at sun.ac.za Rainer at krugs.de
I think your script is slow because it has to recalculate the same model five times. I've tried to avoid this by rewriting your function(df). function(df){ fit <- summary(lm(distance ~ generation, data=df)) result <- c(fit$$r.squared, $coefficients[2], $coefficients[4], $coefficients[1], $coefficients[3]) names(result) <- c("rsqs", "slope", "d.slope", "intercept", "d.intercept"), } Cheers, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Reseach Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx op inbo.be www.inbo.be Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney -----Oorspronkelijk bericht----- Van: r-help-bounces op stat.math.ethz.ch [mailto:r-help-bounces op stat.math.ethz.ch] Namens Rainer M Krug Verzonden: woensdag 25 oktober 2006 11:22 Aan: r-help op stat.math.ethz.ch Onderwerp: [R] simplification of code using stamp? Hi I have the following code which I would like to simplify. Id does linear regressions and returns the r-squares, and the coefficients. It runs slow, as it is doing the regressions for each - is it possible to get the values in a dataframe which looks as follow: expert | xx | seeds | r.squared | slope | intercept Thanks in advance, Rainer library(reshape) rsqs <- as.data.frame( stamp( tc.long, expert * xx * seeds ~ ., function(df) try( summary( lm(distance ~ generation, data=df))$r.squared, silent=TRUE ) ) ) slope <- as.data.frame( stamp( tc.long, expert * xx * seeds ~ ., function(df) try( summary( lm(distance ~ generation, data=df))$coefficients[2], silent=TRUE ) ) ) d.slope <- as.data.frame( stamp( tc.long, expert * xx * seeds ~ ., function(df) try( summary( lm(distance ~ generation, data=df))$coefficients[4], silent=TRUE ) ) ) intercept <- as.data.frame( stamp( tc.long, expert * xx * seeds ~ ., function(df) try( summary( lm(distance ~ generation, data=df))$coefficients[1], silent=TRUE ) ) ) d.intercept <- as.data.frame( stamp( tc.long, expert * xx * seeds ~ ., function(df) try( summary( lm(distance ~ generation, data=df))$coefficients[3], silent=TRUE ) ) ) -- Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Department of Conservation Ecology and Entomology University of Stellenbosch Matieland 7602 South Africa Tel: +27 - (0)72 808 2975 (w) Fax: +27 - (0)21 808 3304 Cell: +27 - (0)83 9479 042 email: RKrug op sun.ac.za Rainer op krugs.de ______________________________________________ R-help op stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Resending the function because of a typo in the result vector. function(df){ fit <- summary(lm(distance ~ generation, data=df)) result <- c(fit$r.squared, fit$coefficients[2], fit$coefficients[4], fit$coefficients[1], fit$coefficients[3]) names(result) <- c("rsqs", "slope", "d.slope", "intercept", "d.intercept"), } Cheers, Thierry ------------------------------------------------------------------------ - ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Reseach Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx op inbo.be www.inbo.be Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney -----Oorspronkelijk bericht----- Van: r-help-bounces op stat.math.ethz.ch [mailto:r-help-bounces op stat.math.ethz.ch] Namens ONKELINX, Thierry Verzonden: woensdag 25 oktober 2006 11:36 Aan: RKrug op sun.ac.za; r-help op stat.math.ethz.ch Onderwerp: Re: [R] simplification of code using stamp? I think your script is slow because it has to recalculate the same model five times. I've tried to avoid this by rewriting your function(df). function(df){ fit <- summary(lm(distance ~ generation, data=df)) result <- c(fit$$r.squared, $coefficients[2], $coefficients[4], $coefficients[1], $coefficients[3]) names(result) <- c("rsqs", "slope", "d.slope", "intercept", "d.intercept"), } Cheers, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Reseach Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx op inbo.be www.inbo.be Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney -----Oorspronkelijk bericht----- Van: r-help-bounces op stat.math.ethz.ch [mailto:r-help-bounces op stat.math.ethz.ch] Namens Rainer M Krug Verzonden: woensdag 25 oktober 2006 11:22 Aan: r-help op stat.math.ethz.ch Onderwerp: [R] simplification of code using stamp? Hi I have the following code which I would like to simplify. Id does linear regressions and returns the r-squares, and the coefficients. It runs slow, as it is doing the regressions for each - is it possible to get the values in a dataframe which looks as follow: expert | xx | seeds | r.squared | slope | intercept Thanks in advance, Rainer library(reshape) rsqs <- as.data.frame( stamp( tc.long, expert * xx * seeds ~ ., function(df) try( summary( lm(distance ~ generation, data=df))$r.squared, silent=TRUE ) ) ) slope <- as.data.frame( stamp( tc.long, expert * xx * seeds ~ ., function(df) try( summary( lm(distance ~ generation, data=df))$coefficients[2], silent=TRUE ) ) ) d.slope <- as.data.frame( stamp( tc.long, expert * xx * seeds ~ ., function(df) try( summary( lm(distance ~ generation, data=df))$coefficients[4], silent=TRUE ) ) ) intercept <- as.data.frame( stamp( tc.long, expert * xx * seeds ~ ., function(df) try( summary( lm(distance ~ generation, data=df))$coefficients[1], silent=TRUE ) ) ) d.intercept <- as.data.frame( stamp( tc.long, expert * xx * seeds ~ ., function(df) try( summary( lm(distance ~ generation, data=df))$coefficients[3], silent=TRUE ) ) ) -- Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Department of Conservation Ecology and Entomology University of Stellenbosch Matieland 7602 South Africa Tel: +27 - (0)72 808 2975 (w) Fax: +27 - (0)21 808 3304 Cell: +27 - (0)83 9479 042 email: RKrug op sun.ac.za Rainer op krugs.de ______________________________________________ R-help op stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help op stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
ONKELINX, Thierry wrote:> Resending the function because of a typo in the result vector. > > function(df){ > fit <- summary(lm(distance ~ generation, data=df)) > result <- c(fit$r.squared, fit$coefficients[2], > fit$coefficients[4], > fit$coefficients[1], fit$coefficients[3]) > names(result) <- c("rsqs", "slope", "d.slope", "intercept", > "d.intercept"), > }You are right - the problem is that it isd calculating the regressions five time. I implemented your code, and when I print the resulting data.frame from test <- as.data.frame( stamp( tc.long, expert * xx * seeds * run ~ ., function(df) { try( { fit <- summary( lm(distance ~ generation, data=df) ) result <- c(fit$r.squared, fit$coefficients[2], fit$coefficients[4], fit$coefficients[1], fit$coefficients[3]) names(result) <- c("rsqs", "slope", "d.slope", "intercept", "d.intercept") }, silent=TRUE ) } ) ) I get the following, which is not what I am looking for. > test[1:10,] expert xx seeds run value 1 BW x0010 25 1 rsqs, slope, d.slope, intercept, d.intercept 2 BW x0010 25 2 rsqs, slope, d.slope, intercept, d.intercept 3 BW x0010 25 3 rsqs, slope, d.slope, intercept, d.intercept 4 BW x0010 25 4 rsqs, slope, d.slope, intercept, d.intercept 5 BW x0010 25 5 rsqs, slope, d.slope, intercept, d.intercept 6 BW x0010 28 1 rsqs, slope, d.slope, intercept, d.intercept 7 BW x0010 28 2 rsqs, slope, d.slope, intercept, d.intercept 8 BW x0010 28 3 rsqs, slope, d.slope, intercept, d.intercept 9 BW x0010 28 4 rsqs, slope, d.slope, intercept, d.intercept 10 BW x0010 28 5 rsqs, slope, d.slope, intercept, d.intercept >> > Cheers, > > Thierry > ------------------------------------------------------------------------ > - > > ir. Thierry Onkelinx > > Instituut voor natuur- en bosonderzoek / Reseach Institute for Nature > and Forest > > Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, > methodology and quality assurance > > Gaverstraat 4 > > 9500 Geraardsbergen > > Belgium > > tel. + 32 54/436 185 > > Thierry.Onkelinx at inbo.be > > www.inbo.be > > > > Do not put your faith in what statistics say until you have carefully > considered what they do not say. ~William W. Watt > > A statistical analysis, properly conducted, is a delicate dissection of > uncertainties, a surgery of suppositions. ~M.J.Moroney > >-- Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Department of Conservation Ecology and Entomology University of Stellenbosch Matieland 7602 South Africa Tel: +27 - (0)72 808 2975 (w) Fax: +27 - (0)21 808 3304 Cell: +27 - (0)83 9479 042 email: RKrug at sun.ac.za Rainer at krugs.de