C.Rosa at lse.ac.uk
2007-Jan-29 15:06 UTC
[R] Loop with string variable AND customizable "summary" output
Dear All, I am using R for my research and I have two questions about it: 1) is it possible to create a loop using a string, instead of a numeric vector? I have in mind a specific problem: Suppose you have 2 countries: UK, and USA, one dependent (y) and one independent variable (y) for each country (vale a dire: yUK, xUK, yUSA, xUSA) and you want to run automatically the following regressions: for (i in c("UK","USA")) output{i}<-summary(lm(y{i} ~ x{i})) In other words, at the end I would like to have two objects as output: "outputUK" and "outputUSA", which contain respectively the results of the first and second regression (yUK on xUK and yUSA on xUSA). 2) in STATA there is a very nice code ("outreg") to display nicely (and as the user wants to) your regression results. Is there anything similar in R / R contributed packages? More precisely, I am thinking of something that is close in spirit to "summary" but it is also customizable. For example, suppose you want different Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or a different format display (i.e. without "t value" column) implemented automatically (without manually editing it every time). In alternative, if I was able to see it, I could modify the source code of the function "summary", but I am not able to see its (line by line) code. Any idea? Or may be a customizable regression output already exists? Thanks really a lot! Carlo
Roger Bivand
2007-Jan-29 15:31 UTC
[R] Loop with string variable AND customizable "summary" output
On Mon, 29 Jan 2007 C.Rosa at lse.ac.uk wrote:> Dear All, > > I am using R for my research and I have two questions about it: > > 1) is it possible to create a loop using a string, instead of a numeric > vector? I have in mind a specific problem: > > Suppose you have 2 countries: UK, and USA, one dependent (y) and one > independent variable (y) for each country (vale a dire: yUK, xUK, yUSA, > xUSA) and you want to run automatically the following regressions: > > > > for (i in c("UK","USA")) > > output{i}<-summary(lm(y{i} ~ x{i})) > > > > In other words, at the end I would like to have two objects as output: > "outputUK" and "outputUSA", which contain respectively the results of > the first and second regression (yUK on xUK and yUSA on xUSA). >The input data could be reshaped as y, x, country, and subset= used in the lm() call. To assign to named objects see assign(), but consider using a named list instead, assigning to a list of the required length in turn, and giving the names from the defining vector. Then you'd get output$UK, etc.> > > 2) in STATA there is a very nice code ("outreg") to display nicely (and > as the user wants to) your regression results. > > Is there anything similar in R / R contributed packages? More precisely, > I am thinking of something that is close in spirit to "summary" but it > is also customizable. For example, suppose you want different Signif. > codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or a different > format display (i.e. without "t value" column) implemented automatically > (without manually editing it every time). > > In alternative, if I was able to see it, I could modify the source code > of the function "summary", but I am not able to see its (line by line) > code. Any idea?Use a custom function on the output object from using the summary() method on the lm object (that is on the summary.lm object). Use str() to look at the summary.lm object to see what you want.> > Or may be a customizable regression output already exists? > > Thanks really a lot! > > Carlo > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
Wensui Liu
2007-Jan-29 15:39 UTC
[R] Loop with string variable AND customizable "summary" output
Carlo, try something like: for (i in c("UK","USA")) { summ<-summary(lm(y ~ x), subset = (country = i)) assign(paste('output', i, sep = ''), summ); } (note: it is untested, sorry). On 1/29/07, C.Rosa at lse.ac.uk <C.Rosa at lse.ac.uk> wrote:> Dear All, > > I am using R for my research and I have two questions about it: > > 1) is it possible to create a loop using a string, instead of a numeric vector? I have in mind a specific problem: > > Suppose you have 2 countries: UK, and USA, one dependent (y) and one independent variable (y) for each country (vale a dire: yUK, xUK, yUSA, xUSA) and you want to run automatically the following regressions: > > > > for (i in c("UK","USA")) > > output{i}<-summary(lm(y{i} ~ x{i})) > > > > In other words, at the end I would like to have two objects as output: "outputUK" and "outputUSA", which contain respectively the results of the first and second regression (yUK on xUK and yUSA on xUSA). > > > > 2) in STATA there is a very nice code ("outreg") to display nicely (and as the user wants to) your regression results. > > Is there anything similar in R / R contributed packages? More precisely, I am thinking of something that is close in spirit to "summary" but it is also customizable. For example, suppose you want different Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or a different format display (i.e. without "t value" column) implemented automatically (without manually editing it every time). > > In alternative, if I was able to see it, I could modify the source code of the function "summary", but I am not able to see its (line by line) code. Any idea? > > Or may be a customizable regression output already exists? > > Thanks really a lot! > > Carlo > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog)
Vladimir Eremeev
2007-Jan-29 15:50 UTC
[R] Loop with string variable AND customizable "summary" output
C.Rosa wrote:> > Dear All, > > I am using R for my research and I have two questions about it: > > 1) is it possible to create a loop using a string, instead of a numeric > vector? I have in mind a specific problem: > > for (i in c("UK","USA")) > > output{i}<-summary(lm(y{i} ~ x{i})) > > In other words, at the end I would like to have two objects as output: > "outputUK" and "outputUSA", which contain respectively the results of the > first and second regression (yUK on xUK and yUSA on xUSA). >Consider R functions bquote, substitute, eval and parse. Several examples are given somewhere in RNews (http://cran.r-project.org/doc/Rnews/) Unfortunately I don't remember exactly which issue, one of list members sent me a link to the article several years ago, when I was studying similar question. C.Rosa wrote:> > 2) I am thinking of something that is close in spirit to "summary" but it > is also customizable. For example, suppose you want different Signif. > codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or a different > format display (i.e. without "t value" column) implemented automatically > (without manually editing it every time). > > In alternative, if I was able to see it, I could modify the source code of > the function "summary", but I am not able to see its (line by line) code. > Any idea? >Stars and significance codes are printed with the symnum function. To customize the summary, explore the result returned by the lm. For example, str(outputUK) you will see, it is a list. Then you will be able to reference its elements with $ (say, outputUK$coeff) R is an object oriented language, and calls of the same function on different objects usually invoke different functions (if a class has a description of proper method). The R manuals contain very good description of this mechanism. Function methods gives you a list of all defined methods For example> methods(summary) > methods(print)If you are working with the lm results, you need to explore the function print.summary.lm> summary(outputUK)invokes summary.lm function, as outputUK is the object of class "lm". This function produces the object of class "summary.lm" Then this object is printed with the method print.summary.lm -- View this message in context: http://www.nabble.com/-R--Loop-with-string-variable-AND-customizable-%22summary%22-output-tf3136358.html#a8691620 Sent from the R help mailing list archive at Nabble.com.
Vladimir Eremeev
2007-Jan-29 16:07 UTC
[R] Loop with string variable AND customizable "summary" output
That is C.Rosa wrote:> > for (i in c("UK","USA")) > output{i}<-summary(lm(y{i} ~ x{i})) >for (i in c("UK","USA")) { lm.txt<-paste("output",i,"<-","lm(","y",i,"x",i,")",sep="") # 1. produce a character string containing needed expression eval(parse(text=lm.txt)) # 2. parse and evaluate it } -- View this message in context: http://www.nabble.com/-R--Loop-with-string-variable-AND-customizable-%22summary%22-output-tf3136358.html#a8692041 Sent from the R help mailing list archive at Nabble.com.
Vladimir Eremeev wrote:> > That is > > C.Rosa wrote: >> >> for (i in c("UK","USA")) >> output{i}<-summary(lm(y{i} ~ x{i})) >> > > for (i in c("UK","USA")) { > lm.txt<-paste("output",i,"<-","lm(","y",i,"~","x",i,")",sep="") # 1. > produce a character string containing needed expression > eval(parse(text=lm.txt)) > # 2. parse and evaluate it > } >-- View this message in context: http://www.nabble.com/-R--Loop-with-string-variable-AND-customizable-%22summary%22-output-tf3136358.html#a8692073 Sent from the R help mailing list archive at Nabble.com.
Gabor Grothendieck
2007-Jan-29 16:27 UTC
[R] Loop with string variable AND customizable "summary" output
Often you will find that if you arrange your data in a desirable way in the first place everything becomes easier. What you really want is a data frame such as the last three columns of the builtin data frame CO2 where Treatment corresponds to country and the two numeric variables correspond to your y and x. Then its easy: lapply(levels(CO2$Treatment), function(lev) lm(uptake ~ conc, CO2, subset = Treatment == lev)) The only problem with the above is that the Call: in the output does not really tell you which level of Treatment is being used since it literally shows "lm(uptake ~ conc, CO2, subset = Treatment == lev)" each time. To get around substitute the value of lev in. Because R uses delayed evaluation you also need to force the evaluation of lev prior to substituting it in: lapply(levels(CO2$Treatment), function(lev) { lev <- force(lev) eval(substitute(lm(uptake ~ conc, CO2, subset = Treatment == lev)), list(lev = lev)) }) Now if you really want to do it the way you specified originally try this. Suppose we use attach to grab the variables x1, x2, x3, x4, y1, y2, y3, y4 out of the builtin anscombe data frame for purposes of getting our hands on some sample data. In your case the variables would already be in the workspace so the attach is not needed. Then simply reconstruct the formula in fo. You could simply use lm(fo) but then the Call: in the output of lm would literally read lm(fo) so its better to use do.call: # next line gives the variables x1, x2, x3, x4, y1, y2, y3, y4 # from the builtin ancombe data set. # In your case such variables would already exist. attach(anscombe) lapply(1:4, function(i) { ynm <- paste("y", i, sep = "") xnm <- paste("x", i, sep = "") fo <- as.formula(paste(ynm, "~", xnm)) do.call("lm", list(fo)) }) detach(anscombe) Or if all the variables have the same length you could use a form such as ancombe in the first place: Actually this is not really a recommended way of proceeding. You would be better off putting all your variables in a data frame and using that. lapply(1:4, function(i) { fo <- as.formula(paste(names(anscombe)[i+4], "~", names(anscombe)[i])) do.call("lm", list(fo, data = quote(anscombe))) }) or lapply(1:4, function(i) { fo <- y ~ x fo[[2]] <- as.name(names(anscombe)[i+4]) fo[[3]] <- as.name(names(anscombe)[i]) do.call("lm", list(fo, data = quote(anscombe))) }) On 1/29/07, C.Rosa at lse.ac.uk <C.Rosa at lse.ac.uk> wrote:> Dear All, > > I am using R for my research and I have two questions about it: > > 1) is it possible to create a loop using a string, instead of a numeric vector? I have in mind a specific problem: > > Suppose you have 2 countries: UK, and USA, one dependent (y) and one independent variable (y) for each country (vale a dire: yUK, xUK, yUSA, xUSA) and you want to run automatically the following regressions: > > > > for (i in c("UK","USA")) > > output{i}<-summary(lm(y{i} ~ x{i})) > > > > In other words, at the end I would like to have two objects as output: "outputUK" and "outputUSA", which contain respectively the results of the first and second regression (yUK on xUK and yUSA on xUSA). > > > > 2) in STATA there is a very nice code ("outreg") to display nicely (and as the user wants to) your regression results. > > Is there anything similar in R / R contributed packages? More precisely, I am thinking of something that is close in spirit to "summary" but it is also customizable. For example, suppose you want different Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or a different format display (i.e. without "t value" column) implemented automatically (without manually editing it every time). > > In alternative, if I was able to see it, I could modify the source code of the function "summary", but I am not able to see its (line by line) code. Any idea? > > Or may be a customizable regression output already exists? > > Thanks really a lot! > > Carlo > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >