I think I'm missing something. I have a data frame that looks below. sample.df<-data.frame(var1=rbinom(50, size=1, prob=0.5), var2=rbinom(50, size=2, prob=0.5), var3=rbinom(50, size=3, prob=0.5), var4=rbinom(50, size=2, prob=0.5), var5=rbinom(50, size=2, prob=0.5)) I'd like to run a series of univariate general linear models where var1 is always the dependent variable and each of the other variables is the independent. Then I'd like to summarize each in a table. I've tried : sample.formula=list(var1~var2, var1 ~var3, var1 ~var4, var1~var5) mapply(glm, formula=sample.formula, data=list(sample.df), family='binomial') And that works pretty well, except, I'm left with a matrix that contains all the information I need. I can't figure out how to use summary() properly on this information to usefully report that information. Thank you for any suggestions. ********************************* Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9
On Dec 17, 2013, at 5:53 PM, Simon Kiss wrote:> I think I'm missing something. I have a data frame that looks below. > sample.df<-data.frame(var1=rbinom(50, size=1, prob=0.5), var2=rbinom(50, size=2, prob=0.5), var3=rbinom(50, size=3, prob=0.5), var4=rbinom(50, size=2, prob=0.5), var5=rbinom(50, size=2, prob=0.5)) > > I'd like to run a series of univariate general linear models where var1 is always the dependent variable and each of the other variables is the independent. Then I'd like to summarize each in a table. > I've tried : > > sample.formula=list(var1~var2, var1 ~var3, var1 ~var4, var1~var5) > mapply(glm, formula=sample.formula, data=list(sample.df), family='binomial') > > And that works pretty well, except, I'm left with a matrix that contains all the information I need. I can't figure out how to use summary() properly on this information to usefully report that information.The default for mapply's SIMPLIFY argument is TRUE. If you do not want a matrix, then set it to FALSE and the list items will retain their glm-object status. (The summary function applied to the resulting list is still a bit strange, but it is recognizable as having class 'glm' at the end. You should be able to extract the bits that you want and ignore the strange $call item.) -- David Winsemius Alameda, CA, USA
Thanks! that works, more or less. Although the wonky behaviour of mapply that David pointed out is irritating. I tried deleting the $call item from the models produced and passing them to stargazer for reporting the results, but stargazer won't recognize the results even though the class is explicitly "glm lm". Does anyone know why mapply produces such weird results? On 2013-12-18, at 3:29 AM, Dennis Murphy <djmuser at gmail.com> wrote:> Hi: > > Here's a way to generate a list of model objects. Once you have the > list, you can write or call functions to extract useful pieces of > information from each model object and use lapply() to call each list > component recursively. > > sample.df<-data.frame(var1=rbinom(50, size=1, prob=0.5), > var2=rbinom(50, size=2, prob=0.5), > var3=rbinom(50, size=3, prob=0.5), > var4=rbinom(50, size=2, prob=0.5), > var5=rbinom(50, size=2, prob=0.5)) > > # vector of x-variable names > xvars <- names(sample.df)[-1] > > # function to paste a variable x into a formula object and > # then pass it to glm() > f <- function(x) > { > form <- as.formula(paste("var1", x, sep = " ~ ")) > glm(form, data = sample.df) > } > > # Apply the function f to each variable in xvars > modlist <- lapply(xvars, f) > > To give you an idea of some of the things you can do with the list: > > sapply(modlist, class) # return class of each component > lapply(modlist, summary) # return the summary of each model > > # combine the model coefficients into a two-column matrix > do.call(rbind, lapply(modlist, coef)) > > > You'd probably want to rename the second column since the slopes are > associated with different x variables. > > Dennis > > On Tue, Dec 17, 2013 at 5:53 PM, Simon Kiss <sjkiss at gmail.com> wrote: >> I think I'm missing something. I have a data frame that looks below. >> sample.df<-data.frame(var1=rbinom(50, size=1, prob=0.5), var2=rbinom(50, size=2, prob=0.5), var3=rbinom(50, size=3, prob=0.5), var4=rbinom(50, size=2, prob=0.5), var5=rbinom(50, size=2, prob=0.5)) >> >> I'd like to run a series of univariate general linear models where var1 is always the dependent variable and each of the other variables is the independent. Then I'd like to summarize each in a table. >> I've tried : >> >> sample.formula=list(var1~var2, var1 ~var3, var1 ~var4, var1~var5) >> mapply(glm, formula=sample.formula, data=list(sample.df), family='binomial') >> >> And that works pretty well, except, I'm left with a matrix that contains all the information I need. I can't figure out how to use summary() properly on this information to usefully report that information. >> >> Thank you for any suggestions. >> >> ********************************* >> Simon J. Kiss, PhD >> Assistant Professor, Wilfrid Laurier University >> 73 George Street >> Brantford, Ontario, Canada >> N3T 2C9 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.********************************* Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 905 746 7606
Dennis, how would your function be modified to allow it to be more flexible in future. I'm thinking like:> f <- function(x='Dependent variable', y='List of Independent Variables', z='Data Frame') > { > form <- as.formula(paste(x, y, sep = " ~ ")) > glm(form, data =z) > }I tried that then using modlist <- lapply(xvars, f), but it didn't work. On 2013-12-18, at 3:29 AM, Dennis Murphy <djmuser at gmail.com> wrote:> Hi: > > Here's a way to generate a list of model objects. Once you have the > list, you can write or call functions to extract useful pieces of > information from each model object and use lapply() to call each list > component recursively. > > sample.df<-data.frame(var1=rbinom(50, size=1, prob=0.5), > var2=rbinom(50, size=2, prob=0.5), > var3=rbinom(50, size=3, prob=0.5), > var4=rbinom(50, size=2, prob=0.5), > var5=rbinom(50, size=2, prob=0.5)) > > # vector of x-variable names > xvars <- names(sample.df)[-1] > > # function to paste a variable x into a formula object and > # then pass it to glm() > f <- function(x) > { > form <- as.formula(paste("var1", x, sep = " ~ ")) > glm(form, data = sample.df) > } > > # Apply the function f to each variable in xvars > modlist <- lapply(xvars, f) > > To give you an idea of some of the things you can do with the list: > > sapply(modlist, class) # return class of each component > lapply(modlist, summary) # return the summary of each model > > # combine the model coefficients into a two-column matrix > do.call(rbind, lapply(modlist, coef)) > > > You'd probably want to rename the second column since the slopes are > associated with different x variables. > > Dennis > > On Tue, Dec 17, 2013 at 5:53 PM, Simon Kiss <sjkiss at gmail.com> wrote: >> I think I'm missing something. I have a data frame that looks below. >> sample.df<-data.frame(var1=rbinom(50, size=1, prob=0.5), var2=rbinom(50, size=2, prob=0.5), var3=rbinom(50, size=3, prob=0.5), var4=rbinom(50, size=2, prob=0.5), var5=rbinom(50, size=2, prob=0.5)) >> >> I'd like to run a series of univariate general linear models where var1 is always the dependent variable and each of the other variables is the independent. Then I'd like to summarize each in a table. >> I've tried : >> >> sample.formula=list(var1~var2, var1 ~var3, var1 ~var4, var1~var5) >> mapply(glm, formula=sample.formula, data=list(sample.df), family='binomial') >> >> And that works pretty well, except, I'm left with a matrix that contains all the information I need. I can't figure out how to use summary() properly on this information to usefully report that information. >> >> Thank you for any suggestions. >> >> ********************************* >> Simon J. Kiss, PhD >> Assistant Professor, Wilfrid Laurier University >> 73 George Street >> Brantford, Ontario, Canada >> N3T 2C9 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.********************************* Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 905 746 7606