Can someone send me something I can read about passing parameters so I can
understand how lm manages to have a dataframe passed to it, and use columns from
the dataframe to set up a regression. I have looked at the code for lm and
don't understand what I am reading. What I want to do is something like the
following,
myfunction <- function(y,x,dataframe){
fit0 <- lm(y~x,data=dataframe)
print (summary(fit0))
}
# Run the function using dep and ind as dependent and independent variables.
mydata <- data.frame(dep=c(1,2,3,4,5),ind=c(1,2,4,5,7))
myfunction(dep,ind)
# Run the function using outcome and predictor as dependent and independent
variables.
newdata <- data.frame(outcome=c(1,2,3,4,5),predictor=c(1,2,4,5,7))
myfunction(outcome,predictor)
John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
[[alternative HTML version deleted]]
Hi,
I'm not sure if this is what you are after, but instead of defining
arguments for elements of the formula why not simply pass your desired formula
to your function?
Cheers,
Ben
myfunction <- function(frmla,dataframe){
fit0 <- lm(frmla,data=dataframe)
print (summary(fit0))
}
# Run the function using dep and ind as dependent and independent variables.
mydata <- data.frame(dep=c(1,2,3,4,5),ind=c(1,2,4,5,7))
myfunction(ind ~ dep, mydata)
# Call:
# lm(formula = frmla, data = dataframe)
# Residuals:
# 1 2 3 4 5
# 0.2 -0.3 0.2 -0.3 0.2
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -0.7000 0.3317 -2.111 0.125298
# dep 1.5000 0.1000 15.000 0.000643 ***
# ---
# Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
# Residual standard error: 0.3162 on 3 degrees of freedom
# Multiple R-squared: 0.9868, Adjusted R-squared: 0.9825
# F-statistic: 225 on 1 and 3 DF, p-value: 0.0006431
# Run the function using outcome and predictor as dependent and independent
variables.
newdata <- data.frame(outcome=c(1,2,3,4,5),predictor=c(1,2,4,5,7))
myfunction(predictor ~ outcome, newdata)
# # Call:
# lm(formula = frmla, data = dataframe)
# Residuals:
# 1 2 3 4 5
# 0.2 -0.3 0.2 -0.3 0.2
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -0.7000 0.3317 -2.111 0.125298
# outcome 1.5000 0.1000 15.000 0.000643 ***
# ---
# Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
# Residual standard error: 0.3162 on 3 degrees of freedom
# Multiple R-squared: 0.9868, Adjusted R-squared: 0.9825
# F-statistic: 225 on 1 and 3 DF, p-value: 0.0006431
> On May 8, 2019, at 9:22 PM, Sorkin, John <jsorkin at
som.umaryland.edu> wrote:
>
> Can someone send me something I can read about passing parameters so I can
understand how lm manages to have a dataframe passed to it, and use columns from
the dataframe to set up a regression. I have looked at the code for lm and
don't understand what I am reading. What I want to do is something like the
following,
>
>
> myfunction <- function(y,x,dataframe){
>
> fit0 <- lm(y~x,data=dataframe)
> print (summary(fit0))
> }
>
> # Run the function using dep and ind as dependent and independent
variables.
> mydata <- data.frame(dep=c(1,2,3,4,5),ind=c(1,2,4,5,7))
> myfunction(dep,ind)
> # Run the function using outcome and predictor as dependent and independent
variables.
> newdata <- data.frame(outcome=c(1,2,3,4,5),predictor=c(1,2,4,5,7))
> myfunction(outcome,predictor)
>
>
>
>
>
> John David Sorkin M.D., Ph.D.
> Professor of Medicine
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology and
Geriatric Medicine
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org
Ecological Forecasting: https://eco.bigelow.org/
[[alternative HTML version deleted]]
Hello,
There is a "standard" deparse/substitute trick that gets the names of
the variables passed to a function. There are more sophisticated ways
but maybe that is what you are looking for.
myfunction <- function(y, x, dataframe){
y <- deparse(substitute(y))
x <- deparse(substitute(x))
fmla <- as.formula(paste(y, '~', x))
fit0 <- lm(fmla, data = dataframe)
summary(fit0)
}
# Run the function using dep and ind as dependent and independent variables.
mydata <- data.frame(dep = c(1,2,3,4,5),ind=c(1,2,4,5,7))
myfunction(dep, ind, mydata)
# Run the function using outcome and predictor as dependent and
independent variables.
newdata <- data.frame(outcome=c(1,2,3,4,5),predictor=c(1,2,4,5,7))
myfunction(outcome, predictor, newdata)
Note: your function has an argument 'dataframe' that you didn't use
in
any of the two calls.
Hope this helps,
Rui Barradas
?s 02:22 de 09/05/19, Sorkin, John escreveu:> Can someone send me something I can read about passing parameters so I can
understand how lm manages to have a dataframe passed to it, and use columns from
the dataframe to set up a regression. I have looked at the code for lm and
don't understand what I am reading. What I want to do is something like the
following,
>
>
> myfunction <- function(y,x,dataframe){
>
> fit0 <- lm(y~x,data=dataframe)
> print (summary(fit0))
> }
>
> # Run the function using dep and ind as dependent and independent
variables.
> mydata <- data.frame(dep=c(1,2,3,4,5),ind=c(1,2,4,5,7))
> myfunction(dep,ind)
> # Run the function using outcome and predictor as dependent and independent
variables.
> newdata <- data.frame(outcome=c(1,2,3,4,5),predictor=c(1,2,4,5,7))
> myfunction(outcome,predictor)
>
>
>
>
>
> John David Sorkin M.D., Ph.D.
> Professor of Medicine
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology and
Geriatric Medicine
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Hello John, Others have commented on the first half of your question, but the second half of your question looks very much like R's built-in predict() functions:>?predict >?predict.lmBest Regards, Bill. W. Michels, Ph.D. On Wed, May 8, 2019 at 6:23 PM Sorkin, John <jsorkin at som.umaryland.edu> wrote:> > Can someone send me something I can read about passing parameters so I can understand how lm manages to have a dataframe passed to it, and use columns from the dataframe to set up a regression. I have looked at the code for lm and don't understand what I am reading. What I want to do is something like the following, > > > myfunction <- function(y,x,dataframe){ > > fit0 <- lm(y~x,data=dataframe) > print (summary(fit0)) > } > > # Run the function using dep and ind as dependent and independent variables. > mydata <- data.frame(dep=c(1,2,3,4,5),ind=c(1,2,4,5,7)) > myfunction(dep,ind) > # Run the function using outcome and predictor as dependent and independent variables. > newdata <- data.frame(outcome=c(1,2,3,4,5),predictor=c(1,2,4,5,7)) > myfunction(outcome,predictor) > > > > > > John David Sorkin M.D., Ph.D. > Professor of Medicine > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
I don't think previous responses have addressed the question, which appears to be: "How does R know to look in the "data" object for the variable names in the formula?" And, of course, I could be wrong -- in which case ignore all the following. My answer to that question is: it's quite complicated. I think you have to know about calls, function closures, evaluation environments, and the details of model.frame.lm -- and perhaps more. The following **might** be a start:> dat <- data.frame(x = 1:10, y = rnorm(10)) > > ## substitute() is used to return the unevaluated expression for the call > mc <- match.call(lm, call = substitute(lm(y~x,data = dat))) > class(mc)[1] "call"> as.list(mc)[[1]] lm $formula y ~ x $data dat Cheers, Bert Gunter On Thu, May 9, 2019 at 10:01 AM William Michels via R-help < r-help at r-project.org> wrote:> Hello John, > > Others have commented on the first half of your question, but the > second half of your question looks very much like R's built-in > predict() functions: > > >?predict > >?predict.lm > > Best Regards, > > Bill. > > W. Michels, Ph.D. > > > > On Wed, May 8, 2019 at 6:23 PM Sorkin, John <jsorkin at som.umaryland.edu> > wrote: > > > > Can someone send me something I can read about passing parameters so I > can understand how lm manages to have a dataframe passed to it, and use > columns from the dataframe to set up a regression. I have looked at the > code for lm and don't understand what I am reading. What I want to do is > something like the following, > > > > > > myfunction <- function(y,x,dataframe){ > > > > fit0 <- lm(y~x,data=dataframe) > > print (summary(fit0)) > > } > > > > # Run the function using dep and ind as dependent and independent > variables. > > mydata <- data.frame(dep=c(1,2,3,4,5),ind=c(1,2,4,5,7)) > > myfunction(dep,ind) > > # Run the function using outcome and predictor as dependent and > independent variables. > > newdata <- data.frame(outcome=c(1,2,3,4,5),predictor=c(1,2,4,5,7)) > > myfunction(outcome,predictor) > > > > > > > > > > > > John David Sorkin M.D., Ph.D. > > Professor of Medicine > > Chief, Biostatistics and Informatics > > University of Maryland School of Medicine Division of Gerontology and > Geriatric Medicine > > Baltimore VA Medical Center > > 10 North Greene Street > > GRECC (BT/18/GR) > > Baltimore, MD 21201-1524 > > (Phone) 410-605-7119 > > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]