dear R experts---I was programming a fama-macbeth panel regression (a fama-macbeth regression is essentially T cross-sectional regressions, with statistics then obtained from the time-series of coefficients), partly because I wanted faster speed than plm, partly because I wanted some additional features. my function starts as fama.macbeth <- function( formula, din ) { names <- terms( formula ) ## omitted : I want an immediate check that the formula refers to existing variables in the data frame with English error messages monthly.regressions <- by( din, as.factor(din$month), function(dd) coef(lm(model.frame( formula, data=dd ))) as.m <- do.call("rbind", monthly.regressions) colMeans(as.m) ## or something like this. } say my data frame mydata has columns named month, r, laggedx and ... . I can call this function fama.macbeth( r ~ laggedx, din=mydata ) but it fails if I want to compute my x variables. for example, myx <- d[,"laggedx"] fama.macbeth( r ~ myx) I also wish that the computed myx still remembered that it was really laggedx. it's almost as if I should not create a vector myx but a data frame myx to avoid losing the column name. I wonder why such vectors don't keep a name attribute of some sort. there is probably an "R way" of doing this. is there? /iaw ---- Ivo Welch (ivo.welch@gmail.com) [[alternative HTML version deleted]]
Ivo: I may not get your question, but you seem to be confusing the name of an object, which is essentially a pointer into memory and a language construct -- (correction requested if I have misstated! -- and the "names" attribute of (some) objects. You can, of course, attach a "lab" or (whatever) attribute to an object that gives it a label and that will be carried around with it. But without special code (if it's at all possible, even) the label will know nothing about the name assigned to the object -- why should it?! i.e.> y <- structure(1:3,lab = "y") > y[1] 1 2 3 attr(,"lab") [1] "y"> z <- y > z[1] 1 2 3 attr(,"lab") [1] "y Feel free to ignore without response if my comment is irrelevant. Cheers, Bert On Mon, Aug 19, 2013 at 9:45 AM, ivo welch <ivo.welch at anderson.ucla.edu> wrote:> dear R experts---I was programming a fama-macbeth panel regression (a > fama-macbeth regression is essentially T cross-sectional regressions, with > statistics then obtained from the time-series of coefficients), partly > because I wanted faster speed than plm, partly because I wanted some > additional features. > > my function starts as > > fama.macbeth <- function( formula, din ) { > names <- terms( formula ) > ## omitted : I want an immediate check that the formula refers to > existing variables in the data frame with English error messages > monthly.regressions <- by( din, as.factor(din$month), function(dd) > coef(lm(model.frame( formula, data=dd ))) > as.m <- do.call("rbind", monthly.regressions) > colMeans(as.m) ## or something like this. > } > > say my data frame mydata has columns named month, r, laggedx and ... . I > can call this function > > fama.macbeth( r ~ laggedx, din=mydata ) > > but it fails if I want to compute my x variables. for example, > > myx <- d[,"laggedx"] > fama.macbeth( r ~ myx) > > I also wish that the computed myx still remembered that it was really > laggedx. it's almost as if I should not create a vector myx but a data > frame myx to avoid losing the column name. I wonder why such vectors don't > keep a name attribute of some sort. > > there is probably an "R way" of doing this. is there? > > /iaw > > ---- > Ivo Welch (ivo.welch at gmail.com) > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
On Aug 19, 2013, at 9:45 AM, ivo welch wrote:> dear R experts---I was programming a fama-macbeth panel regression (a > fama-macbeth regression is essentially T cross-sectional regressions, with > statistics then obtained from the time-series of coefficients), partly > because I wanted faster speed than plm, partly because I wanted some > additional features. > > my function starts as > > fama.macbeth <- function( formula, din ) { > names <- terms( formula ) > ## omitted : I want an immediate check that the formula refers to > existing variables in the data frame with English error messages >Look the structure of a terms result from a formula argument with str(): fama.macbeth <- function( formula, din ) { fnames <- terms( formula ) ; str(fnames) }> fama.macbeth( x ~ y, data.frame(x=rnorm(10), y=rnorm(10) ) )Classes 'terms', 'formula' length 3 x ~ y ..- attr(*, "variables")= language list(x, y) ..- attr(*, "factors")= int [1:2, 1] 0 1 .. ..- attr(*, "dimnames")=List of 2 .. .. ..$ : chr [1:2] "x" "y" .. .. ..$ : chr "y" ..- attr(*, "term.labels")= chr "y" ..- attr(*, "order")= int 1 ..- attr(*, "intercept")= int 1 ..- attr(*, "response")= int 1 ..- attr(*, ".Environment")=<environment: R_GlobalEnv> Then extract the dimnames from the "factors" attribute to compare to the names in hte data-object:> fama.macbeth <- function( formula, din ) {fnames <- terms( formula ) ; dnames <- names( din) dimnames(attr(fnames, "factors"))[[1]] %in% dnames } #[1] TRUE TRUE I couldn't tell if this was the main thrust of you question. It seems to meander a bit. -- David.> monthly.regressions <- by( din, as.factor(din$month), function(dd) > coef(lm(model.frame( formula, data=dd ))) > as.m <- do.call("rbind", monthly.regressions) > colMeans(as.m) ## or something like this. > } > say my data frame mydata has columns named month, r, laggedx and ... . I > can call this function > > fama.macbeth( r ~ laggedx, din=mydata ) > > but it failsWhat fails?> if I want to compute my x variables. for example, > > myx <- d[,"laggedx"] > fama.macbeth( r ~ myx) > > I also wish that the computed myx still remembered that it was really > laggedx. it's almost as if I should not create a vector myx but a data > frame myx to avoid losing the column name.I wouldn't say "almost"... rather that is exactly what you should do. R regression methods almost always work better when formulas are interpreted in the environment of the data argument.> I wonder why such vectors don't > keep a name attribute of some sort. > > there is probably an "R way" of doing this. is there? > > /iaw > > ---- > Ivo Welch (ivo.welch at gmail.com) > > [[alternative HTML version deleted]]Still posting HTML?> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.And do explain what the goal is. -- David Winsemius Alameda, CA, USA