Warnes, Gregory R
2001-Aug-24 18:18 UTC
[Rd] RE: Variable lables (was Re: [R] Reading SAS version 8 data into
[Moved from R-help]> From: fharrell@virginia.edu [mailto:fharrell@virginia.edu] > I store variable labels as "label" attributes of vectors > and use then in various plotting functions as well as the > describe() function.I would like to see general support for label attributes in the R plotting and modeling functions. One possible way of implementing this is to create a replacement for the standard "deparse(substitute(blah))" idiom. This function, getlabel(), checks for a label attribute and returns that if present. Otherwise it returns the variable's name as a string. Here's some code I've put together: label <- function(x) attr(x,"label") "label<-" <- function(x, value ) { m <- match.call() m[[1]] <- as.name("attr<-") m$value <- NULL m$which <- "label" m$value <- value eval(m) } getlabel <- function(x) { tmp <- attr(x,"label") if(is.null(tmp) || tmp=="") { m <- match.call() m[[1]] <- as.name('substitute') tmp <- deparse(eval(m,envir=parent.frame())) } return(tmp) } I've done some testing, and getlabel seems to work fine as a substitute for "deparse(subsitute(x))" in the plot commands. There are a couple of problems. First, attributes are carried along in sometime unexpected ways. For example, attributes are carried along by all of the arethmetic operations I tried: > x <- rnorm(1) > label(x) <- "x label" > > sqrt(x) [1] 0.8888801 attr(,"label") [1] "x label" > x+1 [1] 1.8888801 attr(,"label") [1] "x label" Ideally, performing an operation the creates a new variable should mask off the label attribute (what about other attributes?). I recognize that this would require changes to R. Would this be a big task? Second, unless one bounds the length of the labels, it can get pretty messy to use them in some places, (eg the coefficients table reported by print.summary). I can see a couple of solutions for this problem. A) Truncate labels when necessary. B) Have 2 attributes--One short 'label' that has a fixed length (say 30 characters), and one long 'description' that can has no length limit. C) Continue to use the variable name given in the call for places where length is a problem, but show a translation between the variable name and the label somewhere else as part of the output. Except for the problem of the label attribute getting 'carried along' when it is not desirable, I think that it would be straightforward and 'backwards compatible' to add general support for variable labels. I am willing to submit patches for functions that I regularly use. Would others be willing to contribute? Would the patches be accepted? -Greg LEGAL NOTICE Unless expressly stated otherwise, this message is confidential and may be privileged. It is intended for the addressee(s) only. Access to this E-mail by anyone else is unauthorized. If you are not an addressee, any disclosure or copying of the contents of this E-mail or any action taken (or not taken) in reliance on it is unauthorized and may be unlawful. If you are not an addressee, please inform the sender immediately. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
fharrell@virginia.edu
2001-Aug-25 13:42 UTC
[Rd] Re: Variable lables (was Re: [R] Reading SAS version 8 data into
Dear Greg, I too would like to see labels be more a part of R. In Hmisc I allow labels to be any length but plotting and table making functions have options to abbreviate() them or use variable names instead of labels. I think your code is more complex that is really needed. The problem with defaulting to deparse(...) is that multiple function pass-throughs return the wrong result:> f <- function(w)getlabel(w) > g <- function(z)f(z) > g(y)[1] "z" So I don't see a large role for the deparse(...) method. The Hmisc library already defines label<- so if you are willing to use another name for your version that would prevent confusion from users of Hmisc. The problem of labels being retained after you do arithmetic on the variable is a real one, and one I've put up with for a long time with S-Plus. It would be nice if R could prevent that but that is getting tricky. What I've wanted more generally is the ability for the user to specify a vector of attribute names in options() that would be preserved upon subsetting. That way I wouldn't have to go to trouble to write local versions of [.factor, etc. that carry the 'label' attribute. Im my usage, 'label's are always logically carried forward for subsetting. Frank "Warnes, Gregory R" wrote:> > [Moved from R-help] > > > From: fharrell@virginia.edu [mailto:fharrell@virginia.edu] > > I store variable labels as "label" attributes of vectors > > and use then in various plotting functions as well as the > > describe() function. > > I would like to see general support for label attributes in the R plotting > and modeling functions. One possible way of implementing this is to create > a replacement for the standard "deparse(substitute(blah))" idiom. This > function, getlabel(), checks for a label attribute and returns that if > present. Otherwise it returns the variable's name as a string. > > Here's some code I've put together: > > label <- function(x) attr(x,"label") > > "label<-" <- function(x, value ) > { > m <- match.call() > m[[1]] <- as.name("attr<-") > m$value <- NULL > m$which <- "label" > m$value <- value > eval(m) > } > > getlabel <- function(x) > { > tmp <- attr(x,"label") > if(is.null(tmp) || tmp=="") > { > m <- match.call() > m[[1]] <- as.name('substitute') > tmp <- deparse(eval(m,envir=parent.frame())) > } > return(tmp) > } > > I've done some testing, and getlabel seems to work fine as a substitute for > "deparse(subsitute(x))" in the plot commands. > > There are a couple of problems. First, attributes are carried along in > sometime unexpected ways. For example, attributes are carried along by all > of the arethmetic operations I tried: > > x <- rnorm(1) > > label(x) <- "x label" > > > > sqrt(x) > [1] 0.8888801 > attr(,"label") > [1] "x label" > > x+1 > [1] 1.8888801 > attr(,"label") > [1] "x label" > Ideally, performing an operation the creates a new variable should mask off > the label attribute (what about other attributes?). I recognize that this > would require changes to R. Would this be a big task? > > Second, unless one bounds the length of the labels, it can get pretty messy > to use them in some places, (eg the coefficients table reported by > print.summary). I can see a couple of solutions for this problem. A) > Truncate labels when necessary. B) Have 2 attributes--One short 'label' > that has a fixed length (say 30 characters), and one long 'description' that > can has no length limit. C) Continue to use the variable name given in the > call for places where length is a problem, but show a translation between > the variable name and the label somewhere else as part of the output. > > Except for the problem of the label attribute getting 'carried along' when > it is not desirable, I think that it would be straightforward and 'backwards > compatible' to add general support for variable labels. > > I am willing to submit patches for functions that I regularly use. Would > others be willing to contribute? Would the patches be accepted? > > -Greg > > LEGAL NOTICE > Unless expressly stated otherwise, this message is confidential and may be privileged. It is intended for the addressee(s) only. Access to this E-mail by anyone else is unauthorized. If you are not an addressee, any disclosure or copying of the contents of this E-mail or any action taken (or not taken) in reliance on it is unauthorized and may be unlawful. If you are not an addressee, please inform the sender immediately.-- Frank E Harrell Jr Prof. of Biostatistics & Statistics Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences U. Virginia School of Medicine http://hesweb1.med.virginia.edu/biostat -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._