Warnes, Gregory R
2001-Aug-26 05:15 UTC
[Rd] Re: Variable labels (was Re: [R] Reading SAS version 8 d ata into
> From: fharrell@virginia.edu [mailto:fharrell@virginia.edu] >[snip]> I think your code is more complex that is really needed. > > The problem with defaulting to deparse(...) is that > multiple function pass-throughs return the wrong result: >[snip]> So I don't see a large role for the deparse(...) method. >Actually one of the reasons that I included the deparse(...) method was to create a 'drop-in' substitute for the current calls to deparse(...) that are found throughout the code and that would be backward-compatible. Having such a call will immensely simplify changing existing code. It's true that the variable names don't get correctly handled once you down a layer of function calls, but that applies AFAIK to the current deparse(...) method as well.> > The Hmisc library already defines label<- so if you > are willing to use another name for your version that > would prevent confusion from users of Hmisc.I don't think there will be a problem as long as the functions do exactly the same thing. To that end, perhaps we should agree on a common set of functions and keep them in sync. I expect that your functions are better tested than mine, since they've been available for some time.> > The problem of labels being retained after you do > arithmetic on the variable is a real one, and one > I've put up with for a long time with S-Plus. It would > be nice if R could prevent that but that is getting tricky.> What I've wanted more generally is the ability for the > user to specify a vector of attribute names in options() > that would be preserved upon subsetting. That way I > wouldn't have to go to trouble to write local versions > of [.factor, etc. that carry the 'label' attribute. > Im my usage, 'label's are always logically carried > forward for subsetting.It seems to be a good idea to preserve the labels during subset operations. What are the possible cons? -Greg LEGAL NOTICE Unless expressly stated otherwise, this message is confidential and may be privileged. It is intended for the addressee(s) only. Access to this E-mail by anyone else is unauthorized. If you are not an addressee, any disclosure or copying of the contents of this E-mail or any action taken (or not taken) in reliance on it is unauthorized and may be unlawful. If you are not an addressee, please inform the sender immediately. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
fharrell@virginia.edu
2001-Aug-26 12:45 UTC
[Rd] Re: Variable labels (was Re: [R] Reading SAS version 8 data into
"Warnes, Gregory R" wrote:> > > From: fharrell@virginia.edu [mailto:fharrell@virginia.edu] > > > [snip] > > I think your code is more complex that is really needed. > > > > The problem with defaulting to deparse(...) is that > > multiple function pass-throughs return the wrong result: > > > [snip] > > So I don't see a large role for the deparse(...) method. > > > > Actually one of the reasons that I included the deparse(...) method was to > create a 'drop-in' substitute for the current calls to deparse(...) that are > found throughout the code and that would be backward-compatible. Having > such a call will immensely simplify changing existing code.Thanks for your reply Greg. The "complex" code I was referring to was the eval() and as.name() parts of your code. The deparse can be handy although I have done that on a case-by-case basis. For example in a high-level plotting function I'll retrieve label(an argument) and if that is empty I'll use deparse(substitute(argument)).> > It's true that the variable names don't get correctly handled once you down > a layer of function calls, but that applies AFAIK to the current > deparse(...) method as well.Right. That's why I try to define labels early when a data frame is being created (e.g, in sas.get).> > > > > The Hmisc library already defines label<- so if you > > are willing to use another name for your version that > > would prevent confusion from users of Hmisc. > > I don't think there will be a problem as long as the functions do exactly > the same thing. To that end, perhaps we should agree on a common set of > functions and keep them in sync. I expect that your functions are better > tested than mine, since they've been available for some time.Mine are simple: label <- function(x) { lab<-attr(x, "label") if(is.null(lab))lab<-"" lab } #From Bill Dunlap, StatSci 15Mar95: "label<-" <- if(!.SV4.) function(x, value) structure(x, label=value, class=c('labelled', attr(x,'class')[attr(x,'class')!='labelled'])) else function(x, value) { # 1Nov00 for Splus 5.x, 6.x attr(x,'label') <- value x } For non SV4 systems (which include R) you see above that when putting a label on a variable a class "labelled" is added. This is really to handle the subsetting problem but I would rather get rid of it if subsetting can respect selected attributes.> > > > > The problem of labels being retained after you do > > arithmetic on the variable is a real one, and one > > I've put up with for a long time with S-Plus. It would > > be nice if R could prevent that but that is getting tricky. > > > What I've wanted more generally is the ability for the > > user to specify a vector of attribute names in options() > > that would be preserved upon subsetting. That way I > > wouldn't have to go to trouble to write local versions > > of [.factor, etc. that carry the 'label' attribute. > > Im my usage, 'label's are always logically carried > > forward for subsetting. > > It seems to be a good idea to preserve the labels during subset operations. > What are the possible cons?If the user or a package specifies the list of attribute names to preserve (I can only think of 'label' and 'units' right now) I don't see a downside. -Frank> > -Greg > > LEGAL NOTICE > Unless expressly stated otherwise, this message is confidential and may be privileged. It is intended for the addressee(s) only. Access to this E-mail by anyone else is unauthorized. If you are not an addressee, any disclosure or copying of the contents of this E-mail or any action taken (or not taken) in reliance on it is unauthorized and may be unlawful. If you are not an addressee, please inform the sender immediately.-- Frank E Harrell Jr Prof. of Biostatistics & Statistics Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences U. Virginia School of Medicine http://hesweb1.med.virginia.edu/biostat -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._