Dear All, after searching on CRAN I got the impression that there is no standard way in R to label values of a numerical variable. Since this would be useful for me I intend to create such an attribute, at the moment for my personal use. Still I would like to choose a name which does not conflict with names of commonly used attributes. Would value.labels or vallabs create conflicts? The attribute should be structured as data.frame with two columns, levels (numeric) and labels (character). These could then also be used to transform from numeric to factor. If the attribute is copied to the factor variable it could also serve to retransform the factor to the original numerical variable. Comments? Ideas? Thanks Heinz T?chler
>>>>> "Heinz" == Heinz Tuechler <tuechler at gmx.at> >>>>> on Tue, 23 May 2006 01:17:21 +0100 writes:Heinz> Dear All, after searching on CRAN I got the Heinz> impression that there is no standard way in R to Heinz> label values of a numerical variable. Hmm, there's names(.) and "names(.) <- .." Why are those not sufficient? x <- 1:3 names(x) <- c("apple", "banana", NA) Heinz> Since this Heinz> would be useful for me I intend to create such an Heinz> attribute, at the moment for my personal use. Still Heinz> I would like to choose a name which does not conflict Heinz> with names of commonly used attributes. Heinz> Would value.labels or vallabs create conflicts? Heinz> The attribute should be structured as data.frame with Heinz> two columns, levels (numeric) and labels Heinz> (character). These could then also be used to Heinz> transform from numeric to factor. If the attribute is Heinz> copied to the factor variable it could also serve to Heinz> retransform the factor to the original numerical Heinz> variable. Heinz> Comments? Ideas? Heinz> Thanks Heinz> Heinz T?chler Heinz> ______________________________________________ Heinz> R-help at stat.math.ethz.ch mailing list Heinz> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE Heinz> do read the posting guide! Heinz> http://www.R-project.org/posting-guide.html
At 14:12 03.06.2006 +0200, Martin Maechler wrote:>>>>>> "Heinz" == Heinz Tuechler <tuechler at gmx.at> >>>>>> on Tue, 23 May 2006 01:17:21 +0100 writes: > > Heinz> Dear All, after searching on CRAN I got the > Heinz> impression that there is no standard way in R to > Heinz> label values of a numerical variable. > >Hmm, there's names(.) and "names(.) <- .." >Why are those not sufficient? > >x <- 1:3 >names(x) <- c("apple", "banana", NA)Martin, I will considere this. For now I am using an attribute value.labels and a corresponding class to preserve this and other attributes after inclusion in a data.frame and indexing/subsetting, but using names should do as well. My idea was more like defining a set of value labels for a variable and apply it to all the variable, as e.g. in the following _pseudocode_: ### not run ### pseudocode x <- c(1, 2, 3, 3, 2, 3, 1) value.labels(x) <- c(apple=1, banana=2, NA=3) x ### desired result apple banana NA NA banana NA apple 1 2 3 3 2 3 1 value.labels(x) <- c(Apfel=1, Banane=2, Birne=3) # redefine labels x ### desired result Apfel Banane Birne Birne Banane Birne Apfel 1 2 3 3 2 3 1 value.labels(x) # inspect labels ### desired result Apfel Banane Birne 1 2 3 These value.labels should persist even after inclusion in a data.frame and after indexing/subsetting. I did not yet try your idea concerning these aspects, but I will do it. My final goal is to do all the data handling on numerically coded variables and to transform to factors "on the fly" when needed for statistical procedures. Given the presence of value.labels a factor function could use them for the conversion. I described my motivation for all this in a previous post, titled: How to represent a metric categorical variable? There was no response at all and I wonder, if this is such a rare problem. Thanks, Heinz> > > Heinz> Since this > Heinz> would be useful for me I intend to create such an > Heinz> attribute, at the moment for my personal use. Still > Heinz> I would like to choose a name which does not conflict > Heinz> with names of commonly used attributes. > > Heinz> Would value.labels or vallabs create conflicts? > > Heinz> The attribute should be structured as data.frame with > Heinz> two columns, levels (numeric) and labels > Heinz> (character). These could then also be used to > Heinz> transform from numeric to factor. If the attribute is > Heinz> copied to the factor variable it could also serve to > Heinz> retransform the factor to the original numerical > Heinz> variable. > > Heinz> Comments? Ideas? > > Heinz> Thanks > > Heinz> Heinz T?chler > > Heinz> ______________________________________________ > Heinz> R-help at stat.math.ethz.ch mailing list > Heinz> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE > Heinz> do read the posting guide! > Heinz> http://www.R-project.org/posting-guide.html > >
Aha! Thank you for the more detailed example. My solution for that situation is an attribute "position" and function as.position(). I use this in my book Statistical Analysis and Data Display Richard M. Heiberger and Burt Holland The online files for the book are available at http://springeronline.com/0-387-40270-5 For this example, you need the function as.position() included in this email. ### example ######## x <- ordered(c(1,2,3,2,4,3,1,2,4,3,2,1,3), labels=c("small", "medium", "large", "very.large")) x attr(x, "position") <- c(1,2,4,8) x as.position(x) y <- rnorm(length(x)) y xyplot(y ~ x) source("~/h2/library/code/as.position.s") xyplot(y ~ as.position(x)) xyplot(y ~ as.position(x), scale=list(x=list(at=attr(x,"position"), labels=levels(x)))) xyplot(y ~ as.position(x), scale=list(x=list(at=attr(x,"position"), labels=levels(x))), xlab="x") ### end example ######## ### as.position.s ######### as.position <- function(x) { if (is.numeric(x)) x else { if (!is.factor(x)) stop("x must be either numeric or factor.") if (!is.null(attr(x, "position"))) x <- attr(x, "position")[x] else { lev.x <- levels(x) if (inherits(x, "ordered")) { on.exit(options(old.warn)) old.warn <- options(warn=-1) if (!any(is.na(as.numeric(lev.x)))) x <- as.numeric(lev.x)[x] else x <- as.numeric(ordered(lev.x, lev.x))[x] } else x <- as.numeric(x) } } x } ## tmp <- ordered(c("c","b","f","f","c","b"), c("c","b","f")) ## as.numeric(tmp) ## as.position(tmp) ## ## tmp <- factor(c("c","b","f","f","c","b")) ## as.numeric(tmp) ## as.position(tmp) ## ## tmp <- factor(c(1,3,5,3,5,1)) ## as.numeric(tmp) ## as.position(tmp) ## ## tmp <- ordered(c(1,3,5,3,5,1)) ## as.numeric(tmp) ## as.position(tmp) ## ## tmp <- c(1,3,5,3,5,1) ## as.numeric(tmp) ## as.position(tmp) ### end as.position.s #########
Thank you, Richard. As soon as I find time I will carefully look at your solution and your book. Heinz At 10:01 05.06.2006 -0400, Richard M. Heiberger wrote:>Aha! Thank you for the more detailed example. > >My solution for that situation is an attribute "position" and function >as.position(). I use this in my book > > Statistical Analysis and Data Display > Richard M. Heiberger and Burt Holland > >The online files for the book are available at > http://springeronline.com/0-387-40270-5 > > > >For this example, you need the function as.position() included in this >email. > > >### example ######## >x <- ordered(c(1,2,3,2,4,3,1,2,4,3,2,1,3), > labels=c("small", "medium", "large", "very.large")) >x >attr(x, "position") <- c(1,2,4,8) >x >as.position(x) > >y <- rnorm(length(x)) >y > >xyplot(y ~ x) >source("~/h2/library/code/as.position.s") >xyplot(y ~ as.position(x)) >xyplot(y ~ as.position(x), > scale=list(x=list(at=attr(x,"position"), labels=levels(x)))) >xyplot(y ~ as.position(x), > scale=list(x=list(at=attr(x,"position"), labels=levels(x))), > xlab="x") >### end example ######## > > > >### as.position.s ######### >as.position <- function(x) { > if (is.numeric(x)) > x > else { > if (!is.factor(x)) stop("x must be either numeric or factor.") > > if (!is.null(attr(x, "position"))) > x <- attr(x, "position")[x] > else { > lev.x <- levels(x) > if (inherits(x, "ordered")) { > on.exit(options(old.warn)) > old.warn <- options(warn=-1) > if (!any(is.na(as.numeric(lev.x)))) > x <- as.numeric(lev.x)[x] > else > x <- as.numeric(ordered(lev.x, lev.x))[x] > } > else > x <- as.numeric(x) > } > } > x >} > > >## tmp <- ordered(c("c","b","f","f","c","b"), c("c","b","f")) >## as.numeric(tmp) >## as.position(tmp) >## >## tmp <- factor(c("c","b","f","f","c","b")) >## as.numeric(tmp) >## as.position(tmp) >## >## tmp <- factor(c(1,3,5,3,5,1)) >## as.numeric(tmp) >## as.position(tmp) >## >## tmp <- ordered(c(1,3,5,3,5,1)) >## as.numeric(tmp) >## as.position(tmp) >## >## tmp <- c(1,3,5,3,5,1) >## as.numeric(tmp) >## as.position(tmp) > >### end as.position.s ######### > >