Dear list, Here's a little problem I already solved with my own coding style, but I feel there is a more efficient and cleaner way to write it, but had no success finding the "clever" solution. I want to produce a factor from a subset of the combination of two vectors. I have the vectors a et b in a data-frame : > df <- expand.grid(a=c(0, 5, 10, 25, 50), b=c(0, 25, 50, 100, 200)) > fac.df a b 1 0 0 2 5 0 3 10 0 4 25 0 5 50 0 6 0 25 7 5 25 <snip> and want to create a factor which levels correspond to particular combinations of a and b (let's say Low for a=0 & b=0, Medium for a=10 & b=50, High for a=50 & b=200, others levels set to NA), reading them from a data-frame which describes the desired subset and corresponding levels. Here's my own solution (inputs are data-frames df and cas, output is the sub factor): > cas <- as.data.frame(matrix(c(0, 10,50, 0, 50, 200), 3, 2,dimnames=list(c("Low", "Medium", "High"), c("a", "b")))) > cas a b Low 0 0 Medium 10 50 High 50 200 > sub <- character(length(df$a)) > for (i in 1:length(df$a)) { + temp <- rownames(cas)[cas$a==df$a[i] & cas$b==df$b[i]] + sub[i] <- ifelse(length(temp)>0, temp, NA) + } > sub <- ordered(sub, levels=c("Low", "Medium", "High")) > sub [1] Low <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> Medium <NA> <NA> <NA> <NA> [18] <NA> <NA> <NA> <NA> <NA> <NA> <NA> High Levels: Low < Medium < High I was looking for a vectorized solution (apply style) binding data-frames df and cas, but didn't succeed avoiding the for loop. Could anybody bring me the ligths over the darkness of my ignorance ? Thank you very much in advance. -- Ir. Yves BROSTAUX Unit?? de Statistique et Informatique Facult?? universitaire des Sciences agronomiques de Gembloux (FUSAGx) 8, avenue de la Facult?? B-5030 Gembloux Belgique T??l: +32 81 62 24 69 Email: brostaux.y at fsagx.ac.be
Hi Yves, Using your objects, here is a way: > cascombo=do.call("paste",c(cas,sep=".")) > factor(do.call("paste",c(df,sep=".")),levels=cascombo,labels=rownames(cas)) [1] Low <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> Medium <NA> <NA> [16] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> High Levels: Low Medium High It uses: ? paste (sep=.) to create the combinations ie 0.0, 10.50, etc. ? do.call to invoke the paste on the columns of the data.frames ? factor specifying existing levels (only those defined by cas data.frame) anbd labels Eric At 10:12 30/11/2004, Yves Brostaux wrote:>Dear list, > >Here's a little problem I already solved with my own coding style, but I >feel there is a more efficient and cleaner way to write it, but had no >success finding the "clever" solution. > >I want to produce a factor from a subset of the combination of two >vectors. I have the vectors a et b in a data-frame : > > > df <- expand.grid(a=c(0, 5, 10, 25, 50), b=c(0, 25, 50, 100, 200)) > > fac.df > a b >1 0 0 >2 5 0 >3 10 0 >4 25 0 >5 50 0 >6 0 25 >7 5 25 ><snip> > >and want to create a factor which levels correspond to particular >combinations of a and b (let's say Low for a=0 & b=0, Medium for a=10 & >b=50, High for a=50 & b=200, others levels set to NA), reading them from a >data-frame which describes the desired subset and corresponding levels. > >Here's my own solution (inputs are data-frames df and cas, output is the >sub factor): > > > cas <- as.data.frame(matrix(c(0, 10,50, 0, 50, 200), 3, > 2,dimnames=list(c("Low", "Medium", "High"), c("a", "b")))) > > cas > a b >Low 0 0 >Medium 10 50 >High 50 200 > > > sub <- character(length(df$a)) > > for (i in 1:length(df$a)) { >+ temp <- rownames(cas)[cas$a==df$a[i] & cas$b==df$b[i]] >+ sub[i] <- ifelse(length(temp)>0, temp, NA) >+ } > > sub <- ordered(sub, levels=c("Low", "Medium", "High")) > > sub >[1] Low <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> ><NA> <NA> <NA> Medium <NA> <NA> <NA> <NA> [18] ><NA> <NA> <NA> <NA> <NA> <NA> <NA> High Levels: Low < Medium >< High > >I was looking for a vectorized solution (apply style) binding data-frames >df and cas, but didn't succeed avoiding the for loop. Could anybody bring >me the ligths over the darkness of my ignorance ? Thank you very much in >advance. > >-- >Ir. Yves BROSTAUX >Unit?? de Statistique et Informatique >Facult?? universitaire des Sciences agronomiques de Gembloux (FUSAGx) >8, avenue de la Facult?? >B-5030 Gembloux >Belgique >T??l: +32 81 62 24 69 >Email: brostaux.y at fsagx.ac.be > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlEric Lecoutre UCL / Institut de Statistique Voie du Roman Pays, 20 1348 Louvain-la-Neuve Belgium tel: (+32)(0)10473050 lecoutre at stat.ucl.ac.be http://www.stat.ucl.ac.be/ISpersonnel/lecoutre If the statistics are boring, then you've got the wrong numbers. -Edward Tufte
Gabor Grothendieck
2004-Nov-30 11:55 UTC
[R] Creating a factor from a combination of vectors
Yves Brostaux <brostaux.y <at> fsagx.ac.be> writes: : : Dear list, : : Here's a little problem I already solved with my own coding style, but I : feel there is a more efficient and cleaner way to write it, but had no : success finding the "clever" solution. : : I want to produce a factor from a subset of the combination of two : vectors. I have the vectors a et b in a data-frame : : : > df <- expand.grid(a=c(0, 5, 10, 25, 50), b=c(0, 25, 50, 100, 200)) : > fac.df : a b : 1 0 0 : 2 5 0 : 3 10 0 : 4 25 0 : 5 50 0 : 6 0 25 : 7 5 25 : <snip> : : and want to create a factor which levels correspond to particular : combinations of a and b (let's say Low for a=0 & b=0, Medium for a=10 & : b=50, High for a=50 & b=200, others levels set to NA), reading them from : a data-frame which describes the desired subset and corresponding levels. : : Here's my own solution (inputs are data-frames df and cas, output is the : sub factor): : : > cas <- as.data.frame(matrix(c(0, 10,50, 0, 50, 200), 3, : 2,dimnames=list(c("Low", "Medium", "High"), c("a", "b")))) : > cas : a b : Low 0 0 : Medium 10 50 : High 50 200 : : > sub <- character(length(df$a)) : > for (i in 1:length(df$a)) { : + temp <- rownames(cas)[cas$a==df$a[i] & cas$b==df$b[i]] : + sub[i] <- ifelse(length(temp)>0, temp, NA) : + } : > sub <- ordered(sub, levels=c("Low", "Medium", "High")) : > sub : [1] Low <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> : <NA> <NA> <NA> Medium <NA> <NA> <NA> <NA> : [18] <NA> <NA> <NA> <NA> <NA> <NA> <NA> High : Levels: Low < Medium < High : : I was looking for a vectorized solution (apply style) binding : data-frames df and cas, but didn't succeed avoiding the for loop. Could : anybody bring me the ligths over the darkness of my ignorance ? Thank : you very much in advance. : Use interaction() and factor() like this: factor( interaction(df), lev = c("0.0", "10.50", "50.200"), lab = c("Low", "Medium", "High"), ordered = TRUE)
Richard A. O'Keefe
2004-Dec-01 01:37 UTC
[R] Creating a factor from a combination of vectors
Yves Brostaux <brostaux.y at fsagx.ac.be> wrote: I want to produce a factor from a subset of the combination of two vectors. I have the vectors a et b in a data-frame : > df <- expand.grid(a=c(0, 5, 10, 25, 50), b=c(0, 25, 50, 100, 200)) ... and want to create a factor which levels correspond to particular combinations of a and b (let's say Low for a=0 & b=0, Medium for a=10 & b=50, High for a=50 & b=200, others levels set to NA), reading them from a data-frame which describes the desired subset and corresponding levels. Here's my own solution (inputs are data-frames df and cas, output is the Why not do it the obvious way? ifelse(a == 0 & b == 0, "Low", ifelse(a == 10 & b == 50, "Medium", ifelse(a == 50 & b == 200, "High", "Other"))) gives you the mapping from vectors a and b to strings you want. To get at the vectors locally, you need with(df, ...) To convert the vector of strings you get to an ordered factor, with "Other" mapped to NA, just do ordered(..., levels = c("Low","Medium","High")) because any string not listed in levels= will be mapped to NA. Put these pieces together, and you get output <- ordered(with(df, ifelse(a == 0 & b == 0, "Low", ifelse(a == 10 & b == 50, "Medium", ifelse(a == 50 & b == 200, "High", "Other")))), levels = c("Low","Medium","High"))