Honza Hucin
2008-Sep-25 13:42 UTC
[R] Repeated factor levels - inconsistency of factor and levels<- functions?
Hello, I have a vector x containing letters ("a", "b" etc.). Now I want to convert it to factor and group some letters into one common level. If I do it by factor function, giving the same label names for all values I want to group, it doesn't work:> x<-letters[1:5] > x[1] "a" "b" "c" "d" "e"> f<-factor(x,levels=letters[1:5],labels=c("vowel","consonant","consonant","consonant","vowel"))> levels(f)[1] "vowel" "consonant" "consonant" "consonant" "vowel" But, after it, if I update level names by a single assignment, levels with the same names will group, even when I don't change all of them:> levels(f)[1]<-"vowel" #changing only one vector item will make ALLlevels to group> levels(f)[1] "vowel" "consonant" I'm rather confused! I think this behavior is double inconsistent. First, the labeling in factor function should work similarly as in levels<- , i.e. they should group levels with the same names either BOTH or NONE. Second, if I change only one vector item, it should not change anything else, especially it should not make any "invisible" grouping. Or am I wrong? Or is it a bug? Jan Hucin
Peter Dalgaard
2008-Sep-25 13:59 UTC
[R] Repeated factor levels - inconsistency of factor and levels<- functions?
Honza Hucin wrote:> Hello, > > I have a vector x containing letters ("a", "b" etc.). Now I want to > convert it to factor and group some letters into one common level. If I do > it by factor function, giving the same label names for all values I want > to group, it doesn't work: > > >> x<-letters[1:5] >> x >> > [1] "a" "b" "c" "d" "e" > >> f<-factor(x,levels=letters[1:5], >> > labels=c("vowel","consonant","consonant","consonant","vowel")) > >> levels(f) >> > [1] "vowel" "consonant" "consonant" "consonant" "vowel" > > But, after it, if I update level names by a single assignment, levels with > the same names will group, even when I don't change all of them: > > >> levels(f)[1]<-"vowel" #changing only one vector item will make ALL >> > levels to group > >> levels(f) >> > [1] "vowel" "consonant" > > I'm rather confused! I think this behavior is double inconsistent. First, > the labeling in factor function should work similarly as in levels<- , > i.e. they should group levels with the same names either BOTH or NONE. > Second, if I change only one vector item, it should not change anything > else, especially it should not make any "invisible" grouping. > > Or am I wrong? Or is it a bug? > >I asked Brian Ripley the same thing half a year ago and his answer was: "Back compatibility ...." I'm at a loss trying to figure out what kind of code would depend on current behaviour, but the workaround is rather obvious, so the motivation for fixing (changing!) it is not too great. -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907