Hu Chen
2006-Oct-24 04:41 UTC
[R] why it returns list level number instead of its content?
for example: I have a data frame. data$V4 returns: ..... [6936] P05796 P11096 P76174 P04475 P18775 [6941] P33225 P76387 P76388 P76388 P09375 [6946] P15300 P15723 1375 Levels: O50190 O65938 O69415 P00274 P00363 P00364 P00370 P00373 ... Q9AJ15 data$V5 returns something like data$V4 I want to cbind this two columns, so I use new <- cbind(data$V4,data$V5) I expect it to return something like: [1] P05761 P11986 [2] ....... however it returns [1] 215 434 [2] 134 213 ............. it uses level number instead of its content like "P05761". What's wrong with it? how can I get its content instead of level number? I can use some dirty ways to do that but I didn't understand why. [[alternative HTML version deleted]]
Richard M. Heiberger
2006-Oct-24 05:14 UTC
[R] why it returns list level number instead of its content?
The class "factor" is defined for vectors, not matrices. The attempt to use a factor in a matrix setting coerces it to numeric. See the documentation for factor where it says "In particular, as.numeric applied to a factor is meaningless, and may happen by implicit coercion."> tmp <- matrix(1:8, 4) > tmp[,1] [,2] [1,] 1 5 [2,] 2 6 [3,] 3 7 [4,] 4 8> factor(tmp)[1] 1 2 3 4 5 6 7 8 Levels: 1 2 3 4 5 6 7 8>## A simplified example based on yours:> tmp <- data.frame(a=factor(letters[1:4]), b=factor(letters[5:8])) > tmpa b 1 a e 2 b f 3 c g 4 d h> cbind(tmp$a, tmp$b) ## coerced to numeric, both factors went to 1:4[,1] [,2] [1,] 1 1 [2,] 2 2 [3,] 3 3 [4,] 4 4> tmp[, c("a","b")] ## in your example, subscripting is probably the right methoda b 1 a e 2 b f 3 c g 4 d h> cbind.data.frame(tmp$a, tmp$b) ## explicit use of cbind.data.frame works.tmp$a tmp$b 1 a e 2 b f 3 c g 4 d h Rich
John Kane
2006-Oct-24 14:19 UTC
[R] why it returns list level number instead of its content?
You have read in the data as factors. Try class(data$V4) to see this. When you are doing something like a cbind R treats the factors as numbers. You need to convert the factors back to character data. Try something like d1 <- as.character(data$V4) d2 <- as.character(data$V5) and then do a cbind to see what happens. This is not the proper way to do things, I'm sure, but it should help you see what is happening. --- Hu Chen <chencheva at gmail.com> wrote:> for example: > I have a data frame. > data$V4 returns: > ..... > [6936] P05796 P11096 P76174 P04475 > P18775 > [6941] P33225 P76387 P76388 P76388 > P09375 > [6946] P15300 P15723 > 1375 Levels: O50190 O65938 O69415 P00274 P00363 > P00364 P00370 P00373 ... > Q9AJ15 > data$V5 returns something like data$V4 > I want to cbind this two columns, so I use > new <- cbind(data$V4,data$V5) > I expect it to return something like: > [1] P05761 P11986 > [2] ....... > however it returns > [1] 215 434 > [2] 134 213 > ............. > it uses level number instead of its content like > "P05761". What's wrong > with it? how can I get its content instead of > level number? I can use some > dirty ways to do that but I didn't understand why. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >