Robert Brown FM CEFAS
2004-Nov-10 12:03 UTC
[R] Basic Q on coercing factors in data frames to numeric
Hi there, I'm running R 2.0.0 on Windows 95. I'm trying to coerce a column of factors within a data frame to numeric. This is not a problem with a vector, but I can't find a way to index a column within a data frame to achieve this. All the examples from 'An introduction to R', 'S-plus 6 programmers guide', etc, use simple vectors. I'm sure I'm missing something obvious but I can't find a way to change a single column in data frame. Ive tried several approaches: here is one> summary(test)age yrclass weight year 1 :10 1992 :10 Min. :0.005333 Min. :1993 2 :10 1989 : 9 1st Qu.:0.221790 1st Qu.:1995 3 :10 1990 : 9 Median :0.413411 Median :1997 4 :10 1991 : 9 Mean :0.420119 Mean :1997 5 :10 1988 : 8 3rd Qu.:0.559819 3rd Qu.:2000 6 :10 1993 : 8 Max. :1.189000 Max. :2002 (Other):82 (Other):89> contents(test)Data frame:test 142 observations and 4 variables Maximum # NAs:0 Levels Storage age 23 integer yrclass 28 integer weight double year double +--------+---------------------------------------------------------------------+ |Variable|Levels | +--------+---------------------------------------------------------------------+ | age |0,1,10,11,12,13,14,15,16,18,19,2,20,21,24,25,3,4,5,6,7,8,9 | +--------+---------------------------------------------------------------------+ | yrclass|1969,1974,1975,1978,1979,1980,1981,1982,1983,1984,1985,1986,1987,1988| | |1989,1990,1991,1992,1993,1994,1995,1996,1997,1998,1999,2000,2001,2002| +--------+---------------------------------------------------------------------+> is(test$yrclass,"factor")[1] TRUE> is(test$yrclass,"numeric")[1] FALSE> as(test[,2],"numeric")[1] 1 1 2 3 3 4 5 6 6 6 7 7 8 8 8 8 9 9 9 9 10 10 10 10 10 [26] 10 10 11 11 11 11 11 12 12 12 12 12 12 12 13 13 13 13 13 13 13 14 14 14 14 [51] 14 14 14 14 15 15 15 15 15 15 15 15 15 16 16 16 16 16 16 16 16 16 17 17 17 [76] 17 17 17 17 17 17 18 18 18 18 18 18 18 18 18 18 19 19 19 19 19 19 19 19 20 [101] 20 20 20 20 20 20 20 21 21 21 21 21 21 21 21 22 22 22 22 22 22 22 23 23 23 [126] 23 23 23 24 24 24 24 24 25 25 25 25 26 26 27 27 28> is(test$yrclass,"numeric")[1] FALSE Regards, Robert Brown
Roger D. Peng
2004-Nov-10 13:20 UTC
[R] Basic Q on coercing factors in data frames to numeric
I believe this is a FAQ. See http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-do-I-convert-factors-to-numeric_003f -roger Robert Brown FM CEFAS wrote:> Hi there, > > I'm running R 2.0.0 on Windows 95. I'm trying to coerce a column of factors within a data frame to numeric. This is not a problem with a vector, but I can't find a way to index a column within a data frame to achieve this. All the examples from 'An introduction to R', 'S-plus 6 programmers guide', etc, use simple vectors. I'm sure I'm missing something obvious but I can't find a way to change a single column in data frame. Ive tried several approaches: here is one > > >>summary(test) > > age yrclass weight year > 1 :10 1992 :10 Min. :0.005333 Min. :1993 > 2 :10 1989 : 9 1st Qu.:0.221790 1st Qu.:1995 > 3 :10 1990 : 9 Median :0.413411 Median :1997 > 4 :10 1991 : 9 Mean :0.420119 Mean :1997 > 5 :10 1988 : 8 3rd Qu.:0.559819 3rd Qu.:2000 > 6 :10 1993 : 8 Max. :1.189000 Max. :2002 > (Other):82 (Other):89 > >>contents(test) > > > Data frame:test 142 observations and 4 variables Maximum # NAs:0 > > Levels Storage > age 23 integer > yrclass 28 integer > weight double > year double > > +--------+---------------------------------------------------------------------+ > |Variable|Levels | > +--------+---------------------------------------------------------------------+ > | age |0,1,10,11,12,13,14,15,16,18,19,2,20,21,24,25,3,4,5,6,7,8,9 | > +--------+---------------------------------------------------------------------+ > | yrclass|1969,1974,1975,1978,1979,1980,1981,1982,1983,1984,1985,1986,1987,1988| > | |1989,1990,1991,1992,1993,1994,1995,1996,1997,1998,1999,2000,2001,2002| > +--------+---------------------------------------------------------------------+ > >>is(test$yrclass,"factor") > > [1] TRUE > >>is(test$yrclass,"numeric") > > [1] FALSE > >>as(test[,2],"numeric") > > [1] 1 1 2 3 3 4 5 6 6 6 7 7 8 8 8 8 9 9 9 9 10 10 10 10 10 > [26] 10 10 11 11 11 11 11 12 12 12 12 12 12 12 13 13 13 13 13 13 13 14 14 14 14 > [51] 14 14 14 14 15 15 15 15 15 15 15 15 15 16 16 16 16 16 16 16 16 16 17 17 17 > [76] 17 17 17 17 17 17 18 18 18 18 18 18 18 18 18 18 19 19 19 19 19 19 19 19 20 > [101] 20 20 20 20 20 20 20 21 21 21 21 21 21 21 21 22 22 22 22 22 22 22 23 23 23 > [126] 23 23 23 24 24 24 24 24 25 25 25 25 26 26 27 27 28 > >>is(test$yrclass,"numeric") > > [1] FALSE > > > Regards, > > Robert Brown > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >-- Roger D. Peng http://www.biostat.jhsph.edu/~rpeng/