John Coulthard
2012-Apr-13 13:08 UTC
[R] How do I convert factors to numeric? It's an FAQ but...
Dear R list people I loaded a file of numbers into R and got a dataframe of factors. So I tried to convert it to numeric as per the FAQ using as.numeric(). But I'm getting errors (please see example), so what am I getting wrong? Thanks for your time. John Example... #my data object> fGSM187153 GSM187154 GSM187155 GSM187156 GSM187157 GSM187158 GSM187159 13 7.199346 7.394519 7.466155 8.035864 7.438536 7.308401 7.707994 14 6.910426 6.360291 6.228221 7.42918 7.120322 6.108129 7.201477 15 8.85921 9.152096 9.125067 6.4458 8.600319 8.97577 9.691167 16 5.851665 5.621529 5.673689 6.331274 6.160159 5.65945 5.595156 17 9.905257 8.596643 9.11741 9.872789 8.909299 9.104171 9.158998 18 6.176691 6.429807 6.418132 6.849236 6.162308 6.432743 6.444664 19 7.599871 8.795133 8.382509 5.887119 7.941895 7.666692 8.170374 20 9.458262 8.39701 8.402015 9.0859 8.995632 8.427601 8.265105 21 8.179803 9.868286 10.570601 4.905013 9.488779 9.148336 9.654022 22 7.456822 8.037138 7.953766 6.666418 7.674927 7.995109 7.635158 GSM187160 GSM187161 GSM187162 13 7.269558 7.537711 7.099806 14 6.61534 7.125821 6.413295 15 8.64715 8.252031 9.445682 16 5.639816 5.9257 5.752994 17 8.856829 9.043991 8.839183 18 6.4307 6.71052 6.5269 19 7.674577 7.390617 8.638025 20 8.132649 8.755642 8.137992 21 9.897561 7.619129 10.242096 22 7.836658 7.297986 8.679438> class(f)[1] "data.frame" #all the columns in the dataframe are of class 'factor'> for(i in 1:ncol(f)){if(class(f[,i])!="factor"){print(class(f[,i]))}} >#but it won't convert to numeric> g<-as.numeric(as.character(f))Warning message: NAs introduced by coercion> g[1] NA NA NA NA NA NA NA NA NA NA> g<-as.numeric(levels(f))[as.integer(f)]Error: (list) object cannot be coerced to type 'integer'>R version 2.14.1 (2011-12-22) Copyright (C) 2011 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: i386-redhat-linux-gnu (32-bit) [[alternative HTML version deleted]]
Milan Bouchet-Valat
2012-Apr-13 13:28 UTC
[R] How do I convert factors to numeric? It's an FAQ but...
Le vendredi 13 avril 2012 ? 13:08 +0000, John Coulthard a ?crit :> Dear R list people > > I loaded a file of numbers into R and got a dataframe of factors. So I tried to convert it to numeric as per the FAQ using as.numeric(). But I'm getting errors (please see example), so what am I getting wrong? > > Thanks for your time. > John > > Example... > > #my data object > > f > GSM187153 GSM187154 GSM187155 GSM187156 GSM187157 GSM187158 GSM187159 > 13 7.199346 7.394519 7.466155 8.035864 7.438536 7.308401 7.707994 > 14 6.910426 6.360291 6.228221 7.42918 7.120322 6.108129 7.201477 > 15 8.85921 9.152096 9.125067 6.4458 8.600319 8.97577 9.691167 > 16 5.851665 5.621529 5.673689 6.331274 6.160159 5.65945 5.595156 > 17 9.905257 8.596643 9.11741 9.872789 8.909299 9.104171 9.158998 > 18 6.176691 6.429807 6.418132 6.849236 6.162308 6.432743 6.444664 > 19 7.599871 8.795133 8.382509 5.887119 7.941895 7.666692 8.170374 > 20 9.458262 8.39701 8.402015 9.0859 8.995632 8.427601 8.265105 > 21 8.179803 9.868286 10.570601 4.905013 9.488779 9.148336 9.654022 > 22 7.456822 8.037138 7.953766 6.666418 7.674927 7.995109 7.635158 > GSM187160 GSM187161 GSM187162 > 13 7.269558 7.537711 7.099806 > 14 6.61534 7.125821 6.413295 > 15 8.64715 8.252031 9.445682 > 16 5.639816 5.9257 5.752994 > 17 8.856829 9.043991 8.839183 > 18 6.4307 6.71052 6.5269 > 19 7.674577 7.390617 8.638025 > 20 8.132649 8.755642 8.137992 > 21 9.897561 7.619129 10.242096 > 22 7.836658 7.297986 8.679438 > > class(f) > [1] "data.frame" > > #all the columns in the dataframe are of class 'factor' > > for(i in 1:ncol(f)){if(class(f[,i])!="factor"){print(class(f[,i]))}} > > > #but it won't convert to numeric > > g<-as.numeric(as.character(f)) > Warning message: > NAs introduced by coercion > > g > [1] NA NA NA NA NA NA NA NA NA NA > > g<-as.numeric(levels(f))[as.integer(f)] > Error: (list) object cannot be coerced to type 'integer'That's because you're trying to convert the whole data frame, which is a list of vectors, instead of converting the vectors individually. You can use: g <- sapply(f, function(x) as.numeric(as.character(x))) But it would probably be better to fix the import step so that you get numeric vectors in the first place. ;-) Cheers
ONKELINX, Thierry
2012-Apr-13 13:29 UTC
[R] How do I convert factors to numeric? It's an FAQ but...
f is a dataframe of factor, not a factor use either as.numeric(levels(f$your.factor))[f$your.factor] or if f only contains factors apply(f, 2, function(x){as.numeric(levels(x))[x]}) ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Namens John Coulthard Verzonden: vrijdag 13 april 2012 15:08 Aan: r-help at r-project.org Onderwerp: [R] How do I convert factors to numeric? It's an FAQ but... Dear R list people I loaded a file of numbers into R and got a dataframe of factors. So I tried to convert it to numeric as per the FAQ using as.numeric(). But I'm getting errors (please see example), so what am I getting wrong? Thanks for your time. John Example... #my data object> fGSM187153 GSM187154 GSM187155 GSM187156 GSM187157 GSM187158 GSM187159 13 7.199346 7.394519 7.466155 8.035864 7.438536 7.308401 7.707994 14 6.910426 6.360291 6.228221 7.42918 7.120322 6.108129 7.201477 15 8.85921 9.152096 9.125067 6.4458 8.600319 8.97577 9.691167 16 5.851665 5.621529 5.673689 6.331274 6.160159 5.65945 5.595156 17 9.905257 8.596643 9.11741 9.872789 8.909299 9.104171 9.158998 18 6.176691 6.429807 6.418132 6.849236 6.162308 6.432743 6.444664 19 7.599871 8.795133 8.382509 5.887119 7.941895 7.666692 8.170374 20 9.458262 8.39701 8.402015 9.0859 8.995632 8.427601 8.265105 21 8.179803 9.868286 10.570601 4.905013 9.488779 9.148336 9.654022 22 7.456822 8.037138 7.953766 6.666418 7.674927 7.995109 7.635158 GSM187160 GSM187161 GSM187162 13 7.269558 7.537711 7.099806 14 6.61534 7.125821 6.413295 15 8.64715 8.252031 9.445682 16 5.639816 5.9257 5.752994 17 8.856829 9.043991 8.839183 18 6.4307 6.71052 6.5269 19 7.674577 7.390617 8.638025 20 8.132649 8.755642 8.137992 21 9.897561 7.619129 10.242096 22 7.836658 7.297986 8.679438> class(f)[1] "data.frame" #all the columns in the dataframe are of class 'factor'> for(i in 1:ncol(f)){if(class(f[,i])!="factor"){print(class(f[,i]))}} >#but it won't convert to numeric> g<-as.numeric(as.character(f))Warning message: NAs introduced by coercion> g[1] NA NA NA NA NA NA NA NA NA NA> g<-as.numeric(levels(f))[as.integer(f)]Error: (list) object cannot be coerced to type 'integer'>R version 2.14.1 (2011-12-22) Copyright (C) 2011 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: i386-redhat-linux-gnu (32-bit) [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * * Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
David Winsemius
2012-Apr-13 13:34 UTC
[R] How do I convert factors to numeric? It's an FAQ but...
On Apr 13, 2012, at 9:08 AM, John Coulthard wrote:> > Dear R list people > > I loaded a file of numbers into R and got a dataframe of factors. > So I tried to convert it to numeric as per the FAQ using as.numeric().Actually you used as.numeric(as.character()) which should have been successful under ordinary circumstances. However you applied it to an entire dataframe, when you should have applied it to each column separately. The last error message told you that you were sending the function the wrong datatype (list).> But I'm getting errors (please see example), so what am I getting > wrong? > > Thanks for your time. > John > > Example... > > #my data object >> f > GSM187153 GSM187154 GSM187155 GSM187156 GSM187157 GSM187158 > GSM187159 > 13 7.199346 7.394519 7.466155 8.035864 7.438536 7.308401 > 7.707994 > 14 6.910426 6.360291 6.228221 7.42918 7.120322 6.108129 > 7.201477 > 15 8.85921 9.152096 9.125067 6.4458 8.600319 8.97577 > 9.691167 > 16 5.851665 5.621529 5.673689 6.331274 6.160159 5.65945 > 5.595156 > 17 9.905257 8.596643 9.11741 9.872789 8.909299 9.104171 > 9.158998 > 18 6.176691 6.429807 6.418132 6.849236 6.162308 6.432743 > 6.444664 > 19 7.599871 8.795133 8.382509 5.887119 7.941895 7.666692 > 8.170374 > 20 9.458262 8.39701 8.402015 9.0859 8.995632 8.427601 > 8.265105 > 21 8.179803 9.868286 10.570601 4.905013 9.488779 9.148336 > 9.654022 > 22 7.456822 8.037138 7.953766 6.666418 7.674927 7.995109 > 7.635158That is not a reproducible example. You should provide the unedited output from dput(f) Try: numf <- lapply(f, function(x) as.numeric(as.character(x)) ) # returns a list numf <- as.data.frame(numf) str(numf) 'data.frame': 10 obs. of 7 variables: $ GSM187153: num 7.2 6.91 8.86 5.85 9.91 ... $ GSM187154: num 7.39 6.36 9.15 5.62 8.6 ... $ GSM187155: num 7.47 6.23 9.13 5.67 9.12 ... $ GSM187156: num 8.04 7.43 6.45 6.33 9.87 ... $ GSM187157: num 7.44 7.12 8.6 6.16 8.91 ... $ GSM187158: num 7.31 6.11 8.98 5.66 9.1 ... $ GSM187159: num 7.71 7.2 9.69 5.6 9.16 ... Tested on > dput(f) structure(list(GSM187153 = structure(c(4L, 3L, 8L, 1L, 10L, 2L, 6L, 9L, 7L, 5L), .Label = c("5.851665", "6.176691", "6.910426", "7.199346", "7.456822", "7.599871", "8.179803", "8.85921", "9.458262", "9.905257"), class = "factor"), GSM187154 = structure(c(4L, 2L, 9L, 1L, 7L, 3L, 8L, 6L, 10L, 5L), .Label = c("5.621529", "6.360291", "6.429807", "7.394519", "8.037138", "8.39701", "8.596643", "8.795133", "9.152096", "9.868286"), class = "factor"), GSM187155 = structure(c(5L, 3L, 10L, 2L, 9L, 4L, 7L, 8L, 1L, 6L), .Label = c("10.570601", "5.673689", "6.228221", "6.418132", "7.466155", "7.953766", "8.382509", "8.402015", "9.11741", "9.125067"), class = "factor"), GSM187156 = structure(c(8L, 7L, 4L, 3L, 10L, 6L, 2L, 9L, 1L, 5L), .Label = c("4.905013", "5.887119", "6.331274", "6.4458", "6.666418", "6.849236", "7.42918", "8.035864", "9.0859", "9.872789"), class = "factor"), GSM187157 = structure(c(4L, 3L, 7L, 1L, 8L, 2L, 6L, 9L, 10L, 5L), .Label = c("6.160159", "6.162308", "7.120322", "7.438536", "7.674927", "7.941895", "8.600319", "8.909299", "8.995632", "9.488779"), class = "factor"), GSM187158 = structure(c(4L, 2L, 8L, 1L, 9L, 3L, 5L, 7L, 10L, 6L), .Label = c("5.65945", "6.108129", "6.432743", "7.308401", "7.666692", "7.995109", "8.427601", "8.97577", "9.104171", "9.148336"), class = "factor"), GSM187159 = structure(c(5L, 3L, 10L, 1L, 8L, 2L, 6L, 7L, 9L, 4L), .Label = c("5.595156", "6.444664", "7.201477", "7.635158", "7.707994", "8.170374", "8.265105", "9.158998", "9.654022", "9.691167"), class = "factor")), .Names = c("GSM187153", "GSM187154", "GSM187155", "GSM187156", "GSM187157", "GSM187158", "GSM187159"), class = "data.frame", row.names = c("13", "14", "15", "16", "17", "18", "19", "20", "21", "22"))>> class(f) > [1] "data.frame" > > #all the columns in the dataframe are of class 'factor' >> for(i in 1:ncol(f)){if(class(f[,i])!="factor"){print(class(f[,i]))}} >> > #but it won't convert to numeric >> g<-as.numeric(as.character(f)) > Warning message: > NAs introduced by coercion >> g > [1] NA NA NA NA NA NA NA NA NA NA >> g<-as.numeric(levels(f))[as.integer(f)] > Error: (list) object cannot be coerced to type 'integer' >> > > > R version 2.14.1 (2011-12-22) > Copyright (C) 2011 The R Foundation for Statistical Computing > ISBN 3-900051-07-0 > Platform: i386-redhat-linux-gnu (32-bit) > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT