Methekar, Pushpa (GE Transportation, Non-GE)
2015-Feb-26 09:06 UTC
[R] covert entire dataset to numeric while persuing percentage values
Hi , I am little confused about how to covert entire dataset to numeric . As I read data like.. Xelements =read.csv(file. Choose(),header = T, stringsAsFactors=FALSE) str(xelements )> str(xelements)'data.frame': 731 obs. of 4 variables: $ Engine.Speed : chr "rpm" "ES" "rpm" "1049" ... $ X..NG.by.Energy : chr "" "% NG by Energy" "%" "0%" ... $ Int.Mfld.Temp : chr "" "Int Mfld Temp" "?C" "49" ... $ Cmd.Advance.Angle: chr "" "Cmd Advance Angle" "?BTDC" "13.8" ... I have second column as in 0%, 10%,.... In percentage value , Whenever I am going to covert whole dataset its showing NA introduced .second column going to become NA. Converting separately I would be successful .> xelements$Engine.Speed <- as.numeric(xelements$Engine.Speed)Warning message: NAs introduced by coercion> xelements$X..NG.by.Energy<- as.numeric(sub("%","",xelements$X..NG.by.Energy))/100Warning message: NAs introduced by coercion> xelements$Int.Mfld.Temp<- as.numeric(xelements$Int.Mfld.Temp)Warning message: NAs introduced by coercion> xelements$Cmd.Advance.Angle<- as.numeric(xelements$Cmd.Advance.Angle)Warning message: NAs introduced by coercion But I want to covert whole dataset at a time. I want to write function which will help me to solve this problem . xelements <- data.frame(sapply(xelements, function(x) as.numeric(as.character(x)))) sapply(xelements, class) but it won't be able to covert percentage value like 10%, 20%.... please do help me if you know the way. Thank you [[alternative HTML version deleted]]
JS Huang
2015-Feb-26 14:19 UTC
[R] covert entire dataset to numeric while persuing percentage values
The following data.frame x as one column named Percent.> xPercent 1 10% 2 20% 3 30%> as.numeric(substr(x$Percent,1,nchar(x$Percent)-1))[1] 10 20 30 -- View this message in context: http://r.789695.n4.nabble.com/covert-entire-dataset-to-numeric-while-persuing-percentage-values-tp4703862p4703880.html Sent from the R help mailing list archive at Nabble.com.
jim holtman
2015-Feb-26 14:34 UTC
[R] covert entire dataset to numeric while persuing percentage values
It would help a lot if you posted a subset of your data using 'dput' so that we know what it actually looks like. You have character data mixed with numerics, so you will be NAs in some cases. Conversion of percent to numeric is accomplished with something like this:> x <- c('12%', '6%', '3.75%') > # convert to a number > as.numeric(gsub("%", "", x)) / 100[1] 0.1200 0.0600 0.0375>Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Thu, Feb 26, 2015 at 4:06 AM, Methekar, Pushpa (GE Transportation, Non-GE) <pushpa.methekar at ge.com> wrote:> Hi , > I am little confused about how to covert entire dataset to numeric . > As I read data like.. > Xelements =read.csv(file. Choose(),header = T, stringsAsFactors=FALSE) > str(xelements ) > >> str(xelements) > 'data.frame': 731 obs. of 4 variables: > $ Engine.Speed : chr "rpm" "ES" "rpm" "1049" ... > $ X..NG.by.Energy : chr "" "% NG by Energy" "%" "0%" ... > $ Int.Mfld.Temp : chr "" "Int Mfld Temp" "?C" "49" ... > $ Cmd.Advance.Angle: chr "" "Cmd Advance Angle" "?BTDC" "13.8" ... > > I have second column as in 0%, 10%,.... In percentage value , > Whenever I am going to covert whole dataset its showing NA introduced .second column going to become NA. > Converting separately I would be successful . > > >> xelements$Engine.Speed <- as.numeric(xelements$Engine.Speed) > > Warning message: > > NAs introduced by coercion > >> xelements$X..NG.by.Energy<- as.numeric(sub("%","",xelements$X..NG.by.Energy))/100 > > Warning message: > > NAs introduced by coercion > >> xelements$Int.Mfld.Temp<- as.numeric(xelements$Int.Mfld.Temp) > > Warning message: > > NAs introduced by coercion > >> xelements$Cmd.Advance.Angle<- as.numeric(xelements$Cmd.Advance.Angle) > > Warning message: > > NAs introduced by coercion > > But I want to covert whole dataset at a time. I want to write function which will help me to solve this problem . > > > xelements <- data.frame(sapply(xelements, function(x) as.numeric(as.character(x)))) > sapply(xelements, class) > > but it won't be able to covert percentage value like 10%, 20%.... > please do help me if you know the way. Thank you > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Jeff Newmiller
2015-Feb-26 15:05 UTC
[R] covert entire dataset to numeric while persuing percentage values
I think you are getting ahead of yourself. You use the term "dataset", which is colloquial and not precise. The read.csv function returns a data.frame, in which each column can have its own storage mode ("type"). Most data.frames do not have all columns of the same type... if they were you might consider converting to a matrix, but with different units in each column that would not be a good idea in this case.>From your str output, I think you need to skip loading the second and third lines of your file in the first place, since it looks like they consist of unit strings. Something like:fname <- file.choose() xelements <- read.csv( fname, header=FALSE, skip=3, stringsAsFactors=FALSE) but this does not get your column names. One way to get those would be: names( xelements ) <- names( read.csv( fname ) ) As for the percent signs, you can convert those with something like: xelements$ X..NG.by.Energy <- as.numeric( sub( "%". "", xelements$ X..NG.by.Energy ) ) In the future, please don't post in HTML format, as it just leads to confusion on this plain text mailing list. Read the Posting Guide for other warnings, and let us follow your journey to your problem with a reproducible example. There are various discussions online of what is reproducible.. you might start with [1]. Note that the read.csv function supports a "text" argument that lets you embed a sample of lines from your file into your example so we could troubleshoot your input process better if that is where your problem is. [1] http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. On February 26, 2015 1:06:03 AM PST, "Methekar, Pushpa (GE Transportation, Non-GE)" <pushpa.methekar at ge.com> wrote:>Hi , >I am little confused about how to covert entire dataset to numeric . >As I read data like.. >Xelements =read.csv(file. Choose(),header = T, stringsAsFactors=FALSE) >str(xelements ) > >> str(xelements) >'data.frame': 731 obs. of 4 variables: >$ Engine.Speed : chr "rpm" "ES" "rpm" "1049" ... >$ X..NG.by.Energy : chr "" "% NG by Energy" "%" "0%" ... >$ Int.Mfld.Temp : chr "" "Int Mfld Temp" "?C" "49" ... >$ Cmd.Advance.Angle: chr "" "Cmd Advance Angle" "?BTDC" "13.8" ... > >I have second column as in 0%, 10%,.... In percentage value , >Whenever I am going to covert whole dataset its showing NA introduced >.second column going to become NA. >Converting separately I would be successful . > > >> xelements$Engine.Speed <- as.numeric(xelements$Engine.Speed) > >Warning message: > >NAs introduced by coercion > >> xelements$X..NG.by.Energy<- >as.numeric(sub("%","",xelements$X..NG.by.Energy))/100 > >Warning message: > >NAs introduced by coercion > >> xelements$Int.Mfld.Temp<- as.numeric(xelements$Int.Mfld.Temp) > >Warning message: > >NAs introduced by coercion > >> xelements$Cmd.Advance.Angle<- as.numeric(xelements$Cmd.Advance.Angle) > >Warning message: > >NAs introduced by coercion > >But I want to covert whole dataset at a time. I want to write function >which will help me to solve this problem . > > >xelements <- data.frame(sapply(xelements, function(x) >as.numeric(as.character(x)))) >sapply(xelements, class) > >but it won't be able to covert percentage value like 10%, 20%.... >please do help me if you know the way. Thank you > > [[alternative HTML version deleted]] > > > >------------------------------------------------------------------------ > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.