jpm miao
2013-May-02 03:55 UTC
[R] Problems with reading data by readWorksheetFromFile of XLConnect Package
Hi, Attached are two datasheet to be read. My raw data "130502temp.xlsx" contains numbers with ' symbols, and they can't be read as numbers. Even if I copy and paste as numbers to form a new file "130502temp_number1.xlsx", they could not be read smoothly. 1. How can I read the datasheet as numbers? 2. How can I treat the notation "-" as (1) "NA" or (2) zero? Thanks, Miao> temp<-readWorksheetFromFile("130502temp.xlsx", sheet=1, header=FALSE,startRow=2, endRow= 11, startCol=2, endCol=5)> tempCol1 Col2 Col3 Col4 1 647,853 1,413 57,662 27,897 2 491,400 1,365 40,919 20,411 3 38,604 - 5,505 985 4 576 - 20 54 5 80,845 21 10,211 4,494 6 36,428 27 1,007 1,953 7 269,915 587 32,988 12,779 8 224,494 - 30,554 9,184 9 11,858 587 - 686 10 3,742 - 81 415> temp[2,2][1] "1,365"> temp[2,2]+3Error in temp[2, 2] + 3 : non-numeric argument to binary operator> temp_num<-readWorksheetFromFile("130502temp_number1.xlsx", sheet=1,header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5)> temp_num[2,2][1] "1,365"> temp_num[2,2]+3Error in temp_num[2, 2] + 3 : non-numeric argument to binary operator> as.numeric(temp_num[2,2])+3[1] NA Warning message: NAs introduced by coercion
Anthony Damico
2013-May-02 09:39 UTC
[R] Problems with reading data by readWorksheetFromFile of XLConnect Package
try adding colTypes = 'numeric' to your readWorkSheetFromFile() call if that doesn't work, try a few other steps # view what data types your file is being read in as sapply( temp , class ) # convert all fields to character if they're factor variables.. but i don't think you need this, readWorksheet defaults to `character` temp <- sapply( temp , as.character ) # you can also convert a subset like this temp[ , c( 1 , 3:4 ) ] <- sapply( temp[ , c( 1 , 3:4 ) ] , as.character ) # remove commas from character strings temp <- sapply( temp , function( x ) gsub( ',' , '' , x ) ) # convert all fields to numeric temp <- sapply( temp , as.numeric ) # convert all NA fields to zeroes if you prefer temp[ is.na( temp ) ] <- 0 On Wed, May 1, 2013 at 11:55 PM, jpm miao <miaojpm@gmail.com> wrote:> Hi, > > Attached are two datasheet to be read. > My raw data "130502temp.xlsx" contains numbers with ' symbols, and they > can't be read as numbers. Even if I copy and paste as numbers to form a new > file "130502temp_number1.xlsx", they could not be read smoothly. > > 1. How can I read the datasheet as numbers? > 2. How can I treat the notation "-" as (1) "NA" or (2) zero? > > Thanks, > > Miao > > > > > > temp<-readWorksheetFromFile("130502temp.xlsx", sheet=1, header=FALSE, > startRow=2, endRow= 11, startCol=2, endCol=5) > > > temp > > Col1 Col2 Col3 Col4 > > 1 647,853 1,413 57,662 27,897 > > 2 491,400 1,365 40,919 20,411 > > 3 38,604 - 5,505 985 > > 4 576 - 20 54 > > 5 80,845 21 10,211 4,494 > > 6 36,428 27 1,007 1,953 > > 7 269,915 587 32,988 12,779 > > 8 224,494 - 30,554 9,184 > > 9 11,858 587 - 686 > > 10 3,742 - 81 415 > > > temp[2,2] > > [1] "1,365" > > > temp[2,2]+3 > > Error in temp[2, 2] + 3 : non-numeric argument to binary operator > > > temp_num<-readWorksheetFromFile("130502temp_number1.xlsx", sheet=1, > header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5) > > > temp_num[2,2] > > [1] "1,365" > > > temp_num[2,2]+3 > > Error in temp_num[2, 2] + 3 : non-numeric argument to binary operator > > > as.numeric(temp_num[2,2])+3 > > [1] NA > > Warning message: > > NAs introduced by coercion > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]