David Winsemius
2013-Dec-14 00:35 UTC
[R] colClasses does not cause read.table to coerce to numeric; anymore?
I thought that setting colClasses to numeric would coerce errant data to NA. Instead read.table is throwing errors. This is not what I remember from prior experience with read.table and it is not how I read the help page as promising: BE<- c(" 1841 96 42.26 31.50 73.75 ", " 1841 97 29.56 20.78 50.34 ", " 1841 98 18.71 10.59 29.30 ", " 1841 99 10.48 6.23 16.71 ", " 1841 100 6.14 4.23 10.37 ", " 1841 101 3.31 2.06 5.38 ", " 1841 102 1.50 0.83 2.34 ", " 1841 103 0.33 0.05 0.38 ", " 1841 104 0.00 0.00 0.00 ", " 1841 105 0.00 0.00 0.00 ", " 1841 106 0.00 0.00 0.00 ", " 1841 107 0.00 0.00 0.00 ", " 1841 108 0.00 0.00 0.00 ", " 1841 109 0.00 0.00 0.00 ", " 1841 110+ 0.00 0.00 0.00 ", " 1842 0 60290.60 62238.19 122528.79 ", " 1842 1 54893.31 55849.06 110742.37 ", " 1842 2 51991.87 53033.62 105025.49 ", " 1842 3 49697.90 50789.01 100486.91 ", " 1842 4 47598.24 48414.78 96013.02 ", " 1842 5 46202.38 47106.34 93308.72 " ) #----------- BELe<-read.table(text=BE, header=FALSE, colClasses="numeric", as.is=TRUE) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got '110+' I originally got this when reading from a file, but the error is from scan(). Was this an unfortunate side-effect of adding the `text` argument to read.table? It does still persist when the character string is pass through textConnection tot he file argument: BELe<-read.table(file=textConnection(BE), header=FALSE, colClasses="numeric") Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got '110+' My memory was that such coercion was effective in past years. -- David Winsemius Alameda, CA, USA
Uwe Ligges
2013-Dec-14 15:53 UTC
[R] colClasses does not cause read.table to coerce to numeric; anymore?
David, how should R interpret "110+"? It cannot be numeric, perhaps you have not recognized the "+" there? Uwe On 14.12.2013 01:35, David Winsemius wrote:> > I thought that setting colClasses to numeric would coerce errant data to NA. Instead read.table is throwing > errors. This is not what I remember from prior experience with read.table and it is not how I read the help page as promising: > > BE<- > c(" 1841 96 42.26 31.50 73.75 ", > " 1841 97 29.56 20.78 50.34 ", > " 1841 98 18.71 10.59 29.30 ", > " 1841 99 10.48 6.23 16.71 ", > " 1841 100 6.14 4.23 10.37 ", > " 1841 101 3.31 2.06 5.38 ", > " 1841 102 1.50 0.83 2.34 ", > " 1841 103 0.33 0.05 0.38 ", > " 1841 104 0.00 0.00 0.00 ", > " 1841 105 0.00 0.00 0.00 ", > " 1841 106 0.00 0.00 0.00 ", > " 1841 107 0.00 0.00 0.00 ", > " 1841 108 0.00 0.00 0.00 ", > " 1841 109 0.00 0.00 0.00 ", > " 1841 110+ 0.00 0.00 0.00 ", > " 1842 0 60290.60 62238.19 122528.79 ", > " 1842 1 54893.31 55849.06 110742.37 ", > " 1842 2 51991.87 53033.62 105025.49 ", > " 1842 3 49697.90 50789.01 100486.91 ", > " 1842 4 47598.24 48414.78 96013.02 ", > " 1842 5 46202.38 47106.34 93308.72" > ) > #----------- > BELe<-read.table(text=BE, > header=FALSE, colClasses="numeric", as.is=TRUE) > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > scan() expected 'a real', got '110+' > > I originally got this when reading from a file, but the error is from scan(). Was this an unfortunate side-effect of adding the `text` argument to read.table? It does still persist when the character string is pass through textConnection tot he file argument: > > BELe<-read.table(file=textConnection(BE), > header=FALSE, colClasses="numeric") > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > scan() expected 'a real', got '110+' > > My memory was that such coercion was effective in past years. >