David Winsemius
2013-Dec-14  00:35 UTC
[R] colClasses does not cause read.table to coerce to numeric; anymore?
I thought that setting colClasses to numeric would coerce errant data to NA.
Instead read.table is throwing
errors. This is not what I remember from prior experience with read.table and it
is not how I read the help page as promising:
BE<-
c("   1841       96           42.26        31.50        73.75 ", 
"   1841       97           29.56        20.78        50.34 ", 
"   1841       98           18.71        10.59        29.30 ", 
"   1841       99           10.48         6.23        16.71 ", 
"   1841      100            6.14         4.23        10.37 ", 
"   1841      101            3.31         2.06         5.38 ", 
"   1841      102            1.50         0.83         2.34 ", 
"   1841      103            0.33         0.05         0.38 ", 
"   1841      104            0.00         0.00         0.00 ", 
"   1841      105            0.00         0.00         0.00 ", 
"   1841      106            0.00         0.00         0.00 ", 
"   1841      107            0.00         0.00         0.00 ", 
"   1841      108            0.00         0.00         0.00 ", 
"   1841      109            0.00         0.00         0.00 ", 
"   1841      110+           0.00         0.00         0.00 ", 
"   1842        0        60290.60     62238.19    122528.79 ", 
"   1842        1        54893.31     55849.06    110742.37 ", 
"   1842        2        51991.87     53033.62    105025.49 ", 
"   1842        3        49697.90     50789.01    100486.91 ", 
"   1842        4        47598.24     48414.78     96013.02 ", 
"   1842        5        46202.38     47106.34     93308.72 "
)
#-----------
 BELe<-read.table(text=BE,
                  header=FALSE, colClasses="numeric", as.is=TRUE)
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
  scan() expected 'a real', got '110+'
I originally got this when reading from a file, but the error is from scan().
Was this an unfortunate side-effect of adding the `text` argument to read.table?
It does still persist when the character string is pass through textConnection
tot he file argument:
BELe<-read.table(file=textConnection(BE),
                 header=FALSE, colClasses="numeric")
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
  scan() expected 'a real', got '110+'
My memory was that such coercion was effective in past years.
-- 
David Winsemius
Alameda, CA, USA
Uwe Ligges
2013-Dec-14  15:53 UTC
[R] colClasses does not cause read.table to coerce to numeric; anymore?
David, how should R interpret "110+"? It cannot be numeric, perhaps you have not recognized the "+" there? Uwe On 14.12.2013 01:35, David Winsemius wrote:> > I thought that setting colClasses to numeric would coerce errant data to NA. Instead read.table is throwing > errors. This is not what I remember from prior experience with read.table and it is not how I read the help page as promising: > > BE<- > c(" 1841 96 42.26 31.50 73.75 ", > " 1841 97 29.56 20.78 50.34 ", > " 1841 98 18.71 10.59 29.30 ", > " 1841 99 10.48 6.23 16.71 ", > " 1841 100 6.14 4.23 10.37 ", > " 1841 101 3.31 2.06 5.38 ", > " 1841 102 1.50 0.83 2.34 ", > " 1841 103 0.33 0.05 0.38 ", > " 1841 104 0.00 0.00 0.00 ", > " 1841 105 0.00 0.00 0.00 ", > " 1841 106 0.00 0.00 0.00 ", > " 1841 107 0.00 0.00 0.00 ", > " 1841 108 0.00 0.00 0.00 ", > " 1841 109 0.00 0.00 0.00 ", > " 1841 110+ 0.00 0.00 0.00 ", > " 1842 0 60290.60 62238.19 122528.79 ", > " 1842 1 54893.31 55849.06 110742.37 ", > " 1842 2 51991.87 53033.62 105025.49 ", > " 1842 3 49697.90 50789.01 100486.91 ", > " 1842 4 47598.24 48414.78 96013.02 ", > " 1842 5 46202.38 47106.34 93308.72" > ) > #----------- > BELe<-read.table(text=BE, > header=FALSE, colClasses="numeric", as.is=TRUE) > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > scan() expected 'a real', got '110+' > > I originally got this when reading from a file, but the error is from scan(). Was this an unfortunate side-effect of adding the `text` argument to read.table? It does still persist when the character string is pass through textConnection tot he file argument: > > BELe<-read.table(file=textConnection(BE), > header=FALSE, colClasses="numeric") > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > scan() expected 'a real', got '110+' > > My memory was that such coercion was effective in past years. >