Thank you, Sarah. Seems that updating to a newer version does indeed solve that
problem. For completeness, below is the version in which it seems to work
properly and below is the version in which I observe the problem I described.
> sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] readr_1.3.1
loaded via a namespace (and not attached):
[1] compiler_3.5.3 assertthat_0.2.1 R6_2.4.0 cli_1.1.0
hms_0.4.2
[6] tools_3.5.3 pillar_1.3.1 tibble_2.1.1 Rcpp_1.0.1
crayon_1.3.4
[11] utf8_1.1.4 fansi_0.4.0 pkgconfig_2.0.2 rlang_0.3.4
> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] readr_1.1.1
loaded via a namespace (and not attached):
[1] compiler_3.4.2 assertthat_0.2.0 R6_2.2.2 cli_1.0.0 hms_0.3
tools_3.4.2
[7] pillar_1.3.0 tibble_1.4.2 Rcpp_1.0.0 crayon_1.3.4
utf8_1.1.4 fansi_0.2.3
[13] rlang_0.3.0.1
-----Original Message-----
From: Sarah Goslee <sarah.goslee at gmail.com>
Sent: Wednesday, April 24, 2019 11:12 AM
To: Doran, Harold <HDoran at air.org>
Cc: r-help at r-project.org
Subject: Re: [R] Read_fwf in package readr, double vs. numeric
Hi,
I can't reproduce your problem: with readr 1.1.1 on linux, it works as
expected. Letting read_fwf guess the types also works fine. (See
below.)
If you aren't running the current version of readr, update and retry.
If you are, then we probably need more info, at least sessionInfo().
Sarah
library(readr)
myFile <- "foo.txt"
pos <- fwf_positions(c(1,2,7), c(1,6,10))
type <- c('N','D','N')
types <- paste0(type, collapse = '')
types <- chartr('NCD', 'ncd', types)
read_fwf(file = myFile, col_positions = pos, col_types = types)
# A tibble: 3 x 3
X1 X2 X3
<dbl> <dbl> <dbl>
1 1 1.00e-20 1043
2 1 7.12e+ 4 1043
3 1 9.12e+ 4 1055
type <- c('N','N','N')
types <- paste0(type, collapse = '')
types <- chartr('NCD', 'ncd', types)
read_fwf(file = myFile, col_positions = pos, col_types = types)
# A tibble: 3 x 3
X1 X2 X3
<dbl> <dbl> <dbl>
1 1 1.00e-20 1043
2 1 7.12e+ 4 1043
3 1 9.12e+ 4 1055
> read_fwf(file = myFile, col_positions = pos, col_types = NULL)
Parsed with column specification:
cols(
X1 = col_double(),
X2 = col_double(),
X3 = col_double()
)
# A tibble: 3 x 3
X1 X2 X3
<dbl> <dbl> <dbl>
1 1 1.00e-20 1043
2 1 7.12e+ 4 1043
3 1 9.12e+ 4 1055
> sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-redhat-linux-gnu (64-bit) Running under: Fedora 28 (Workstation
Edition)
Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] readr_1.3.1 colorout_1.2-0
loaded via a namespace (and not attached):
[1] compiler_3.5.3 assertthat_0.2.0 R6_2.4.0 cli_1.0.1
[5] hms_0.4.2 tools_3.5.3 pillar_1.3.1 tibble_2.0.1
[9] Rcpp_1.0.0 crayon_1.3.4 utf8_1.1.4 fansi_0.4.0
[13] pkgconfig_2.0.2 rlang_0.3.1
On Wed, Apr 24, 2019 at 10:56 AM Doran, Harold <HDoran at air.org>
wrote:>
> Suppose I have the following data sitting in a fwf file 'foo.txt'.
The point of this email is to ask the group how to properly read in the value in
this pseudo-data "1e-20" using the read_fwf function in the package
readr.
>
> 11e-201043
> 1712201043
> 1912201055
>
> First, suppose I do it this way, where in this case "D" is used
for double precision.
>
> library(readr)
> pos <- fwf_positions(c(1,2,7), c(1,6,10)) type <-
c('N','D','N') types
> <- paste0(type, collapse = '') types <- chartr('NCD',
'ncd', types)
>
> read_fwf(file = myFile, col_positions = pos, col_types = types)
>
> # A tibble: 3 x 3
> X1 X2 X3
> <dbl> <dbl> <dbl>
> 1 1 1.00e-20 1043
> 2 1 7.12e+ 4 1043
> 3 1 9.12e+ 4 1055
>
> This seemingly works well and properly captures the value. However, if
> I instead were to indicate to the function that *all* of my columns
> were numeric (just insert this one line in lieu of the other above)
>
> type <- c('N','N','N')
>
> # A tibble: 3 x 3
> X1 X2 X3
> <dbl> <dbl> <dbl>
> 1 1 1 1043
> 2 1 71220 1043
> 3 1 91220 1055
>
> The read in is not correct. Here is the pragmatic issue. I have a legacy
program that spits out the layout structure of the fwf file (start, end
positions) and also indicates what the column types are. This layout file we
receive always uses a column type of numeric (N) for any numeric types
(including the column holding values such as 1e-20).
>
> This layout file will not change so I need to figure out how to solve the
problem within my read in program. I suppose one option is that I could manually
change any values of "N" to "D" in my R code. That seems to
work. But not sure if that is the "right" way to solve this issue.
>
> Thanks
> Harold
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Sarah Goslee (she/her)
http://www.numberwright.com