thr3ads.net - R help - [R] Need fresh eyes to see what I'm missing [Sep 2021]

If this information is useful, please help other people find it:
Share via:

Bert Gunter

2021-Sep-14 15:38 UTC

[R] Need fresh eyes to see what I'm missing

Remove all your as.integer() and as.double() coercions. They are
unnecessary (unless you are preparing input for C code; also, all R
non-integers are double precision) and may be the source of your
problems.

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Sep 14, 2021 at 8:31 AM Eric Berger <ericjberger at gmail.com>
wrote:>
> Before you create vel_by_month you can check vel for NAs and NaNs by
>
> sum(is.na(vel))
> sum(unlist(lapply(vel,is.nan)))
>
> HTH,
> Eric
>
>
> On Tue, Sep 14, 2021 at 6:21 PM Rich Shepard <rshepard at
appl-ecosys.com>
> wrote:
>
> > The data file begins this way:
> > year,month,day,hour,min,fps
> > 2016,03,03,12,00,1.74
> > 2016,03,03,12,10,1.75
> > 2016,03,03,12,20,1.76
> > 2016,03,03,12,30,1.81
> > 2016,03,03,12,40,1.79
> > 2016,03,03,12,50,1.75
> > 2016,03,03,13,00,1.78
> > 2016,03,03,13,10,1.81
> >
> > The script to process it:
> > library('tidyverse')
> > vel <- read.csv('../data/water/vel.dat', header = TRUE, sep
= ',',
> > stringsAsFactors = FALSE)
> > vel$year <- as.integer(vel$year)
> > vel$month <- as.integer(vel$month)
> > vel$day <- as.integer(vel$day)
> > vel$hour <- as.integer(vel$hour)
> > vel$min <- as.integer(vel$min)
> > vel$fps <- as.double(vel$fps, length = 6)
> >
> > # use dplyr to filter() by year, month, day; summarize() to get
monthly
> > # means
> > vel_by_month = vel %>%
> >      group_by(year, month) %>%
> >      summarize(flow = mean(fps, na.rm = TRUE))
> >
> > R's display after running the script:
> > > source('vel.R')
> > `summarise()` has grouped output by 'year'. You can override
using the
> > `.groups` argument.
> > Warning messages:
> > 1: In eval(ei, envir) : NAs introduced by coercion
> > 2: In eval(ei, envir) : NAs introduced by coercion
> > 3: In eval(ei, envir) : NAs introduced by coercion
> >
> > The dataframe created by the read.csv() command:
> > > head(vel)
> >    year month day hour min  fps
> > 1 2016     3   3   12   0 1.74
> > 2 2016     3   3   12  10 1.75
> > 3 2016     3   3   12  20 1.76
> > 4 2016     3   3   12  30 1.81
> > 5 2016     3   3   12  40 1.79
> > 6 2016     3   3   12  50 1.75
> >
> > and the resulting grouping:
> > > vel_by_month
> > # A tibble: 67 ? 3
> > # Groups:   year [8]
> >      year month   flow
> >     <int> <int>  <dbl>
> >   1     0    NA NaN
> >   2  2016     3   2.40
> >   3  2016     4   3.00
> >   4  2016     5   2.86
> >   5  2016     6   2.51
> >   6  2016     7   2.18
> >   7  2016     8   1.89
> >   8  2016     9   1.38
> >   9  2016    10   1.73
> > 10  2016    11   2.01
> > # ? with 57 more rows
> >
> > I cannot find why line 1 is there. Other data sets don't produce
this
> > result.
> >
> > TIA,
> >
> > Rich
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Rich Shepard

2021-Sep-14 15:48 UTC

head link

[R] Need fresh eyes to see what I'm missing

On Tue, 14 Sep 2021, Bert Gunter wrote:
> Remove all your as.integer() and as.double() coercions. They are
> unnecessary (unless you are preparing input for C code; also, all R
> non-integers are double precision) and may be the source of your problems.
Bert,

When I remove coercions the script produces warnings like this:
1: In mean.default(fps, na.rm = TRUE) :
   argument is not numeric or logical: returning NA

and str(vel) displays this:
'data.frame':	565675 obs. of  6 variables:
  $ year : chr  "2016" "2016" "2016"
"2016" ...
  $ month: int  3 3 3 3 3 3 3 3 3 3 ...
  $ day  : int  3 3 3 3 3 3 3 3 3 3 ...
  $ hour : chr  "12" "12" "12" "12" ...
  $ min  : int  0 10 20 30 40 50 0 10 20 30 ...
  $ fps  : chr  "1.74" "1.75" "1.76"
"1.81" ...

so month, day, and min are recognized as integers but year, hour, and fps
are seen as characters. I don't understand why.

Regards,

Rich

Rich Shepard

2021-Sep-14 15:51 UTC

head link

[R] Need fresh eyes to see what I'm missing

On Tue, 14 Sep 2021, Bert Gunter wrote:
> Remove all your as.integer() and as.double() coercions. They are
> unnecessary (unless you are preparing input for C code; also, all R
> non-integers are double precision) and may be the source of your
> problems.
Bert,

Are all columns but the fps factors?

Rich

R help - Sep 2021 - Need fresh eyes to see what I'm missing

[R] Need fresh eyes to see what I'm missing

[R] Need fresh eyes to see what I'm missing

[R] Need fresh eyes to see what I'm missing