thr3ads.net - R help - [R] NADA Package: Referencing Data Frame Columns [Aug 2012]

If this information is useful, please help other people find it:
Share via:

Rich Shepard

2012-Aug-07 16:26 UTC

[R] NADA Package: Referencing Data Frame Columns

The sample data sets that come with the NADA package are limited to one or
two variables and a censored measurement indicator column. I try to mimic
examples using my data but keep missing the target.

   My water chemistry data is available in two formats: long (as seen in a
database table) and wide (as seen in a spreadsheet). The two structures are:

str(chem)
'data.frame':	65349 obs. of  8 variables:
  $ site    : Factor w/ 64 levels
"D-1","D-2","D-3",..: 1 1 1 1 1 1 1 ...
  $ sampdate: Date, format: "2007-12-12" "2007-12-12" ...
  $ era     : Factor w/ 2 levels "Post","Pre": 1 1 1 1 1 1 1
1 1 1 ...
  $ param   : Factor w/ 64 levels "AgDis","AgTot",..: 2 4 5
7 11 15 25 ...
  $ quant   : num  1.30e-04 1.06e-01 2.31e+02 1.13e-02 5.00e-03 ...
  $ ceneq1  : logi  TRUE FALSE FALSE FALSE TRUE FALSE ...
  $ floor   : num  0 0.106 231 0.0113 0 100 0 1.43 0 0.0239 ...
  $ ceiling : num  1.30e-04 1.06e-01 2.31e+02 1.13e-02 5.00e-03 2.39e-02 ...

and

str(chem.cast)
'data.frame':	56938 obs. of  70 variables:
  $ site     : Factor w/ 64 levels
"D-1","D-2","D-3",..: 1 1 1 1 1 ...
  $ sampdate : Date, format: "2007-12-12" "2007-12-12" ...
  $ era      : Factor w/ 2 levels "Post","Pre": 1 1 1 1 1 1
1 1 1 1 ...
  $ ceneq1   : logi  TRUE FALSE FALSE FALSE TRUE FALSE ...
  $ floor    : num  0 0.106 231 0.0113 0 100 0 1.43 0 0.0239 ...
  $ ceiling  : num  1.30e-04 1.06e-01 2.31e+02 1.13e-02 5.00e-03 ...
  $ AgDis    : num  NA NA NA NA NA NA NA NA NA NA ...
  $ AgTot    : num  0.00013 NA NA NA NA NA NA NA NA NA ...
  $ AlDis    : num  NA NA NA NA NA NA NA NA NA NA ...
  $ AlTot    : num  NA 0.106 NA NA NA NA NA NA NA NA ...
  $ Alk      : num  NA NA 231 NA NA NA NA NA NA NA ...
  $ AsDis    : num  NA NA NA NA NA NA NA NA NA NA ...
   and so on.

   I do not know if the latter is appropriate; that is, that the ceneq1,
floor, and ceiling values are available for each site, sampdate, and
chemical.

   Is the appropriate way to use the NADA methods for analyses and plotting
to subset each chemical separately from the 'chem' data frame? Or, is
there
a syntax other than, for example,

cenboxplot(chem&Vdis, chem$ceneq1, chem$era)
Error in cenros(obs[group == i], cen[group == i]) :
   error in evaluating the argument 'obs' in selecting a method for
function
'ros': Error: object 'Vdis' not found

   I get the same error when trying to use the 'chem.cast' data frame.

Rich

R. Michael Weylandt

2012-Aug-07 18:31 UTC

head link

[R] NADA Package: Referencing Data Frame Columns

On Tue, Aug 7, 2012 at 11:26 AM, Rich Shepard <rshepard at
appl-ecosys.com> wrote:>   The sample data sets that come with the NADA package are limited to one
or
> two variables and a censored measurement indicator column. I try to mimic
> examples using my data but keep missing the target.
>
>   My water chemistry data is available in two formats: long (as seen in a
> database table) and wide (as seen in a spreadsheet). The two structures
are:
>
> str(chem)
> 'data.frame':   65349 obs. of  8 variables:
>  $ site    : Factor w/ 64 levels
"D-1","D-2","D-3",..: 1 1 1 1 1 1 1 ...
>  $ sampdate: Date, format: "2007-12-12" "2007-12-12"
...
>  $ era     : Factor w/ 2 levels "Post","Pre": 1 1 1 1 1
1 1 1 1 1 ...
>  $ param   : Factor w/ 64 levels "AgDis","AgTot",..: 2
4 5 7 11 15 25 ...
>  $ quant   : num  1.30e-04 1.06e-01 2.31e+02 1.13e-02 5.00e-03 ...
>  $ ceneq1  : logi  TRUE FALSE FALSE FALSE TRUE FALSE ...
>  $ floor   : num  0 0.106 231 0.0113 0 100 0 1.43 0 0.0239 ...
>  $ ceiling : num  1.30e-04 1.06e-01 2.31e+02 1.13e-02 5.00e-03 2.39e-02 ...
>
> and
>
> str(chem.cast)
> 'data.frame':   56938 obs. of  70 variables:
>  $ site     : Factor w/ 64 levels
"D-1","D-2","D-3",..: 1 1 1 1 1 ...
>  $ sampdate : Date, format: "2007-12-12" "2007-12-12"
...
>  $ era      : Factor w/ 2 levels "Post","Pre": 1 1 1 1
1 1 1 1 1 1 ...
>  $ ceneq1   : logi  TRUE FALSE FALSE FALSE TRUE FALSE ...
>  $ floor    : num  0 0.106 231 0.0113 0 100 0 1.43 0 0.0239 ...
>  $ ceiling  : num  1.30e-04 1.06e-01 2.31e+02 1.13e-02 5.00e-03 ...
>  $ AgDis    : num  NA NA NA NA NA NA NA NA NA NA ...
>  $ AgTot    : num  0.00013 NA NA NA NA NA NA NA NA NA ...
>  $ AlDis    : num  NA NA NA NA NA NA NA NA NA NA ...
>  $ AlTot    : num  NA 0.106 NA NA NA NA NA NA NA NA ...
>  $ Alk      : num  NA NA 231 NA NA NA NA NA NA NA ...
>  $ AsDis    : num  NA NA NA NA NA NA NA NA NA NA ...
>   and so on.
>
>   I do not know if the latter is appropriate; that is, that the ceneq1,
> floor, and ceiling values are available for each site, sampdate, and
> chemical.
>
>   Is the appropriate way to use the NADA methods for analyses and plotting
> to subset each chemical separately from the 'chem' data frame? Or,
is there
> a syntax other than, for example,
>
> cenboxplot(chem&Vdis, chem$ceneq1, chem$era)
> Error in cenros(obs[group == i], cen[group == i]) :
>   error in evaluating the argument 'obs' in selecting a method for
function
> 'ros': Error: object 'Vdis' not found
>
>   I get the same error when trying to use the 'chem.cast' data
frame.
>

Take a look at with()

Michael
> Rich
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Rich Shepard

2012-Aug-07 19:59 UTC

head link

[R] NADA Package: Referencing Data Frame Columns

On Tue, 7 Aug 2012, R. Michael Weylandt wrote:
> Take a look at with()
Michael,

   Works like a charm! Thanks again for the pointer.

Rich

MacQueen, Don

2012-Aug-08 20:08 UTC

head link

[R] NADA Package: Referencing Data Frame Columns

Hi Rich,

I may not have the complete picture here, but I do see what looks to me
like a problem with your chem.cast.

Specifically, since it has only a single detection indicator column
(ceneq1), it implies that within any single sample either all the analytes
were detected, or all were not. Not what I would expect.

If the typo that others pointed out was not the entire answer to your
question, then I would add:

As to your larger question of which layout is appropriate for use with
NADA functions, the answer is that either can be used. The "trick" is
to
use the appropriate syntax to extract the values needed to pass the data
to a NADA function. The syntax is different for the long vs the wide
format. At this point, it's not really a NADA issue, just a matter of R
syntax. There are multiple ways to do either one. I suppose each has pros
and cons, to some extent depends on what kinds of graphics or analyses you
need to do, and there's plenty of room for personal preference.


For the long format you subset the rows, then pass the appropriate
columns. Here's one way:

   with(subset(chem, param=='AgDis') , ros(quant,ceneq1))


For the wide format you pass the appropriate columns

   ros( chem.cast$AgDis, chem.cast$AgDis.ceneq1 )

where I have invented the name of a new column that has the censoring
indicator specific to AgDis.

Hope this helps.

-Don

(p.s., I still think you'll be better off in the long run if you store
site, param, and maybe era, as character objects, not factors.)

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 8/7/12 9:26 AM, "Rich Shepard" <rshepard at appl-ecosys.com>
wrote:
>   The sample data sets that come with the NADA package are limited to
>one or
>two variables and a censored measurement indicator column. I try to mimic
>examples using my data but keep missing the target.
>
>   My water chemistry data is available in two formats: long (as seen in a
>database table) and wide (as seen in a spreadsheet). The two structures
>are:
>
>str(chem)
>'data.frame':	65349 obs. of  8 variables:
>  $ site    : Factor w/ 64 levels
"D-1","D-2","D-3",..: 1 1 1 1 1 1 1 ...
>  $ sampdate: Date, format: "2007-12-12" "2007-12-12"
...
>  $ era     : Factor w/ 2 levels "Post","Pre": 1 1 1 1 1
1 1 1 1 1 ...
>  $ param   : Factor w/ 64 levels "AgDis","AgTot",..: 2
4 5 7 11 15 25 ...
>  $ quant   : num  1.30e-04 1.06e-01 2.31e+02 1.13e-02 5.00e-03 ...
>  $ ceneq1  : logi  TRUE FALSE FALSE FALSE TRUE FALSE ...
>  $ floor   : num  0 0.106 231 0.0113 0 100 0 1.43 0 0.0239 ...
>  $ ceiling : num  1.30e-04 1.06e-01 2.31e+02 1.13e-02 5.00e-03 2.39e-02
>...
>
>and
>
>str(chem.cast)
>'data.frame':	56938 obs. of  70 variables:
>  $ site     : Factor w/ 64 levels
"D-1","D-2","D-3",..: 1 1 1 1 1 ...
>  $ sampdate : Date, format: "2007-12-12" "2007-12-12"
...
>  $ era      : Factor w/ 2 levels "Post","Pre": 1 1 1 1
1 1 1 1 1 1 ...
>  $ ceneq1   : logi  TRUE FALSE FALSE FALSE TRUE FALSE ...
>  $ floor    : num  0 0.106 231 0.0113 0 100 0 1.43 0 0.0239 ...
>  $ ceiling  : num  1.30e-04 1.06e-01 2.31e+02 1.13e-02 5.00e-03 ...
>  $ AgDis    : num  NA NA NA NA NA NA NA NA NA NA ...
>  $ AgTot    : num  0.00013 NA NA NA NA NA NA NA NA NA ...
>  $ AlDis    : num  NA NA NA NA NA NA NA NA NA NA ...
>  $ AlTot    : num  NA 0.106 NA NA NA NA NA NA NA NA ...
>  $ Alk      : num  NA NA 231 NA NA NA NA NA NA NA ...
>  $ AsDis    : num  NA NA NA NA NA NA NA NA NA NA ...
>   and so on.
>
>   I do not know if the latter is appropriate; that is, that the ceneq1,
>floor, and ceiling values are available for each site, sampdate, and
>chemical.
>
>   Is the appropriate way to use the NADA methods for analyses and
>plotting
>to subset each chemical separately from the 'chem' data frame? Or,
is
>there
>a syntax other than, for example,
>
>cenboxplot(chem&Vdis, chem$ceneq1, chem$era)
>Error in cenros(obs[group == i], cen[group == i]) :
>   error in evaluating the argument 'obs' in selecting a method for
>function
>'ros': Error: object 'Vdis' not found
>
>   I get the same error when trying to use the 'chem.cast' data
frame.
>
>Rich
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

Seemingly Similar Threads

Search for more reasonably related threads

R help - Aug 2012 - NADA Package: Referencing Data Frame Columns

[R] NADA Package: Referencing Data Frame Columns

[R] NADA Package: Referencing Data Frame Columns

[R] NADA Package: Referencing Data Frame Columns

[R] NADA Package: Referencing Data Frame Columns

Seemingly Similar Threads