thr3ads.net - R help - [R] Making objects global in a package [Jul 2018]

If this information is useful, please help other people find it:
Share via:

William Dunlap

2018-Jul-14 01:50 UTC

[R] Making objects global in a package

What the OP is doing looks fine to me.

The environment holding the data vectors is not necessary, but it helps
organize things - you know where to look for this sort of data vector.

I would avoid the *.rda file, since it is not text, hence not readily
editable
or trackable with most source control systems.


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Jul 13, 2018 at 6:17 PM, Jeff Newmiller <jdnewmil at
dcn.davis.ca.us>
wrote:
> a) There is a mailing list for package development questions:
> R-package-devel.
>
> b) This seems like a job for the sysdata.rda file... no explicit
> environments needed. See the Writing R Extensions manual.
>
> On July 13, 2018 5:51:06 PM PDT, Michael Hannon <
> jmhannon.ucdavis at gmail.com> wrote:
> >Greetings.  I'm putting together a small package in which I use
> >`dplyr::read_csv()` to read CSV files from several different sources.
> >I do
> >this in several different files, but with various kinds of subsequent
> >processing, depending on the file.
> >
> >I find it useful to specify column types, as the apparent data type of
> >a given
> >column sometimes changes unexpectedly deep into the file.  I.e., a
> >field that
> >consistently looks like an integer, suddenly becomes a fraction:
> >
> >    1, 1, ..., 1, 1/2, 1, ...
> >
> >Hence, the column type has to be treated as a character, rather than as
> >an
> >integer (with the possibility of later conversion to double, if
> >necessary).
> >(This is just an example.)
> >
> >Therefore I use the `col_types` argument in all of the calls to
> >`read_csv()`.
> >
> >These calls are spread over several files, but I want the keep all of
> >the
> >column types in a single place, yet have them available in each of the
> >several
> >files.  This is just for the sake of maintainability.
> >
> >At the moment I do this by putting the column-type definitions into a
> >single,
> >file:
> >
> >    000_define_data_attributes.R
> >
> >that:
> >
> >    (1) is named so that it's parsed first by `devtools::build()`
> >    (2) sets up an environment and stuffs the column types into it:
> >
> >            data_env <- new.env(parent=emptyenv())
> >            data_env$col_types_alpha <- list(
> >                Date = col_date(),
> >                var1 = col_double(),
> >                ...
> >            )
> >
> >There are a few other things that go into the file as well.
> >
> >Then I pick off the appropriate stuff from the environment in the other
> >files:
> >
> >foo_alpha <- read_csv("alpha.csv", col_types >
>data_env$col_types_alpha)
> >
> >This seems to work, but it doesn't "feel" right to me. 
(If this were
> >Python,
> >people would accuse me of being "non-pythonic").
> >
> >Hence, I'm seeking suggestions for the best practice for this kind
of
> >thing.
> >
> >BTW, I note that both the sources of data ("alpha", etc.) and
the
> >column types
> >are more or less guaranteed to be static for the foreseeable future.
> >Hence,
> >there really isn't much danger in just replicating the column-type
> >definitions
> >in each of the various files, which would obviate the need for the
> >"000..."
> >file.  In other words, this is mostly a style thing.
> >
> >Thanks for any advice you can provide.
> >
> >-- Mike
> >
> >______________________________________________
> >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Jeff Newmiller

2018-Jul-14 01:54 UTC

head link

[R] Making objects global in a package

Avoiding rda files because they don't track well with version control seems
weak to me, since you should be creating the rda with an R file in the tools
directory.

On July 13, 2018 6:50:31 PM PDT, William Dunlap <wdunlap at tibco.com>
wrote:>What the OP is doing looks fine to me.
>
>The environment holding the data vectors is not necessary, but it helps
>organize things - you know where to look for this sort of data vector.
>
>I would avoid the *.rda file, since it is not text, hence not readily
>editable
>or trackable with most source control systems.
>
>
>Bill Dunlap
>TIBCO Software
>wdunlap tibco.com
>
>On Fri, Jul 13, 2018 at 6:17 PM, Jeff Newmiller
><jdnewmil at dcn.davis.ca.us>
>wrote:
>
>> a) There is a mailing list for package development questions:
>> R-package-devel.
>>
>> b) This seems like a job for the sysdata.rda file... no explicit
>> environments needed. See the Writing R Extensions manual.
>>
>> On July 13, 2018 5:51:06 PM PDT, Michael Hannon <
>> jmhannon.ucdavis at gmail.com> wrote:
>> >Greetings.  I'm putting together a small package in which I use
>> >`dplyr::read_csv()` to read CSV files from several different
>sources.
>> >I do
>> >this in several different files, but with various kinds of
>subsequent
>> >processing, depending on the file.
>> >
>> >I find it useful to specify column types, as the apparent data type
>of
>> >a given
>> >column sometimes changes unexpectedly deep into the file.  I.e., a
>> >field that
>> >consistently looks like an integer, suddenly becomes a fraction:
>> >
>> >    1, 1, ..., 1, 1/2, 1, ...
>> >
>> >Hence, the column type has to be treated as a character, rather
than
>as
>> >an
>> >integer (with the possibility of later conversion to double, if
>> >necessary).
>> >(This is just an example.)
>> >
>> >Therefore I use the `col_types` argument in all of the calls to
>> >`read_csv()`.
>> >
>> >These calls are spread over several files, but I want the keep all
>of
>> >the
>> >column types in a single place, yet have them available in each of
>the
>> >several
>> >files.  This is just for the sake of maintainability.
>> >
>> >At the moment I do this by putting the column-type definitions into
>a
>> >single,
>> >file:
>> >
>> >    000_define_data_attributes.R
>> >
>> >that:
>> >
>> >    (1) is named so that it's parsed first by
`devtools::build()`
>> >    (2) sets up an environment and stuffs the column types into it:
>> >
>> >            data_env <- new.env(parent=emptyenv())
>> >            data_env$col_types_alpha <- list(
>> >                Date = col_date(),
>> >                var1 = col_double(),
>> >                ...
>> >            )
>> >
>> >There are a few other things that go into the file as well.
>> >
>> >Then I pick off the appropriate stuff from the environment in the
>other
>> >files:
>> >
>> >foo_alpha <- read_csv("alpha.csv", col_types >>
>data_env$col_types_alpha)
>> >
>> >This seems to work, but it doesn't "feel" right to
me.  (If this
>were
>> >Python,
>> >people would accuse me of being "non-pythonic").
>> >
>> >Hence, I'm seeking suggestions for the best practice for this
kind
>of
>> >thing.
>> >
>> >BTW, I note that both the sources of data ("alpha", etc.)
and the
>> >column types
>> >are more or less guaranteed to be static for the foreseeable
future.
>> >Hence,
>> >there really isn't much danger in just replicating the
column-type
>> >definitions
>> >in each of the various files, which would obviate the need for the
>> >"000..."
>> >file.  In other words, this is mostly a style thing.
>> >
>> >Thanks for any advice you can provide.
>> >
>> >-- Mike
>> >
>> >______________________________________________
>> >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide
>> >http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>>
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
-- 
Sent from my phone. Please excuse my brevity.

Michael Hannon

2018-Jul-16 21:16 UTC

head link

[R] Making objects global in a package

Thanks to all for your replies.  So far as I can see, there was
nothing wrong with my original approach, but I've decided to stuff all
the relevant definitions into a function (or functions), as this seems
to make "devtools::check()" happier.

-- Mike


On Fri, Jul 13, 2018 at 6:54 PM, Jeff Newmiller
<jdnewmil at dcn.davis.ca.us> wrote:> Avoiding rda files because they don't track well with version control
seems weak to me, since you should be creating the rda with an R file in the
tools directory.
>
> On July 13, 2018 6:50:31 PM PDT, William Dunlap <wdunlap at
tibco.com> wrote:
>>What the OP is doing looks fine to me.
>>
>>The environment holding the data vectors is not necessary, but it helps
>>organize things - you know where to look for this sort of data vector.
>>
>>I would avoid the *.rda file, since it is not text, hence not readily
>>editable
>>or trackable with most source control systems.
>>
>>
>>Bill Dunlap
>>TIBCO Software
>>wdunlap tibco.com
>>
>>On Fri, Jul 13, 2018 at 6:17 PM, Jeff Newmiller
>><jdnewmil at dcn.davis.ca.us>
>>wrote:
>>
>>> a) There is a mailing list for package development questions:
>>> R-package-devel.
>>>
>>> b) This seems like a job for the sysdata.rda file... no explicit
>>> environments needed. See the Writing R Extensions manual.
>>>
>>> On July 13, 2018 5:51:06 PM PDT, Michael Hannon <
>>> jmhannon.ucdavis at gmail.com> wrote:
>>> >Greetings.  I'm putting together a small package in which I
use
>>> >`dplyr::read_csv()` to read CSV files from several different
>>sources.
>>> >I do
>>> >this in several different files, but with various kinds of
>>subsequent
>>> >processing, depending on the file.
>>> >
>>> >I find it useful to specify column types, as the apparent data
type
>>of
>>> >a given
>>> >column sometimes changes unexpectedly deep into the file. 
I.e., a
>>> >field that
>>> >consistently looks like an integer, suddenly becomes a
fraction:
>>> >
>>> >    1, 1, ..., 1, 1/2, 1, ...
>>> >
>>> >Hence, the column type has to be treated as a character, rather
than
>>as
>>> >an
>>> >integer (with the possibility of later conversion to double, if
>>> >necessary).
>>> >(This is just an example.)
>>> >
>>> >Therefore I use the `col_types` argument in all of the calls to
>>> >`read_csv()`.
>>> >
>>> >These calls are spread over several files, but I want the keep
all
>>of
>>> >the
>>> >column types in a single place, yet have them available in each
of
>>the
>>> >several
>>> >files.  This is just for the sake of maintainability.
>>> >
>>> >At the moment I do this by putting the column-type definitions
into
>>a
>>> >single,
>>> >file:
>>> >
>>> >    000_define_data_attributes.R
>>> >
>>> >that:
>>> >
>>> >    (1) is named so that it's parsed first by
`devtools::build()`
>>> >    (2) sets up an environment and stuffs the column types into
it:
>>> >
>>> >            data_env <- new.env(parent=emptyenv())
>>> >            data_env$col_types_alpha <- list(
>>> >                Date = col_date(),
>>> >                var1 = col_double(),
>>> >                ...
>>> >            )
>>> >
>>> >There are a few other things that go into the file as well.
>>> >
>>> >Then I pick off the appropriate stuff from the environment in
the
>>other
>>> >files:
>>> >
>>> >foo_alpha <- read_csv("alpha.csv", col_types
>>> >data_env$col_types_alpha)
>>> >
>>> >This seems to work, but it doesn't "feel" right
to me.  (If this
>>were
>>> >Python,
>>> >people would accuse me of being "non-pythonic").
>>> >
>>> >Hence, I'm seeking suggestions for the best practice for
this kind
>>of
>>> >thing.
>>> >
>>> >BTW, I note that both the sources of data ("alpha",
etc.) and the
>>> >column types
>>> >are more or less guaranteed to be static for the foreseeable
future.
>>> >Hence,
>>> >there really isn't much danger in just replicating the
column-type
>>> >definitions
>>> >in each of the various files, which would obviate the need for
the
>>> >"000..."
>>> >file.  In other words, this is mostly a style thing.
>>> >
>>> >Thanks for any advice you can provide.
>>> >
>>> >-- Mike
>>> >
>>> >______________________________________________
>>> >R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>> >https://stat.ethz.ch/mailman/listinfo/r-help
>>> >PLEASE do read the posting guide
>>> >http://www.R-project.org/posting-guide.html
>>> >and provide commented, minimal, self-contained, reproducible
code.
>>>
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/
>>> posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>
> --
> Sent from my phone. Please excuse my brevity.

R help - Jul 2018 - Making objects global in a package

[R] Making objects global in a package

[R] Making objects global in a package

[R] Making objects global in a package