thr3ads.net - R help - [R] Seeking Packaging advice [Aug 2003]

If this information is useful, please help other people find it:
Share via:

Ross Boylan

2003-Aug-26 20:58 UTC

[R] Seeking Packaging advice

I have two questions about packaging up code.

1) Weave/tangle advisable?
In the course of extending some C code already in S, I had to work out
the underlying math.  It seems to me useful to keep this information
with the code, using Knuth's tangle/weave type tools.  I know there is
some support for this in R code, but my question is about the wisdow of
doing this with C (or Fortran, or other source) code.

Against the advantage of having the documentation and code nicely
integrated are the drawbacks of added complexity in the build process
and portability concerns.  Some of this is mitigated by the existing
dependence on TeX.

An intermediate approach would be to provide both the web (in the Knuth
sense) source and the C output; the latter could be used directly by
those not wishing to hassle with web.  This isn't ideal, since the
resulting C is likely to be a bit cryptic, and if someone edits the C
without changing the web source confusion will reign.

So do people have any thoughts about whether introducing this is a step
forward or back?

2) Modifications of existing packages.
I modified the survival package (I'm not sure if that's properly called
a "base" package, but it's close).  I know in this particular
case, if
I'm serious, I probably should contact the package maintainer.  But this
kind of operation will probably be pretty common for me; I imagine many
on this list have already done it.  In general, is the best thing to do
a) package the new routines as a small additional package, with a
dependence on the base package if necessary (the particular change I've
made actually produces a few distinct files, slight tweaks of existing
ones, that can stand on their own)
b) package the new things in with the old under the same name as the old
(obviously requires working with package maintainter)
c) package the new things with the old and give it a new name.

I'm also curious about what development strategy is best; I did b), and
it seemed to work OK.  But I kept expecting it to cause disaster (it
probably helped that I usually didn't load the baseline survival
packages; clearly that wouldn't be an option if working with one of the
automatically loaded packages).

Thanks.
-- 
Ross Boylan                                      wk:  (415) 502-4031
530 Parnassus Avenue (Library) rm 115-4          ross at biostat.ucsf.edu
Dept of Epidemiology and Biostatistics           fax: (415) 476-9856
University of California, San Francisco
San Francisco, CA 94143-0840                     hm:  (415) 550-1062

Prof Brian Ripley

2003-Aug-27 06:56 UTC

head link

[R] Seeking Packaging advice

On Tue, 26 Aug 2003, Ross Boylan wrote:
> I have two questions about packaging up code.
> 
> 1) Weave/tangle advisable?
> In the course of extending some C code already in S, I had to work out
> the underlying math.  It seems to me useful to keep this information
> with the code, using Knuth's tangle/weave type tools.  I know there is
> some support for this in R code, but my question is about the wisdow of
> doing this with C (or Fortran, or other source) code.
> 
> Against the advantage of having the documentation and code nicely
> integrated are the drawbacks of added complexity in the build process
> and portability concerns.  Some of this is mitigated by the existing
> dependence on TeX.
There is none. We don't assume a working latex/tex, although some manuals 
will not be produced without working (pdf)latex (or texinfo->pdf).

One quick comment: the pre-compiled packages (for Windows now and MacOS X
for the next release) are produced automatically without user
intervention.  So if you want to have a package on CRAN, it needs to work
out of the box, and there is no dependence on TeX, let alone weave/tangle,
in the standard procedure.
> An intermediate approach would be to provide both the web (in the Knuth
> sense) source and the C output; the latter could be used directly by
> those not wishing to hassle with web.  This isn't ideal, since the
> resulting C is likely to be a bit cryptic, and if someone edits the C
> without changing the web source confusion will reign.
> 
> So do people have any thoughts about whether introducing this is a step
> forward or back?
A useful analogue: we now distribute Fortran code not the original Ratfor.

> 2) Modifications of existing packages.
> I modified the survival package (I'm not sure if that's properly
called
> a "base" package, but it's close).  I know in this particular
case, if
It's a `recommended' package, as the DESCRIPTION file says.  There is a 
base package, and several standard packages bundled with R, which have
priority "base" and are often call `base packages'.
> I'm serious, I probably should contact the package maintainer.  But
this
> kind of operation will probably be pretty common for me; I imagine many
> on this list have already done it.  In general, is the best thing to do
> a) package the new routines as a small additional package, with a
> dependence on the base package if necessary (the particular change I've
> made actually produces a few distinct files, slight tweaks of existing
> ones, that can stand on their own)
> b) package the new things in with the old under the same name as the old
> (obviously requires working with package maintainter)
> c) package the new things with the old and give it a new name.
> 
> I'm also curious about what development strategy is best; I did b), and
> it seemed to work OK.  But I kept expecting it to cause disaster (it
> probably helped that I usually didn't load the baseline survival
> packages; clearly that wouldn't be an option if working with one of the
> automatically loaded packages).
I think a) is the best, including changing the names of any R functions 
you alter, and changing the entry points in any compiled code you alter.

Package maintainers may have very good reasons not to go along with b), 
including their not being the original authors (true for survival), 
workload, lack of interest in the proposed changes, complications of 
ownership and copyright, ....

c) is I believe unwise.  It may be allowed by the licence (or may not) but
in the couple of cases where I have seen it done it did not give anything
like adequate credit to the original authors (who were never consulted)
and the modified code distributed was out-of-date when originally 
released, let alone now.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Thomas Lumley

2003-Aug-27 14:49 UTC

head link

[R] Seeking Packaging advice

On Tue, 26 Aug 2003, Ross Boylan wrote:
>
> 2) Modifications of existing packages.
> I modified the survival package (I'm not sure if that's properly
called
> a "base" package, but it's close).  I know in this particular
case, if
> I'm serious, I probably should contact the package maintainer.  But
this
> kind of operation will probably be pretty common for me; I imagine many
> on this list have already done it.  In general, is the best thing to do
> a) package the new routines as a small additional package, with a
> dependence on the base package if necessary (the particular change I've
> made actually produces a few distinct files, slight tweaks of existing
> ones, that can stand on their own)
I think that's best
> b) package the new things in with the old under the same name as the old
> (obviously requires working with package maintainter)
The problem in this case is that the package maintainer is not the author.
Additional functionality might well be ok, but that could easily be done
with method (a).  Substantial changes to existing functions are going
cause problems when the next few thousand lines of diffs arrive from Mayo
Clinic.
> c) package the new things with the old and give it a new name.
Keeping this in sync is hard.

	-thomas

Seemingly Similar Threads

Search for more seemingly similar threads

R help - Aug 2003 - Seeking Packaging advice

[R] Seeking Packaging advice

[R] Seeking Packaging advice

[R] Seeking Packaging advice

Seemingly Similar Threads