thr3ads.net - R devel - [Rd] Documentation issues [Was: Function hints] [Jun 2006]

If this information is useful, please help other people find it:
Share via:

Heather Turner

2006-Jun-20 09:18 UTC

[Rd] Documentation issues [Was: Function hints]

I would like to follow up on another one of the documentation issues raised in
the discussion on function hints. Duncan mentioned that the R core were working
on preprocessing directives for .Rd files, which could possibly include some
sort of include directive. I was wondering if a "includeexamples"
directive might also be considered.

It often makes sense to use the same example to illustrate the use of different
functions, or perhaps extend an example used to illustrate one function to
illustrate another. One way to do this is simply to put

example(fnA)

in the \examples for fnB, but this is not particularly helpful for people
reading the help pages as they either need to look at both help pages or run the
example. The alternative is to maintain multiple copies of the same code, which
is not ideal.

So it would be useful to be able to put

\includeexamples(fnA)

so that the code is replicated in fnB.Rd. Perhaps an include directive could do
this anyway, but it might be useful to have a special directive for examples so
that RCMD check is set up to only check the original example to save time (and
unnecessary effort).

On a related issue, it would be nice if source() had an option to print comments
contained in the source file, so that example() and demo() could print out
annotation.

Heather

Dr H Turner
Research Assistant
Dept. of Statistics
The University of Warwick
Coventry
CV4 7AL

Tel: 024 76575870
Fax: 024 7652 4532
Url: www.warwick.ac.uk/go/heatherturner
>>> <Mark.Bravington at csiro.au> 06/20/06 01:43am >>>[This is not about the feasibility of a "hints" function-- which would
be incredibly useful, but perhaps very very hard to do-- but about some
of the other documentation issues raised in Hadley's post and in
Duncan's reply]

WRTO documentation & code together: for several years, I've successfully
used the 'mvbutils' package to keep every function definition & its
documentation together, editing them together in the same file--
function first, then documentation in plain-text (basically the format
you see if you use "vanilla help" inside R). Storage-wise, the
documentation is just kept as an attribute of the function (with a print
method that hides it by default)-- I also keep a text backup of the
combination. Any text editor will do. When it's time to create a
package, the Rd file is generated automatically.

For me, it's been extremely helpful to keep function & documentation
together during editing-- it greatly increases the chance that I will
actually update the doco when I change the code, rather than putting it
off until I've forgotten what I did. Also, writing Rd format is a
nightmare (again, personal opinion)-- being able to write plain-text
makes the whole documentation thing bearable.

The above is not quite to the point of the original post, I think, which
talks about storing the documentation as commented bits *inside* the
function code. However, I'm not sure the latter is really desirable;
there is some merit in forcing authors to write an explicit "Details"
or
"Description" section that is not just a paraphrase of programming
comments, and such sections are unlikely to fit easily inside code. At
any rate, I wouldn't want to have to interpret my *own* programming
comments as a usage guide!

WRTO automatic "usage" sections: it is easy to write code to do this
('prompt', and there is also some in 'mvbutils'-- not sure if
it's in
the current release though) but at least as far as the "usage" section
goes, I think people should be "vigorously encouraged" to write their
own, showing as far as possible how one might actually *use* the
function. For many functions, just duplicating the argument list is not
helpful to the user-- a function can often be invoked in several
different ways, with different arguments relevant to different
invocations. I think it's good to show how this can be done in the
"usage" section, with comments, rather than deferring all practical
usage to "examples". For one thing, "usage" is near the top,
and so
gives a very quick reminder without having to scroll through the entire
doco; for another, "usage" and "arguments" are visually
adjacent,
whereas "examples" can be widely separated from "arguments".

My general point here is: the documentating process should be as
painless as possible, but not more so. Defaults that are likely to lead
to unhelpful documentation are perhaps best avoided.
For this general reason, I applaud R's fairly rigid documentation
standards, even though I frequently curse them. (And I would like to see
some bits more rigid, such as compulsory "how-to-use-this"
documentation
for each package!)

The next version of 'mvbutils' will include various tools for easy
"live
editing" and automated preparation of packages-- I've been using them
for a while, but still have to get round to finishing the documentation
;) 

Mark Bravington
CSIRO Mathematical & Information Sciences
Marine Laboratory
Castray Esplanade
Hobart 7001
TAS

ph (+61) 3 6232 5118
fax (+61) 3 6232 5012
mob (+61) 438 315 623
 
> -----Original Message-----
> From: r-devel-bounces at r-project.org 
> [mailto:r-devel-bounces at r-project.org] On Behalf Of Duncan Murdoch
> Sent: Tuesday, 20 June 2006 12:39 AM
> To: hadley wickham; R-devel
> Subject: Re: [Rd] [R] Function hints
> 
> I've moved this from R-help to R-devel, where I think it is 
> more appropriate, and interspersed comments below.
> 
> 
> 
> On 6/19/2006 8:51 AM, hadley wickham wrote:
> > One of the recurring themes in the recent UserR conference was that
> > many people find it difficult to find the functions they need for a
> > particular task.  Sandy Weisberg suggested a small idea he 
> would like
> > to see: a hints function that given an object, lists likely
> > operations.  I've done my best to implement this function using
the
> > tools currently available in R, and my code is included at 
> the bottom
> > of this email (I hope that I haven't just duplicated 
> something already
> > present in R).  I think Sandy's idea is genuinely useful, 
> even in the
> > limited form provided by my implementation, and I have already
> > discovered a few useful functions that I was unaware of.
> > 
> > While developing and testing this function, I ran into a 
> few problems
> > which, I think, represent underlying problems with the current
> > documentation system.  These are typified by the results of running
> > hints on a object produced by glm (having class c("glm",
"lm")).  I
> > have outlined (very tersely) some possible solutions.  Please note
> > that while these solutions are largely technological, the problem is
> > at heart sociological: writing documentation is no easier 
> (and perhaps
> > much harder) than writing a scientific publication, but the rewards
> > are fewer.
> > 
> > Problems:
> > 
> >  * Many functions share the same description (eg. head, tail).
> > Solution: each rdoc file should only describe one method. Problem:
> > Writing rdoc files is tedious, there is a lot of information
> > duplicated between the code and the documenation (eg. the usage
> > statement) and some functions share a lot of similar information.
> > Solution: make it easier to write documentation (eg. documentation
> > inline with code), and easier to include certain common descriptions
> > in multiple methods (eg. new include command)
> 
> I think it's bad to document dissimilar functions in the same 
> file, but 
> similar related functions *should* be documented together.  Not doing 
> this just adds to the burden of documenting them, and the risk of 
> modifying only part of the documentation so that it is inconsistent. 
> The user also gets the benefit of seeing a common description all at 
> once, rather than having to decide whether to follow "See also"
links.
> 
> Your solutions would both be interesting on their own merits 
> regardless 
> of the above.  We did decide to work on preprocessing 
> directives for .Rd 
> files at the R core meetings; some sort of include directive may be 
> possible.
> 
> I don't think I would want complete documentation mixed with the 
> original source, but it would certainly be interesting to 
> have partial 
> documentation there.  (Complete documentation is too long, and would 
> make it harder to read the source without a dedicated editor 
> that could 
> hide it.  Though ESS users may see it as a reasonable requirement to 
> have everyone use the same editor, I don't think it is.)  
> However, this 
> is a lot of work, depending on infrastructure that is not in place.
> 
> >  * It is difficult to tell which functions are commonly
> > used/important. Solution: break down by keywords. Problem: keywords
> > are not useful at the moment.  Solution:  make better list 
> of keywords
> > available and encourage people to use it.  Problem: people won't
> > unless there is a strong incentive, plus good keywording requires
> > considerable expertise (especially in bulding up list).  This is
> > probably insoluable unless one person systematically keywords all of
> > the base packages.
> 
> I think it is worse than that.  There are concepts in 
> packages that just 
> don't arise in base R, and hence there would be no keywords for them 
> other than "misc", even if someone redesigned the current system.
> Keywording is hard, and it's not clear to me how to do much 
> better than 
> we currently do.
> 
> We do already have user-defined keywords (via \concept), but 
> these are 
> not widely used.
> 
> > 
> >  * Some functions aren't documented (eg. simulate.lm, formula.glm)
-
> > typically, these are methods where the documentation is in the
> > generic.  Solution: these methods should all be aliased to 
> the generic
> > (by default?), and R CMD check should be amended to check for this
> > situation.  You could also argue that this is a deficiency with my
> > function, and easily fixed by automatically referring to the generic
> > if the specific isn't documented.
> 
> I'd say it's a deficiency of your function.  You might want 
> to look at 
> the code in get("?") and .helpForCall() to see how those 
> functions work 
> out things like
> 
> ?simulate(x)
> 
> where x is an lm object.  (But notice that .helpForCall is an 
> undocumented internal function; don't depend on its implementation 
> working forever).
> 
> >  * It can't supply suggestions when there isn't an explicit
method
> > (ie. .default is used), this makes it pretty useless for basic
> > vectors.  This may not really be a problem, as all possible 
> operations
> > are probably too numerous to list.
> > 
> >  * Provides full name for function, when best practice is to use
> > generic part only when calling function.  However, getting precise
> > documentation may requires that full name. 
> 
> No, not if the call syntax above is used.
> 
>   I do the best I can
> > (returning the generic if specific is alias to a documentation file
> > with the same method name), but this reflects a deeper problem that
> > the name you should use when calling a function may be different to
> > the name you use to get documentation.
> > 
> >  * Can only display methods from currently loaded packages. 
>  This is a
> > shortcoming of the methods function, but I suspect it is 
> difficult to
> > find S3 methods without loading a package.
> > 
> > Relatively trivial problems:
> > 
> >  * Needs wide display to be effective.  Could be dealt with by
> > breaking description in a sensible manner (there may 
> already by R code
> > to do this.  Please let me know if you know of any)
> 
> I think strwrap() may do what you want.
> > 
> >  * Doesn't currently include S4 methods.  Solution: add 
> some more code
> > to wrap showMethods
> > 
> >  * Personally, I think sentence case is more aesthetically pleasing
> > (and more flexible) than title case.
> 
> It's quite hard to go from existing title case to sentence 
> case, because 
> we don't have any markup to indicate proper names.  One would 
> think it 
> would be easier to go in the opposite direction, but in fact the same 
> problem arises:  "van Beethoven" for example, not "Van
Beethoven".
> 
> 
> > 
> > 
> > Hadley
> > 
> > 
> > hints <- function(x) {
> 
> I don't like the name "hints".  I think we already have too
many ways
> into the help system:
> 
> help
> ?
> help.search
> apropos
> etc.?
> 
> I like your function, but I'd rather see it attached to one of the 
> existing help functions, probably help.search().  For example,
> 
> help.search(x)
> 
> could look for functions designed to work with the class of 
> x, if it had 
> one.  (There's some ambiguity here:  perhaps x contains a 
> string, and I 
> want help on that string.)
> 
> Anyway, thanks for your efforts on this so far; I hope we end up with 
> something that can make it into the next release.
> 
> Duncan Murdoch
> 
> > 	db <- eval(utils:::.hsearch_db())
> > 	if (is.null(db)) {
> > 		help.search("abcd!", rebuild=TRUE, agrep=FALSE)
> > 		db <- eval(utils:::.hsearch_db())
> > 	}
> > 
> > 	base <- db$Base
> > 	alias <- db$Aliases
> > 	key <- db$Keywords
> > 
> > 	m <- all.methods(class=class(x))
> > 	m_id <- alias[match(m, alias[,1]), 2]
> > 	keywords <- lapply(m_id, function(id) key[key[,2] %in% id, 1])
> > 
> > 	f.names <- cbind(m, base[match(m_id, base[,3]), 4])
> > 	f.names <- unlist(lapply(1:nrow(f.names), function(i) {
> > 		if (is.na(f.names[i, 2])) return(f.names[i, 1])
> > 		a <- methodsplit(f.names[i, 1])
> > 		b <- methodsplit(f.names[i, 2])
> > 		
> > 		if (a[1] == b[1]) f.names[i, 2] else f.names[i, 
> 1]		
> > 	}))
> > 	
> > 	hints <- cbind(f.names, base[match(m_id, base[,3]), 5])
> > 	hints <- hints[order(tolower(hints[,1])),]
> > 	hints <- rbind(    c("--------",
"---------------"), hints)
> > 	rownames(hints) <- rep("", nrow(hints))
> > 	colnames(hints) <- c("Function", "Task")
> > 	hints[is.na(hints)] <- "(Unknown)"
> > 	
> > 	class(hints) <- "hints"
> > 	hints
> > }
> > 
> > print.hints <- function(x, ...) print(unclass(x), quote=FALSE)
> > 
> > all.methods <- function(classes) {
> > 	methods <- do.call(rbind,lapply(classes, function(x) {
> > 		m <- methods(class=x)
> > 		t(sapply(as.vector(m), methodsplit)) #m[attr(m, 
> "info")$visible]
> > 	}))
> > 	rownames(methods[!duplicated(methods[,1]),])
> > }
> > 
> > methodsplit <- function(m) {
> > 	parts <- strsplit(m, "\\.")[[1]]
> > 	if (length(parts) == 1) {
> > 		c(name=m, class="")
> > 	} else{
> > 		c(name=paste(parts[-length(parts)], 
> collapse="."), class=parts[length(parts)])
> > 	}	
> > }
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help 
> > PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html 
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel 
> 
>
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Duncan Murdoch

2006-Jun-20 10:58 UTC

head link

[Rd] Documentation issues [Was: Function hints]

On 6/20/2006 5:18 AM, Heather Turner wrote:> I would like to follow up on another one of the documentation issues raised
in the discussion on function hints. Duncan mentioned that the R core were
working on preprocessing directives for .Rd files, which could possibly include
some sort of include directive. I was wondering if a "includeexamples"
directive might also be considered.
> 
> It often makes sense to use the same example to illustrate the use of
different functions, or perhaps extend an example used to illustrate one
function to illustrate another. One way to do this is simply to put
> 
> example(fnA)
> 
> in the \examples for fnB, but this is not particularly helpful for people
reading the help pages as they either need to look at both help pages or run the
example. The alternative is to maintain multiple copies of the same code, which
is not ideal.
> 
> So it would be useful to be able to put
> 
> \includeexamples(fnA)
> 
> so that the code is replicated in fnB.Rd. Perhaps an include directive
could do this anyway, but it might be useful to have a special directive for
examples so that RCMD check is set up to only check the original example to save
time (and unnecessary effort).
Thanks, that's a good suggestion.  My inclination would be towards just 
one type of \include; it could be surrounded by notation saying not to 
check it in all but one instance if the author wanted to save testing time.
> On a related issue, it would be nice if source() had an option to print
comments contained in the source file, so that example() and demo() could print
out annotation.
Yes, this has been a long-standing need, but it's somewhat tricky 
because of the way source currently works:  it parses the whole file, 
then executes the parsed version.  The first step loses the comments, so 
you see a deparsed version when executing.  What I think it should do is 
have pointers back from the parsed version to the original source code, 
but that needs fairly low level changes.  This is some of the missing 
"infrastructure" I mentioned below.

Duncan Murdoch
> 
> Heather
> 
> Dr H Turner
> Research Assistant
> Dept. of Statistics
> The University of Warwick
> Coventry
> CV4 7AL
> 
> Tel: 024 76575870
> Fax: 024 7652 4532
> Url: www.warwick.ac.uk/go/heatherturner
> 
>>>> <Mark.Bravington at csiro.au> 06/20/06 01:43am
>>>
> [This is not about the feasibility of a "hints" function-- which
would
> be incredibly useful, but perhaps very very hard to do-- but about some
> of the other documentation issues raised in Hadley's post and in
> Duncan's reply]
> 
> WRTO documentation & code together: for several years, I've
successfully
> used the 'mvbutils' package to keep every function definition &
its
> documentation together, editing them together in the same file--
> function first, then documentation in plain-text (basically the format
> you see if you use "vanilla help" inside R). Storage-wise, the
> documentation is just kept as an attribute of the function (with a print
> method that hides it by default)-- I also keep a text backup of the
> combination. Any text editor will do. When it's time to create a
> package, the Rd file is generated automatically.
> 
> For me, it's been extremely helpful to keep function &
documentation
> together during editing-- it greatly increases the chance that I will
> actually update the doco when I change the code, rather than putting it
> off until I've forgotten what I did. Also, writing Rd format is a
> nightmare (again, personal opinion)-- being able to write plain-text
> makes the whole documentation thing bearable.
> 
> The above is not quite to the point of the original post, I think, which
> talks about storing the documentation as commented bits *inside* the
> function code. However, I'm not sure the latter is really desirable;
> there is some merit in forcing authors to write an explicit
"Details" or
> "Description" section that is not just a paraphrase of
programming
> comments, and such sections are unlikely to fit easily inside code. At
> any rate, I wouldn't want to have to interpret my *own* programming
> comments as a usage guide!
> 
> WRTO automatic "usage" sections: it is easy to write code to do
this
> ('prompt', and there is also some in 'mvbutils'-- not sure
if it's in
> the current release though) but at least as far as the "usage"
section
> goes, I think people should be "vigorously encouraged" to write
their
> own, showing as far as possible how one might actually *use* the
> function. For many functions, just duplicating the argument list is not
> helpful to the user-- a function can often be invoked in several
> different ways, with different arguments relevant to different
> invocations. I think it's good to show how this can be done in the
> "usage" section, with comments, rather than deferring all
practical
> usage to "examples". For one thing, "usage" is near the
top, and so
> gives a very quick reminder without having to scroll through the entire
> doco; for another, "usage" and "arguments" are visually
adjacent,
> whereas "examples" can be widely separated from
"arguments".
> 
> My general point here is: the documentating process should be as
> painless as possible, but not more so. Defaults that are likely to lead
> to unhelpful documentation are perhaps best avoided.
> For this general reason, I applaud R's fairly rigid documentation
> standards, even though I frequently curse them. (And I would like to see
> some bits more rigid, such as compulsory "how-to-use-this"
documentation
> for each package!)
> 
> The next version of 'mvbutils' will include various tools for easy
"live
> editing" and automated preparation of packages-- I've been using
them
> for a while, but still have to get round to finishing the documentation
> ;) 
> 
> Mark Bravington
> CSIRO Mathematical & Information Sciences
> Marine Laboratory
> Castray Esplanade
> Hobart 7001
> TAS
> 
> ph (+61) 3 6232 5118
> fax (+61) 3 6232 5012
> mob (+61) 438 315 623
>  
> 
>> -----Original Message-----
>> From: r-devel-bounces at r-project.org 
>> [mailto:r-devel-bounces at r-project.org] On Behalf Of Duncan Murdoch
>> Sent: Tuesday, 20 June 2006 12:39 AM
>> To: hadley wickham; R-devel
>> Subject: Re: [Rd] [R] Function hints
>>
>> I've moved this from R-help to R-devel, where I think it is 
>> more appropriate, and interspersed comments below.
>>
>>
>>
>> On 6/19/2006 8:51 AM, hadley wickham wrote:
>>> One of the recurring themes in the recent UserR conference was that
>>> many people find it difficult to find the functions they need for a
>>> particular task.  Sandy Weisberg suggested a small idea he 
>> would like
>>> to see: a hints function that given an object, lists likely
>>> operations.  I've done my best to implement this function using
the
>>> tools currently available in R, and my code is included at 
>> the bottom
>>> of this email (I hope that I haven't just duplicated 
>> something already
>>> present in R).  I think Sandy's idea is genuinely useful, 
>> even in the
>>> limited form provided by my implementation, and I have already
>>> discovered a few useful functions that I was unaware of.
>>>
>>> While developing and testing this function, I ran into a 
>> few problems
>>> which, I think, represent underlying problems with the current
>>> documentation system.  These are typified by the results of running
>>> hints on a object produced by glm (having class c("glm",
"lm")).  I
>>> have outlined (very tersely) some possible solutions.  Please note
>>> that while these solutions are largely technological, the problem
is
>>> at heart sociological: writing documentation is no easier 
>> (and perhaps
>>> much harder) than writing a scientific publication, but the rewards
>>> are fewer.
>>>
>>> Problems:
>>>
>>>  * Many functions share the same description (eg. head, tail).
>>> Solution: each rdoc file should only describe one method. Problem:
>>> Writing rdoc files is tedious, there is a lot of information
>>> duplicated between the code and the documenation (eg. the usage
>>> statement) and some functions share a lot of similar information.
>>> Solution: make it easier to write documentation (eg. documentation
>>> inline with code), and easier to include certain common
descriptions
>>> in multiple methods (eg. new include command)
>> I think it's bad to document dissimilar functions in the same 
>> file, but 
>> similar related functions *should* be documented together.  Not doing 
>> this just adds to the burden of documenting them, and the risk of 
>> modifying only part of the documentation so that it is inconsistent. 
>> The user also gets the benefit of seeing a common description all at 
>> once, rather than having to decide whether to follow "See
also" links.
>>
>> Your solutions would both be interesting on their own merits 
>> regardless 
>> of the above.  We did decide to work on preprocessing 
>> directives for .Rd 
>> files at the R core meetings; some sort of include directive may be 
>> possible.
>>
>> I don't think I would want complete documentation mixed with the 
>> original source, but it would certainly be interesting to 
>> have partial 
>> documentation there.  (Complete documentation is too long, and would 
>> make it harder to read the source without a dedicated editor 
>> that could 
>> hide it.  Though ESS users may see it as a reasonable requirement to 
>> have everyone use the same editor, I don't think it is.)  
>> However, this 
>> is a lot of work, depending on infrastructure that is not in place.
>>
>>>  * It is difficult to tell which functions are commonly
>>> used/important. Solution: break down by keywords. Problem: keywords
>>> are not useful at the moment.  Solution:  make better list 
>> of keywords
>>> available and encourage people to use it.  Problem: people
won't
>>> unless there is a strong incentive, plus good keywording requires
>>> considerable expertise (especially in bulding up list).  This is
>>> probably insoluable unless one person systematically keywords all
of
>>> the base packages.
>> I think it is worse than that.  There are concepts in 
>> packages that just 
>> don't arise in base R, and hence there would be no keywords for
them
>> other than "misc", even if someone redesigned the current
system.
>> Keywording is hard, and it's not clear to me how to do much 
>> better than 
>> we currently do.
>>
>> We do already have user-defined keywords (via \concept), but 
>> these are 
>> not widely used.
>>
>>>  * Some functions aren't documented (eg. simulate.lm,
formula.glm) -
>>> typically, these are methods where the documentation is in the
>>> generic.  Solution: these methods should all be aliased to 
>> the generic
>>> (by default?), and R CMD check should be amended to check for this
>>> situation.  You could also argue that this is a deficiency with my
>>> function, and easily fixed by automatically referring to the
generic
>>> if the specific isn't documented.
>> I'd say it's a deficiency of your function.  You might want 
>> to look at 
>> the code in get("?") and .helpForCall() to see how those 
>> functions work 
>> out things like
>>
>> ?simulate(x)
>>
>> where x is an lm object.  (But notice that .helpForCall is an 
>> undocumented internal function; don't depend on its implementation 
>> working forever).
>>
>>>  * It can't supply suggestions when there isn't an explicit
method
>>> (ie. .default is used), this makes it pretty useless for basic
>>> vectors.  This may not really be a problem, as all possible 
>> operations
>>> are probably too numerous to list.
>>>
>>>  * Provides full name for function, when best practice is to use
>>> generic part only when calling function.  However, getting precise
>>> documentation may requires that full name. 
>> No, not if the call syntax above is used.
>>
>>   I do the best I can
>>> (returning the generic if specific is alias to a documentation file
>>> with the same method name), but this reflects a deeper problem that
>>> the name you should use when calling a function may be different to
>>> the name you use to get documentation.
>>>
>>>  * Can only display methods from currently loaded packages. 
>>  This is a
>>> shortcoming of the methods function, but I suspect it is 
>> difficult to
>>> find S3 methods without loading a package.
>>>
>>> Relatively trivial problems:
>>>
>>>  * Needs wide display to be effective.  Could be dealt with by
>>> breaking description in a sensible manner (there may 
>> already by R code
>>> to do this.  Please let me know if you know of any)
>> I think strwrap() may do what you want.
>>>  * Doesn't currently include S4 methods.  Solution: add 
>> some more code
>>> to wrap showMethods
>>>
>>>  * Personally, I think sentence case is more aesthetically pleasing
>>> (and more flexible) than title case.
>> It's quite hard to go from existing title case to sentence 
>> case, because 
>> we don't have any markup to indicate proper names.  One would 
>> think it 
>> would be easier to go in the opposite direction, but in fact the same 
>> problem arises:  "van Beethoven" for example, not "Van
Beethoven".
>>
>>
>>>
>>> Hadley
>>>
>>>
>>> hints <- function(x) {
>> I don't like the name "hints".  I think we already have
too many ways
>> into the help system:
>>
>> help
>> ?
>> help.search
>> apropos
>> etc.?
>>
>> I like your function, but I'd rather see it attached to one of the 
>> existing help functions, probably help.search().  For example,
>>
>> help.search(x)
>>
>> could look for functions designed to work with the class of 
>> x, if it had 
>> one.  (There's some ambiguity here:  perhaps x contains a 
>> string, and I 
>> want help on that string.)
>>
>> Anyway, thanks for your efforts on this so far; I hope we end up with 
>> something that can make it into the next release.
>>
>> Duncan Murdoch
>>
>>> 	db <- eval(utils:::.hsearch_db())
>>> 	if (is.null(db)) {
>>> 		help.search("abcd!", rebuild=TRUE, agrep=FALSE)
>>> 		db <- eval(utils:::.hsearch_db())
>>> 	}
>>>
>>> 	base <- db$Base
>>> 	alias <- db$Aliases
>>> 	key <- db$Keywords
>>>
>>> 	m <- all.methods(class=class(x))
>>> 	m_id <- alias[match(m, alias[,1]), 2]
>>> 	keywords <- lapply(m_id, function(id) key[key[,2] %in% id, 1])
>>>
>>> 	f.names <- cbind(m, base[match(m_id, base[,3]), 4])
>>> 	f.names <- unlist(lapply(1:nrow(f.names), function(i) {
>>> 		if (is.na(f.names[i, 2])) return(f.names[i, 1])
>>> 		a <- methodsplit(f.names[i, 1])
>>> 		b <- methodsplit(f.names[i, 2])
>>> 		
>>> 		if (a[1] == b[1]) f.names[i, 2] else f.names[i, 
>> 1]		
>>> 	}))
>>> 	
>>> 	hints <- cbind(f.names, base[match(m_id, base[,3]), 5])
>>> 	hints <- hints[order(tolower(hints[,1])),]
>>> 	hints <- rbind(    c("--------",
"---------------"), hints)
>>> 	rownames(hints) <- rep("", nrow(hints))
>>> 	colnames(hints) <- c("Function", "Task")
>>> 	hints[is.na(hints)] <- "(Unknown)"
>>> 	
>>> 	class(hints) <- "hints"
>>> 	hints
>>> }
>>>
>>> print.hints <- function(x, ...) print(unclass(x), quote=FALSE)
>>>
>>> all.methods <- function(classes) {
>>> 	methods <- do.call(rbind,lapply(classes, function(x) {
>>> 		m <- methods(class=x)
>>> 		t(sapply(as.vector(m), methodsplit)) #m[attr(m, 
>> "info")$visible]
>>> 	}))
>>> 	rownames(methods[!duplicated(methods[,1]),])
>>> }
>>>
>>> methodsplit <- function(m) {
>>> 	parts <- strsplit(m, "\\.")[[1]]
>>> 	if (length(parts) == 1) {
>>> 		c(name=m, class="")
>>> 	} else{
>>> 		c(name=paste(parts[-length(parts)], 
>> collapse="."), class=parts[length(parts)])
>>> 	}	
>>> }
>>>
>>> ______________________________________________
>>> R-help at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help 
>>> PLEASE do read the posting guide! 
>> http://www.R-project.org/posting-guide.html 
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel 
>>
>>
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

Heather Turner

2006-Jun-20 11:35 UTC

head link

[Rd] Documentation issues [Was: Function hints]

>>> Duncan Murdoch <murdoch at stats.uwo.ca> 06/20/06 11:58am
>>>
On 6/20/2006 5:18 AM, Heather Turner wrote:> I would like to follow up on another one of the documentation issues raised
in the discussion on function hints. Duncan mentioned that the R core were
working on preprocessing directives for .Rd files, which could possibly include
some sort of include directive. I was wondering if a "includeexamples"
directive might also be considered.
> 
> It often makes sense to use the same example to illustrate the use of
different functions, or perhaps extend an example used to illustrate one
function to illustrate another. One way to do this is simply to put
> 
> example(fnA)
> 
> in the \examples for fnB, but this is not particularly helpful for people
reading the help pages as they either need to look at both help pages or run the
example. The alternative is to maintain multiple copies of the same code, which
is not ideal.
> 
> So it would be useful to be able to put
> 
> \includeexamples(fnA)
> 
> so that the code is replicated in fnB.Rd. Perhaps an include directive
could do this anyway, but it might be useful to have a special directive for
examples so that RCMD check is set up to only check the original example to save
time (and unnecessary effort).
Thanks, that's a good suggestion.  My inclination would be towards just 
one type of \include; it could be surrounded by notation saying not to 
check it in all but one instance if the author wanted to save testing time.

Fair enough, but at the moment I don't think such notation exists - using
\dontrun would skip the check, but would also mean the code would not get run by
example(), leading to missing/broken examples. You could introduce a \dontcheck
directive but this might be dangerous!

Heather
> On a related issue, it would be nice if source() had an option to print
comments contained in the source file, so that example() and demo() could print
out annotation.
Yes, this has been a long-standing need, but it's somewhat tricky 
because of the way source currently works:  it parses the whole file, 
then executes the parsed version.  The first step loses the comments, so 
you see a deparsed version when executing.  What I think it should do is 
have pointers back from the parsed version to the original source code, 
but that needs fairly low level changes.  This is some of the missing 
"infrastructure" I mentioned below.

Duncan Murdoch
> 
> Heather
> 
> Dr H Turner
> Research Assistant
> Dept. of Statistics
> The University of Warwick
> Coventry
> CV4 7AL
> 
> Tel: 024 76575870
> Fax: 024 7652 4532
> Url: www.warwick.ac.uk/go/heatherturner 
> 
>>>> <Mark.Bravington at csiro.au> 06/20/06 01:43am
>>>
> [This is not about the feasibility of a "hints" function-- which
would
> be incredibly useful, but perhaps very very hard to do-- but about some
> of the other documentation issues raised in Hadley's post and in
> Duncan's reply]
> 
> WRTO documentation & code together: for several years, I've
successfully
> used the 'mvbutils' package to keep every function definition &
its
> documentation together, editing them together in the same file--
> function first, then documentation in plain-text (basically the format
> you see if you use "vanilla help" inside R). Storage-wise, the
> documentation is just kept as an attribute of the function (with a print
> method that hides it by default)-- I also keep a text backup of the
> combination. Any text editor will do. When it's time to create a
> package, the Rd file is generated automatically.
> 
> For me, it's been extremely helpful to keep function &
documentation
> together during editing-- it greatly increases the chance that I will
> actually update the doco when I change the code, rather than putting it
> off until I've forgotten what I did. Also, writing Rd format is a
> nightmare (again, personal opinion)-- being able to write plain-text
> makes the whole documentation thing bearable.
> 
> The above is not quite to the point of the original post, I think, which
> talks about storing the documentation as commented bits *inside* the
> function code. However, I'm not sure the latter is really desirable;
> there is some merit in forcing authors to write an explicit
"Details" or
> "Description" section that is not just a paraphrase of
programming
> comments, and such sections are unlikely to fit easily inside code. At
> any rate, I wouldn't want to have to interpret my *own* programming
> comments as a usage guide!
> 
> WRTO automatic "usage" sections: it is easy to write code to do
this
> ('prompt', and there is also some in 'mvbutils'-- not sure
if it's in
> the current release though) but at least as far as the "usage"
section
> goes, I think people should be "vigorously encouraged" to write
their
> own, showing as far as possible how one might actually *use* the
> function. For many functions, just duplicating the argument list is not
> helpful to the user-- a function can often be invoked in several
> different ways, with different arguments relevant to different
> invocations. I think it's good to show how this can be done in the
> "usage" section, with comments, rather than deferring all
practical
> usage to "examples". For one thing, "usage" is near the
top, and so
> gives a very quick reminder without having to scroll through the entire
> doco; for another, "usage" and "arguments" are visually
adjacent,
> whereas "examples" can be widely separated from
"arguments".
> 
> My general point here is: the documentating process should be as
> painless as possible, but not more so. Defaults that are likely to lead
> to unhelpful documentation are perhaps best avoided.
> For this general reason, I applaud R's fairly rigid documentation
> standards, even though I frequently curse them. (And I would like to see
> some bits more rigid, such as compulsory "how-to-use-this"
documentation
> for each package!)
> 
> The next version of 'mvbutils' will include various tools for easy
"live
> editing" and automated preparation of packages-- I've been using
them
> for a while, but still have to get round to finishing the documentation
> ;) 
> 
> Mark Bravington
> CSIRO Mathematical & Information Sciences
> Marine Laboratory
> Castray Esplanade
> Hobart 7001
> TAS
> 
> ph (+61) 3 6232 5118
> fax (+61) 3 6232 5012
> mob (+61) 438 315 623
>  
> 
>> -----Original Message-----
>> From: r-devel-bounces at r-project.org 
>> [mailto:r-devel-bounces at r-project.org] On Behalf Of Duncan Murdoch
>> Sent: Tuesday, 20 June 2006 12:39 AM
>> To: hadley wickham; R-devel
>> Subject: Re: [Rd] [R] Function hints
>>
>> I've moved this from R-help to R-devel, where I think it is 
>> more appropriate, and interspersed comments below.
>>
>>
>>
>> On 6/19/2006 8:51 AM, hadley wickham wrote:
>>> One of the recurring themes in the recent UserR conference was that
>>> many people find it difficult to find the functions they need for a
>>> particular task.  Sandy Weisberg suggested a small idea he 
>> would like
>>> to see: a hints function that given an object, lists likely
>>> operations.  I've done my best to implement this function using
the
>>> tools currently available in R, and my code is included at 
>> the bottom
>>> of this email (I hope that I haven't just duplicated 
>> something already
>>> present in R).  I think Sandy's idea is genuinely useful, 
>> even in the
>>> limited form provided by my implementation, and I have already
>>> discovered a few useful functions that I was unaware of.
>>>
>>> While developing and testing this function, I ran into a 
>> few problems
>>> which, I think, represent underlying problems with the current
>>> documentation system.  These are typified by the results of running
>>> hints on a object produced by glm (having class c("glm",
"lm")).  I
>>> have outlined (very tersely) some possible solutions.  Please note
>>> that while these solutions are largely technological, the problem
is
>>> at heart sociological: writing documentation is no easier 
>> (and perhaps
>>> much harder) than writing a scientific publication, but the rewards
>>> are fewer.
>>>
>>> Problems:
>>>
>>>  * Many functions share the same description (eg. head, tail).
>>> Solution: each rdoc file should only describe one method. Problem:
>>> Writing rdoc files is tedious, there is a lot of information
>>> duplicated between the code and the documenation (eg. the usage
>>> statement) and some functions share a lot of similar information.
>>> Solution: make it easier to write documentation (eg. documentation
>>> inline with code), and easier to include certain common
descriptions
>>> in multiple methods (eg. new include command)
>> I think it's bad to document dissimilar functions in the same 
>> file, but 
>> similar related functions *should* be documented together.  Not doing 
>> this just adds to the burden of documenting them, and the risk of 
>> modifying only part of the documentation so that it is inconsistent. 
>> The user also gets the benefit of seeing a common description all at 
>> once, rather than having to decide whether to follow "See
also" links.
>>
>> Your solutions would both be interesting on their own merits 
>> regardless 
>> of the above.  We did decide to work on preprocessing 
>> directives for .Rd 
>> files at the R core meetings; some sort of include directive may be 
>> possible.
>>
>> I don't think I would want complete documentation mixed with the 
>> original source, but it would certainly be interesting to 
>> have partial 
>> documentation there.  (Complete documentation is too long, and would 
>> make it harder to read the source without a dedicated editor 
>> that could 
>> hide it.  Though ESS users may see it as a reasonable requirement to 
>> have everyone use the same editor, I don't think it is.)  
>> However, this 
>> is a lot of work, depending on infrastructure that is not in place.
>>
>>>  * It is difficult to tell which functions are commonly
>>> used/important. Solution: break down by keywords. Problem: keywords
>>> are not useful at the moment.  Solution:  make better list 
>> of keywords
>>> available and encourage people to use it.  Problem: people
won't
>>> unless there is a strong incentive, plus good keywording requires
>>> considerable expertise (especially in bulding up list).  This is
>>> probably insoluable unless one person systematically keywords all
of
>>> the base packages.
>> I think it is worse than that.  There are concepts in 
>> packages that just 
>> don't arise in base R, and hence there would be no keywords for
them
>> other than "misc", even if someone redesigned the current
system.
>> Keywording is hard, and it's not clear to me how to do much 
>> better than 
>> we currently do.
>>
>> We do already have user-defined keywords (via \concept), but 
>> these are 
>> not widely used.
>>
>>>  * Some functions aren't documented (eg. simulate.lm,
formula.glm) -
>>> typically, these are methods where the documentation is in the
>>> generic.  Solution: these methods should all be aliased to 
>> the generic
>>> (by default?), and R CMD check should be amended to check for this
>>> situation.  You could also argue that this is a deficiency with my
>>> function, and easily fixed by automatically referring to the
generic
>>> if the specific isn't documented.
>> I'd say it's a deficiency of your function.  You might want 
>> to look at 
>> the code in get("?") and .helpForCall() to see how those 
>> functions work 
>> out things like
>>
>> ?simulate(x)
>>
>> where x is an lm object.  (But notice that .helpForCall is an 
>> undocumented internal function; don't depend on its implementation 
>> working forever).
>>
>>>  * It can't supply suggestions when there isn't an explicit
method
>>> (ie. .default is used), this makes it pretty useless for basic
>>> vectors.  This may not really be a problem, as all possible 
>> operations
>>> are probably too numerous to list.
>>>
>>>  * Provides full name for function, when best practice is to use
>>> generic part only when calling function.  However, getting precise
>>> documentation may requires that full name. 
>> No, not if the call syntax above is used.
>>
>>   I do the best I can
>>> (returning the generic if specific is alias to a documentation file
>>> with the same method name), but this reflects a deeper problem that
>>> the name you should use when calling a function may be different to
>>> the name you use to get documentation.
>>>
>>>  * Can only display methods from currently loaded packages. 
>>  This is a
>>> shortcoming of the methods function, but I suspect it is 
>> difficult to
>>> find S3 methods without loading a package.
>>>
>>> Relatively trivial problems:
>>>
>>>  * Needs wide display to be effective.  Could be dealt with by
>>> breaking description in a sensible manner (there may 
>> already by R code
>>> to do this.  Please let me know if you know of any)
>> I think strwrap() may do what you want.
>>>  * Doesn't currently include S4 methods.  Solution: add 
>> some more code
>>> to wrap showMethods
>>>
>>>  * Personally, I think sentence case is more aesthetically pleasing
>>> (and more flexible) than title case.
>> It's quite hard to go from existing title case to sentence 
>> case, because 
>> we don't have any markup to indicate proper names.  One would 
>> think it 
>> would be easier to go in the opposite direction, but in fact the same 
>> problem arises:  "van Beethoven" for example, not "Van
Beethoven".
>>
>>
>>>
>>> Hadley
>>>
>>>
>>> hints <- function(x) {
>> I don't like the name "hints".  I think we already have
too many ways
>> into the help system:
>>
>> help
>> ?
>> help.search
>> apropos
>> etc.?
>>
>> I like your function, but I'd rather see it attached to one of the 
>> existing help functions, probably help.search().  For example,
>>
>> help.search(x)
>>
>> could look for functions designed to work with the class of 
>> x, if it had 
>> one.  (There's some ambiguity here:  perhaps x contains a 
>> string, and I 
>> want help on that string.)
>>
>> Anyway, thanks for your efforts on this so far; I hope we end up with 
>> something that can make it into the next release.
>>
>> Duncan Murdoch
>>
>>> 	db <- eval(utils:::.hsearch_db())
>>> 	if (is.null(db)) {
>>> 		help.search("abcd!", rebuild=TRUE, agrep=FALSE)
>>> 		db <- eval(utils:::.hsearch_db())
>>> 	}
>>>
>>> 	base <- db$Base
>>> 	alias <- db$Aliases
>>> 	key <- db$Keywords
>>>
>>> 	m <- all.methods(class=class(x))
>>> 	m_id <- alias[match(m, alias[,1]), 2]
>>> 	keywords <- lapply(m_id, function(id) key[key[,2] %in% id, 1])
>>>
>>> 	f.names <- cbind(m, base[match(m_id, base[,3]), 4])
>>> 	f.names <- unlist(lapply(1:nrow(f.names), function(i) {
>>> 		if (is.na(f.names[i, 2])) return(f.names[i, 1])
>>> 		a <- methodsplit(f.names[i, 1])
>>> 		b <- methodsplit(f.names[i, 2])
>>> 		
>>> 		if (a[1] == b[1]) f.names[i, 2] else f.names[i, 
>> 1]		
>>> 	}))
>>> 	
>>> 	hints <- cbind(f.names, base[match(m_id, base[,3]), 5])
>>> 	hints <- hints[order(tolower(hints[,1])),]
>>> 	hints <- rbind(    c("--------",
"---------------"), hints)
>>> 	rownames(hints) <- rep("", nrow(hints))
>>> 	colnames(hints) <- c("Function", "Task")
>>> 	hints[is.na(hints)] <- "(Unknown)"
>>> 	
>>> 	class(hints) <- "hints"
>>> 	hints
>>> }
>>>
>>> print.hints <- function(x, ...) print(unclass(x), quote=FALSE)
>>>
>>> all.methods <- function(classes) {
>>> 	methods <- do.call(rbind,lapply(classes, function(x) {
>>> 		m <- methods(class=x)
>>> 		t(sapply(as.vector(m), methodsplit)) #m[attr(m, 
>> "info")$visible]
>>> 	}))
>>> 	rownames(methods[!duplicated(methods[,1]),])
>>> }
>>>
>>> methodsplit <- function(m) {
>>> 	parts <- strsplit(m, "\\.")[[1]]
>>> 	if (length(parts) == 1) {
>>> 		c(name=m, class="")
>>> 	} else{
>>> 		c(name=paste(parts[-length(parts)], 
>> collapse="."), class=parts[length(parts)])
>>> 	}	
>>> }
>>>
>>> ______________________________________________
>>> R-help at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help 
>>> PLEASE do read the posting guide! 
>> http://www.R-project.org/posting-guide.html 
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel 
>>
>>
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

Seemingly Similar Threads

Search for more apparently analagous threads

R devel - Jun 2006 - Documentation issues [Was: Function hints]

[Rd] Documentation issues [Was: Function hints]

[Rd] Documentation issues [Was: Function hints]

[Rd] Documentation issues [Was: Function hints]

Seemingly Similar Threads