Ross Boylan
2006-Oct-02  19:15 UTC
[Rd] documenation duplication and proposed automatic tools
I've been looking at documenting S4 classes and methods, though I have a feeling many of these issues apply to S3 as well. My impression is that the documentation system requires or recommends creating basically the same information in several places. I'd like to explain that, see if I'm correct, and suggest that a more automated framework might make life easier. PROBLEM Consider a class A and a method foo that operates on A. As I understand it, I must document the generic function foo (or ?foo will not produce a response) and the method foo (or methods ? foo will not produce a response). Additionally, it appears to be recommended that I document foo in the Methods section of the documentation for class A. Finally, I may want to document the method foo with specific arguments (particularly if if uses "unusual" arguments, but presumably also if the semantics are different in a class that extends A). This seems like a lot of work to me, and it also seems error prone and subject to synchronization errors. R CMD check checks vanilla function documentation for agreement with the code, but I'm not sure that method documentation in other contexts gets much help. To complete the picture, suppose there is a another function, "bar", that operates on A. B extends A, and reimplements foo, but not bar. I think the suggestion is that I go back and add the B-flavored method foo to the general methods documentation for foo. I also have a choice whether I should mention bar in the documentation for the class B. If I mention it, it's easier for the reader to grasp the whole interface that B presents. However, I make it harder to determine which methods implement new functionality. SOLUTION There are a bunch of things users of OO systems typically want to know: 1) the relations between classes 2) the methods implemented by a class (for B, just foo) 3) the interface provided by a class (for B, foo and bar) 4) the various implementations of a particular method All of these can be discovered dynamically by the user. The problem is that current documentation system attempts to reproduce this dynamic information in static pages. prompt, promptClass and promptMethods functions generate templates that contain much of the information (or at least there supposed to; they seem to miss stuff for me, for example saying there are no methods when there are methods). This is helpful, but has two weaknesses. First, the class developer must enter very similar information in multiple places (specifically, function, methods, and class documentation). Second, that information is likely to get dated as the classes are modified and extended. I think it would be better if the class developer could enter the information once, and the documentation system assemble it dynamically when the user asks a question. For example, if the user asks for documentation on a class, the resulting page would be contstructed by pulling together the class description, appropriate method descriptions, and links to classes the focal class extends (as well, possibly, as classes that extend it). Similarly, a request for methods could assemble a page out of the snippets documenting the individual methods, including links to the relevant classes. I realize that implementing this is not trivial, and I'm not necessarily advocating it as a priority. But I wonder how it strikes people. -- Ross Boylan wk: (415) 514-8146 185 Berry St #5700 ross at biostat.ucsf.edu Dept of Epidemiology and Biostatistics fax: (415) 514-8150 University of California, San Francisco San Francisco, CA 94107-1739 hm: (415) 550-1062
Henrik Bengtsson
2006-Oct-02  22:27 UTC
[Rd] documenation duplication and proposed automatic tools
Hi.  Far from complete, but some sketches on a solution is in the
Rdoc-to-Rd translator of the R.oo package.  I, the author, never made
this very public because it uses a very ugly parser etc for it, but
the basics is there and I use it to generate \usage{} statements and
class hierarchies automatically, import example code and so on.  I
never have worry about consistency between code and Rd usage because
they are generated automatically.
An alternative to generate dynamic help pages in HTML (or other
formats too) is provided by the R.rsp package which allows you to
write things like:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
        "http://www.w3.org/TR/REC-html40/loose.dtd">
<%title="R Server Pages - Hello world!"%>
<html>
 <head>
  <title><%=title%></title>
 </head>
 <body>
  <h1><%=title%></h1>
  <p>
  This is an example of an RSP page.  This HTML page was generated
  from an RSP file on a <%=format(Sys.time(), "%A")%> at
  <%=format(Sys.time(), "%H:%M:%S")%>.
  </p>
 </body>
</html>
and have R to translate it to HTML code.  It even got a built-in HTTP
daemon (from Rpad) to do this automatically.  This would allow you to
generate instant help on classes and objects etc.
/Henrik
On 10/2/06, Ross Boylan <ross at biostat.ucsf.edu>
wrote:> I've been looking at documenting S4 classes and methods, though I have
a
> feeling many of these issues apply to S3 as well.
>
> My impression is that the documentation system requires or recommends
> creating basically the same information in several places.  I'd like to
> explain that, see if I'm correct, and suggest that a more automated
> framework might make life easier.
>
> PROBLEM
>
> Consider a class A and a method foo that operates on A.  As I understand
> it, I must document the generic function foo (or ?foo will not produce a
> response) and the method foo (or methods ? foo will not produce a
> response).  Additionally, it appears to be recommended that I document
> foo in the Methods section of the documentation for class A.  Finally, I
> may want to document the method foo with specific arguments
> (particularly if if uses "unusual" arguments, but presumably also
if the
> semantics are different in a class that extends A).
>
> This seems like a lot of work to me, and it also seems error prone  and
> subject to synchronization errors.  R CMD check checks vanilla function
> documentation for agreement with the code, but I'm not sure that method
> documentation in other contexts gets much help.
>
> To complete the picture, suppose there is a another function,
"bar",
> that operates on A.  B extends A, and reimplements foo, but not bar.
>
> I think the suggestion is that I go back and add the B-flavored method
> foo to the general methods documentation for foo.  I also have a choice
> whether I should mention bar in the documentation for the class B.  If I
> mention it, it's easier for the reader to grasp the whole interface
that
> B presents.  However, I make it harder to determine which methods
> implement new functionality.
>
> SOLUTION
>
> There are a bunch of things users of OO systems typically want to know:
> 1) the relations between classes
> 2) the methods implemented by a class (for B, just foo)
> 3) the interface provided by a class (for B, foo and bar)
> 4) the various implementations of a particular method
>
> All of these can be discovered dynamically by the user.  The problem is
> that current documentation system attempts to reproduce this dynamic
> information in static pages.  prompt, promptClass and promptMethods
> functions generate templates that contain much of the information (or at
> least there supposed to; they seem to miss stuff for me, for example
> saying there are no methods when there are methods).  This is helpful,
> but has two weaknesses.  First, the class developer must enter very
> similar information in multiple places (specifically, function, methods,
> and class documentation).  Second, that information is likely to get
> dated as the classes are modified and extended.
>
> I think it would be better if the class developer could enter the
> information once, and the documentation system assemble it dynamically
> when the user asks a question.  For example, if the user asks for
> documentation on a class, the resulting page would be contstructed by
> pulling together the class description, appropriate method descriptions,
> and links to classes the focal class extends (as well, possibly, as
> classes that extend it).  Similarly, a request for methods could
> assemble a page out of the snippets documenting the individual methods,
> including links to the relevant classes.
>
> I realize that implementing this is not trivial, and I'm not
necessarily
> advocating it as a priority.  But I wonder how it strikes people.
>
> --
> Ross Boylan                                      wk:  (415) 514-8146
> 185 Berry St #5700                               ross at biostat.ucsf.edu
> Dept of Epidemiology and Biostatistics           fax: (415) 514-8150
> University of California, San Francisco
> San Francisco, CA 94107-1739                     hm:  (415) 550-1062
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>