thr3ads.net - R help - [R] History pruning [Jul 2008]

If this information is useful, please help other people find it:
Share via:

Ken Williams

2008-Jul-30 18:12 UTC

[R] History pruning

Hi,

I find that a typical workflow for me looks something like this:

1) import some data from files
2) mess around with the data for a while
3) mess around with plotting for a while
4) get a plot or analysis that looks good
5) go back through my history to make a list of the shortest command
sequence to recreate the plot or analysis
6) send out that sequence to colleagues, along with the generated plots
or analysis output

I wonder if there are any tools people have developed to help with step
5.  Typically I do something like this:

5a) save my entire history to a text file
5b) open it up in Emacs
5c) prune any lines that don't have assignment operators
5d) prune any plotting commands that were superseded by later plots

and then start on other more subtle stuff like pruning assignments that
were later overwritten, unless the later assignments have variable
overlap between the LHS and the RHS.  Then I just start eyeballing it.

Would any deeper introspection of the history expressions be feasible,
e.g. detecting statements that have no side effects, dead ends, etc.

The holy grail would be something like "show me all the statements that
contributed to the current plot" or the like.

Thanks.

-- 
Ken Williams
Research Scientist
The Thomson Reuters Corporation
Eagan, MN

Marc Schwartz

2008-Jul-30 18:59 UTC

head link

[R] History pruning

on 07/30/2008 01:12 PM Ken Williams wrote:> Hi,
> 
> I find that a typical workflow for me looks something like this:
> 
> 1) import some data from files
> 2) mess around with the data for a while
> 3) mess around with plotting for a while
> 4) get a plot or analysis that looks good
> 5) go back through my history to make a list of the shortest command
> sequence to recreate the plot or analysis
> 6) send out that sequence to colleagues, along with the generated plots
> or analysis output
> 
> I wonder if there are any tools people have developed to help with step
> 5.  Typically I do something like this:
> 
> 5a) save my entire history to a text file
> 5b) open it up in Emacs
> 5c) prune any lines that don't have assignment operators
> 5d) prune any plotting commands that were superseded by later plots
> 
> and then start on other more subtle stuff like pruning assignments that
> were later overwritten, unless the later assignments have variable
> overlap between the LHS and the RHS.  Then I just start eyeballing it.
> 
> Would any deeper introspection of the history expressions be feasible,
> e.g. detecting statements that have no side effects, dead ends, etc.
> 
> The holy grail would be something like "show me all the statements
that
> contributed to the current plot" or the like.
> 
> Thanks.
I (and many others) use ESS (Emacs Speaks Statistics), in which case, I 
have an R source buffer in the upper frame and an R session in the lower 
frame.

In my particular case, I also happen to use ECB (Emacs Code Browser) 
which also has a left hand column spanning the full vertical length, to 
provide access to other things (file browser, R function and data 
objects, etc.). It also helps integrate Sweave/LaTeX functionality to 
further centralize things and increase productivity. I have also tied in 
Subversion functionality to enable me to engage in version control of my 
code and other key files.

I do all of my editing in the upper frame and use the built-in ESS 
functions to submit the code to the R session. This also provides for 
code syntax highlighting, which makes it easier to visualize code as 
well as to check for things like matching parens/braces, etc.

In this way, your working code (including comments) is kept functionally 
intact in the upper frame and you can edit and use it without having to 
scroll through a long history of commands (which is still there if you 
need it).

More information here:

   http://ess.r-project.org/

HTH,

Marc Schwartz

Ken Williams

2008-Jul-31 18:08 UTC

head link

[R] History pruning

On 7/31/08 11:01 AM, "hadley wickham" <h.wickham at gmail.com>
wrote:
> I think that would be a very hard task -
Well, at least medium-hard.  But I think significant automatic steps could
be made, and then a human can take over for the last few steps.  That's why
I was enquiring about "tools" rather than a complete solution.

Does R provide facilities for introspection or interrogation of expression
objects?  I couldn't find anything useful on first look:
> methods(class="expression")
no methods were found> dput(expression(foo  <- 5 * bar))
expression(foo <- 5 * bar)> str(expression(foo <- 5 * bar))  expression(foo <- 5 * bar)
 
> it's equivalent to taking a
> long rambling conversation and then automatically turning it into a
> concise summary of what was said.  I think you must have human
> intervention.
It's not really equivalent, natural language has ambiguities and subtleties
that computer languages, especially functional languages, intentionally
don't have.  By their nature, computer languages can be turned into parse
trees unambiguously and then those trees can be manipulated.

But coincidentally I work in a Natural Language Processing group, and one of
the things we do is create exactly the kind of concise summaries you
describe. =)

-- 
Ken Williams
Research Scientist
The Thomson Reuters Corporation
Eagan, MN

Antony Unwin

2008-Aug-01 10:46 UTC

head link

[R] History pruning

JGR's "Copy Commands" command works well for me (even if it is
both
fascinating and embarrassing how little is sometimes left over).  It  
retains only commands that worked, so it is still not the minimum  
possible.

Antony Unwin
Professor of Computer-Oriented Statistics and Data Analysis,
Mathematics Institute,
University of Augsburg,
86135 Augsburg, Germany
Tel: + 49 821 5982218

antony.unwin@math.uni-augsburg.de

http://stats.math.uni-augsburg.de/




	[[alternative HTML version deleted]]

Richard M. Heiberger

2008-Aug-01 17:40 UTC

head link

[R] History pruning

>5a) save my entire history to a text file
>5b) open it up in Emacs
>5c) prune any lines that don't have assignment operators
>
>
>Ken Williams
>Research Scientist
>The Thomson Reuters Corporation
>Eagan, MN

No one has yet mentioned the obvious.  ESS does your 5a 5b 5c with
   M-x ess-transcript-clean-buffer
It works in either the *R* buffer or a *.rt or *.st buffer.
It handles multiple-line commands correctly.

Make sure the buffer is writable (C-x C-q on the *.rt buffer)
M-x ess-transcript-clean-buffer
Save the buffer as a *.r file.



On automatic content analysis, that is tougher. I would be scared to do
your>5d) prune any plotting commands that were superseded by later plots
because I don't know what supersede means.  I can imagine situations, for
example,
par(mfrow=c(1,2))
plot(y ~ x)
x <- x + 1
plot(y ~ x)
where I want to keep both plots.
You also have to trust that there are no side effects, which I wouldn't
want to do, because plot() changes the value of par() parameters.

Greg Snow

2008-Aug-08 18:04 UTC

head link

[R] History pruning

Ken,

Others have given hints on pruning the history, but are you committed to doing
this way?

An alternative would be something more like sink, where when you get to a place
that you know you want to start saving the commands you run a function to start
saving your commands, then at the end you run a command to stop saving the
commands.  One tool for doing this is in the TeachingDemos package, see the help
on ?txtStart.

The main goal of this set of functions was more to save a transcript of a
session (including graphical output if you use the etxtStart interface and an
external tool), but it has a possible side effect of saving the commands issued
in a file that could be 'source'd to rerun the set of commands (which
seems similar to what you want).  Commands (actually expressions) that result in
an error are not included and you can use the txtSkip function to run a command
without saving the command in the file (for things like "?plot" that
you don't want to rerun).  This may give you what you want, or at least
something that needs less editing to get at what you want.

Another option would be to take the source code for the above utilities and add
some checks that will decide whether to save the command or not (check if an
assignment was made, check if any 'par'ameters were changed, etc.).

Another option if you just want some code to recreate the current plot is to
look at the plot2script function in the TeachingDemos package.  It will create a
script (put it on the clipboard by default) to recreate the current plot.  It
does NOT use the same set of commands that you used to create the plot, but
rather lowlevel commands, but it creates a script that you can edit to recreate
the plot with just your changes (the current version needs some edits (line
wrapping, fixing the box command) before running the script, but it may be
another place to start).

Hope this helps,


--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
(801) 408-8111


> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Ken Williams
> Sent: Wednesday, July 30, 2008 12:13 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] History pruning
>
> Hi,
>
> I find that a typical workflow for me looks something like this:
>
> 1) import some data from files
> 2) mess around with the data for a while
> 3) mess around with plotting for a while
> 4) get a plot or analysis that looks good
> 5) go back through my history to make a list of the shortest
> command sequence to recreate the plot or analysis
> 6) send out that sequence to colleagues, along with the
> generated plots or analysis output
>
> I wonder if there are any tools people have developed to help
> with step 5.  Typically I do something like this:
>
> 5a) save my entire history to a text file
> 5b) open it up in Emacs
> 5c) prune any lines that don't have assignment operators
> 5d) prune any plotting commands that were superseded by later plots
>
> and then start on other more subtle stuff like pruning
> assignments that were later overwritten, unless the later
> assignments have variable overlap between the LHS and the
> RHS.  Then I just start eyeballing it.
>
> Would any deeper introspection of the history expressions be
> feasible, e.g. detecting statements that have no side
> effects, dead ends, etc.
>
> The holy grail would be something like "show me all the
> statements that contributed to the current plot" or the like.
>
> Thanks.
>
> --
> Ken Williams
> Research Scientist
> The Thomson Reuters Corporation
> Eagan, MN
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Maybe Matching Threads

Search for more reasonably related threads

R help - Jul 2008 - History pruning

[R] History pruning

[R] History pruning

[R] History pruning

[R] History pruning

[R] History pruning

[R] History pruning

Maybe Matching Threads