On 11/7/2007 7:46 AM, Alexy Khrabrov wrote:> Greetings -- coming from Python/Ruby perspective, I'm wondering about  
> certain features of R as a programming language.
Lots of question, I'll intersperse some answers.> 
> Say I have a huge table t of the form
> 
> run     ord     unit    words   new
> 1       1       6939    1013    641
> 1       2       275     1001    518
> 1       3       3314    1008    488
> 1       4       14154   1018    463
> 1       5       2982    1006    421
> 
> Alternatively, it may have a part column in front.  For each run (in  
> a part if present), I select ord and new columns as x and y and plot  
> their functions in various ways.  t is huge.  So I want to select the  
> subset to plot, as follows:
> 
> t.xy <- function(t,part=NA,run=NA) {
> 	if (is.na(run)) {
> 		# TODO does this entail a full copy -- or how do we do references  
> in R?
> 		r <- t
Semantically it acts as a full copy, though there is some internal 
optimization that means the copy won't be made until necessary, i.e. one 
of r or t changes.
There are some kinds of objects in R that are handled as references: 
environments, external pointers, names, NULL. (I may have missed some.) 
There are various kludges to expand this list to other kinds of objects, 
the most common way being to wrap an object in an environment.  But 
there is a fond wish that people use R as a functional language and 
avoid doing this.
> 	} else if (is.na(part)) {
> 		r <- t[t$run == run,]
> 	} else { # part present too
> 		r <- t[t$part == part & t$run == run,]
> 	}
> 	x <- r$ord
> 	y <- r$new
> 	xy.coords(x,y)
> }
> 
> What I'm wondering about is whether r <-t will copy the complete t,
> and how do I minimize copying in R.  I heard it's a functional  
> language -- is there lazy evaluation in place here?
There is lazy evaluation of function arguments, but assignments trigger 
evaluation of their RHS.
> 
> Additionally, tried to use --args command line arguments, and found a  
> way only due to David Brahm -- who helped with several important R  
> points (thanks Dave!):
> 
> #!/bin/sh
> # graph a fertility run
> tail --lines=+4 "$0" | R --vanilla --slave --args $*; exit
> args <- commandArgs()[-(1:4)]
> ...
> 
> And, still no option processing as in GNU long options, or python or  
> ruby's optparse.
> 
> What's the semantics of parameter passing -- by value or by reference?
By value.
> Is there anything less ugly than
> 
> print(paste("x=",x,"y=",y))
> 
> -- for routine printing?  Can [1] be eliminated from such simple  
> printing?  What about formatted printing?
You can use cat() instead of print(), and avoid the numbering and 
quoting.  Remember to explicitly specify a "\n" newline at the end.
At first I thought you were complaining about the syntax, which I find 
ugly.  There was a proposal last year to overload + to do concatenation 
of strings, so you'd type cat("x=" + x + "y=" + y +
"\n"), but there was
substantial resistance, on the grounds that + should be commutative.
> Is there a way to assign all of
> 
> a <- args[1]
> b <- args[2]
> c <- args[3]
> 
> in one fell swoop, a l? Python's
> 
> a,b,c = args
No, but you can do
abc <- args[1:3]
names(abc) <- c('a', 'b', 'c')
and refer to the components as abc$a, etc.
> What's the simplest way to check whether a filename ends in
".rda"?
Probably something like
if (regexpr("\\.rda$", filename) > 0) ...
You double the escape char to get it entered into the RE, and then the 
regexpr function uses it to escape the dot in the RE.
Duncan Murdoch
> Will ask more as I go programming...
> 
> (Will someone here please write an O'Reilly's "Programming in
R"?  :)
> 
> Cheers,
> Alexy
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.