I'm very new to R and utterly blown away by not only the language but the unbelievable set of packages and the documentation and the documentation standards and... I was an early APL user and never lost my love for it and in R I find most of the essential things I loved about APL except for one thing. At this early stage of my learning I can't yet determine if there is a way to effect what in APL was zero index origin, the ordinality of indexes starts with 0 instead of 1. Is it possible to effect that in R without a lot of difficulty? I come here today from the world of DSP research and development where Matlab has a near hegemony. I see no reason whatsoever that R couldn't replace it with a _far_ better and _far_ less idiosyncratic framework. I'd be interested in working on a Matlab equivalent DSP package for R (if that isn't being done by someone) and one of the things most criticized about Matlab from the standpoint of the DSP programmer is its insistence on 1 origin indexing. Any feedback greatly appreciated. Thanks, Bob -- "Things should be described as simply as possible, but no simpler." A. Einstein
Much of R is itself written in R, so you cannot possibly change something as fundamental as this. Further, index 0 has a special meaning that you would lose if R have 0-based indexing. However, the R thinking is to work with whole objects (vectors, arrays, lists ...) and you rather rarely need to know what numbers are in an index vector. There are usages such as 1:n, and those are quite often wrong: they should be seq(length=n) or seq(along=x) or some such, since n might be zero. If you are writing code that works with single elements, you are probably a lot better off writing C code to link into R (and C is 0-based ...). On Wed, 31 Mar 2004, Bob Cain wrote:> > I'm very new to R and utterly blown away by not only the > language but the unbelievable set of packages and the > documentation and the documentation standards and... > > I was an early APL user and never lost my love for it and in > R I find most of the essential things I loved about APL > except for one thing. At this early stage of my learning I > can't yet determine if there is a way to effect what in APL > was zero index origin, the ordinality of indexes starts with > 0 instead of 1. Is it possible to effect that in R without > a lot of difficulty? > > I come here today from the world of DSP research and > development where Matlab has a near hegemony. I see no > reason whatsoever that R couldn't replace it with a _far_ > better and _far_ less idiosyncratic framework. I'd be > interested in working on a Matlab equivalent DSP package for > R (if that isn't being done by someone) and one of the > things most criticized about Matlab from the standpoint of > the DSP programmer is its insistence on 1 origin indexing. > Any feedback greatly appreciated. > > > Thanks, > > Bob >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Bob Cain wrote:> > I'm very new to R and utterly blown away by not only the language but > the unbelievable set of packages and the documentation and the > documentation standards and... > > I was an early APL user and never lost my love for it and in R I find > most of the essential things I loved about APL except for one thing. > At this early stage of my learning I can't yet determine if there is a > way to effect what in APL was zero index origin, the ordinality of > indexes starts with 0 instead of 1. Is it possible to effect that in > R without a lot of difficulty? > > I come here today from the world of DSP research and development where > Matlab has a near hegemony. I see no reason whatsoever that R > couldn't replace it with a _far_ better and _far_ less idiosyncratic > framework. I'd be interested in working on a Matlab equivalent DSP > package for R (if that isn't being done by someone) and one of the > things most criticized about Matlab from the standpoint of the DSP > programmer is its insistence on 1 origin indexing. Any feedback > greatly appreciated. > > > Thanks, > > BobHallo Bob, in APL we control index origin by "QUAD.IO" and QUAD.IO \in {0,1}. Suppose within a function index origin is unknown: a) If we want to work with origin 0 we write x[ i + QUAD.IO ] b) ... with origin 1 ... x[ i - !QUAD.IO ] . So set: QUAD.IO <- 1 and use a) --- apl-like. Or define an index shift function: io.0<-function(ind) ind+1 to be able to type x[io.0(0:5)] I am shure that your first experiments have been to implement APL functions like take, drop, rotate, ... and now you are looking for a more elegant way to manage origin 0 than "+QUAD.IO". Peter Wolf
Bob Cain wrote:> At > this early stage of my learning I can't yet determine if there is a way > to effect what in APL was zero index origin, the ordinality of indexes > starts with 0 instead of 1. Is it possible to effect that in R without > a lot of difficulty? >Clearly R wasn't written by Dijkstra: http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF This text was pointed out to me when I started using Python, which has zero-based indexing. Python can look so much like R, but there are subtle differences. R: > x=c(5,4,3,2,1) > x[3] [1] 3 > x[2:4] [1] 4 3 2 compare: Python: >>> x=[5,4,3,2,1] >>> x[3] 2 >>> x [5, 4, 3, 2, 1] >>> x[2:4] [3, 2] A single element from a sequence in python is indexed from zero, hence x[3] == 2, but a range indexes from the commas between the limits of the range. Hence x[2:4] is the elements between comma 2 and comma 4 - hence its only 2 elements. Did my head in when I first started pythoning. Flipping between R and python is not recommended, kudos to all those involved in such R-python links... Baz
Hi Bob, Jonathan Rougier's Oarray package might be what you want. Jim
Dear Bob, One approach would be to introduce a class of objects for which zero-based indexing is implemented. Here's a simple example:> "[.io0" <- function(x, i) as.vector(x)[i + 1] > > v <- 0:10 > class(v) <- "io0" > v[0][1] 0> v[0:5][1] 0 1 2 3 4 5>Of course, a serious implementation would handle arrays and perhaps other kinds of objects, and would be more careful about the subscript. I hope that this helps, John> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Bob Cain > Sent: Wednesday, March 31, 2004 3:16 AM > To: R-help > Subject: [R] Zero Index Origin? > > > I'm very new to R and utterly blown away by not only the > language but the unbelievable set of packages and the > documentation and the documentation standards and... > > I was an early APL user and never lost my love for it and in > R I find most of the essential things I loved about APL > except for one thing. At this early stage of my learning I > can't yet determine if there is a way to effect what in APL > was zero index origin, the ordinality of indexes starts with > 0 instead of 1. Is it possible to effect that in R without a > lot of difficulty? > > I come here today from the world of DSP research and > development where Matlab has a near hegemony. I see no > reason whatsoever that R couldn't replace it with a _far_ > better and _far_ less idiosyncratic framework. I'd be > interested in working on a Matlab equivalent DSP package for > R (if that isn't being done by someone) and one of the things > most criticized about Matlab from the standpoint of the DSP > programmer is its insistence on 1 origin indexing. > Any feedback greatly appreciated. >
If you are willing to do it yourself you can define a class for which indexing behaves that way. For example, here is a start for a limited implementation for vectors. The first statement defines the constructor, the second defines [, the third converts an index 0 based vector back to a regular vector, the next implements lvalues and the last provides print method. as.vector0 <- function(x) structure(x, class="vector0") "[.vector0" <- function(x,i) as.vector0(as.vector.vector0(x)[i+1]) as.vector.vector0 <- function(x) unclass(x) "[<-.vector0" <- function(x,i,value) { x <- as.vector.vector0(x) x[i+1] <- value as.vector0(x) } print.vector0 <- function(x) print(as.vector.vector0(x)) # Test: x <- as.vector0(1:10) x[0:4] <- 100 * x[0:4] x Bob Cain <arcane <at> arcanemethods.com> writes: : : I'm very new to R and utterly blown away by not only the : language but the unbelievable set of packages and the : documentation and the documentation standards and... : : I was an early APL user and never lost my love for it and in : R I find most of the essential things I loved about APL : except for one thing. At this early stage of my learning I : can't yet determine if there is a way to effect what in APL : was zero index origin, the ordinality of indexes starts with : 0 instead of 1. Is it possible to effect that in R without : a lot of difficulty? : : I come here today from the world of DSP research and : development where Matlab has a near hegemony. I see no : reason whatsoever that R couldn't replace it with a _far_ : better and _far_ less idiosyncratic framework. I'd be : interested in working on a Matlab equivalent DSP package for : R (if that isn't being done by someone) and one of the : things most criticized about Matlab from the standpoint of : the DSP programmer is its insistence on 1 origin indexing. : Any feedback greatly appreciated. : : Thanks, : : Bob
I note that (1) "[" is a class-based function in R, so it would be possible to define a class of zero-origin arrays. This would mean that indexing these things would be quite incompatible with all the other indexing in R, so it's not clear that it would be a good thing. (2) R lets you define "left-hand functions", so if you define sub <- function (a, i) a[i+1] "sub<-" <- function (a, i, value) a[i+1] <- value then you can use sub(a,i) and sub(a,i) <- ... for indexing. This still wouldn't work like R indexing in general, but it wouldn't _look_ as if it should, so that might be better. I used to love APL myself, and am a great fan of index origin 0, but I'm *not* a fan of having two different origins in one language controlled by a variable; it made life quite difficult trying to mix code from different libraries. In R, I find myself wondering whether a["fred"] (string indexing) should depend on index origin, and if not, why not. (:-) Despite my love for origin 0, I've decided that the rest of R is worth it. Despite my love for origin 0, I've found that for the things that I do with R, index origin 1 really does seem to work better as a human interface. It would be interesting to see some sample code where origin 0 is supposed to make life easier, and to see what R experts to do make it even easier than that.
Gabor Grothendieck wrote: [snip good stuff]> > Of course the above is motherhood and some specific examples > might put a sharper edge to the discussion. >I really appreciate your point of view on this and think you are probably right. A question I have from my very limited understanding yet of OO is whether such objects could be passed to legacy functions with any expectation of correct results. Thanks, Bob -- "Things should be described as simply as possible, but no simpler." A. Einstein
Bob Cain <arcane <at> arcanemethods.com> writes:> A question I have from my very limited > understanding yet of OO is whether such objects could be > passed to legacy functions with any expectation of correct > results.Typically you need to write a wrapper but sometimes you get it for free. For example, in the vector0 class that I displayed previously, note that no multiplication method was defined for it yet the example included multiplication and it worked. Oarray would get the same benefit. Note that if a routine f.legacy(x) does not make use of indices at all then f(unclass(v)) where v is of class vector0 or Oarray would be sufficient (untested). Also you can define a generic that dispatches to the appropriate class: f <- function(x) UseMethod("f") # defines a generic f.Orray <- function(x) f(unclass(x)) f.default <- f.legacy Now you can call f(x) and if x is a legacy variable f.legacy gets used and if x is Oarray then f.Oarray gets used.
Bob, on a remark about Brian Ripley's remark if you're actually using indicies explicitly, you probably haven't wrapped your head around how powerful the indexing structure and "whole object" approach is in S and in R, Gabor Grothendieck wrote:> I believe the point was not so much that R has powerful indexing > (which it does) but that the capabilities of dealing with objects as > a whole, i.e. without using indexing at all, makes facilities for > indexing less important.I believe what REALLY makes the question of zero origin versus 1-origin fairly unimportant is the fact that even when you use indexing, you rarely need to know what the index numbers are. In typical indexing expressions such as x[y==z] (that is, selecting a subset from one vector, based on agreement between two other vectors) you just do not care whether the index numbers that are used implicitly in the process start at 0, 1, or, for that matter, 314. And, as Brian Ripley has also pointed out: even the expressions of the form 1:n should 1:n usually rather be someting like seq(length=n) or seq(along=x) instead, which again would even allow indexing starting at 314 (and progressing in steps of 17 if you'd like). Lutz Prof. Dr. Lutz Prechelt; prechelt at inf.fu-berlin.de Institut f??r Informatik; Freie Universit??t Berlin Takustr. 9; 14195 Berlin; Germany +49 30 838 75115; http://www.inf.fu-berlin.de/inst/ag-se/
I asked where index origin 0 would help. (I am familiar with Dijkstra's article on why counting should start at 0 and agree whole-heartedly. I am also convinced that "kicking against the goads" is not helpful.) Peter Wolf <s-plus at wiwi.uni-bielefeld.de> has provided two examples. (1) Converting pseudo-code which assumes index origin 0. I've done this myself, although not yet in R. I've also done it the other way, converting index origin 1 code to C. A good method, I've found, is to to start by converting to index-free form. This applies whatever the source and target languages are. (2) A sorting algorithm. sort.6 <- function (a) { n <- length(a) adapt <- function (i) {i+1} a <- c(0,a) for (i in 2:n) { j <- i-1 a[adapt(0)] <- a[adapt(i)] while (a[adapt(j)] > a[adapt(0)]) { a[adapt(j+1)] <- a[adapt(j)] j <- j-1 } a[adapt(j+1)] <- a[adapt(0)] } a[-1] } The really interesting thing here is that this basically is an index origin 1 algorithm. The original array and the final result start at 1, not 0, and position 0 is used for a "sentinel". Let's convert it to 1-origin. I'll skip the details of how I did it because that's not the point I want to make. sort.VI <- function (a) { a <- c(0, a) for (i in 3:length(a)) { j <- i-1 a[1] <- a[i] while (a[j] > a[1]) { a[j+1] <- a[j] j <- j-1 } a[j+1] <- a[1] } a[-1] } What do you get if you move up to index-free form? sort.six <- function (a) { s <- c() for (x in a) { f <- s <= x s <- c(s[f], x, s[!f]) # insert x stably into s } s } It's clear that sort.six is shorter, clearer, and easier to get right than sort.VI. But how much do we have to pay for this? How much efficiency do we lose? > a <- runif(400) > system.time(sort.VI(a)) [1] 3.64 0.02 12.56 0.00 0.00 > system.time(sort.six(a)) [1] 0.15 0.01 0.16 0.00 0.00 We don't lose any efficiency at all. We gain, considerably. (Not as much as we'd gain by using the built-in sort(), of course.) I expect that this will happen fairly often: rewriting the code to be index-free will *usually* make it shorter and clearer, and will *always* make it easier to adapt to a language with a different index origin. When the target language is R or S, the result is likely to be faster than a direct conversion.