I don't see where describes the implementation of '[]'. For example, if x is a matrix or a data.frame, how the lookup of 'colname1' is x[, 'colname1'] executed. Does R perform a lookup in the a hash of the colnames? Is the reference O(1) or O(n), where n is the second dim of x?
Hi Peng, If I undertood your point, try this: x<-runif(10) y<-runif(10) z<-runif(10) w<-runif(10) myDF<-data.frame(cbind(x,y,z,w)) myDF myDF[,c("w","z")] Happy new year miltinho On Thu, Dec 31, 2009 at 5:15 PM, Peng Yu <pengyu.ut@gmail.com> wrote:> I don't see where describes the implementation of '[]'. > > For example, if x is a matrix or a data.frame, how the lookup of > 'colname1' is x[, 'colname1'] executed. Does R perform a lookup in the > a hash of the colnames? Is the reference O(1) or O(n), where n is the > second dim of x? > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Peng Yu > Sent: Thursday, December 31, 2009 2:16 PM > To: r-help at stat.math.ethz.ch > Subject: [R] How x[, 'colname1'] is implemented? > > I don't see where describes the implementation of '[]'.You'd probably have to look in the source code for implementation details like that.> For example, if x is a matrix or a data.frame, how the lookup of > 'colname1' is x[, 'colname1'] executed. Does R perform a lookup in the > a hash of the colnames? Is the reference O(1) or O(n), where n is the > second dim of x?You can easily run timing tests in R by using system.time(). The sum of the first 2 components of its output gives the CPU time. E.g., > f<-function(ncol){ d<-data.frame(as.list(1:ncol)) names(d)<-paste("Col",1:ncol) sum(system.time(for(i in 1:100)d['Col 1'])[1:2]) } > z <- sapply(n<-2^(0:20), f) > z [1] 0.06 0.01 0.01 0.02 0.00 0.03 [7] 0.02 0.01 0.01 0.03 0.02 0.02 [13] 0.02 0.10 0.16 0.33 0.63 1.49 [19] 3.22 8.35 18.91 > plot(n, z, log="xy") # neither 0(1) nor O(ncol) Compare the results to subscripting by number and see how fast the column name to column number algorithm with various naming schemes. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
I don't see where describes the implementation of '[]'. For example, if x is a matrix or a data.frame, how the lookup of 'colname1' is x[, 'colname1'] executed. Does R perform a lookup in the a hash of the colnames? Is the reference O(1) or O(n), where n is the second dim of x?