hello, i have been trying to convert my data frames to matrices in the hopes of speeding up some of my more complicated scripts. to assist with this, i am trying to create a "matrix column operator" like $: "%$%" = function(data,field) { as.numeric(data[,grep(field,unlist(dimnames(data)[2]))]) } the idea here is that you can use a matrix like a dataframe: matrix%$%"fieldname" i am getting this matrix by converting a dataframe: df = read.csv("data.csv") matrix = data.matrix(df,rownames.force=FALSE) this sets rownames to "NULL", but there is still an entry at dimnames(matrix)[1], and so i have to access the actual column names as dimnames(matrix)[2]. if there were only one dimension of dimnames, this operator works quickly, but when i have to access [2], it is super slow. am i way off base trying to do this? i'd like to have the ability to talk about the columns by name, since they may not always be in the same place. maybe i am making it more complicated than necessary? thanks! dan
Maybe you can time this and see if this is any better: "%$%" = function(data,field) data[, pmatch(field,dimnames(data)[[2L]])] # test mat <- matrix(1:24, 6, dimnames = list(NULL, letters[1:4])) mat%$%"c" On Jan 4, 2008 2:29 PM, Dan Dube <ddube at advisen.com> wrote:> hello, > > i have been trying to convert my data frames to matrices in the hopes of > speeding up some of my more complicated scripts. > > to assist with this, i am trying to create a "matrix column operator" > like $: > "%$%" = function(data,field) { > as.numeric(data[,grep(field,unlist(dimnames(data)[2]))]) > } > > the idea here is that you can use a matrix like a dataframe: > matrix%$%"fieldname" > > i am getting this matrix by converting a dataframe: > df = read.csv("data.csv") > matrix = data.matrix(df,rownames.force=FALSE) > > this sets rownames to "NULL", but there is still an entry at > dimnames(matrix)[1], and so i have to access the actual column names as > dimnames(matrix)[2]. if there were only one dimension of dimnames, this > operator works quickly, but when i have to access [2], it is super slow. > > am i way off base trying to do this? i'd like to have the ability to > talk about the columns by name, since they may not always be in the same > place. maybe i am making it more complicated than necessary? > > thanks! > > dan > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Why do you want to do this? X[,"cname"] gives the column named "cname" of matrix X (as a vector, unless drop=FALSE). The $ operator on data frames is essentially equivalent to this, anyway (see ? Extract). -- Bert Gunter Genentech -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Dan Dube Sent: Friday, January 04, 2008 11:29 AM To: r-help at r-project.org Subject: [R] slow access to matrix dimnames hello, i have been trying to convert my data frames to matrices in the hopes of speeding up some of my more complicated scripts. to assist with this, i am trying to create a "matrix column operator" like $: "%$%" = function(data,field) { as.numeric(data[,grep(field,unlist(dimnames(data)[2]))]) } the idea here is that you can use a matrix like a dataframe: matrix%$%"fieldname" i am getting this matrix by converting a dataframe: df = read.csv("data.csv") matrix = data.matrix(df,rownames.force=FALSE) this sets rownames to "NULL", but there is still an entry at dimnames(matrix)[1], and so i have to access the actual column names as dimnames(matrix)[2]. if there were only one dimension of dimnames, this operator works quickly, but when i have to access [2], it is super slow. am i way off base trying to do this? i'd like to have the ability to talk about the columns by name, since they may not always be in the same place. maybe i am making it more complicated than necessary? thanks! dan ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Fri, 4 Jan 2008, Dan Dube wrote:> hello, > > i have been trying to convert my data frames to matrices in the hopes of > speeding up some of my more complicated scripts. > > to assist with this, i am trying to create a "matrix column operator" > like $: > "%$%" = function(data,field) { > as.numeric(data[,grep(field,unlist(dimnames(data)[2]))]) > } > > the idea here is that you can use a matrix like a dataframe: > matrix%$%"fieldname" > > i am getting this matrix by converting a dataframe: > df = read.csv("data.csv") > matrix = data.matrix(df,rownames.force=FALSE) > > this sets rownames to "NULL", but there is still an entry at > dimnames(matrix)[1], and so i have to access the actual column names as > dimnames(matrix)[2]. if there were only one dimension of dimnames, this > operator works quickly, but when i have to access [2], it is super slow. > > am i way off base trying to do this? i'd like to have the ability to > talk about the columns by name, since they may not always be in the same > place. maybe i am making it more complicated than necessary?Well, dimnames( matrix )[[ 2 ]] colnames( matrix )] and unlist( dimnames( matrix )[ 2 ] ) each result in a vector of column names. The last seems to require the most time for what seem obvious reasons. Are you sure you want grep()? match() seems faster and will protect you from _regular expression madness_. HTH, Chuck> > thanks! > > dan > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901