hello,
i have been trying to convert my data frames to matrices in the hopes of
speeding up some of my more complicated scripts.
to assist with this, i am trying to create a "matrix column operator"
like $:
"%$%" = function(data,field) {
as.numeric(data[,grep(field,unlist(dimnames(data)[2]))])
}
the idea here is that you can use a matrix like a dataframe:
matrix%$%"fieldname"
i am getting this matrix by converting a dataframe:
df = read.csv("data.csv")
matrix = data.matrix(df,rownames.force=FALSE)
this sets rownames to "NULL", but there is still an entry at
dimnames(matrix)[1], and so i have to access the actual column names as
dimnames(matrix)[2]. if there were only one dimension of dimnames, this
operator works quickly, but when i have to access [2], it is super slow.
am i way off base trying to do this? i'd like to have the ability to
talk about the columns by name, since they may not always be in the same
place. maybe i am making it more complicated than necessary?
thanks!
dan
Maybe you can time this and see if this is any better: "%$%" = function(data,field) data[, pmatch(field,dimnames(data)[[2L]])] # test mat <- matrix(1:24, 6, dimnames = list(NULL, letters[1:4])) mat%$%"c" On Jan 4, 2008 2:29 PM, Dan Dube <ddube at advisen.com> wrote:> hello, > > i have been trying to convert my data frames to matrices in the hopes of > speeding up some of my more complicated scripts. > > to assist with this, i am trying to create a "matrix column operator" > like $: > "%$%" = function(data,field) { > as.numeric(data[,grep(field,unlist(dimnames(data)[2]))]) > } > > the idea here is that you can use a matrix like a dataframe: > matrix%$%"fieldname" > > i am getting this matrix by converting a dataframe: > df = read.csv("data.csv") > matrix = data.matrix(df,rownames.force=FALSE) > > this sets rownames to "NULL", but there is still an entry at > dimnames(matrix)[1], and so i have to access the actual column names as > dimnames(matrix)[2]. if there were only one dimension of dimnames, this > operator works quickly, but when i have to access [2], it is super slow. > > am i way off base trying to do this? i'd like to have the ability to > talk about the columns by name, since they may not always be in the same > place. maybe i am making it more complicated than necessary? > > thanks! > > dan > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Why do you want to do this? X[,"cname"] gives the column named
"cname" of
matrix X (as a vector, unless drop=FALSE). The $ operator on data frames is
essentially equivalent to this, anyway (see ? Extract).
-- Bert Gunter
Genentech
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On
Behalf Of Dan Dube
Sent: Friday, January 04, 2008 11:29 AM
To: r-help at r-project.org
Subject: [R] slow access to matrix dimnames
hello,
i have been trying to convert my data frames to matrices in the hopes of
speeding up some of my more complicated scripts.
to assist with this, i am trying to create a "matrix column operator"
like $:
"%$%" = function(data,field) {
as.numeric(data[,grep(field,unlist(dimnames(data)[2]))])
}
the idea here is that you can use a matrix like a dataframe:
matrix%$%"fieldname"
i am getting this matrix by converting a dataframe:
df = read.csv("data.csv")
matrix = data.matrix(df,rownames.force=FALSE)
this sets rownames to "NULL", but there is still an entry at
dimnames(matrix)[1], and so i have to access the actual column names as
dimnames(matrix)[2]. if there were only one dimension of dimnames, this
operator works quickly, but when i have to access [2], it is super slow.
am i way off base trying to do this? i'd like to have the ability to
talk about the columns by name, since they may not always be in the same
place. maybe i am making it more complicated than necessary?
thanks!
dan
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
On Fri, 4 Jan 2008, Dan Dube wrote:> hello, > > i have been trying to convert my data frames to matrices in the hopes of > speeding up some of my more complicated scripts. > > to assist with this, i am trying to create a "matrix column operator" > like $: > "%$%" = function(data,field) { > as.numeric(data[,grep(field,unlist(dimnames(data)[2]))]) > } > > the idea here is that you can use a matrix like a dataframe: > matrix%$%"fieldname" > > i am getting this matrix by converting a dataframe: > df = read.csv("data.csv") > matrix = data.matrix(df,rownames.force=FALSE) > > this sets rownames to "NULL", but there is still an entry at > dimnames(matrix)[1], and so i have to access the actual column names as > dimnames(matrix)[2]. if there were only one dimension of dimnames, this > operator works quickly, but when i have to access [2], it is super slow. > > am i way off base trying to do this? i'd like to have the ability to > talk about the columns by name, since they may not always be in the same > place. maybe i am making it more complicated than necessary?Well, dimnames( matrix )[[ 2 ]] colnames( matrix )] and unlist( dimnames( matrix )[ 2 ] ) each result in a vector of column names. The last seems to require the most time for what seem obvious reasons. Are you sure you want grep()? match() seems faster and will protect you from _regular expression madness_. HTH, Chuck> > thanks! > > dan > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901