Jonathan Greenberg
2010-Jun-08 02:33 UTC
[R] Matrix to "database" -- best practices/efficiency?
I have a matrix of, say, M and N dimensions: my_matrix=matrix(c(1:60),nrow=6,ncol=10) I have two "id" vectors corresponding to the rows and columns, e.g.: id_m=seq(10,60,by=10) id_n=seq(100,1000,by=100) I would like to create a "proper" database (let's say a data.frame for this example -- i'm going to be loading these into an SQLite database, but we'll leave that complication out of this discussion for now) of m x n rows, and 3 columns, where the 3 columns relate to the values from m, n, and my_matrix respectively, e.g. a single row follows the form: c(id_m[a],id_n[b],my_matrix[a,b]) I can, of course, for-loop this thing with an if-then, e.g.: *** for (a in 1:length(id_m)) { for (b in 1:length(id_n)) { if ((a==1) && (b==1)) { my_database=c(id_m[a],id_n[b],my_matrix[a,b]) } else { my_database=rbind(my_database,c(id_m[a],id_n[b],my_matrix[a,b])) } } } *** But my gut is telling me this is an incredibly inefficient way of doing this -- is there a faster approach to doing this same process? Thanks! --j
Gabor Grothendieck
2010-Jun-08 04:18 UTC
[R] Matrix to "database" -- best practices/efficiency?
Try this:> mm <- matrix(1:6, 3, dimnames = list(LETTERS[1:3], letters[1:2])) > mma b A 1 4 B 2 5 C 3 6> library(reshape) > melt(mm)X1 X2 value 1 A a 1 2 B a 2 3 C a 3 4 A b 4 5 B b 5 6 C b 6 On Mon, Jun 7, 2010 at 10:33 PM, Jonathan Greenberg <greenberg at ucdavis.edu> wrote:> I have a matrix of, say, M and N dimensions: > > my_matrix=matrix(c(1:60),nrow=6,ncol=10) > > I have two "id" vectors corresponding to the rows and columns, e.g.: > > id_m=seq(10,60,by=10) > id_n=seq(100,1000,by=100) > > I would like to create a "proper" database (let's say a data.frame for > this example -- i'm going to be loading these into an SQLite database, > but we'll leave that complication out of this discussion for now) of m > x n rows, and 3 columns, where the 3 columns relate to the values from > m, n, and my_matrix respectively, e.g. a single row follows the form: > > c(id_m[a],id_n[b],my_matrix[a,b]) > > I can, of course, for-loop this thing with an if-then, e.g.: > > *** > > for (a in 1:length(id_m)) > { > ? ? ? ?for (b in 1:length(id_n)) > ? ? ? ?{ > ? ? ? ? ? ? ? ?if ((a==1) && (b==1)) > ? ? ? ? ? ? ? ?{ > ? ? ? ? ? ? ? ? ? ? ? ?my_database=c(id_m[a],id_n[b],my_matrix[a,b]) > ? ? ? ? ? ? ? ?} else > ? ? ? ? ? ? ? ?{ > ? ? ? ? ? ? ? ? ? ? ? ?my_database=rbind(my_database,c(id_m[a],id_n[b],my_matrix[a,b])) > ? ? ? ? ? ? ? ?} > ? ? ? ?} > } > > *** > > But my gut is telling me this is an incredibly inefficient way of > doing this -- is there a faster approach to doing this same process? > Thanks! > > --j > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Jorge Ivan Velez
2010-Jun-08 04:24 UTC
[R] Matrix to "database" -- best practices/efficiency?
Hi Jonathan, Following Gabor Grothendieck's advice, try also: mm <- matrix(1:6, 3, dimnames = list(LETTERS[1:3], letters[1:2])) as.data.frame.table(mm) HTH, Jorge On Mon, Jun 7, 2010 at 10:33 PM, Jonathan Greenberg <> wrote:> I have a matrix of, say, M and N dimensions: > > my_matrix=matrix(c(1:60),nrow=6,ncol=10) > > I have two "id" vectors corresponding to the rows and columns, e.g.: > > id_m=seq(10,60,by=10) > id_n=seq(100,1000,by=100) > > I would like to create a "proper" database (let's say a data.frame for > this example -- i'm going to be loading these into an SQLite database, > but we'll leave that complication out of this discussion for now) of m > x n rows, and 3 columns, where the 3 columns relate to the values from > m, n, and my_matrix respectively, e.g. a single row follows the form: > > c(id_m[a],id_n[b],my_matrix[a,b]) > > I can, of course, for-loop this thing with an if-then, e.g.: > > *** > > for (a in 1:length(id_m)) > { > for (b in 1:length(id_n)) > { > if ((a==1) && (b==1)) > { > my_database=c(id_m[a],id_n[b],my_matrix[a,b]) > } else > { > > my_database=rbind(my_database,c(id_m[a],id_n[b],my_matrix[a,b])) > } > } > } > > *** > > But my gut is telling me this is an incredibly inefficient way of > doing this -- is there a faster approach to doing this same process? > Thanks! > > --j > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Bill.Venables at csiro.au
2010-Jun-08 04:30 UTC
[R] Matrix to "database" -- best practices/efficiency?
I think what you are groping for is something like this my_matrix <- matrix(1:60, nrow = 6) id_a <- seq(10,60,by=10) id_b <- seq(100,1000,by=100) my_database <- cbind( expand.grid(id_a = id_a, id_b = id_b), mat = as.vector(my_matrix) ) -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Jonathan Greenberg Sent: Tuesday, 8 June 2010 12:34 PM To: r-help Subject: [R] Matrix to "database" -- best practices/efficiency? I have a matrix of, say, M and N dimensions: my_matrix=matrix(c(1:60),nrow=6,ncol=10) I have two "id" vectors corresponding to the rows and columns, e.g.: id_m=seq(10,60,by=10) id_n=seq(100,1000,by=100) I would like to create a "proper" database (let's say a data.frame for this example -- i'm going to be loading these into an SQLite database, but we'll leave that complication out of this discussion for now) of m x n rows, and 3 columns, where the 3 columns relate to the values from m, n, and my_matrix respectively, e.g. a single row follows the form: c(id_m[a],id_n[b],my_matrix[a,b]) I can, of course, for-loop this thing with an if-then, e.g.: *** for (a in 1:length(id_m)) { for (b in 1:length(id_n)) { if ((a==1) && (b==1)) { my_database=c(id_m[a],id_n[b],my_matrix[a,b]) } else { my_database=rbind(my_database,c(id_m[a],id_n[b],my_matrix[a,b])) } } } *** But my gut is telling me this is an incredibly inefficient way of doing this -- is there a faster approach to doing this same process? Thanks! --j ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.