gaurav kandoi
2015-Jul-20 19:29 UTC
[R] Printing row and column names of cells with specific value in a big matrix
Hi Sarah, sorry for posting in HTML. I've two big matrices (5k*4k) with the same structure, i.e. : ,mRNA1,mRNA2,mRNA3 lncRNA1,0.395646498,0.949950035,0.761770206 lncRNA2,0.037909944,0.661258022,0.558657799 lncRNA3,0.678459646,0.652364052,0.359053653 Now, I would like to extract the names of the row,col pairs whose value is less than 0.05. In this case, I should get the output as (lncRNA2,mRNA1) and (lncRNA4,mRNA2) alongwith their values (0.03791 and 0.003). Since the structure of both the matrix is same, I would also like to retrieve the corresponding values and row,col names from the second matrix. (lncRNA2,mRNA1 and lncRNA4,mRNA2 alongwith their values in the second matrix.) I'm using the following code:> Pmatrix = read.table("pmatrix.csv", header=T, sep="," , row.names=1) > sig_values <- which(Pmatrix<0.05, arr.ind=TRUE) > sig_values > Corr_Matrix = read.csv("corr_matrix.csv", header = T, row.names=1) > Corr_Matrix[sig_values]However, it only prints the row,col number (sig_values command) or only the values (Corr_Matrix[sig_values]) command. How can I get the row and column names alongwith their values? I've also tried printing using the following print command:>paste(rownames(Pmatrix)[sig_values[1]], colnames(Pmatrix)[sig_values[2]], sep=", ")But it gives a output like this: [1] "lncRNA2, NA" Sample input files available for download: https://goo.gl/xR6XDg Regards On Mon, Jul 20, 2015 at 2:11 PM, Sarah Goslee <sarah.goslee at gmail.com> wrote:> Without a reproducible example, or at least a non-mangled one (please > don't post in HTML), I'm not inclined to try it, but why not use > sig_values to index row.names() and col.names() if you're after the > names? > > Sarah > > On Mon, Jul 20, 2015 at 1:44 PM, gaurav kandoi <kandoigaurav at gmail.com> wrote: >> Hi All >> >> I've two big matrices (5k*4k) with the same structure, i.e. : >> >> mRNA1 mRNA2 mRNA3 lncRNA1 0.395646 0.94995 0.76177 lncRNA2 0.03791 >> 0.661258 0.558658 lncRNA3 0.67846 0.652364 0.359054 lncRNA4 0.57769 0.003 >> 0.459127 >> Now, I would like to extract the names of the row,col pairs whose value is >> less than 0.05. In this case, I should get the output as (lncRNA2,mRNA1) >> and (lncRNA4,mRNA2) alongwith their values (0.03791 and 0.003). Since the >> structure of both the matrix is same, I would also like to retrieve the >> corresponding values and row,col names from the second matrix. >> (lncRNA2,mRNA1 and lncRNA4,mRNA2 alongwith their values in the second >> matrix.) >> >> I'm using the following code: >> >> Pmatrix = read.table("pmatrix.csv", header=T, sep="," , row.names=1) >>> sig_values <- which(Pmatrix<0.05, arr.ind=TRUE) >>> sig_values >>> Corr_Matrix = read.csv("corr_matrix.csv", header = T, row.names=1) >>> Corr_Matrix[sig_values] >> >> >> However, it only prints the row,col number (sig_values command) or only the >> values (Corr_Matrix[sig_values]) command. How can I get the row and column >> names alongwith their values? >> >> Regards >> >> -- >> *Gaurav Kandoi* >> >> [[alternative HTML version deleted]] > > -- > Sarah Goslee > http://www.functionaldiversity.org-- Gaurav Kandoi
Sarah Goslee
2015-Jul-20 19:55 UTC
[R] Printing row and column names of cells with specific value in a big matrix
Subsetting error. See below. On Mon, Jul 20, 2015 at 3:29 PM, gaurav kandoi <kandoigaurav at gmail.com> wrote:> Hi Sarah, sorry for posting in HTML. > > I've two big matrices (5k*4k) with the same structure, i.e. : > > ,mRNA1,mRNA2,mRNA3 > lncRNA1,0.395646498,0.949950035,0.761770206 > lncRNA2,0.037909944,0.661258022,0.558657799 > lncRNA3,0.678459646,0.652364052,0.359053653 > > Now, I would like to extract the names of the row,col pairs whose > value is less than 0.05. In this case, I should get the output as > (lncRNA2,mRNA1) and (lncRNA4,mRNA2) alongwith their values (0.03791 > and 0.003). Since the structure of both the matrix is same, I would > also like to retrieve the corresponding values and row,col names from > the second matrix. (lncRNA2,mRNA1 and lncRNA4,mRNA2 alongwith their > values in the second matrix.) > > I'm using the following code: > >> Pmatrix = read.table("pmatrix.csv", header=T, sep="," , row.names=1) >> sig_values <- which(Pmatrix<0.05, arr.ind=TRUE) >> sig_values >> Corr_Matrix = read.csv("corr_matrix.csv", header = T, row.names=1) >> Corr_Matrix[sig_values] > > However, it only prints the row,col number (sig_values command) or > only the values (Corr_Matrix[sig_values]) command. How can I get the > row and column names alongwith their values? > > I've also tried printing using the following print command: > >>paste(rownames(Pmatrix)[sig_values[1]], colnames(Pmatrix)[sig_values[2]], sep=", ")> But it gives a output like this: > > [1] "lncRNA2, NA"Well, yes. sig_values[1]> sig_values[1][1] 2> sig_values[2][1] 8 And there is no column 8, so no name. paste(rownames(Pmatrix)[sig_values[,1]], colnames(Pmatrix)[sig_values[,2]], sep=", ") [1] "lncRNA2, mRNA1" "lncRNA8, mRNA1" "lncRNA4, mRNA2" "lncRNA7, mRNA2" "lncRNA1, mRNA4" [6] "lncRNA3, mRNA4" "lncRNA5, mRNA5"> Sample input files available for download: https://goo.gl/xR6XDgdput() is preferred to expecting people to download things from unknown sources. Sarah -- Sarah Goslee http://www.functionaldiversity.org
gaurav kandoi
2015-Jul-20 20:13 UTC
[R] Printing row and column names of cells with specific value in a big matrix
Thanks a lot Sarah. I think I've got what I wanted. On Mon, Jul 20, 2015 at 2:55 PM, Sarah Goslee <sarah.goslee at gmail.com> wrote:> Subsetting error. See below. > > On Mon, Jul 20, 2015 at 3:29 PM, gaurav kandoi <kandoigaurav at gmail.com> wrote: >> Hi Sarah, sorry for posting in HTML. >> >> I've two big matrices (5k*4k) with the same structure, i.e. : >> >> ,mRNA1,mRNA2,mRNA3 >> lncRNA1,0.395646498,0.949950035,0.761770206 >> lncRNA2,0.037909944,0.661258022,0.558657799 >> lncRNA3,0.678459646,0.652364052,0.359053653 >> >> Now, I would like to extract the names of the row,col pairs whose >> value is less than 0.05. In this case, I should get the output as >> (lncRNA2,mRNA1) and (lncRNA4,mRNA2) alongwith their values (0.03791 >> and 0.003). Since the structure of both the matrix is same, I would >> also like to retrieve the corresponding values and row,col names from >> the second matrix. (lncRNA2,mRNA1 and lncRNA4,mRNA2 alongwith their >> values in the second matrix.) >> >> I'm using the following code: >> >>> Pmatrix = read.table("pmatrix.csv", header=T, sep="," , row.names=1) >>> sig_values <- which(Pmatrix<0.05, arr.ind=TRUE) >>> sig_values >>> Corr_Matrix = read.csv("corr_matrix.csv", header = T, row.names=1) >>> Corr_Matrix[sig_values] >> >> However, it only prints the row,col number (sig_values command) or >> only the values (Corr_Matrix[sig_values]) command. How can I get the >> row and column names alongwith their values? >> >> I've also tried printing using the following print command: >> >>>paste(rownames(Pmatrix)[sig_values[1]], colnames(Pmatrix)[sig_values[2]], sep=", ") > >> But it gives a output like this: >> >> [1] "lncRNA2, NA" > > Well, yes. > > sig_values[1] > >> sig_values[1] > [1] 2 >> sig_values[2] > [1] 8 > > And there is no column 8, so no name. > > paste(rownames(Pmatrix)[sig_values[,1]], > colnames(Pmatrix)[sig_values[,2]], sep=", ") > [1] "lncRNA2, mRNA1" "lncRNA8, mRNA1" "lncRNA4, mRNA2" "lncRNA7, > mRNA2" "lncRNA1, mRNA4" > [6] "lncRNA3, mRNA4" "lncRNA5, mRNA5" > >> Sample input files available for download: https://goo.gl/xR6XDg > > dput() is preferred to expecting people to download things from unknown sources. > > Sarah > > -- > Sarah Goslee > http://www.functionaldiversity.org-- Gaurav Kandoi