arun
2013-Nov-12 03:40 UTC
[R] Apply function to every 20 rows between pairs of columns in a matrix
Hi, May be this what you wanted. res2 <- lapply(row.names(res[[1]]),function(x) do.call(rbind,lapply(res,function(y) y[match(x, row.names(y)),]))) ?length(res2) #[1] 48 ?dim(res2[[1]]) #[1] 2325??? 8 A.K. On Monday, November 11, 2013 10:20 PM, Yu-yu Ren <renyangsu at gmail.com> wrote: Thank you so much for that script, it works great. One additional request; how can I go about binding each of the 2325 matrices for each sample, resulting in 48 matrices of 8 column by 2325 row? On Mon, Nov 11, 2013 at 10:02 PM, arun <smartpink111 at yahoo.com> wrote:> >Hi, >I already sent a reply to R-help.? I am not sure about the "2342". > >set.seed(25) >dat1 <- as.data.frame(matrix(sample(c("A","T","G","C"),46482*56,replace=TRUE),ncol=56,nrow=46482),stringsAsFactors=FALSE) >?lst1 <- split(dat1,as.character(gl(nrow(dat1),20,nrow(dat1)))) >res <- lapply(lst1,function(x) sapply(x[,1:8],function(y) sapply(x[,9:56], function(z) sum(y==z)/20))) > >?length(res) >#[1] 2325? ### check here >?dim(res[[1]]) >#[1] 48? 8 > >A.K. > > > > >On Monday, November 11, 2013 10:00 PM, Yu-yu Ren <renyangsu at gmail.com> wrote: > >Thank you, I have uploaded several example files, with intermediate outputs of what I have done and the logic flow. > > > > >On Mon, Nov 11, 2013 at 9:37 PM, <smartpink111 at yahoo.com> wrote: > > >>Hi, >> >>Comparing the first 8 columns separately with 9-56 columns is not clear. ?Also, please provide a reproducible example (using ?dput) for others to work on. >> >>A.K. >><quote author='Renyulb28'> >>Hi all, I have a set of genetic SNP data that looks like >> >>Founder1 Founder2 Founder3 Founder4 Founder5 Founder6 Founder7 Founder8 >>Sample1 Sample2 Sample3 Sample... >>A A A T T T T T A T A T >>A A A T T T T T A T A T >>A A A T T T T T A T A T >>A A A T T T T T A T A T >>A A A T T T T T A T A T >>A A A T T T T T A T A T >>A A A T T T T T A T A T >>A A A T T T T T A T A T >>A A A T T T T T A T A T >>A A A T T T T T A T A T >>A A A T T T T T A T A T >>A A A T T T T T A T A T >> >>The size of the matrix is 56 columns by 46482 rows. I need to first bin the >>matrix by every 20 rows, then compare each of the first 8 columns (founders) >>to each columns 9-56, and divide the total number of matching >>letters/alleles by the total number of rows (20). Ultimately I need 48 8 >>column by 2342 row matrices, which are essentially similarity matrices. I >>have tried to extract each pair separately by something like >> >>"length(cbind(odd[,9],odd[,1])[cbind(odd[,9],cbind(odd[,9],odd[,1])[,1])[,1]=="T" >>& cbind(odd[,9],odd[,1])[,2]=="T",])/nrow(cbind(odd[,9],odd[,1]))" >> >>but this is no where near efficient, and I do not know of a faster way of >>applying the function to every 20 rows and across multiple pairs. >> >>In the example given above, if the rows were all identical like shown across >>20 rows, then the first row of the matrix for Sample1 would be >> >>1 1 1 0 0 0 0 >> >></quote> >>Quoted from: >>http://r.789695.n4.nabble.com/Apply-function-to-every-20-rows-between-pairs-of-columns-in-a-matrix-tp4680272.html >> >> >>_____________________________________ >>Sent from http://r.789695.n4.nabble.com >> >> >