Thanks Ben. I need to learn more about apply. Have you a link or tutorial about apply. R documentation is very short. How can obtain: z <- list (Col1, Col2, Col3, Col4......)? Thanks ?__ c/ /'_;~~~~kmezhoud (*) \(*) ????? ?????? http://bioinformatics.tn/ On Mon, Jan 19, 2015 at 8:22 PM, Ben Tupper <btupper at bigelow.org> wrote:> Hi again, > > On Jan 19, 2015, at 1:53 PM, Karim Mezhoud <kmezhoud at gmail.com> wrote: > > Yes Many thanks. > That is my request using lapply. > > do.call(cbind,col1) > > converts col1 to matrix but does not fill empty value with NA. > > Even for > > matrix(unlist(col1), ncol=5,byrow = FALSE) > > > How can get Matrix class of col1? And fill empty values with NA? > > > Perhaps best is to determine the maximum number of rows required first, > then force each subset to have that length. > > # make a list of matrices, each with nCol columns and differing > # number of rows > nCol <- 3 > nRow <- sample(3:10, 5) > x <- lapply(nRow, function(x, nc) {matrix(x:(x + nc*x - 1), ncol = nc, > nrow = x)}, nCol) > x > > # make a simple function to get a single column from a matrix > getColumn <- function(x, colNum, len = nrow(x)) { > y <- x[,colNum] > length(y) <- len > y > } > > # what is the maximum number of rows > n <- max(sapply(x, nrow)) > > # use the function to get the column from each matrix > col1 <- lapply(x, getColumn, 1, len = n) > col1 > > do.call(cbind, col1) > [,1] [,2] [,3] [,4] [,5] > [1,] 3 8 5 7 9 > [2,] 4 9 6 8 10 > [3,] 5 10 7 9 11 > [4,] NA 11 8 10 12 > [5,] NA 12 9 11 13 > [6,] NA 13 NA 12 14 > [7,] NA 14 NA 13 15 > [8,] NA 15 NA NA 16 > [9,] NA NA NA NA 17 > > Ben > > Thanks > Karim > > > ?__ > c/ /'_;~~~~kmezhoud > (*) \(*) ????? ?????? > http://bioinformatics.tn/ > > > > On Mon, Jan 19, 2015 at 4:36 PM, Ben Tupper <ben.bighair at gmail.com> wrote: > >> Hi, >> >> On Jan 18, 2015, at 4:36 PM, Karim Mezhoud <kmezhoud at gmail.com> wrote: >> >> > Dear All, >> > I am trying to get correlation between Diseases (80) in columns and >> > samples in rows (UNEQUAL) using gene expression (at less 1000,numeric). >> For >> > this I can use CORREP package with cor.unbalanced function. >> > >> > But before to get this final matrix I need to load and to store the >> > expression of 1000 genes for every Disease (80). Every disease has >> > different number of samples (between 50 - 500). >> > >> > It is possible to get a cube of matrices with equal columns but unequal >> > rows? I think NO and I can't use array function. >> > >> > I am trying to get ? list of matrices having the same number of columns >> but >> > different number of rows. as >> > >> > Cubist <- vector("list", 1) >> > Cubist$Expression <- vector("list", 1) >> > >> > >> > for (i in 1:80){ >> > >> > matrix <- function(getGeneExpression[i]) >> > Cubist$Expression[[Disease[i]]] <- matrix >> > >> > } >> > >> > At this step I have: >> > length(Cubist$Expression) >> > #80 >> > dim(Cubist$Expression$Disease1) >> > #526 1000 >> > dim(Cubist$Expression$Disease2) >> > #106 1000 >> > >> > names(Cubist$Expression$Disease1[4]) >> > #ABD >> > >> > names(Cubist$Expression$Disease2[4]) >> > #ABD >> > >> > Now I need to built the final matrices for every genes (1000) that I >> will >> > use for CORREP function. >> > >> > Is there a way to extract directly the first column (first gene) for all >> > Diseases (80) from Cubist$Expression? or >> > >> >> I don't understand most your question, but the above seems to be straight >> forward. Here's a toy example: >> >> # make a list of matrices, each with nCol columns and differing >> # number of rows, nRow >> nCol <- 3 >> nRow <- sample(3:10, 5) >> x <- lapply(nRow, function(x, nc) {matrix(x:(x + nc*x - 1), ncol = nc, >> nrow = x)}, nCol) >> x >> >> # make a simple function to get a single column from a matrix >> getColumn <- function(x, colNum) { >> return(x[,colNum]) >> } >> >> # use the function to get the column from each matrix >> col1 <- lapply(x, getColumn, 1) >> col1 >> >> Does that help answer this part of your question? If not, you may need >> to create a very small example of your data and post it here using the >> head() and dput() functions. >> >> Ben >> >> >> >> > I need to built 1000 matrices with 80 columns and unequal rows? >> > >> > Cublist$Diseases <- vector("list", 1) >> > >> > for (k in 1:1000){ >> > for (i in 1:80){ >> > >> > Cublist$Diseases[[gene[k] ]] <- Cubist$Expression[[Diseases[i] ]][k] >> > } >> > >> > } >> > >> > This double loops is time consuming...Is there a way to do this faster? >> > >> > Thanks, >> > karim >> > ?__ >> > c/ /'_;~~~~kmezhoud >> > (*) \(*) ????? ?????? >> > http://bioinformatics.tn/ >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> <http://www.r-project.org/posting-guide.html> >> > and provide commented, minimal, self-contained, reproducible code. >> >> > > Ben Tupper > Bigelow Laboratory for Ocean Sciences > 60 Bigelow Drive, P.O. Box 380 > East Boothbay, Maine 04544 > http://www.bigelow.org > > > > > > > > >[[alternative HTML version deleted]]
Hi, On Jan 19, 2015, at 5:17 PM, Karim Mezhoud <kmezhoud at gmail.com> wrote:> Thanks Ben. > I need to learn more about apply. Have you a link or tutorial about apply. R documentation is very short. > > How can obtain: > z <- list (Col1, Col2, Col3, Col4......)? >This may not be the most efficient way and there certainly is no error checking, but you can wrap one lapply within another as shown below. The innermost iterates over your list of input matrices, extracting one column specified per list element. The outer lapply iterates over the various column numbers you want to extract. getMatrices <- function(colNums, dataList = x){ # the number of rows required n <- max(sapply(dataList, nrow)) lapply(colNums, function(x, dat, n) { # iterate along requested columns do.call(cbind, lapply(dat, getColumn,x, len=n)) # iterate along input data list }, dataList, n) } getMatrices(c(1,3), dataList = x) If we are lucky, one of the plyr package users might show us how to do the same with a one-liner. There are endless resources online, here are some gems. http://www.r-project.org/doc/bib/R-books.html http://www.rseek.org/ http://www.burns-stat.com/documents/ http://www.r-bloggers.com/ Also, I found "Data Manipulation with R" ( http://www.r-project.org/doc/bib/R-books_bib.html#R:Spector:2008 ) helpful. Ben> Thanks > > ?__ > c/ /'_;~~~~kmezhoud > (*) \(*) ????? ?????? > http://bioinformatics.tn/ > > > > On Mon, Jan 19, 2015 at 8:22 PM, Ben Tupper <btupper at bigelow.org> wrote: > Hi again, > > On Jan 19, 2015, at 1:53 PM, Karim Mezhoud <kmezhoud at gmail.com> wrote: > >> Yes Many thanks. >> That is my request using lapply. >> >> do.call(cbind,col1) >> >> converts col1 to matrix but does not fill empty value with NA. >> >> Even for >> >> matrix(unlist(col1), ncol=5,byrow = FALSE) >> >> >> How can get Matrix class of col1? And fill empty values with NA? >> > > Perhaps best is to determine the maximum number of rows required first, then force each subset to have that length. > > # make a list of matrices, each with nCol columns and differing > # number of rows > nCol <- 3 > nRow <- sample(3:10, 5) > x <- lapply(nRow, function(x, nc) {matrix(x:(x + nc*x - 1), ncol = nc, nrow = x)}, nCol) > x > > # make a simple function to get a single column from a matrix > getColumn <- function(x, colNum, len = nrow(x)) { > y <- x[,colNum] > length(y) <- len > y > } > > # what is the maximum number of rows > n <- max(sapply(x, nrow)) > > # use the function to get the column from each matrix > col1 <- lapply(x, getColumn, 1, len = n) > col1 > > do.call(cbind, col1) > [,1] [,2] [,3] [,4] [,5] > [1,] 3 8 5 7 9 > [2,] 4 9 6 8 10 > [3,] 5 10 7 9 11 > [4,] NA 11 8 10 12 > [5,] NA 12 9 11 13 > [6,] NA 13 NA 12 14 > [7,] NA 14 NA 13 15 > [8,] NA 15 NA NA 16 > [9,] NA NA NA NA 17 > > Ben > >> Thanks >> Karim >> >> >> ?__ >> c/ /'_;~~~~kmezhoud >> (*) \(*) ????? ?????? >> http://bioinformatics.tn/ >> >> >> >> On Mon, Jan 19, 2015 at 4:36 PM, Ben Tupper <ben.bighair at gmail.com> wrote: >> Hi, >> >> On Jan 18, 2015, at 4:36 PM, Karim Mezhoud <kmezhoud at gmail.com> wrote: >> >> > Dear All, >> > I am trying to get correlation between Diseases (80) in columns and >> > samples in rows (UNEQUAL) using gene expression (at less 1000,numeric). For >> > this I can use CORREP package with cor.unbalanced function. >> > >> > But before to get this final matrix I need to load and to store the >> > expression of 1000 genes for every Disease (80). Every disease has >> > different number of samples (between 50 - 500). >> > >> > It is possible to get a cube of matrices with equal columns but unequal >> > rows? I think NO and I can't use array function. >> > >> > I am trying to get ? list of matrices having the same number of columns but >> > different number of rows. as >> > >> > Cubist <- vector("list", 1) >> > Cubist$Expression <- vector("list", 1) >> > >> > >> > for (i in 1:80){ >> > >> > matrix <- function(getGeneExpression[i]) >> > Cubist$Expression[[Disease[i]]] <- matrix >> > >> > } >> > >> > At this step I have: >> > length(Cubist$Expression) >> > #80 >> > dim(Cubist$Expression$Disease1) >> > #526 1000 >> > dim(Cubist$Expression$Disease2) >> > #106 1000 >> > >> > names(Cubist$Expression$Disease1[4]) >> > #ABD >> > >> > names(Cubist$Expression$Disease2[4]) >> > #ABD >> > >> > Now I need to built the final matrices for every genes (1000) that I will >> > use for CORREP function. >> > >> > Is there a way to extract directly the first column (first gene) for all >> > Diseases (80) from Cubist$Expression? or >> > >> >> I don't understand most your question, but the above seems to be straight forward. Here's a toy example: >> >> # make a list of matrices, each with nCol columns and differing >> # number of rows, nRow >> nCol <- 3 >> nRow <- sample(3:10, 5) >> x <- lapply(nRow, function(x, nc) {matrix(x:(x + nc*x - 1), ncol = nc, nrow = x)}, nCol) >> x >> >> # make a simple function to get a single column from a matrix >> getColumn <- function(x, colNum) { >> return(x[,colNum]) >> } >> >> # use the function to get the column from each matrix >> col1 <- lapply(x, getColumn, 1) >> col1 >> >> Does that help answer this part of your question? If not, you may need to create a very small example of your data and post it here using the head() and dput() functions. >> >> Ben >> >> >> >> > I need to built 1000 matrices with 80 columns and unequal rows? >> > >> > Cublist$Diseases <- vector("list", 1) >> > >> > for (k in 1:1000){ >> > for (i in 1:80){ >> > >> > Cublist$Diseases[[gene[k] ]] <- Cubist$Expression[[Diseases[i] ]][k] >> > } >> > >> > } >> > >> > This double loops is time consuming...Is there a way to do this faster? >> > >> > Thanks, >> > karim >> > ?__ >> > c/ /'_;~~~~kmezhoud >> > (*) \(*) ????? ?????? >> > http://bioinformatics.tn/ >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> >> > > Ben Tupper > Bigelow Laboratory for Ocean Sciences > 60 Bigelow Drive, P.O. Box 380 > East Boothbay, Maine 04544 > http://www.bigelow.org > > > > > > > > >Ben Tupper Bigelow Laboratory for Ocean Sciences 60 Bigelow Drive, P.O. Box 380 East Boothbay, Maine 04544 http://www.bigelow.org [[alternative HTML version deleted]]
I use plyr and am learning dplyr and magrittr, but those are just syntactic
sugar. What I have been having difficulty with in this thread is the idea that
it somehow makes sense to pad vectors with NA values... because I really
don't think it does. It seems more like a hammer looking for a nail because
that is what it knows how to deal with.
You have a list of matrices with data in them, and switching from for loops to
lapply is not in itself going to fix a memory or speed problem... normally the
big improvements are in the way you allocate and use your data. Burns talks
about pre-allocating the result to speed things up, but I don't understand
the problem well enough to suggest an efficient data structure to pre-allocate.
I suggest that Karim read and adhere to the Posting Guide (particularly the bits
about giving a reproducible example and posting in plain text so it doesn't
get scrambled) if help with optimizing is desired. The discussion at [1] might
clarify what "reproducible" means.
I will also mention that efficient algorithms for this subject area are
frequently available in the Bioconductor project, so I hope you are not
re-inventing the wheel and have already reviewed their tools.
[1]
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
On January 19, 2015 6:11:38 PM PST, Ben Tupper <btupper at bigelow.org>
wrote:>Hi,
>
>On Jan 19, 2015, at 5:17 PM, Karim Mezhoud <kmezhoud at gmail.com>
wrote:
>
>> Thanks Ben.
>> I need to learn more about apply. Have you a link or tutorial about
>apply. R documentation is very short.
>>
>> How can obtain:
>> z <- list (Col1, Col2, Col3, Col4......)?
>>
>
>This may not be the most efficient way and there certainly is no error
>checking, but you can wrap one lapply within another as shown below.
>The innermost iterates over your list of input matrices, extracting one
>column specified per list element. The outer lapply iterates over the
>various column numbers you want to extract.
>
>
>getMatrices <- function(colNums, dataList = x){
> # the number of rows required
> n <- max(sapply(dataList, nrow))
>lapply(colNums, function(x, dat, n) { # iterate along requested columns
>do.call(cbind, lapply(dat, getColumn,x, len=n)) # iterate along input
>data list
> }, dataList, n)
>}
>
>getMatrices(c(1,3), dataList = x)
>
>If we are lucky, one of the plyr package users might show us how to do
>the same with a one-liner.
>
>
>There are endless resources online, here are some gems.
>
>http://www.r-project.org/doc/bib/R-books.html
>http://www.rseek.org/
>http://www.burns-stat.com/documents/
>http://www.r-bloggers.com/
>
>Also, I found "Data Manipulation with R" (
>http://www.r-project.org/doc/bib/R-books_bib.html#R:Spector:2008 )
>helpful.
>
>Ben
>
>> Thanks
>>
>> ?__
>> c/ /'_;~~~~kmezhoud
>> (*) \(*) ????? ??????
>> http://bioinformatics.tn/
>>
>>
>>
>> On Mon, Jan 19, 2015 at 8:22 PM, Ben Tupper <btupper at
bigelow.org>
>wrote:
>> Hi again,
>>
>> On Jan 19, 2015, at 1:53 PM, Karim Mezhoud <kmezhoud at
gmail.com>
>wrote:
>>
>>> Yes Many thanks.
>>> That is my request using lapply.
>>>
>>> do.call(cbind,col1)
>>>
>>> converts col1 to matrix but does not fill empty value with NA.
>>>
>>> Even for
>>>
>>> matrix(unlist(col1), ncol=5,byrow = FALSE)
>>>
>>>
>>> How can get Matrix class of col1? And fill empty values with NA?
>>>
>>
>> Perhaps best is to determine the maximum number of rows required
>first, then force each subset to have that length.
>>
>> # make a list of matrices, each with nCol columns and differing
>> # number of rows
>> nCol <- 3
>> nRow <- sample(3:10, 5)
>> x <- lapply(nRow, function(x, nc) {matrix(x:(x + nc*x - 1), ncol
>nc, nrow = x)}, nCol)
>> x
>>
>> # make a simple function to get a single column from a matrix
>> getColumn <- function(x, colNum, len = nrow(x)) {
>> y <- x[,colNum]
>> length(y) <- len
>> y
>> }
>>
>> # what is the maximum number of rows
>> n <- max(sapply(x, nrow))
>>
>> # use the function to get the column from each matrix
>> col1 <- lapply(x, getColumn, 1, len = n)
>> col1
>>
>> do.call(cbind, col1)
>> [,1] [,2] [,3] [,4] [,5]
>> [1,] 3 8 5 7 9
>> [2,] 4 9 6 8 10
>> [3,] 5 10 7 9 11
>> [4,] NA 11 8 10 12
>> [5,] NA 12 9 11 13
>> [6,] NA 13 NA 12 14
>> [7,] NA 14 NA 13 15
>> [8,] NA 15 NA NA 16
>> [9,] NA NA NA NA 17
>>
>> Ben
>>
>>> Thanks
>>> Karim
>>>
>>>
>>> ?__
>>> c/ /'_;~~~~kmezhoud
>>> (*) \(*) ????? ??????
>>> http://bioinformatics.tn/
>>>
>>>
>>>
>>> On Mon, Jan 19, 2015 at 4:36 PM, Ben Tupper <ben.bighair at
gmail.com>
>wrote:
>>> Hi,
>>>
>>> On Jan 18, 2015, at 4:36 PM, Karim Mezhoud <kmezhoud at
gmail.com>
>wrote:
>>>
>>> > Dear All,
>>> > I am trying to get correlation between Diseases (80) in
columns
>and
>>> > samples in rows (UNEQUAL) using gene expression (at less
>1000,numeric). For
>>> > this I can use CORREP package with cor.unbalanced function.
>>> >
>>> > But before to get this final matrix I need to load and to
store
>the
>>> > expression of 1000 genes for every Disease (80). Every disease
has
>>> > different number of samples (between 50 - 500).
>>> >
>>> > It is possible to get a cube of matrices with equal columns
but
>unequal
>>> > rows? I think NO and I can't use array function.
>>> >
>>> > I am trying to get ? list of matrices having the same number
of
>columns but
>>> > different number of rows. as
>>> >
>>> > Cubist <- vector("list", 1)
>>> > Cubist$Expression <- vector("list", 1)
>>> >
>>> >
>>> > for (i in 1:80){
>>> >
>>> > matrix <- function(getGeneExpression[i])
>>> > Cubist$Expression[[Disease[i]]] <- matrix
>>> >
>>> > }
>>> >
>>> > At this step I have:
>>> > length(Cubist$Expression)
>>> > #80
>>> > dim(Cubist$Expression$Disease1)
>>> > #526 1000
>>> > dim(Cubist$Expression$Disease2)
>>> > #106 1000
>>> >
>>> > names(Cubist$Expression$Disease1[4])
>>> > #ABD
>>> >
>>> > names(Cubist$Expression$Disease2[4])
>>> > #ABD
>>> >
>>> > Now I need to built the final matrices for every genes (1000)
that
>I will
>>> > use for CORREP function.
>>> >
>>> > Is there a way to extract directly the first column (first
gene)
>for all
>>> > Diseases (80) from Cubist$Expression? or
>>> >
>>>
>>> I don't understand most your question, but the above seems to
be
>straight forward. Here's a toy example:
>>>
>>> # make a list of matrices, each with nCol columns and differing
>>> # number of rows, nRow
>>> nCol <- 3
>>> nRow <- sample(3:10, 5)
>>> x <- lapply(nRow, function(x, nc) {matrix(x:(x + nc*x - 1), ncol
>nc, nrow = x)}, nCol)
>>> x
>>>
>>> # make a simple function to get a single column from a matrix
>>> getColumn <- function(x, colNum) {
>>> return(x[,colNum])
>>> }
>>>
>>> # use the function to get the column from each matrix
>>> col1 <- lapply(x, getColumn, 1)
>>> col1
>>>
>>> Does that help answer this part of your question? If not, you may
>need to create a very small example of your data and post it here using
>the head() and dput() functions.
>>>
>>> Ben
>>>
>>>
>>>
>>> > I need to built 1000 matrices with 80 columns and unequal
rows?
>>> >
>>> > Cublist$Diseases <- vector("list", 1)
>>> >
>>> > for (k in 1:1000){
>>> > for (i in 1:80){
>>> >
>>> > Cublist$Diseases[[gene[k] ]] <-
Cubist$Expression[[Diseases[i]
>]][k]
>>> > }
>>> >
>>> > }
>>> >
>>> > This double loops is time consuming...Is there a way to do
this
>faster?
>>> >
>>> > Thanks,
>>> > karim
>>> > ?__
>>> > c/ /'_;~~~~kmezhoud
>>> > (*) \(*) ????? ??????
>>> > http://bioinformatics.tn/
>>> >
>>> > [[alternative HTML version deleted]]
>>> >
>>> > ______________________________________________
>>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>>> > and provide commented, minimal, self-contained, reproducible
code.
>>>
>>>
>>
>> Ben Tupper
>> Bigelow Laboratory for Ocean Sciences
>> 60 Bigelow Drive, P.O. Box 380
>> East Boothbay, Maine 04544
>> http://www.bigelow.org
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>Ben Tupper
>Bigelow Laboratory for Ocean Sciences
>60 Bigelow Drive, P.O. Box 380
>East Boothbay, Maine 04544
>http://www.bigelow.org
>
>
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
Thanks Ben, Jeff and Roy,
Here is an example of my data
Disease <- NULL
Diseases <- NULL
ListMatByGene <- NULL
for(i in 1:3){
Disease[[i]] <-matrix(sample(-30:30,25+(5*i)),5+i)
rownames(Disease[[i]]) <- paste0("Sample",1:(5+i))
colnames(Disease[[i]]) <- paste0("Gene",1:5)
D <- paste0("Disease",i)
Diseases[[D]] <- Disease[[i]]
}
getColumn <- function(x, colNum, len = nrow(x)){
y <- x[,colNum]
length(y) <- len
y
}
getMatrices <- function(colNums, dataList = x){
# the number of rows required
n <- max(sapply(dataList, nrow))
lapply(colNums, function(x, dat, n) { # iterate along requested columns
do.call(cbind, lapply(dat, getColumn,x, len=n)) # iterate along
input data list
}, dataList, n)
}
G <- paste0("Gene",1:5)
ListMatByGene[G] <- getMatrices(c(1:ncol(Diseases[[1]])),dataList=Diseases)
## get Disease correlation by gene
DiseaseCorrelation <- lapply(ListMatByGene,function(x)
cor(x,use="na",
method="spearman"))
##convert the list of Matrices to array
ArrayDiseaseCor <- array(unlist(DiseaseCorrelation), dim
c(nrow(DiseaseCorrelation[[1]]), ncol(DiseaseCorrelation[[1]]),
length(DiseaseCorrelation)))
dimnames(ArrayDiseaseCor) <- list(names(Diseases), names(Diseases),
colnames(Diseases[[1]]))
FilterDiseaseCor <- apply(ArrayDiseaseCor,MARGIN=c(1,2) ,function(x)
x[abs(x)>0.5])
FilterDiseaseCor
Disease1 Disease2 Disease3
Disease1 Numeric,5 Numeric,2 -0.9428571
Disease2 Numeric,2 Numeric,5 Numeric,2
Disease3 -0.9428571 Numeric,2 Numeric,5
Question is:
How can get a table as:
D1 D2 Cor Gene
Disease1 Disease2 -0.94 Gene2
Disease1 Disease2 0.78 Gene4
Disease3 Disease2 0.5 Gene5
...
and
Disease1 Disease2 Disease3
Disease1 5 1 0
Disease2 1 5 3
Disease3 0 3 5
Thanks
Karim
On Tue, Jan 20, 2015 at 2:11 AM, Ben Tupper <btupper at bigelow.org>
wrote:
> Hi,
>
> On Jan 19, 2015, at 5:17 PM, Karim Mezhoud <kmezhoud at gmail.com>
wrote:
>
> Thanks Ben.
> I need to learn more about apply. Have you a link or tutorial about apply.
> R documentation is very short.
>
> How can obtain:
> z <- list (Col1, Col2, Col3, Col4......)?
>
>
> This may not be the most efficient way and there certainly is no error
> checking, but you can wrap one lapply within another as shown below. The
> innermost iterates over your list of input matrices, extracting one column
> specified per list element. The outer lapply iterates over the various
> column numbers you want to extract.
>
>
> getMatrices <- function(colNums, dataList = x){
> # the number of rows required
> n <- max(sapply(dataList, nrow))
> lapply(colNums, function(x, dat, n) { # iterate along requested columns
> do.call(cbind, lapply(dat, getColumn,x, len=n)) # iterate along
> input data list
> }, dataList, n)
> }
>
> getMatrices(c(1,3), dataList = x)
>
> If we are lucky, one of the plyr package users might show us how to do the
> same with a one-liner.
>
>
> There are endless resources online, here are some gems.
>
> http://www.r-project.org/doc/bib/R-books.html
> http://www.rseek.org/
> http://www.burns-stat.com/documents/
> http://www.r-bloggers.com/
>
> Also, I found "Data Manipulation with R" (
> http://www.r-project.org/doc/bib/R-books_bib.html#R:Spector:2008 )
> helpful.
>
> Ben
>
> Thanks
>
> ?__
> c/ /'_;~~~~kmezhoud
> (*) \(*) ????? ??????
> http://bioinformatics.tn/
>
>
>
> On Mon, Jan 19, 2015 at 8:22 PM, Ben Tupper <btupper at bigelow.org>
wrote:
>
>> Hi again,
>>
>> On Jan 19, 2015, at 1:53 PM, Karim Mezhoud <kmezhoud at
gmail.com> wrote:
>>
>> Yes Many thanks.
>> That is my request using lapply.
>>
>> do.call(cbind,col1)
>>
>> converts col1 to matrix but does not fill empty value with NA.
>>
>> Even for
>>
>> matrix(unlist(col1), ncol=5,byrow = FALSE)
>>
>>
>> How can get Matrix class of col1? And fill empty values with NA?
>>
>>
>> Perhaps best is to determine the maximum number of rows required first,
>> then force each subset to have that length.
>>
>> # make a list of matrices, each with nCol columns and differing
>> # number of rows
>> nCol <- 3
>> nRow <- sample(3:10, 5)
>> x <- lapply(nRow, function(x, nc) {matrix(x:(x + nc*x - 1), ncol =
nc,
>> nrow = x)}, nCol)
>> x
>>
>> # make a simple function to get a single column from a matrix
>> getColumn <- function(x, colNum, len = nrow(x)) {
>> y <- x[,colNum]
>> length(y) <- len
>> y
>> }
>>
>> # what is the maximum number of rows
>> n <- max(sapply(x, nrow))
>>
>> # use the function to get the column from each matrix
>> col1 <- lapply(x, getColumn, 1, len = n)
>> col1
>>
>> do.call(cbind, col1)
>> [,1] [,2] [,3] [,4] [,5]
>> [1,] 3 8 5 7 9
>> [2,] 4 9 6 8 10
>> [3,] 5 10 7 9 11
>> [4,] NA 11 8 10 12
>> [5,] NA 12 9 11 13
>> [6,] NA 13 NA 12 14
>> [7,] NA 14 NA 13 15
>> [8,] NA 15 NA NA 16
>> [9,] NA NA NA NA 17
>>
>> Ben
>>
>> Thanks
>> Karim
>>
>>
>> ?__
>> c/ /'_;~~~~kmezhoud
>> (*) \(*) ????? ??????
>> http://bioinformatics.tn/
>>
>>
>>
>> On Mon, Jan 19, 2015 at 4:36 PM, Ben Tupper <ben.bighair at
gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> On Jan 18, 2015, at 4:36 PM, Karim Mezhoud <kmezhoud at
gmail.com> wrote:
>>>
>>> > Dear All,
>>> > I am trying to get correlation between Diseases (80) in
columns and
>>> > samples in rows (UNEQUAL) using gene expression (at less
>>> 1000,numeric). For
>>> > this I can use CORREP package with cor.unbalanced function.
>>> >
>>> > But before to get this final matrix I need to load and to
store the
>>> > expression of 1000 genes for every Disease (80). Every disease
has
>>> > different number of samples (between 50 - 500).
>>> >
>>> > It is possible to get a cube of matrices with equal columns
but unequal
>>> > rows? I think NO and I can't use array function.
>>> >
>>> > I am trying to get ? list of matrices having the same number
of
>>> columns but
>>> > different number of rows. as
>>> >
>>> > Cubist <- vector("list", 1)
>>> > Cubist$Expression <- vector("list", 1)
>>> >
>>> >
>>> > for (i in 1:80){
>>> >
>>> > matrix <- function(getGeneExpression[i])
>>> > Cubist$Expression[[Disease[i]]] <- matrix
>>> >
>>> > }
>>> >
>>> > At this step I have:
>>> > length(Cubist$Expression)
>>> > #80
>>> > dim(Cubist$Expression$Disease1)
>>> > #526 1000
>>> > dim(Cubist$Expression$Disease2)
>>> > #106 1000
>>> >
>>> > names(Cubist$Expression$Disease1[4])
>>> > #ABD
>>> >
>>> > names(Cubist$Expression$Disease2[4])
>>> > #ABD
>>> >
>>> > Now I need to built the final matrices for every genes (1000)
that I
>>> will
>>> > use for CORREP function.
>>> >
>>> > Is there a way to extract directly the first column (first
gene) for
>>> all
>>> > Diseases (80) from Cubist$Expression? or
>>> >
>>>
>>> I don't understand most your question, but the above seems to
be
>>> straight forward. Here's a toy example:
>>>
>>> # make a list of matrices, each with nCol columns and differing
>>> # number of rows, nRow
>>> nCol <- 3
>>> nRow <- sample(3:10, 5)
>>> x <- lapply(nRow, function(x, nc) {matrix(x:(x + nc*x - 1), ncol
= nc,
>>> nrow = x)}, nCol)
>>> x
>>>
>>> # make a simple function to get a single column from a matrix
>>> getColumn <- function(x, colNum) {
>>> return(x[,colNum])
>>> }
>>>
>>> # use the function to get the column from each matrix
>>> col1 <- lapply(x, getColumn, 1)
>>> col1
>>>
>>> Does that help answer this part of your question? If not, you may
need
>>> to create a very small example of your data and post it here using
the
>>> head() and dput() functions.
>>>
>>> Ben
>>>
>>>
>>>
>>> > I need to built 1000 matrices with 80 columns and unequal
rows?
>>> >
>>> > Cublist$Diseases <- vector("list", 1)
>>> >
>>> > for (k in 1:1000){
>>> > for (i in 1:80){
>>> >
>>> > Cublist$Diseases[[gene[k] ]] <-
Cubist$Expression[[Diseases[i] ]][k]
>>> > }
>>> >
>>> > }
>>> >
>>> > This double loops is time consuming...Is there a way to do
this faster?
>>> >
>>> > Thanks,
>>> > karim
>>> > ?__
>>> > c/ /'_;~~~~kmezhoud
>>> > (*) \(*) ????? ??????
>>> > http://bioinformatics.tn/
>>> >
>>> > [[alternative HTML version deleted]]
>>> >
>>> > ______________________________________________
>>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> <http://www.r-project.org/posting-guide.html>
>>> > and provide commented, minimal, self-contained, reproducible
code.
>>>
>>>
>>
>> Ben Tupper
>> Bigelow Laboratory for Ocean Sciences
>> 60 Bigelow Drive, P.O. Box 380
>> East Boothbay, Maine 04544
>> http://www.bigelow.org
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
> Ben Tupper
> Bigelow Laboratory for Ocean Sciences
> 60 Bigelow Drive, P.O. Box 380
> East Boothbay, Maine 04544
> http://www.bigelow.org
>
>
>
>
>
>
>
>
>
[[alternative HTML version deleted]]