On Jun 15, 2013, at 2:38 PM, Fabrice Tourre wrote:
> Dear expert,
>
> How can I make follow codematrix function more faster?
>
> top20.dat <- top20.dat[,7:length(top20.dat[1,])]
> top40.dat <-
>
read.table("top40snps.ped",header=F,sep="\t",stringsAsFactors=F)
Did you attach a file with a non-'.txt' extension?
> row.names(top40.dat) <- top40.dat[,1]
> top40.dat <- top40.dat[,7:length(top40.dat[1,])]
> codematrix <- function(dat)
> {
> new.dat <- dat
> for(col in 1:length(dat[1,]))
> {
I'm guessing that using `ifelse` would be much faster than going column by
column and and then row by painful row through this testing with nested
`if(cond){conseq}else{alter}` . You should gain efficiency by setting up the
results of the `strsplit` operation on a full column at a time. Build a function
that would work one column at a time and then lapply it to the dataframe.
> tbl <- table(dat[,col])
> max.allel <- names(which(tbl==max(table(dat[,col]))))
> for(row in 1:length(dat[,1]))
> {
> if(dat[row,col]=="0 0")
> {
> new.dat[row,col]=NA
> }else{
> if(dat[row,col]==max.allel) {
> new.dat[row,col]=0
> }else{
> allele <- unlist(strsplit(
> as.character(dat[row,col])," "))
> if(allele[1]==allele[2]){
> new.dat[row,col]=2
> }else{
> new.dat[row,col]=1
You could leave the "==max.allelle" test on the outer of nested ifelse
operations to "overwrite" the resutls of the testing of the two
split-bits. But I would make it a %in%-test so that it won't fail when mor
than one maximum occur.
Perhaps (untested and a lot of guesswork):
testsplitfunc <- function(col){
temptbl <- table(col)
tempspl <- strsplit(as.character(col) , split=" ")
allele <-cbind( sapply(temp, "[", 1),
sapply(temp, "[", 2) )
res <- ifelse ( col %in% names(temptbl)[ which(tbl==max(temptbl))] ,
0,
ifelse( allele[,1]==allele[,2], 2, 1) )
is.na(res) <- col=="0 0"
}
code.top20 <- do.call(cbind, lapply(top20.dat, testsplitfunc) )
> }
> }
> }
> }
> #})
> cat(paste(col," ",sep=""))
> }
> return(new.dat)
> }
> code.top20 <- codematrix(top20.dat)
In the absence of a problem description I will leave the details
unaddressed.>
> [[alternative HTML version deleted]]
David Winsemius
Alameda, CA, USA