Soheila Khodakarim
2012-Feb-17 11:44 UTC
[R] (subscript) logical subscript too long in using apply
Dear ALL I have this function in R: func_LN <- function(data){ med_ge <- matrix(c(rep(NA,nrow(data)*ncol(data))), nrow = nrow(data), ncol=ncol(data), byrow=TRUE) T <- matrix(c(rep(NA,length(n)*ncol(data))), nrow = length(n), ncol=ncol(data), byrow=TRUE) Tdiff<- matrix(c(rep(NA,length(n)*ncol(data))), nrow = length(n), ncol=ncol(data), byrow=TRUE) T1<- c(rep(NA,ncol(data))) T0<- c(rep(NA,ncol(data))) cov_rank<-matrix(c(rep(NA,ncol(data)*ncol(data))), nrow = ncol(data), ncol = ncol(data) , byrow=TRUE) med <- c(rep(NA,ncol(data))) mean_ge <- c(rep(NA,ncol(data))) n<-c(NA,2) if (ncol(data)>1){ for(m_j in 1:ncol(data)){ med[m_j]<-median(data[,m_j])} for(m_j in 1:ncol(data)) for(m_i in 1:nrow(data)) { if(data[m_i,m_j]>med[m_j]) med_ge[m_i,m_j]=0 else med_ge[m_i,m_j]=1 } y=c(1,1,1,1,1,1,0,0,0,0) n<-c(sum(y == 1),sum(y==0)) touse3 <- y==1 T1<- apply(med_ge[touse3,], 2, mean) T0<- apply(med_ge[!touse3,], 2, mean) T=rbind(T1,T0) Tbar=colMeans(T) Tdiff=T-Tbar cov_rank=cov(med_ge) inv_cov_rank=ginv(cov_rank) LN=0 for(m_i in 1:length(n)) { LN <- LN+((Tdiff[m_i,]%*%inv_cov_rank)%*%t(Tdiff)[,m_i])*n[m_i] } return(LN) }} func_LN(data) Now, I want to try this function on subgroups of data. So I used "apply" result <- apply(gs , 1 , function(z) func_LN(data[which(z==1),])) but I saw this error: Error in apply(med_ge[touse3, ], 2, mean) : (subscript) logical subscript too long I will appreciate if you help me. PS:the elements of gs are 1 0r 0. dim(data)=24*2665 dim(gs)=107*2665 Soheila [[alternative HTML version deleted]]
Petr Savicky
2012-Feb-17 12:10 UTC
[R] (subscript) logical subscript too long in using apply
On Fri, Feb 17, 2012 at 12:44:44PM +0100, Soheila Khodakarim wrote:> Dear ALL > I have this function in R: > > > > func_LN <- function(data){ > > med_ge <- matrix(c(rep(NA,nrow(data)*ncol(data))), nrow = nrow(data), > ncol=ncol(data), byrow=TRUE) > T <- matrix(c(rep(NA,length(n)*ncol(data))), nrow = length(n), > ncol=ncol(data), byrow=TRUE) > Tdiff<- matrix(c(rep(NA,length(n)*ncol(data))), nrow = length(n), > ncol=ncol(data), byrow=TRUE) > T1<- c(rep(NA,ncol(data))) > T0<- c(rep(NA,ncol(data))) > cov_rank<-matrix(c(rep(NA,ncol(data)*ncol(data))), nrow = ncol(data), ncol > = ncol(data) , byrow=TRUE) > > med <- c(rep(NA,ncol(data))) > mean_ge <- c(rep(NA,ncol(data))) > n<-c(NA,2) > if (ncol(data)>1){ > for(m_j in 1:ncol(data)){ > med[m_j]<-median(data[,m_j])} > > > for(m_j in 1:ncol(data)) > for(m_i in 1:nrow(data)) > { > if(data[m_i,m_j]>med[m_j]) > med_ge[m_i,m_j]=0 > else > med_ge[m_i,m_j]=1 > } > > y=c(1,1,1,1,1,1,0,0,0,0) > > > n<-c(sum(y == 1),sum(y==0)) > touse3 <- y==1 > > T1<- apply(med_ge[touse3,], 2, mean) > T0<- apply(med_ge[!touse3,], 2, mean) > > > T=rbind(T1,T0) > Tbar=colMeans(T) > Tdiff=T-Tbar > cov_rank=cov(med_ge) > inv_cov_rank=ginv(cov_rank) > > LN=0 > for(m_i in 1:length(n)) { > LN <- LN+((Tdiff[m_i,]%*%inv_cov_rank)%*%t(Tdiff)[,m_i])*n[m_i] > > } > return(LN) > }} > > func_LN(data) > > Now, I want to try this function on subgroups of data. > So I used "apply" > result <- apply(gs , 1 , function(z) func_LN(data[which(z==1),])) > > but I saw this error: > > Error in apply(med_ge[touse3, ], 2, mean) : > (subscript) logical subscript too long > > I will appreciate if you help me. > > PS:the elements of gs are 1 0r 0. > dim(data)=24*2665 > dim(gs)=107*2665Hi. Without a reproducible example, it is hard to determine the problem. You can try options(error=utils::recover) to get more information on the values of the variables when the error occurs. However, i am not sure, why you use data[which(z==1),] and not data[,which(z==1)]. The reason is that the function "apply(gs , 1 , func)" applies "func" to the rows of "gs". These rows have length 2665, which is equal to the number of columns of "data". So, i would expect to use "z" to select columns, not rows of "data". Can you comment on this? Petr Savicky.
Petr PIKAL
2012-Feb-17 12:24 UTC
[R] (subscript) logical subscript too long in using apply
Hi apply probably does not understand you function. I do not want to go too deeply into it but I noticed few issues in it. See inline> > Dear ALL > I have this function in R: > > > > func_LN <- function(data){ > > med_ge <- matrix(c(rep(NA,nrow(data)*ncol(data))), nrow = nrow(data), > ncol=ncol(data), byrow=TRUE) > T <- matrix(c(rep(NA,length(n)*ncol(data))), nrow = length(n), > ncol=ncol(data), byrow=TRUE) > Tdiff<- matrix(c(rep(NA,length(n)*ncol(data))), nrow = length(n), > ncol=ncol(data), byrow=TRUE) > T1<- c(rep(NA,ncol(data))) > T0<- c(rep(NA,ncol(data))) > cov_rank<-matrix(c(rep(NA,ncol(data)*ncol(data))), nrow = ncol(data),ncol> = ncol(data) , byrow=TRUE) > > med <- c(rep(NA,ncol(data))) > mean_ge <- c(rep(NA,ncol(data))) > n<-c(NA,2) > if (ncol(data)>1){ > for(m_j in 1:ncol(data)){ > med[m_j]<-median(data[,m_j])}this shall be the same as med <- apply(data, 2, median)> > > for(m_j in 1:ncol(data)) > for(m_i in 1:nrow(data)) > { > if(data[m_i,m_j]>med[m_j]) > med_ge[m_i,m_j]=0 > else > med_ge[m_i,m_j]=1 > }AFAIK this shall be same as med_ge<-(sweep(dat, 2, med)<=0)*1> > y=c(1,1,1,1,1,1,0,0,0,0) > > > n<-c(sum(y == 1),sum(y==0)) > touse3 <- y==1 > > T1<- apply(med_ge[touse3,], 2, mean) > T0<- apply(med_ge[!touse3,], 2, mean) > > > T=rbind(T1,T0) > Tbar=colMeans(T) > Tdiff=T-Tbar > cov_rank=cov(med_ge) > inv_cov_rank=ginv(cov_rank) > > LN=0 > for(m_i in 1:length(n)) { > LN <- LN+((Tdiff[m_i,]%*%inv_cov_rank)%*%t(Tdiff)[,m_i])*n[m_i] > > } > return(LN) > }} > > func_LN(data) > > Now, I want to try this function on subgroups of data. > So I used "apply" > result <- apply(gs , 1 , function(z) func_LN(data[which(z==1),])) > > but I saw this error: > > Error in apply(med_ge[touse3, ], 2, mean) : > (subscript) logical subscript too longThe rest is quite complicated for me so I do not dig into it. The obvious source of error is apply(med_ge[touse3, ], 2, mean) which tells you that touse3 is longer than number of med_ge rows. The error message probably could not be more precise. Regards Petr> > I will appreciate if you help me. > > PS:the elements of gs are 1 0r 0. > dim(data)=24*2665 > dim(gs)=107*2665 > > Soheila > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.