Hi all, I have a matrix called 'data', which looks like:> data[1:4,1:4]Probe_ID Gene_Symbol M1601 M1602 1 A_23_P105862 13CDNA73 -1.6 0.16 2 A_23_P76435 15E1.2 0.18 0.59 3 A_24_P402115 15E1.2 1.63 -0.62 4 A_32_P227764 15E1.2 -0.76 -0.42> dim(data)[1] 23963 85 What I want to do is to make a new matrix called 'data2', which would be transformed by subtracting the mean of each row from matrix 'data'. There are some 'NA's in the matrix and I do want to keep it. I tried to take 'mean's from each row first by using: a<- rowMeans(data[,3:85],na.rm = FALSE) but I got:> a<- rowMeans(data[,3:85],na.rm = FALSE)Error in rowMeans(data[, 3:85], na.rm = FALSE) : 'x' must be numeric Can anybody suggest me how to get around this? Thank you very much! Allen [[alternative HTML version deleted]]
Hello - ss wrote:> Hi all, > > I have a matrix called 'data', which looks like: > >> data[1:4,1:4] > Probe_ID Gene_Symbol M1601 M1602 > 1 A_23_P105862 13CDNA73 -1.6 0.16 > 2 A_23_P76435 15E1.2 0.18 0.59 > 3 A_24_P402115 15E1.2 1.63 -0.62 > 4 A_32_P227764 15E1.2 -0.76 -0.42 >> dim(data) > [1] 23963 85 >Do you really have a matrix, or a data.frame? Try > class(data)> What I want to do is to make a new matrix called 'data2', which would be > transformed > by subtracting the mean of each row from matrix 'data'. There are some 'NA's > in the > matrix and I do want to keep it.See ?scale> > I tried to take 'mean's from each row first by using: > > a<- rowMeans(data[,3:85],na.rm = FALSE) > > but I got: > >> a<- rowMeans(data[,3:85],na.rm = FALSE) > Error in rowMeans(data[, 3:85], na.rm = FALSE) : 'x' must be numeric > > Can anybody suggest me how to get around this?Figure out what you are giving the rowMeans function. If you really have a matrix, then all(apply(data[,3:85], 2, class) == "numeric") should be TRUE.> > Thank you very much! > > Allen > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
ss wrote:> Hi all, > > I have a matrix called 'data', which looks like: > > >> data[1:4,1:4] >> > Probe_ID Gene_Symbol M1601 M1602 > 1 A_23_P105862 13CDNA73 -1.6 0.16 > 2 A_23_P76435 15E1.2 0.18 0.59 > 3 A_24_P402115 15E1.2 1.63 -0.62 > 4 A_32_P227764 15E1.2 -0.76 -0.42 > >> dim(data) >> > [1] 23963 85 > > What I want to do is to make a new matrix called 'data2', which would be > transformed > by subtracting the mean of each row from matrix 'data'. There are some 'NA's > in the > matrix and I do want to keep it. > > I tried to take 'mean's from each row first by using: > > a<- rowMeans(data[,3:85],na.rm = FALSE) > > but I got: > > >> a<- rowMeans(data[,3:85],na.rm = FALSE) >> > Error in rowMeans(data[, 3:85], na.rm = FALSE) : 'x' must be numeric > >sure, at least the first two columns are not numeric> Can anybody suggest me how to get around this? > >you can compute row means based on only those columns which are numeric as follows: a = rowMeans(data[sapply(data, is.numeric)]) what you do with NAs is another story. vQ
ss wrote:> Thank you very much, Wacek! It works very well. > But there is a minor problem. I did the following: > > >data <- > read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt', > +row.names = NULL ,header=TRUE, fill=TRUE)looks like you have a data frame, not a matrix> > dim(data) > [1] 23963 85 > > data[1:4,1:4] > Probe_ID Gene_Symbol M16012391010920 M16012391010525 > 1 A_23_P105862 13CDNA73 -1.6 0.16 > 2 A_23_P76435 15E1.2 0.18 0.59 > 3 A_24_P402115 15E1.2 1.63 -0.62 > 4 A_32_P227764 15E1.2 -0.76 -0.42 > >data1<-data[sapply(data, is.numeric)] > > dim(data1) > [1] 23963 82 > > data1[1:4,1:4] > M16012391010525 M16012391010843 M16012391010531 M16012391010921 > 1 0.16 -0.23 -1.40 0.90 > 2 0.59 0.28 -0.30 0.08 > 3 -0.62 -0.62 -0.22 -0.18 > 4 -0.42 0.01 0.28 -0.79 > > You will notice that, after using 'data[sapply(data, is.numeric)]' and > getting > data1, the first sample in data, called 'M16012391010920', was missed > in data1. > > Any further suggestions? >surely there must be an entry in column 3 that makes it non-numeric. what does is.numeric(data[3]) say? (NAs should not make a column non-numeric, unless there are only NAs there, which is not the case here.) check your data for non-numeric entries in column 3, there can be a typo. vQ