David Winsemius
2014-Apr-06 17:13 UTC
[R] Fwd: Recoding in R conditioned on a certain value.
On Apr 5, 2014, at 8:37 PM, Kate Ignatius wrote:> Thanks, > > I ended up using this. I was curious how to get the mean of multiple > columns by chrom (or Plan with the example below). Using this data > for example: > > Plan X mm mm2 > 1 95 0.323000 0.400303 > 1 275 0.341818 0.400303 > 1 2 0.618000 0.400303 > 1 75 0.320000 0.400303 > 1 13 0.399000 0.400303 > 1 20 0.400000 0.400303 > 2 219 0.393000 0.353350 > 2 50 0.060000 0.353350 > 2 213 0.390000 0.353350 > 2 204 0.496100 0.353350 > 2 19 0.393000 0.353350 > 2 201 0.388000 0.353350 > > I've tried: > > pp$meanmm <- with(pp, ave(pp[,3:4], Plan, FUN = mean))People should do some 'dimensional analysis' when they get errors. (And they should report the text of the errors.) The length and width of what is specified on the LHS should be the same as what would be produced on the RHS of hte assignment. But that would not have been the first error that was encountered. You tried to pass two columns as the first argument to a function that expected one, and then you tried to assign the result to one column. This might have a better chance. pp[ c('mean.m1' , 'mean.m2') ] <- lapply( pp[ , 3:4] , function(x) ave(x, pp$Plan, FUN=mean) ) > pp Plan X mm mm2 mean.m1 mean.m2 1 1 95 0.323000 0.400303 0.400303 0.400303 2 1 275 0.341818 0.400303 0.400303 0.400303 3 1 2 0.618000 0.400303 0.400303 0.400303 snipped -- David.> > But that doesn't seem to work. > > On Sat, Apr 5, 2014 at 4:18 PM, David Winsemius <dwinsemius at comcast.net > > wrote: >> >> On Apr 5, 2014, at 9:51 AM, Kate Ignatius wrote: >> >>> I'm trying to work out the average of a certain value by chromosome. >>> I've done the following, but it doesn't seem to work: >>> >>> Say, I want to find average AD for chromosome 1 only and paste the >>> value next to all the positions on chromosome 1: >>> >>> [sam$chrom == '1'] <- >>> (sam$ad)/(colSums(sam[c(1:nrow(sam$chrom=='1'))],)) >> >> It "looks" wrong to me because of the mismatching lengths of the >> lhs and rhs but since you have not provided a test dataset that's >> all I will say. >> >> The usual way to calculate a function within categorical groupings >> that will be "re-inserted" alongside the original dataframe is to >> use `ave`: >> >> sam$mmad <- with( sam, ave(ad, chrom, FUN=mean) ) >> >> >>> >>> I know this is convoluted and possible wrong... but I would like >>> to do >>> this for all chromosomes. >>> >>> Thanks! >> -- >> David Winsemius >> Alameda, CA, USA >>David Winsemius, MD Alameda, CA, USA David Winsemius, MD Alameda, CA, USA