Hello,
I have a data file that I want to run loess on for 36 columns, divide the
original data by the new data, then dividing columns that end in A and B by
those that end in C. However, I have something wrong in my first step and
am completely stuck on the third. Could someone help me please?
Here's a snippet of the data file:
Order Target GC AA_001_A AA_001_B AA_001_C
1 a 0.584507042 422.94 302.32 412.19
2 b 0.630434783 193.44 182.88 224.96
3 c 0.649350649 132.67 116 136.12
4 d 0.635359116 306.78 203.68 306.98
5 e 0.609271523 276.32 214.73 307.03
6 f 0.626373626 333.93 249.28 421.97
7 g 0.618834081 216.22 200.94 236.27
All columns have 3722 rows. The columns repeat in that pattern out to
AA_012_C.
This is the script that I've tried:
gc<-read.delim("AA1_3_GC.txt")
gc2<-gc[,-c(1:2)]
res=cbind()
for(i in colnames(gc[,-1])){
temp<-loess(i~GC,gc2)
temp2<-predict(temp)
if (length(res)==0){
res=temp2
}else{ res=cbind(res,temp2)
}
}
But I keep getting this error:
Error in model.frame.default(formula = i ~ GC, data = gc2) :
variable lengths differ (found for 'GC')
If I manually type in a name, then it works just fine, but obviously I don't
want to do that for 36 columns. (Or 72 for the next project.) Where am I
going wrong and how to do I fix this?
For the second step (dividing column after I divide gc2/res), I really am
unsure of where to even start. I would guess that it would be something
along the lines of
for(i in colnames(gc[,-1])){
res[i]/res[i+2]}
But that would only get me A/C, then B/D, etc. I've spent the last hour
searching for this, but I'm clearly not using the right terms. Could
someone even point me in the right direction please?
Any help/suggestions you can give for either/both parts would be really
appreciated.
Thanks,
Rose
--
View this message in context:
http://r.789695.n4.nabble.com/Looping-column-names-tp4333870p4333870.html
Sent from the R help mailing list archive at Nabble.com.
Hello,> But I keep getting this error: > Error in model.frame.default(formula = i ~ GC, data = gc2) : > variable lengths differ (found for 'GC')Simple: you are using a variable's name, not the variable itself Your code corrected should be res <- NULL for(i in colnames(gc2[,-1])){ temp <- loess(gc2[, i]~GC,gc2) # fit the vector, NOT it's name temp2 <- predict(temp) res <- cbind(res, temp2) } colnames(res) <- colnames(gc2[,-1]) res But even better, without the loop, apply(gc2[,-1], 2, function(x) predict(loess(x~GC, data=gc2)))> For the second step (dividing column after I divide gc2/res), I really am > unsure of where to even start. I would guess that it would be something > along the lines of > for(i in colnames(gc[,-1])){ > res[i]/res[i+2]} > But that would only get me A/C, then B/D, etc.Create indexes on the columns: res2 <- gc2[, -1]/res n <- ncol(res2) ainx <- seq(1, n, 3) binx <- seq(2, n, 3) cinx <- seq(3, n, 3) res2[, ainx]/res2[, cinx] res2[, binx]/res2[, cinx] One final note. You've named your data.frame 'gc' but since this is the name of a function in R, it's a bad choice. I've renamed it 'gc1'. Hope this helps, Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Looping-column-names-tp4334211p4335454.html Sent from the R help mailing list archive at Nabble.com.