Hi all, I have a large data.frame, 1530 observation with 6 columns. I want to merge a 7th column, a transformation of the response variable (hospital admissions), namely trans<-sqrt(copd$admissions+0.25) trans<-data.frame(trans) And now when I do copd2<-merge(copd,trans) (copd being my original data.frame), R either crashes or is taking an extremely long time to do the computation. I had expected the computation to be done almost instantly as I have done similar things in R recently, however my system becomes very slow to the point of being unusable. Would people expect this computation to take a long time? Most of the data in the data.frame are just integers. I want to perform linear regression on the data trans, so I would like it in the data.frame to save typing. Any ideas? Robin Williams Met Office summer intern - Health Forecasting robin.williams@metoffice.gov.uk [[alternative HTML version deleted]]
> I have a large data.frame, 1530 observation with 6 columns. I want to > merge a 7th column, a transformation of the response variable (hospital > admissions), namely > trans<-sqrt(copd$admissions+0.25) > trans<-data.frame(trans) > And now when I do > copd2<-merge(copd,trans) > (copd being my original data.frame), R either crashes or is taking an > extremely long time to do the computation. I had expected the > computation to be done almost instantly as I have done similar things in > R recently, however my system becomes very slow to the point of being > unusable.If I understand correctly, all you want to do is add another column to your data.frame. Unless I have overlooked something, a simple assignement should do: copd$trans <- copd$admissions+0.25 The reason your computation takes long/crashes is that you are merging two data frames which have no common columns to merge on. So merge() is generating all possible combinations for you.> a = data.frame(1:3) > b = data.frame(4:6) > merge(a,b)X1.3 X4.6 1 1 4 2 2 4 3 3 4 4 1 5 5 2 5 6 3 5 7 1 6 8 2 6 9 3 6 This is most likely not what you intended. I guess you were looking for cbind() rather than merge() cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://mips.gsf.de/staff/pagel
On Wed, Jul 30, 2008 at 12:24:22PM +0200, Philipp Pagel wrote:> If I understand correctly, all you want to do is add another column to > your data.frame. Unless I have overlooked something, a simple > assignement should do: > > copd$trans <- copd$admissions+0.25Sorry, that should have been: copd$trans <- sqrt(copd$admissions+0.25) cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://mips.gsf.de/staff/pagel