Hi, I'm having (yet another) problem with R. I have a few data sets that have this sort of format dataset1 ID DATA 1234 value 2345 value 3456 value dataset2 ID DATA 1111 value 2345 value 3333 value What i really want to do is write an R script that says "if the ID of dataset1 and 2 match (2nd row), print out that whole row into a new dataset3". No idea how to do that though. Normally I would just write out the files to a txt file then write a perl script that would do just that. However these files are HUGE and perl will take forever to do this!! I'm hoping theres a quicker solution in R... Any help appreciated -- View this message in context: http://n4.nabble.com/Matching-rows-in-a-Data-set-I-m-Stuck-tp1576432p1576432.html Sent from the R help mailing list archive at Nabble.com.
I have not explained this properly. I meant to say if the ID vales of the rows in each file match, print the data into a new dataframe with this structure ID DATA1 2345 VALUE1 VALUE2 where value1 is from the first dataset and value2 is from the 2nd -- View this message in context: http://n4.nabble.com/Matching-rows-in-a-Data-set-I-m-Stuck-tp1576432p1576440.html Sent from the R help mailing list archive at Nabble.com.
On 3/3/10, BioStudent <s0975764 at sms.ed.ac.uk> wrote:> What i really want to do is write an R script that says "if the ID of > dataset1 and 2 match (2nd row), print out that whole row into a new > dataset3". >Would this do what you want?> x1 <- iris[1:5,] > x1Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa> x2 <- iris[4:7,] > x2Sepal.Length Sepal.Width Petal.Length Petal.Width Species 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa 7 4.6 3.4 1.4 0.3 setosa> x.ids <- intersect(rownames(x1), rownames(x2)) > x.ids[1] "4" "5"> x3 <- cbind(x1[x.ids, ], x2[x.ids, ])Liviu
Thanks! I'm just trying to do it now but having issues with memory... test <- merge(file1, file2, by.x = "col1") will this give me the output I was hoping for ID VALUE1 VALUE2 ? Thanks -- View this message in context: http://n4.nabble.com/Matching-rows-in-a-Data-set-I-m-Stuck-tp1576432p1576523.html Sent from the R help mailing list archive at Nabble.com.
On Mar 3, 2010, at 5:52 AM, BioStudent wrote:> Hi, I'm having (yet another) problem with R. > > I have a few data sets that have this sort of format > > dataset1 > ID DATA > 1234 value > 2345 value > 3456 value > > dataset2 > ID DATA > 1111 value > 2345 value > 3333 value > > What i really want to do is write an R script that says "if the ID of > dataset1 and 2 match (2nd row), print out that whole row into a new > dataset3". No idea how to do that though. Normally I would just write out > the files to a txt file then write a perl script that would do just that. > However these files are HUGE and perl will take forever to do this!! I'm > hoping theres a quicker solution in R... > > Any help appreciatedSee ?merge will performs SQL-like join operations:> dataset1ID DATA 1 1234 value1 2 2345 value1 3 3456 value1> dataset2ID DATA 1 1111 value2 2 2345 value2 3 3333 value2> merge(dataset1, dataset2, by = "ID")ID DATA.x DATA.y 1 2345 value1 value2 HTH, Marc Schwartz
Hi, are your dataframes really called file1 and file2? Then, it will be something like this: test Links: ------ [1] http://n4.nabble.com/Matching-rows-in-a-Data-set-I-m-Stuck-tp1576432p1576523.html [2] (link removed) -- View this message in context: http://n4.nabble.com/Matching-rows-in-a-Data-set-I-m-Stuck-tp1576432p1576573.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]]