Hi everyone, I am a real beginner to R and have probably a very naive issue. I've a small data frame with three columns: Unique Sample ID, Gene 1 and Gene 2 (the columns on Gene1 and Gene2 are empty). I have two separate tables for the genes which contain the Unique Subject ID in one column and information on whether the gene is mutated or not in that particular subject (M, N/M) in another column called (Condition). I want to make a loop which can read the Unique Subject ID from my data frame, then look up for the same ID in the two tables and depending on whether the gene is mutated (M)/not mutated (N/M), inserts Yes like emoticon / No (N) in the appropriate gene column (Gene1/Gene2) for each Subject ID. If anyone can help, I would really appreciate Thanks in advance Fazal, -------------- next part -------------- A non-text attachment was scrubbed... Name: Gene2.png Type: image/png Size: 6717 bytes Desc: Gene2.png URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20150501/dc7de010/attachment.png>
Hi Fazal, In order to help you we probably need some sample data. Any code you have been trying is also probably useful The png is helpful but it is much better to supply the actual data or a good sample of it). The best way to supply data to R-help is to use the dput() function. See ?dput() for some basic information on how to use it. In very simple terms, if you have a data set called mydata do dput(mydata) copy the output and paste into your e-mail. Done. Fini! This provides the R-help readers with an exact copy of your data. For general information about how to ask questions in R-help see http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example welcome to R-help John Kane Kingston ON Canada> -----Original Message----- > From: fazal.hadi at curie.fr > Sent: Fri, 1 May 2015 20:05:22 +0000 > To: r-help at r-project.org > Subject: [R] Help with making Loop > > Hi everyone, > I am a real beginner to R and have probably a very naive issue. I've a > small data frame with three columns: Unique Sample ID, Gene 1 and Gene 2 > (the columns on Gene1 and Gene2 are empty). I have two separate tables > for the genes which contain the Unique Subject ID in one column and > information on whether the gene is mutated or not in that particular > subject (M, N/M) in another column called (Condition). I want to make a > loop which can read the Unique Subject ID from my data frame, then look > up for the same ID in the two tables and depending on whether the gene is > mutated (M)/not mutated (N/M), inserts Yes like emoticon / No (N) in the > appropriate gene column (Gene1/Gene2) for each Subject ID. > If anyone can help, I would really appreciate > Thanks in advance > > Fazal, > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.____________________________________________________________ FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
Dear Fazal I think part of your problem can be addressed with merge go ?merge at the R prompt On 01/05/2015 21:05, Hadi Fazal wrote:> Hi everyone, > I am a real beginner to R and have probably a very naive issue. I've a small data frame with three columns: Unique Sample ID, Gene 1 and Gene 2 (the columns on Gene1 and Gene2 are empty). I have two separate tables for the genes which contain the Unique Subject ID in one column and information on whether the gene is mutated or not in that particular subject (M, N/M) in another column called (Condition). I want to make a loop which can read the Unique Subject ID from my data frame, then look up for the same ID in the two tables and depending on whether the gene is mutated (M)/not mutated (N/M), inserts Yes like emoticon / No (N) in the appropriate gene column (Gene1/Gene2) for each Subject ID. > If anyone can help, I would really appreciate > Thanks in advance > > Fazal, > > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Michael http://www.dewey.myzen.co.uk/home.html
Fazal, I am not sure what you want, but I have guessed. I have tried to provide a straight forward simplistic solution. If you examine the intermediate results, I think what is being done will be clear. Mark Michael Dewey?s suggestion to look at merge is excellent. You may also need to look at the other functions used below. All are commonly used. # Code follows set.seed(1) # ensures you get the same results id <- paste0("id_", 1:10) # I did not want to copy down your Ids so I made them up. small_dataframe <- data.frame(unique_sample_id = id, g_1 = character(length(id)), g_2 = character(length(id))) # Making up some genotypes for g_1_table and g_2_table g_1_table <- data.frame(unique_sample_id = sample(id, length(id), replace = FALSE), cond = sample(c("M", "N/M"), length(id), replace = TRUE, prob = c(0.5, 0.5))) g_2_table <- data.frame(unique_sample_id = sample(id, length(id), replace = FALSE), cond = sample(c("M", "N/M"), length(id), replace = TRUE, prob = c(0.5, 0.5))) new_dataframe <- merge(small_dataframe, g_1_table, by = "unique_sample_id") names(new_dataframe) <- c("unique_sample_id", "g_1", "g_2", "g_1_cond") new_dataframe <- merge(new_dataframe, g_2_table, by = "unique_sample_id") names(new_dataframe) <- c("unique_sample_id", "g_1", "g_2", "g_1_cond", "g_2_cond") new_dataframe$g_1_emoticon <- ifelse(new_dataframe$g_1_cond == "M", ":-)", "No") new_dataframe$g_2_emoticon <- ifelse(new_dataframe$g_2_cond == "M", ":-)", "No") new_dataframe # End of code # Output of last line of code. unique_sample_id g_1 g_2 g_1_cond g_2_cond g_1_emoticon g_2_emoticon 1 id_1 M N/M :-) No 2 id_10 N/M N/M No No 3 id_2 M M :-) :-) 4 id_3 N/M M No :-) 5 id_4 N/M N/M No No 6 id_5 M N/M :-) No 7 id_6 M N/M :-) No 8 id_7 N/M M No :-) 9 id_8 N/M M No :-) 10 id_9 M M :-) :-)> On May 1, 2015, at 3:05 PM, Hadi Fazal <fazal.hadi at curie.fr> wrote: > > Hi everyone, > I am a real beginner to R and have probably a very naive issue. I've a small data frame with three columns: Unique Sample ID, Gene 1 and Gene 2 (the columns on Gene1 and Gene2 are empty). I have two separate tables for the genes which contain the Unique Subject ID in one column and information on whether the gene is mutated or not in that particular subject (M, N/M) in another column called (Condition). I want to make a loop which can read the Unique Subject ID from my data frame, then look up for the same ID in the two tables and depending on whether the gene is mutated (M)/not mutated (N/M), inserts Yes like emoticon / No (N) in the appropriate gene column (Gene1/Gene2) for each Subject ID. > If anyone can help, I would really appreciate > Thanks in advance > > Fazal, > <Gene2.png>______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.