HI, Not sure about what your expected output would be.? Also 'CEBPA' was not present in the Data.txt. gset<- read.table("Names.txt",header=TRUE,stringsAsFactors=FALSE) ?temp1<- read.table("Data.txt",header=TRUE,stringsAsFactors=FALSE) lst1<-split(temp1,temp1$Names) mat1<-combn(gset[-1,1],2) #removed CEBPA library(plyr) lst2<-lapply(split(mat1,col(mat1)),function(x) {x1<-join_all(lst1[x],by="patient_id",type="inner");x1["patient_id"] }) names(lst2)<-apply(mat1,2,paste,collapse="_") do.call(rbind,lst2) #?????????????????? patient_id #DNMT3A_FLT3.1 LAML-AB-2811-TB #common ids between DNMT3A and FLT3 #DNMT3A_FLT3.2 LAML-AB-2816-TB #DNMT3A_FLT3.3 LAML-AB-2818-TB #DNMT3A_IDH1.1 LAML-AB-2802-TB#common ids between DNMT3A and IDH1.? If you wanted it as separate dataframes, use `lst2`. #DNMT3A_IDH1.2 LAML-AB-2822-TB #DNMT3A_NPM1.1 LAML-AB-2802-TB #DNMT3A_NPM1.2 LAML-AB-2809-TB #DNMT3A_NPM1.3 LAML-AB-2811-TB #DNMT3A_NPM1.4 LAML-AB-2816-TB #DNMT3A_NRAS?? LAML-AB-2816-TB #FLT3_NPM1.1?? LAML-AB-2811-TB #FLT3_NPM1.2?? LAML-AB-2812-TB #FLT3_NPM1.3?? LAML-AB-2816-TB #FLT3_NRAS???? LAML-AB-2816-TB #IDH1_NPM1???? LAML-AB-2802-TB #NPM1_NRAS???? LAML-AB-2816-TB A.K. Hello R experts, I am trying to solve the following logic. I have two input files. The first file (Names.txt) that has two columns: Column1 Column2 CEBPA CEBPA DNMT3A DNMT3A FLT3 FLT3 IDH1 IDH1 NPM1 NPM1 NRAS NRAS and the second input file Data.txt has two columns Names, patient_id. Name patient_id DNMT3A LAML-AB-2802-TB DNMT3A LAML-AB-2809-TB DNMT3A LAML-AB-2811-TB DNMT3A LAML-AB-2816-TB DNMT3A LAML-AB-2818-TB DNMT3A LAML-AB-2822-TB DNMT3A LAML-AB-2824-TB FLT3 LAML-AB-2811-TB FLT3 LAML-AB-2812-TB FLT3 LAML-AB-2814-TB FLT3 LAML-AB-2816-TB FLT3 LAML-AB-2818-TB FLT3 LAML-AB-2825-TB FLT3 LAML-AB-2830-TB FLT3 LAML-AB-2834-TB IDH1 LAML-AB-2802-TB IDH1 LAML-AB-2821-TB ?What I am attempting to do is for each name in first column of names.txt, I do a pairwise comparison with the other names in the second column based on which patient ids are common. To explain in detail: As an example: I extract patient_ids for CEBPA and DNMT3A and see which are common, then I do the same for CEBPA and FLT3 and so on for CEBPA and the next name in column 2. So far the script I have written only does the comparison with the first name in the list. So essentially with itself. I am not sure why this logic is not working for all the names in column 2 for a single name in column 1. Below is my script: gset<-read.table("Names.txt",header=F,na.strings = ".", as.is=T) # reading in the genes temp<-read.table("Data.txt",header=T,sep="\t") ################################################# ? ? all<-length(unique(temp$fpatient_id)) ? final<-c() ? ? both.ab <- list() ? both <- list() ? temp.b <- matrix() ? ? for(i in 1:nrow(gset)) ?# Loop for genes in the first column ? ? { ? ? ? ? temp2<-temp[which(temp$Column1 %in% gset[i,]),] ? ? num.mut<-length(unique(temp2$patient_id)) ? ? ? ? temp.a <-temp[which(temp$Column1 == gset[i,1]),] ? ? ? for(j in 1:(nrow(gset)) ?# Loop for genes in the second column ? ? ? ? ? ? ? ? { ? ? ? temp.b <-temp[which(temp$Column2 == gset[j,2]),] ? ? ? # See which patient_ids of temp.a are in temp.b ? ? ? both.ab[[i]]<-temp.a[which(temp.a$patient_id %in% temp.b$patient_id),] ? ? } ? ? both[[i]]<-both.ab[[i]] ? ? ? ? num.both<-length(unique(both[[i]]$patient_id)) ? ? ? ? line<-c(paste(gset[i, which(!(is.na(gset[i,]))) ],collapse="/"), num.mut, all, num.mut/all, num.both) ? ? final<-rbind(final,line) ? } Names.txtData.txtScript.txt