Hi, It is not that clear. If VAR1 is a match between columns AB001A, AB0002A, VAR2? between AB001A, AB362 and VAR3 between AB0002A and AB362: Also, I assume row8 match would be taken as 1. dat1<- read.table(text=" ? S.No AB001A AB0002A AB362 ?? 1?? -/-??????? C/C?? A/A??????????????????????? ??? 2?? C/C??????? C/C?? A/A??????????????????????? ??? 3?? C/C??????? C/C?? A/A??????????????????????? ??? 4?? C/C??????? C/C?? A/A??????????????????????? ??? 5?? C/C??????? C/C?? A/A??????????????????????? ??? 6?? C/C??????? C/C?? A/A??????????????????????? ??? 7?? C/C??????? C/C?? A/A??????????????????????? ??? 8?? -/-??????? -/-?? -/-??????????????????????? ??? 9?? C/C??????? C/C?? A/A??????????????????????? ??? 10? C/C??????? C/C?? A/A??????????????????????? ??? 11? -/-??????? C/C?? A/A??????????????????????? ??? 12? C/C??????? C/C?? A/A??????????????????????? ??? 13? C/C??????? C/C?? A/A??????????????????????? ??? 14? C/C??????? C/C?? A/A??????????????????????? ??? 16? C/C??????? -/-?? A/A??????????????????????? ??? 17?? -/-??????? C/C?? A/A??????????????????????? ??? 18?? C/C??????? C/C?? A/A??????????????????????? ??? 19? C/C??????? C/C?? A/A ",sep="",header=TRUE,stringsAsFactors=FALSE) library(plyr) res<-mutate(dat1,VAR1=1*(AB001A==AB0002A),VAR2=1*(AB001A==AB362),VAR3=1*(AB0002A==AB362),SUM=rowSums(cbind(VAR1,VAR2,VAR3)),MATCH=(SUM/3)*100,Rank=rank(MATCH) ?head(res) #? S.No AB001A AB0002A AB362 VAR1 VAR2 VAR3 SUM??? MATCH Rank #1??? 1??? -/-???? C/C?? A/A??? 0??? 0??? 0?? 0? 0.00000? 2.5 #2??? 2??? C/C???? C/C?? A/A??? 1??? 0??? 0?? 1 33.33333 11.0 #3??? 3??? C/C???? C/C?? A/A??? 1??? 0??? 0?? 1 33.33333 11.0 #4??? 4??? C/C???? C/C?? A/A??? 1??? 0??? 0?? 1 33.33333 11.0 #5??? 5??? C/C???? C/C?? A/A??? 1??? 0??? 0?? 1 33.33333 11.0 #6??? 6??? C/C???? C/C?? A/A??? 1??? 0??? 0?? 1 33.33333 11.0 #or ?res<-mutate(dat1,VAR1=1*(AB001A==AB0002A),VAR2=1*(AB001A==AB362),VAR3=1*(AB0002A==AB362),SUM=rowSums(cbind(VAR1,VAR2,VAR3)),MATCH=(SUM/3)*100,Rank=rank(MATCH,ties.method="min")) ?head(res) #? S.No AB001A AB0002A AB362 VAR1 VAR2 VAR3 SUM??? MATCH Rank #1??? 1??? -/-???? C/C?? A/A??? 0??? 0??? 0?? 0? 0.00000??? 1 #2??? 2??? C/C???? C/C?? A/A??? 1??? 0??? 0?? 1 33.33333??? 5 #3??? 3??? C/C???? C/C?? A/A??? 1??? 0??? 0?? 1 33.33333??? 5 #4??? 4??? C/C???? C/C?? A/A??? 1??? 0??? 0?? 1 33.33333??? 5 #5??? 5??? C/C???? C/C?? A/A??? 1??? 0??? 0?? 1 33.33333??? 5 #6??? 6??? C/C???? C/C?? A/A??? 1??? 0??? 0?? 1 33.33333??? 5 A.K.>Hi to all bloggers,?>my data looks like this,> >S. No ? AB001A ?AB0002A AB362 ? VAR1 ? ?VAR2 ? ?VAR3 ? ?SUM %Match ?Rank?>? 1 ? -/- ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?> ? 2 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ? >? 3 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ? >? 4 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 5 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ? >? 6 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 7 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 8 ? -/- ? ? ? ?-/- ? -/- ? ? ? ? ? ? ? ? ? ? ? ? ?? > 9 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 10 ?C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 11 ?-/- ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 12 ?C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 13 ?C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 14 ?C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 16 ?C/C ? ? ? ?-/- ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 17 ? -/- ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 18 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 19 ?C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ?>I want to match obs 3 with obs 2 if it exactly matched then scorewill be 1 else 0, that will be stored in var1 for AB001a, in var2 for ab0002a and in >var3 for ab362 and i want to calculate sum of all the 1's and observation match percent and their rank (top ten matchers), I did this successfully in >excel but it took me lot of time, i used if condition in excel like (=if(A3=A$2,1,0) and then i dragged among all obs and i did sum of all obs, their >%match and rank. My question is how can i do this in R? can i use match package for this? or other packages will help me? my data is so big with >5,15,567 obs. can any one guide me how to do this in sas because i want to reduce my time to analyze my data. Thanking you Regards,
Hi, May be this helps: As you wanted to match only from row3 onwards to row2, the corresponding values on row1 and row2 were set to NA. dat1<- read.table(text=" ? S.No AB001A AB0002A AB362 ?? P1?? -/-??????? C/C?? A/A?????????????????????? ??? P2?? C/C??????? C/C?? A/A?????????????????????? ??? 3?? C/C??????? C/C?? A/A?????????????????????? ??? 4?? C/C??????? C/C?? A/A?????????????????????? ??? 5?? C/C??????? C/C?? A/A?????????????????????? ??? 6?? C/C??????? C/C?? A/A?????????????????????? ??? 7?? C/C??????? C/C?? A/A?????????????????????? ??? 8?? -/-??????? -/-?? -/-?????????????????????? ??? 9?? C/C??????? C/C?? A/A?????????????????????? ??? 10? C/C??????? C/C?? A/A?????????????????????? ??? 11? -/-??????? C/C?? A/A?????????????????????? ??? 12? C/C??????? C/C?? A/A?????????????????????? ??? 13? C/C??????? C/C?? A/A?????????????????????? ??? 14? C/C??????? C/C?? A/A?????????????????????? ??? 15? C/C??????? -/-?? A/A?????????????????????? ??? 16?? -/-??????? C/C?? A/A?????????????????????? ??? 17?? A/A??????? A/C?? A/A?????????????????????? ??? 18? C/A??????? A/A?? A/A ",sep="",header=TRUE,stringsAsFactors=FALSE) dat2<-cbind(dat1,(1*mapply("==",dat1[,-1],dat1[2,-1]))) names(dat2)[duplicated(names(dat2))]<- paste0(names(dat2)[duplicated(names(dat2))],"_1") library(plyr) ?dat3<-mutate(dat2,SUM=rowSums(cbind(AB001A_1,AB0002A_1,AB362_1)), MATCH=(SUM/3)*100) ?dat3[1:2,5:9]<-NA res<-mutate(dat3,RANK=rank(MATCH,ties.method="min")) ?head(res) #? S.No AB001A AB0002A AB362 AB001A_1 AB0002A_1 AB362_1 SUM MATCH RANK #1?? P1??? -/-???? C/C?? A/A?????? NA??????? NA????? NA? NA??? NA?? 17 #2?? P2??? C/C???? C/C?? A/A?????? NA??????? NA????? NA? NA??? NA?? 18 #3??? 3??? C/C???? C/C?? A/A??????? 1???????? 1?????? 1?? 3?? 100??? 7 #4??? 4??? C/C???? C/C?? A/A??????? 1???????? 1?????? 1?? 3?? 100??? 7 #5??? 5??? C/C???? C/C?? A/A??????? 1???????? 1?????? 1?? 3?? 100??? 7 #6??? 6??? C/C???? C/C?? A/A??????? 1???????? 1?????? 1?? 3?? 100??? 7 A.K.>Hi Arun, >Thank you very much for your help in solving my problem, >S. No ? AB001A ?AB0002A AB362 ? AB001A ? ?AB0002A ? ? AB362 ? SUM %Match ?Rank?> ? P1 ? -/- ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ? > P 2 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ? >? 3 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ? >? 4 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ? >? 5 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 6 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 7 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 8 ? -/- ? ? ? ?-/- ? -/- ? ? ? ? ? ? ? ? ? ? ? ? ?? > 9 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? >10 ?C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 11 ?-/- ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 12 ?C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 13 ?C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 14 ?C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ? ? >16 ?C/C ? ? ? ?-/- ? A/A ? ? ? ? ? ? ? ? ? ? ? ?>Actually i want to match observation from 3 to 16 with the value inp2 (i.e 3 with p2, 4 with p2, 5 with p2 etc), if they match i would like to give >value 1 and store it in corresponding dummy variable i.e. AB001A and i would like to do samething for remaining vars too and storing in their >dummy vars. Finally i want make sum of all the matched (i.e. 1 score) in each row and calculate percentage of match and then rank. This what i >want, sorry for not expressing my problem exactly in understandable way.>Hi to all bloggers,?>my data looks like this,> >S. No ? AB001A ?AB0002A AB362 ? VAR1 ? ?VAR2 ? ?VAR3 ? ?SUM %Match ?Rank?>? 1 ? -/- ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?> ? 2 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ? >? 3 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ? >? 4 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 5 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ? >? 6 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 7 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 8 ? -/- ? ? ? ?-/- ? -/- ? ? ? ? ? ? ? ? ? ? ? ? ?? > 9 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 10 ?C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 11 ?-/- ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 12 ?C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 13 ?C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 14 ?C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 16 ?C/C ? ? ? ?-/- ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 17 ? -/- ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 18 ? C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ? ?? > 19 ?C/C ? ? ? ?C/C ? A/A ? ? ? ? ? ? ? ? ? ? ? ?>I want to match obs 3 with obs 2 if it exactly matched then scorewill be 1 else 0, that will be stored in var1 for AB001a, in var2 for ab0002a and in >var3 for ab362 and i want to calculate sum of all the 1's and observation match percent and their rank (top ten matchers), I did this successfully in >excel but it took me lot of time, i used if condition in excel like (=if(A3=A$2,1,0) and then i dragged among all obs and i did sum of all obs, their >%match and rank. My question is how can i do this in R? can i use match package for this? or other packages will help me? my data is so big with >5,15,567 obs. can any one guide me how to do this in sas because i want to reduce my time to analyze my data. Thanking you Regards,