Folkes, Michael
2014-Jul-09 19:58 UTC
using match to obtain non-sorted index values from non-sorted vector
Hello all, I've been struggling with the best way to find index values from a large vector with elements that will match elements of a subset vector [the table argument in match()]. BUT the index values can't come out sorted (as we'd get in which(X %in% Y) ). My 'population' vector can't be sorted. pop.df <- data.frame(pop=c(1,6,4,3,10)) The subset: Tset = c(10,3,6) So I'd like to get these index values (from pop.df) , in this order: 5,4,2 If it could be sorted I could use: which(sort(pop.df$pop) %in% sort(Tset)) But sorting will cause more grief later, so best not mess with it. Here is my hopefully adequate MWE of a solution. I'm keen to see if anybody has a better suggestion. Thanks! _____________________ ###BEGIN R #pop is the full set of values, it has no info on their ranking # I don't want to sort these data. They need to remain in this order. pop.df <- data.frame(pop=c(1,6,4,3,10)) #rank.df is my dataframe that tells me the top three rankings (derived elsewhere) rank.df <- data.frame(rank=1:3, Tset = c(10,3,6)) # Target set #match.df will be my source of row index based on rank match.df <- data.frame(match.vec= match(pop.df$pop, table=rank.df$Tset), index.vec=1:nrow(pop.df)) #rank.df will now include the index location in the pop.df where I can find the top three ranks. rank.df <- merge(rank.df, match.df, by.x='rank', by.y='match.vec') rank.df ####END _______________________________________________________ Michael Folkes Salmon Stock Assessment Canadian Dept. of Fisheries & Oceans Pacific Biological Station 3190 Hammond Bay Rd. Nanaimo, B.C., Canada V9T-6N7 Ph (250) 756-7264 Fax (250) 756-7053 Michael.Folkes@dfo-mpo.gc.ca <mailto:Michael.Folkes@dfo-mpo.gc.ca> [[alternative HTML version deleted]]