It is not clear to what you want for the general case. Perhaps:> v <- letters[c(2,2,1,2,1,1)] > wh <- tapply(seq_along(v),factor(v), '[',1) > w <- wh[match(v,v[wh])] > wb b a b a a 1 1 3 1 3 3> ## and if you want NA's for the first occurences of unique values > ## of course: > w[wh] <- NA > wb b a b a a NA 1 NA 1 3 3 I'd like to see a cleverer solution that vectorizes and avoids the tapply(), though. Cheers, Bert On Mon, Nov 12, 2018 at 8:33 PM Bert Gunter <bgunter.4567 at gmail.com> wrote:> > match(v, unique(v)) > [1] 1 2 2 1 > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Mon, Nov 12, 2018 at 5:08 PM Duncan Murdoch <murdoch.duncan at gmail.com> > wrote: > >> The duplicated() function gives TRUE if an item in a vector (or row in a >> matrix, etc.) is a duplicate of an earlier item. But what I would like >> to know is which item does it duplicate? >> >> For example, >> >> v <- c("a", "b", "b", "a") >> duplicated(v) >> >> returns >> >> [1] FALSE FALSE TRUE TRUE >> >> What I want is a fast way to calculate >> >> [1] NA NA 2 1 >> >> or (equally useful to me) >> >> [1] 1 2 2 1 >> >> The result should have the property that if result[i] == j, then v[i] =>> v[j], at least for i != j. >> >> Does this already exist somewhere, or is it easy to write? >> >> Duncan Murdoch >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >[[alternative HTML version deleted]]
"I'd like to see a cleverer solution that vectorizes..." and Herve provided it. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Nov 12, 2018 at 9:43 PM Bert Gunter <bgunter.4567 at gmail.com> wrote:> It is not clear to what you want for the general case. Perhaps: > > > v <- letters[c(2,2,1,2,1,1)] > > wh <- tapply(seq_along(v),factor(v), '[',1) > > w <- wh[match(v,v[wh])] > > w > b b a b a a > 1 1 3 1 3 3 > > ## and if you want NA's for the first occurences of unique values > > ## of course: > > w[wh] <- NA > > w > b b a b a a > NA 1 NA 1 3 3 > > I'd like to see a cleverer solution that vectorizes and avoids the > tapply(), though. > > Cheers, > Bert > > > > > On Mon, Nov 12, 2018 at 8:33 PM Bert Gunter <bgunter.4567 at gmail.com> > wrote: > >> > match(v, unique(v)) >> [1] 1 2 2 1 >> >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along >> and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> On Mon, Nov 12, 2018 at 5:08 PM Duncan Murdoch <murdoch.duncan at gmail.com> >> wrote: >> >>> The duplicated() function gives TRUE if an item in a vector (or row in a >>> matrix, etc.) is a duplicate of an earlier item. But what I would like >>> to know is which item does it duplicate? >>> >>> For example, >>> >>> v <- c("a", "b", "b", "a") >>> duplicated(v) >>> >>> returns >>> >>> [1] FALSE FALSE TRUE TRUE >>> >>> What I want is a fast way to calculate >>> >>> [1] NA NA 2 1 >>> >>> or (equally useful to me) >>> >>> [1] 1 2 2 1 >>> >>> The result should have the property that if result[i] == j, then v[i] =>>> v[j], at least for i != j. >>> >>> Does this already exist somewhere, or is it easy to write? >>> >>> Duncan Murdoch >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>[[alternative HTML version deleted]]
Hi similar result (with different numerical values) could be achieved by making v a factor.> v <- letters[c(2,2,1,2,1,1)] > vf<-factor(v) > as.numeric(vf)[1] 2 2 1 2 1 1 Cheers Petr> -----Original Message----- > From: R-help <r-help-bounces at r-project.org> On Behalf Of Bert Gunter > Sent: Tuesday, November 13, 2018 6:44 AM > To: Duncan Murdoch <murdoch.duncan at gmail.com> > Cc: R-help <R-help at r-project.org> > Subject: Re: [R] which element is duplicated? > > It is not clear to what you want for the general case. Perhaps: > > > v <- letters[c(2,2,1,2,1,1)] > > wh <- tapply(seq_along(v),factor(v), '[',1) w <- wh[match(v,v[wh])] w > b b a b a a > 1 1 3 1 3 3 > > ## and if you want NA's for the first occurences of unique values ## > > of course: > > w[wh] <- NA > > w > b b a b a a > NA 1 NA 1 3 3 > > I'd like to see a cleverer solution that vectorizes and avoids the tapply(), > though. > > Cheers, > Bert > > > > > On Mon, Nov 12, 2018 at 8:33 PM Bert Gunter <bgunter.4567 at gmail.com> > wrote: > > > > match(v, unique(v)) > > [1] 1 2 2 1 > > > > Bert Gunter > > > > "The trouble with having an open mind is that people keep coming along > > and sticking things into it." > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > > On Mon, Nov 12, 2018 at 5:08 PM Duncan Murdoch > > <murdoch.duncan at gmail.com> > > wrote: > > > >> The duplicated() function gives TRUE if an item in a vector (or row > >> in a matrix, etc.) is a duplicate of an earlier item. But what I > >> would like to know is which item does it duplicate? > >> > >> For example, > >> > >> v <- c("a", "b", "b", "a") > >> duplicated(v) > >> > >> returns > >> > >> [1] FALSE FALSE TRUE TRUE > >> > >> What I want is a fast way to calculate > >> > >> [1] NA NA 2 1 > >> > >> or (equally useful to me) > >> > >> [1] 1 2 2 1 > >> > >> The result should have the property that if result[i] == j, then v[i] > >> == v[j], at least for i != j. > >> > >> Does this already exist somewhere, or is it easy to write? > >> > >> Duncan Murdoch > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.Osobn? ?daje: Informace o zpracov?n? a ochran? osobn?ch ?daj? obchodn?ch partner? PRECHEZA a.s. jsou zve?ejn?ny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner?s personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/ D?v?rnost: Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a podl?haj? tomuto pr?vn? z?vazn?mu prohl??en? o vylou?en? odpov?dnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
>>>>> PIKAL Petr >>>>> on Tue, 13 Nov 2018 08:42:22 +0000 writes:> Hi > similar result (with different numerical values) could > be achieved by making v a factor.> > v <- letters[c(2,2,1,2,1,1)] > > vf<-factor(v) > > as.numeric(vf) > [1] 2 2 1 2 1 1 > > Cheers > PetrYes, as was already remarked by Michael Sumner. But really the power is in match() : It is called at *twice* by factor(). Martin> > -----Original Message----- > > From: R-help <r-help-bounces at r-project.org> On Behalf Of Bert Gunter > > Sent: Tuesday, November 13, 2018 6:44 AM > > To: Duncan Murdoch <murdoch.duncan at gmail.com> > > Cc: R-help <R-help at r-project.org> > > Subject: Re: [R] which element is duplicated? > > > > It is not clear to what you want for the general case. Perhaps: > > > > > v <- letters[c(2,2,1,2,1,1)] > > > wh <- tapply(seq_along(v),factor(v), '[',1) w <- wh[match(v,v[wh])] w > > b b a b a a > > 1 1 3 1 3 3 > > > ## and if you want NA's for the first occurences of unique values ## > > > of course: > > > w[wh] <- NA > > > w > > b b a b a a > > NA 1 NA 1 3 3 > > > > I'd like to see a cleverer solution that vectorizes and avoids the tapply(), > > though. > > > > Cheers, > > Bert > > > > > > > > > > On Mon, Nov 12, 2018 at 8:33 PM Bert Gunter <bgunter.4567 at gmail.com> > > wrote: > > > > > > match(v, unique(v)) > > > [1] 1 2 2 1 > > > > > > Bert Gunter > > > > > > "The trouble with having an open mind is that people keep coming along > > > and sticking things into it." > > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > > > > > On Mon, Nov 12, 2018 at 5:08 PM Duncan Murdoch > > > <murdoch.duncan at gmail.com> > > > wrote: > > > > > >> The duplicated() function gives TRUE if an item in a vector (or row > > >> in a matrix, etc.) is a duplicate of an earlier item. But what I > > >> would like to know is which item does it duplicate? > > >> > > >> For example, > > >> > > >> v <- c("a", "b", "b", "a") > > >> duplicated(v) > > >> > > >> returns > > >> > > >> [1] FALSE FALSE TRUE TRUE > > >> > > >> What I want is a fast way to calculate > > >> > > >> [1] NA NA 2 1 > > >> > > >> or (equally useful to me) > > >> > > >> [1] 1 2 2 1 > > >> > > >> The result should have the property that if result[i] == j, then v[i] > > >> == v[j], at least for i != j. > > >> > > >> Does this already exist somewhere, or is it easy to write? > > >> > > >> Duncan Murdoch > > >> > > >> ______________________________________________ > > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > >> https://stat.ethz.ch/mailman/listinfo/r-help > > >> PLEASE do read the posting guide > > >> http://www.R-project.org/posting-guide.html > > >> and provide commented, minimal, self-contained, reproducible code.