R help - Aug 2012 - return first index for each unique value in a vector

If this information is useful, please help other people find it:
Share via:

Bronwyn Rayfield

2012-Aug-28 19:58 UTC

[R] return first index for each unique value in a vector

I would like to efficiently find the first index of each unique value in a
very large vector.

For example, if I have a vector

A<-c(9,2,9,5)

I would like to return not only the unique values (2,5,9) but also their
first indices (2,4,1).

I tried using a for loop with which(A==unique(A)[i])[1] to find the first
index of each unique value but it is very slow.

What I am trying to do is easily and quickly done with the "unique"
function in MATLAB (see
mathworks.com/help/techdoc/ref/unique.html).

Thank you for your help,
Bronwyn

	[[alternative HTML version deleted]]

R. Michael Weylandt

2012-Aug-28 22:32 UTC

[R] return first index for each unique value in a vector

On Tue, Aug 28, 2012 at 2:58 PM, Bronwyn Rayfield
<bronwynrayfield at gmail.com> wrote:> I would like to efficiently find the first index of each unique value in a
> very large vector.
>
> For example, if I have a vector
>
> A<-c(9,2,9,5)
>
> I would like to return not only the unique values (2,5,9) but also their
> first indices (2,4,1).
>
> I tried using a for loop with which(A==unique(A)[i])[1] to find the first
> index of each unique value but it is very slow.
You'll get marginally more speed from which.max() but I'm sure
there's
a better way. I'll write if I can think of it.

Michael
>
> What I am trying to do is easily and quickly done with the
"unique"
> function in MATLAB (see
> mathworks.com/help/techdoc/ref/unique.html).
>
> Thank you for your help,
> Bronwyn
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Noia Raindrops

2012-Aug-28 22:49 UTC

[R] return first index for each unique value in a vector

Hi,

Try this:
order(A)[!duplicated(sort(A))]

-- 
Noia Raindrops
noia.raindrops at gmail.com

arun

2012-Aug-28 23:21 UTC

[R] return first index for each unique value in a vector

HI,

I was thinking about duplicated().? But, Bert already posted the solution.? The
solution below is not very efficient.
A<-c(9,2,9,5)
unik<-as.numeric(names(table(A)))
match(unik,A)
#[1] 2 4 1

#Bert's solution wins here.
system.time({
set.seed(1)
A<-sample(1:5,1e6,replace=TRUE)
unik <- !duplicated(A)? ## logical vector of unique values
seq_along(A)[unik]? ## indices
A[unik]})
?user? system elapsed 
? 0.040?? 0.016?? 0.056 
#My solution
system.time({
set.seed(1)
A<-sample(1:5,1e6,replace=TRUE)
#unik<-as.numeric(names(table(A)))
match(as.numeric(names(table(A))),A)})
?user? system elapsed 
?0.344?? 0.036?? 0.383 
#Robert's solution
?system.time({
set.seed(1)
A<-sample(1:5,1e6,replace=TRUE)
as.numeric(rownames(unique(data.frame(A)[1])))})
?user? system elapsed 
? 0.056?? 0.012?? 0.069 
A.K.





----- Original Message -----
From: Bronwyn Rayfield <bronwynrayfield at gmail.com>
To: r-help at r-project.org
Cc: 
Sent: Tuesday, August 28, 2012 3:58 PM
Subject: [R] return first index for each unique value in a vector

I would like to efficiently find the first index of each unique value in a
very large vector.

For example, if I have a vector

A<-c(9,2,9,5)

I would like to return not only the unique values (2,5,9) but also their
first indices (2,4,1).

I tried using a for loop with which(A==unique(A)[i])[1] to find the first
index of each unique value but it is very slow.

What I am trying to do is easily and quickly done with the "unique"
function in MATLAB (see
mathworks.com/help/techdoc/ref/unique.html).

Thank you for your help,
Bronwyn

??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

arun

2012-Aug-28 23:33 UTC

[R] return first index for each unique value in a vector

HI,
Replacing seq_along() with which() slightly improved CPU time. ? 


system.time({
?set.seed(1)
?A<-sample(1:5,1e6,replace=TRUE)
?which(!duplicated(A))
?A[which(!duplicated(A))]
?})
#?? user? system elapsed 
? #0.040?? 0.012?? 0.052?
A.K.



----- Original Message -----
From: Bronwyn Rayfield <bronwynrayfield at gmail.com>
To: r-help at r-project.org
Cc: 
Sent: Tuesday, August 28, 2012 3:58 PM
Subject: [R] return first index for each unique value in a vector

I would like to efficiently find the first index of each unique value in a
very large vector.

For example, if I have a vector

A<-c(9,2,9,5)

I would like to return not only the unique values (2,5,9) but also their
first indices (2,4,1).

I tried using a for loop with which(A==unique(A)[i])[1] to find the first
index of each unique value but it is very slow.

What I am trying to do is easily and quickly done with the "unique"
function in MATLAB (see
mathworks.com/help/techdoc/ref/unique.html).

Thank you for your help,
Bronwyn

??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

William Dunlap

2012-Aug-29 03:22 UTC

[R] return first index for each unique value in a vector

Here are two methods:
> A<-c(9,2,9,5)
> f1 <- function(x) { d <- !duplicated(x) ;
data.frame(uniqueValue=x[d], firstIndex=which(d)) }
> f2 <- function(x) { u <- unique(x) ; data.frame(uniqueValue=u,
firstIndex=match(u, x))}
> f1(A)  uniqueValue firstIndex
1           9          1
2           2          2
3           5          4> identical(f1(A), f2(A))
[1] TRUE> A6 <- sample(1e6, size=5e5, replace=TRUE)
> system.time(z1 <- f1(A6))   user  system elapsed 
   0.25    0.02    0.27 > system.time(z2 <- f2(A6))   user  system elapsed 
   0.09    0.02    0.11 > identical(z1, z2)[1] TRUE

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at
r-project.org] On Behalf
> Of Bronwyn Rayfield
> Sent: Tuesday, August 28, 2012 12:59 PM
> To: r-help at r-project.org
> Subject: [R] return first index for each unique value in a vector
> 
> I would like to efficiently find the first index of each unique value in a
> very large vector.
> 
> For example, if I have a vector
> 
> A<-c(9,2,9,5)
> 
> I would like to return not only the unique values (2,5,9) but also their
> first indices (2,4,1).
> 
> I tried using a for loop with which(A==unique(A)[i])[1] to find the first
> index of each unique value but it is very slow.
> 
> What I am trying to do is easily and quickly done with the
"unique"
> function in MATLAB (see
> mathworks.com/help/techdoc/ref/unique.html).
> 
> Thank you for your help,
> Bronwyn
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Maybe Matching Threads

Search for more maybe matching threads