thr3ads.net - R devel - [Rd] Ordering of values returned by unique [Sep 2004]

If this information is useful, please help other people find it:
Share via:

Witold Eryk Wolski

2004-Sep-29 17:17 UTC

[Rd] Ordering of values returned by unique

Hi,

Is the ordering of the values returned something on what I can rely on, 
a form of a standard,  that a function called unique in R (in futher 
versions) will return the uniq elements in order of they first occurcence.

 > x<-c(2,2,1,2)
 > unique(x)
[1] 2 1

Its seems not to be the standard. E.g. matlab
 >> x=[2,2,1,2]
x      2     2     1     2
 >> unique(x)
ans      1     2

I just noted it because, the way how it is working now is extremely 
usefull for some applications (e.g tree traversal), so i use it in a 
script. But I am a little woried if I can rely on this behaviour in 
further versions. And furthermore can I assume that someone reading the 
code will think that it works in that way?
Or is it better to define a additional function?
keeporderunique<-function(x)
{
    res<-rep(NA,length(unique(x))
    count<-0
    for(i in x)
    {
        if(!i%in%res)
            {
                    count<-count+1
                     res[count]<-i
            }  
    }
    res
}

/E



-- 
Dipl. bio-chem. Witold Eryk Wolski         
MPI-Moleculare Genetic
Ihnestrasse 63-73 14195 Berlin           _
tel: 0049-30-83875219                   'v'
http://www.molgen.mpg.de/~wolski       /   \
mail: witek96@users.sourceforge.net  ---W-W----
      wolski@molgen.mpg.de

Tony Plate

2004-Sep-29 18:09 UTC

head link

[Rd] Ordering of values returned by unique

AFAIK, it has always worked that way in S-plus and R.  Furthermore, the 
documentation in R for 'unique' says that it removes duplicated 
elements.  This does seem to leave the possibility that element other than 
the first of a set of duplicates is retained, which could mess up the 
order.  However, the documentation for 'duplicated' is clearer: it says 
that 'duplicated' identifies duplicates of earlier elements.  Also in
the
examples for 'duplicated', it says that x[!duplicated(x)] == unique(x) 
(paraphrased).

I depend on this all the time, so I also checked some references.  In the 
Blue book the documentation for the functions unique and duplicated is 
combined and implies the above.  In MASS 4th Ed, the page referred to by 
the index entry for 'unique' (p48, #9 in my copy) states that
'unique'
removes duplicates as identified by 'duplicated', which implies that the
order of retained elements is not changed.  The Green book has no index 
entry for 'unique'.  In S-plus the implementation of unique.default(x)
uses
x[!duplicated(x)].

So, I think the evidence is pretty strong that unique(x) will always return 
elements in the same order as they first appear in x.  But it would be nice 
if the documentation for 'unique' explicitly stated that this is the 
behavior for all methods.  (It does state this for the array method for 
'unique').

-- Tony Plate

At Wednesday 09:17 AM 9/29/2004, Witold Eryk Wolski
wrote:>Hi,
>
>Is the ordering of the values returned something on what I can rely on, a 
>form of a standard,  that a function called unique in R (in futher 
>versions) will return the uniq elements in order of they first occurcence.
>
> > x<-c(2,2,1,2)
> > unique(x)
>[1] 2 1
>
>Its seems not to be the standard. E.g. matlab
> >> x=[2,2,1,2]
>x >     2     2     1     2
> >> unique(x)
>ans >     1     2
>
>I just noted it because, the way how it is working now is extremely 
>usefull for some applications (e.g tree traversal), so i use it in a 
>script. But I am a little woried if I can rely on this behaviour in 
>further versions. And furthermore can I assume that someone reading the 
>code will think that it works in that way?
>Or is it better to define a additional function?
>keeporderunique<-function(x)
>{
>    res<-rep(NA,length(unique(x))
>    count<-0
>    for(i in x)
>    {
>        if(!i%in%res)
>            {
>                    count<-count+1
>                     res[count]<-i
>            }
>    }
>    res
>}
>
>/E
>
>
>
>--
>Dipl. bio-chem. Witold Eryk Wolski
>MPI-Moleculare Genetic
>Ihnestrasse 63-73 14195 Berlin           _
>tel: 0049-30-83875219                   'v'
>http://www.molgen.mpg.de/~wolski       /   \
>mail: witek96@users.sourceforge.net  ---W-W----
>      wolski@molgen.mpg.de
>
>______________________________________________
>R-devel@stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-devel

Reasonably Related Threads

Search for more reasonably related threads

R devel - Sep 2004 - Ordering of values returned by unique

[Rd] Ordering of values returned by unique

[Rd] Ordering of values returned by unique

Reasonably Related Threads