G FANG
2010-Jun-20 23:46 UTC
[R] how to convert a set of strings to a list of unique numeric id?
Hi, I have been a matlab user and is learning R. I want to convert a large list of strings to a list of unique numeric ids to reduce storage space. For example, there is a string list (there are duplicates) ABCDDDD ACCDEDF ACCGEDF ACCGEGF ..... ACCDEDF ACCGEGF and I want to have a corresponding numeric id list 1 2 3 4 .... 2 4 In matlab, the 'unique' function can do this in addition to give the unique set, but in R, 'unique' only gives the unique set Please advice me on this. Thanks, Gang
Gabor Grothendieck
2010-Jun-20 23:58 UTC
[R] how to convert a set of strings to a list of unique numeric id?
Use a variable of class "factor"> s <- c("ABCDDDD", "ACCDEDF", "ACCGEDF", "ACCGEGF", "ACCDEDF", "ACCGEGF") > fs <- factor(s) > levels(fs)[1] "ABCDDDD" "ACCDEDF" "ACCGEDF" "ACCGEGF"> unclass(fs)[1] 1 2 3 4 2 4 attr(,"levels") [1] "ABCDDDD" "ACCDEDF" "ACCGEDF" "ACCGEGF" On Sun, Jun 20, 2010 at 7:46 PM, G FANG <fanggangsw at gmail.com> wrote:> Hi, > > I have been a matlab user and is learning R. > > I want to convert a large list of strings to a list of unique numeric > ids to reduce storage space. > > For example, > > there is a string list (there are duplicates) > > ABCDDDD > ACCDEDF > ACCGEDF > ACCGEGF > ..... > ACCDEDF > ACCGEGF > > and I want to have a corresponding numeric id list > > 1 > 2 > 3 > 4 > .... > 2 > 4 > > In matlab, the 'unique' function can do this in addition to give the > unique set, but in R, 'unique' only gives the unique set > > > Please advice me on this. > > Thanks, > > Gang > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Apparently Analagous Threads
- Automatically changing strings to pass unique constraints
- how to efficiently compute set unique?
- Mutiple unique key(Compisitive unique)
- selecting values that are unique, instead of selecting unique values
- MARGIN in base::unique.matrix() and base::unique.array()