thr3ads.net - R help - [R] Unique subsetting question [Sep 2010]

If this information is useful, please help other people find it:
Share via:

AndrewPage

2010-Sep-22 14:55 UTC

[R] Unique subsetting question

Hi all,

I'm looking at a large data set, and I'm interested in removing rows
where
only one variable is duplicated.  Here's an example:
> presidents     Qtr1 Qtr2 Qtr3 Qtr4
1945   NA   87   82   75
1946   63   50   43   32
1947   35   60   54   55
1948   36   39   NA   NA
1949   69   57   57   51
1950   45   37   46   39
1951   36   24   32   23
1952   25   32   NA   32
1953   59   74   75   60
1954   71   61   71   57
1955   71   68   79   73
1956   76   71   67   75
1957   79   62   63   57
1958   60   49   48   52
1959   57   62   61   66
1960   71   62   61   57
1961   72   83   71   78
1962   79   71   62   74
1963   76   64   62   57
1964   80   73   69   69
1965   71   64   69   62
1966   63   46   56   44
1967   44   52   38   46
1968   36   49   35   44
1969   59   65   65   56
1970   66   53   61   52
1971   51   48   54   49
1972   49   61   NA   NA
1973   68   44   40   27
1974   28   25   24   24

See how in 1954 and 1955, the Qtr1 approval rating is the same?  Let's say I
wanted to return the presidents data frame, but only have unique values for
Qtr1.  I doesn't matter which years are displayed for duplicated values-- it
just matters that each value is not displayed more than once.  Any way I can
do this but still have it be a data frame that shows Qtr2, 3, and 4 values?

Thanks in advance,
Andrew
-- 
View this message in context:
http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550453.html
Sent from the R help mailing list archive at Nabble.com.

Ivan Calandra

2010-Sep-22 16:27 UTC

head link

[R] Unique subsetting question

Hi,

Take a look at ?duplicated and ?unique

HTH,
Ivan

Le 9/22/2010 16:55, AndrewPage a écrit :> Hi all,
>
> I'm looking at a large data set, and I'm interested in removing
rows where
> only one variable is duplicated.  Here's an example:
>
>> presidents
>       Qtr1 Qtr2 Qtr3 Qtr4
> 1945   NA   87   82   75
> 1946   63   50   43   32
> 1947   35   60   54   55
> 1948   36   39   NA   NA
> 1949   69   57   57   51
> 1950   45   37   46   39
> 1951   36   24   32   23
> 1952   25   32   NA   32
> 1953   59   74   75   60
> 1954   71   61   71   57
> 1955   71   68   79   73
> 1956   76   71   67   75
> 1957   79   62   63   57
> 1958   60   49   48   52
> 1959   57   62   61   66
> 1960   71   62   61   57
> 1961   72   83   71   78
> 1962   79   71   62   74
> 1963   76   64   62   57
> 1964   80   73   69   69
> 1965   71   64   69   62
> 1966   63   46   56   44
> 1967   44   52   38   46
> 1968   36   49   35   44
> 1969   59   65   65   56
> 1970   66   53   61   52
> 1971   51   48   54   49
> 1972   49   61   NA   NA
> 1973   68   44   40   27
> 1974   28   25   24   24
>
> See how in 1954 and 1955, the Qtr1 approval rating is the same?  Let's
say I
> wanted to return the presidents data frame, but only have unique values for
> Qtr1.  I doesn't matter which years are displayed for duplicated
values-- it
> just matters that each value is not displayed more than once.  Any way I
can
> do this but still have it be a data frame that shows Qtr2, 3, and 4 values?
>
> Thanks in advance,
> Andrew
-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calandra@uni-hamburg.de

**********
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php


	[[alternative HTML version deleted]]

Ista Zahn

2010-Sep-22 18:34 UTC

head link

[R] Unique subsetting question

Hi Andrew,
Perhaps you did not notice my previous email. The answer is still the
same (see below):

On Wed, Sep 22, 2010 at 1:48 PM, AndrewPage <savejarvis at yahoo.com>
wrote:>
> How about this:
>
>
> s = c("aa", "bb", "cc", "",
"aa", "dd", "", "aa")
>
> n = c(2, 3, 5, 6, 7, 8, 9, 3)
>
> b = c(TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, FALSE)
>
> df = data.frame(n, s, b) ? ? ? # df is a data frame
>
>
> I want to display df with no value in s occurring more than once.
df <- df[!duplicated(df$s),]

Also, I> want to delete the rows where s contains "".
Same idea here:
df[s != "",]

-Ista
> --
> View this message in context:
http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550769.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

Maybe Matching Threads

Search for more possibly parallel threads

R help - Sep 2010 - Unique subsetting question

[R] Unique subsetting question

[R] Unique subsetting question

[R] Unique subsetting question

Maybe Matching Threads