thr3ads.net - R help - [R] How to delete rows [Jul 2005]

If this information is useful, please help other people find it:
Share via:

Michael Graber

2005-Jul-27 16:43 UTC

[R] How to delete rows

Dear R-users,

I am very new to R, so maybe my question is very easy to answer.
I have the following table:
TAB1<-data.frame(Name,Number), "Name" and "Number" are
all character
strings,
it looks like this:

Name  Number

ab      2

ab      2

NA     15

NA     15

NA     15

cd      3

ef      1

NA     15

NA     15

gh     15

gh     15

I want to delete all the rows which begin with "NA"
and all the rows where names are duplicates
(for example the second row).
I have tried this, but I only get numbers:

 for (i in 1:ZeileMax )  {if ( TAB1[[1]] [i] != "NA" ) 
{cat(TAB1[[1]][i],file = "Name.txt",fill= TRUE,append = TRUE ,sep = 
"");cat(TAB1[[2]][i], file="Number.txt",
fill=TRUE,append=TRUE, sep="")}}
Name<-readLines("Name.txt")
Number<-readLines("Number.txt")
TAB<-data.frame(Name,Number)

 
Thanks in advance,

 

Michael Graber

Prof Brian Ripley

2005-Jul-27 17:02 UTC

head link

[R] How to delete rows

To delete duplicate rows, use unique(TAB1): see its help page.

It looks to me as if the names are missing values NA and *not* start with 
NA.  If so, you want to use

TAB1[!is.na(TAB1$Name), ]

Otherwise, perhaps TAB1[substr(TAB1$Name, 1, 2) == "NA", ].

On Wed, 27 Jul 2005, Michael Graber wrote:
> Dear R-users,
>
> I am very new to R, so maybe my question is very easy to answer.
> I have the following table:
> TAB1<-data.frame(Name,Number), "Name" and "Number"
are all character
> strings,
> it looks like this:
>
> Name  Number
>
> ab      2
>
> ab      2
>
> NA     15
>
> NA     15
>
> NA     15
>
> cd      3
>
> ef      1
>
> NA     15
>
> NA     15
>
> gh     15
>
> gh     15
>
> I want to delete all the rows which begin with "NA"
> and all the rows where names are duplicates
> (for example the second row).
> I have tried this, but I only get numbers:
>
> for (i in 1:ZeileMax )  {if ( TAB1[[1]] [i] != "NA" )
> {cat(TAB1[[1]][i],file = "Name.txt",fill= TRUE,append = TRUE ,sep
> "");cat(TAB1[[2]][i], file="Number.txt",
fill=TRUE,append=TRUE, sep="")}}
> Name<-readLines("Name.txt")
> Number<-readLines("Number.txt")
> TAB<-data.frame(Name,Number)
>
>
> Thanks in advance,
>
>
>
> Michael Graber
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Barry Rowlingson

2005-Jul-27 17:05 UTC

head link

[R] How to delete rows

Michael Graber wrote:> Dear R-users,
> 
> I am very new to R, so maybe my question is very easy to answer.
> I have the following table:
> TAB1<-data.frame(Name,Number), "Name" and "Number"
are all character
> strings,
> it looks like this:
> 
> Name  Number
> 
> ab      2
> 
  [etc]
> gh     15
> 
> gh     15
> 
>  for (i in 1:ZeileMax )  {if ( TAB1[[1]] [i] != "NA" ) 
> {cat(TAB1[[1]][i],file = "Name.txt",fill= TRUE,append = TRUE ,sep
=
> "");cat(TAB1[[2]][i], file="Number.txt",
fill=TRUE,append=TRUE, sep="")}}
> Name<-readLines("Name.txt")
> Number<-readLines("Number.txt")
> TAB<-data.frame(Name,Number)
  I'm not going to bother working out why that fails!

  The following assumes you want to keep one of any row that has a 
duplicated Name, in this case the first instance. I think your mail was 
a bit ambiguous as to whether you wanted to delete all rows with a 
duplicate Name...

  You can do it in two lines. First select the rows that dont have
Name=="NA", and then select the rows that dont have duplicated Name:

  > TAB <- TAB1[TAB1$Name!="NA",]
  > TAB <- TAB[!duplicated(TAB$Name),]

  > TAB
    Name Number
1    ab      2
6    cd      3
7    ef      1
10   gh     15

  Or you can do it in one line:

  > TAB=TAB1[!duplicated(TAB1$Name) & TAB1$Name!="NA",]
  > TAB

    Name Number
1    ab      2
6    cd      3
7    ef      1
10   gh     15

  Dont think of it as deleting rows, you are selecting the rows you want
and creating a new data frame.

  Any simple intro to R (see www.r-project.org for plenty) will have
examples on selecting rows and columns.

Baz

Reasonably Related Threads

Search for more reasonably related threads

R help - Jul 2005 - How to delete rows

[R] How to delete rows

[R] How to delete rows

[R] How to delete rows

Reasonably Related Threads