Avi Gross
2021-Sep-15 04:39 UTC
[R] How to remove all rows that have a numeric in the first (or any) column
Calling something a data.frame does not make it a data.frame.
The abbreviated object shown below is a list of singletons. If it is a column in
a larger object that is a data.frame, then it is a list column which is valid
but can be ticklish to handle within base R but less so in the tidyverse.
For example, if I try to make a data.frame the normal way, the list gets made
into multiple columns and copied to each row. Not what was expected. I think
some tidyverse functionality does better.
Like this:
library(tidyverse)
temp=list("Hello", 1, 1.1, "bye")
Now making a data.frame has an odd result:
> mydf=data.frame(alpha=1:4, beta=temp)
> mydf
alpha beta..Hello. beta.1 beta.1.1 beta..bye.
1 1 Hello 1 1.1 bye
2 2 Hello 1 1.1 bye
3 3 Hello 1 1.1 bye
4 4 Hello 1 1.1 bye
But a tibble handles it:
> mydf=tibble(alpha=1:4, beta=temp)
> mydf
# A tibble: 4 x 2
alpha beta
<int> <list>
1 1 <chr [1]>
2 2 <dbl [1]>
3 3 <dbl [1]>
4 4 <chr [1]>
So if the data does look like this, with a list column, but access can be tricky
as subsetting a list with [] returns a list and you need [[]].
I found a somehwhat odd solution like this:
mydf %>%
filter(!map_lgl(beta, is.numeric)) -> mydf2
# A tibble: 2 x 2
alpha beta
<int> <list>
1 1 <chr [1]>
2 4 <chr [1]>
When I saved that result into mydf2, I got this.
Original:
> str(mydf)
tibble [4 x 2] (S3: tbl_df/tbl/data.frame)
$ alpha: int [1:4] 1 2 3 4
$ beta :List of 4
..$ : chr "Hello"
..$ : num 1
..$ : num 1.1
..$ : chr "bye"
Output when any row with a numeric is removed:
> str(mydf2)
tibble [2 x 2] (S3: tbl_df/tbl/data.frame)
$ alpha: int [1:2] 1 4
$ beta :List of 2
..$ : chr "Hello"
..$ : chr "bye"
So if you try variations on your code motivated by what I show, good luck. I am
sure there are many better ways but I repeat, it can be tricky.
-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Jeff Newmiller
Sent: Tuesday, September 14, 2021 11:54 PM
To: Gregg Powell <g.a.powell at protonmail.com>
Cc: Gregg Powell via R-help <r-help at r-project.org>
Subject: Re: [R] How to remove all rows that have a numeric in the first (or
any) column
You cannot apply vectorized operators to list columns... you have to use a map
function like sapply or purrr::map_lgl to obtain a logical vector by running the
function once for each list element:
sapply( VPN_Sheet1$HVA, is.numeric )
On September 14, 2021 8:38:35 PM PDT, Gregg Powell <g.a.powell at
protonmail.com> wrote:>Here is the output:
>
>> str(VPN_Sheet1$HVA)
>List of 2174
> $ : chr "Email: fffd at fffffffffff.com"
> $ : num 1
> $ : chr "Eloisa Libas"
> $ : chr "Percival Esquejo"
> $ : chr "Louchelle Singh"
> $ : num 2
> $ : chr "Charisse Anne Tabarno, RN"
> $ : chr "Sol Amor Mucoy"
> $ : chr "Josan Moira Paler"
> $ : num 3
> $ : chr "Anna Katrina V. Alberto"
> $ : chr "Nenita Velarde"
> $ : chr "Eunice Arrances"
> $ : num 4
> $ : chr "Catherine Henson"
> $ : chr "Maria Carla Daya"
> $ : chr "Renee Ireine Alit"
> $ : num 5
> $ : chr "Marol Joseph Domingo - PS"
> $ : chr "Kissy Andrea Arriesgado"
> $ : chr "Pia B Baluyut, RN"
> $ : num 6
> $ : chr "Gladys Joy Tan"
> $ : chr "Frances Zarzua"
> $ : chr "Fairy Jane Nery"
> $ : num 7
> $ : chr "Gladys Tijam, RMT"
> $ : chr "Sarah Jane Aramburo"
> $ : chr "Eve Mendoza"
> $ : num 8
> $ : chr "Gloria Padolino"
> $ : chr "Joyce Pearl Javier"
> $ : chr "Ayza Padilla"
> $ : num 9
> $ : chr "Walfredson Calderon"
> $ : chr "Stephanie Anne Militante"
> $ : chr "Rennua Oquilan"
> $ : num 10
> $ : chr "Neil John Nery"
> $ : chr "Maria Reyna Reyes"
> $ : chr "Rowella Villegas"
> $ : num 11
> $ : chr "Katelyn Mendiola"
> $ : chr "Maria Riza Mariano"
> $ : chr "Marie Vallianne Carantes"
> $ : num 12
>
>??????? Original Message ???????
>
>On Tuesday, September 14th, 2021 at 8:32 PM, Jeff Newmiller <jdnewmil at
dcn.davis.ca.us> wrote:
>
>> An atomic column of data by design has exactly one mode, so if any
>> values are non-numeric then the entire column will be non-numeric.
>> What does
>>
>
>> str(VPN_Sheet1$HVA)
>>
>
>> tell you? It is likely either a factor or character data.
>>
>
>> On September 14, 2021 7:01:53 PM PDT, Gregg Powell via R-help r-help at
r-project.org wrote:
>>
>
>> > > Stuck on this problem - How does one remove all rows in a
dataframe that have a numeric in the first (or any) column?
>> >
>
>> > > Seems straight forward - but I'm having trouble.
>> >
>
>> > I've attempted to used:
>> >
>
>> > VPN_Sheet1 <- VPN_Sheet1[!is.numeric(VPN_Sheet1$HVA),]
>> >
>
>> > and
>> >
>
>> > VPN_Sheet1 <- VPN_Sheet1[!is.integer(VPN_Sheet1$HVA),]
>> >
>
>> > Neither work - Neither throw an error.
>> >
>
>> > class(VPN_Sheet1$HVA) returns:
>> >
>
>> > [1] "list"
>> >
>
>> > So, the HVA column returns a list.
>> >
>
>> > > Data looks like the attached screen grab -
>> >
>
>> > > The ONLY rows I need to delete are the rows where there is a
numeric in the HVA column.
>> >
>
>> > > There are some 5000+ rows in the actual data.
>> >
>
>> > > Would be grateful for a solution to this problem.
>> >
>
>> > How to get R to detect whether the value in column 1 is a number
so the rows with the number values can be deleted?
>> >
>
>> > > Thanks in advance to any and all willing to help on this
problem.
>> >
>
>> > > Gregg Powell
>> >
>
>> > > Sierra Vista, AZ
>>
>
>> --
>>
>
>> Sent from my phone. Please excuse my brevity.
--
Sent from my phone. Please excuse my brevity.
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Andrew Simmons
2021-Sep-15 04:44 UTC
[R] How to remove all rows that have a numeric in the first (or any) column
I'd like to point out that base R can handle a list as a data frame column,
it's just that you have to make the list of class "AsIs". So in
your example
temp <- list("Hello", 1, 1.1, "bye")
data.frame(alpha = 1:4, beta = I(temp))
means that column "beta" will still be a list.
On Wed, Sep 15, 2021, 00:40 Avi Gross via R-help <r-help at r-project.org>
wrote:
> Calling something a data.frame does not make it a data.frame.
>
> The abbreviated object shown below is a list of singletons. If it is a
> column in a larger object that is a data.frame, then it is a list column
> which is valid but can be ticklish to handle within base R but less so in
> the tidyverse.
>
> For example, if I try to make a data.frame the normal way, the list gets
> made into multiple columns and copied to each row. Not what was expected. I
> think some tidyverse functionality does better.
>
> Like this:
>
> library(tidyverse)
> temp=list("Hello", 1, 1.1, "bye")
>
> Now making a data.frame has an odd result:
>
> > mydf=data.frame(alpha=1:4, beta=temp)
> > mydf
> alpha beta..Hello. beta.1 beta.1.1 beta..bye.
> 1 1 Hello 1 1.1 bye
> 2 2 Hello 1 1.1 bye
> 3 3 Hello 1 1.1 bye
> 4 4 Hello 1 1.1 bye
>
> But a tibble handles it:
>
> > mydf=tibble(alpha=1:4, beta=temp)
> > mydf
> # A tibble: 4 x 2
> alpha beta
> <int> <list>
> 1 1 <chr [1]>
> 2 2 <dbl [1]>
> 3 3 <dbl [1]>
> 4 4 <chr [1]>
>
> So if the data does look like this, with a list column, but access can be
> tricky as subsetting a list with [] returns a list and you need [[]].
>
> I found a somehwhat odd solution like this:
>
> mydf %>%
> filter(!map_lgl(beta, is.numeric)) -> mydf2
> # A tibble: 2 x 2
> alpha beta
> <int> <list>
> 1 1 <chr [1]>
> 2 4 <chr [1]>
>
> When I saved that result into mydf2, I got this.
>
> Original:
>
> > str(mydf)
> tibble [4 x 2] (S3: tbl_df/tbl/data.frame)
> $ alpha: int [1:4] 1 2 3 4
> $ beta :List of 4
> ..$ : chr "Hello"
> ..$ : num 1
> ..$ : num 1.1
> ..$ : chr "bye"
>
> Output when any row with a numeric is removed:
>
> > str(mydf2)
> tibble [2 x 2] (S3: tbl_df/tbl/data.frame)
> $ alpha: int [1:2] 1 4
> $ beta :List of 2
> ..$ : chr "Hello"
> ..$ : chr "bye"
>
> So if you try variations on your code motivated by what I show, good luck.
> I am sure there are many better ways but I repeat, it can be tricky.
>
> -----Original Message-----
> From: R-help <r-help-bounces at r-project.org> On Behalf Of Jeff
Newmiller
> Sent: Tuesday, September 14, 2021 11:54 PM
> To: Gregg Powell <g.a.powell at protonmail.com>
> Cc: Gregg Powell via R-help <r-help at r-project.org>
> Subject: Re: [R] How to remove all rows that have a numeric in the first
> (or any) column
>
> You cannot apply vectorized operators to list columns... you have to use a
> map function like sapply or purrr::map_lgl to obtain a logical vector by
> running the function once for each list element:
>
> sapply( VPN_Sheet1$HVA, is.numeric )
>
> On September 14, 2021 8:38:35 PM PDT, Gregg Powell <
> g.a.powell at protonmail.com> wrote:
> >Here is the output:
> >
> >> str(VPN_Sheet1$HVA)
> >List of 2174
> > $ : chr "Email: fffd at fffffffffff.com"
> > $ : num 1
> > $ : chr "Eloisa Libas"
> > $ : chr "Percival Esquejo"
> > $ : chr "Louchelle Singh"
> > $ : num 2
> > $ : chr "Charisse Anne Tabarno, RN"
> > $ : chr "Sol Amor Mucoy"
> > $ : chr "Josan Moira Paler"
> > $ : num 3
> > $ : chr "Anna Katrina V. Alberto"
> > $ : chr "Nenita Velarde"
> > $ : chr "Eunice Arrances"
> > $ : num 4
> > $ : chr "Catherine Henson"
> > $ : chr "Maria Carla Daya"
> > $ : chr "Renee Ireine Alit"
> > $ : num 5
> > $ : chr "Marol Joseph Domingo - PS"
> > $ : chr "Kissy Andrea Arriesgado"
> > $ : chr "Pia B Baluyut, RN"
> > $ : num 6
> > $ : chr "Gladys Joy Tan"
> > $ : chr "Frances Zarzua"
> > $ : chr "Fairy Jane Nery"
> > $ : num 7
> > $ : chr "Gladys Tijam, RMT"
> > $ : chr "Sarah Jane Aramburo"
> > $ : chr "Eve Mendoza"
> > $ : num 8
> > $ : chr "Gloria Padolino"
> > $ : chr "Joyce Pearl Javier"
> > $ : chr "Ayza Padilla"
> > $ : num 9
> > $ : chr "Walfredson Calderon"
> > $ : chr "Stephanie Anne Militante"
> > $ : chr "Rennua Oquilan"
> > $ : num 10
> > $ : chr "Neil John Nery"
> > $ : chr "Maria Reyna Reyes"
> > $ : chr "Rowella Villegas"
> > $ : num 11
> > $ : chr "Katelyn Mendiola"
> > $ : chr "Maria Riza Mariano"
> > $ : chr "Marie Vallianne Carantes"
> > $ : num 12
> >
> >??????? Original Message ???????
> >
> >On Tuesday, September 14th, 2021 at 8:32 PM, Jeff Newmiller <
> jdnewmil at dcn.davis.ca.us> wrote:
> >
> >> An atomic column of data by design has exactly one mode, so if any
> >> values are non-numeric then the entire column will be non-numeric.
> >> What does
> >>
> >
> >> str(VPN_Sheet1$HVA)
> >>
> >
> >> tell you? It is likely either a factor or character data.
> >>
> >
> >> On September 14, 2021 7:01:53 PM PDT, Gregg Powell via R-help
> r-help at r-project.org wrote:
> >>
> >
> >> > > Stuck on this problem - How does one remove all rows in
a dataframe
> that have a numeric in the first (or any) column?
> >> >
> >
> >> > > Seems straight forward - but I'm having trouble.
> >> >
> >
> >> > I've attempted to used:
> >> >
> >
> >> > VPN_Sheet1 <- VPN_Sheet1[!is.numeric(VPN_Sheet1$HVA),]
> >> >
> >
> >> > and
> >> >
> >
> >> > VPN_Sheet1 <- VPN_Sheet1[!is.integer(VPN_Sheet1$HVA),]
> >> >
> >
> >> > Neither work - Neither throw an error.
> >> >
> >
> >> > class(VPN_Sheet1$HVA) returns:
> >> >
> >
> >> > [1] "list"
> >> >
> >
> >> > So, the HVA column returns a list.
> >> >
> >
> >> > > Data looks like the attached screen grab -
> >> >
> >
> >> > > The ONLY rows I need to delete are the rows where there
is a
> numeric in the HVA column.
> >> >
> >
> >> > > There are some 5000+ rows in the actual data.
> >> >
> >
> >> > > Would be grateful for a solution to this problem.
> >> >
> >
> >> > How to get R to detect whether the value in column 1 is a
number so
> the rows with the number values can be deleted?
> >> >
> >
> >> > > Thanks in advance to any and all willing to help on this
problem.
> >> >
> >
> >> > > Gregg Powell
> >> >
> >
> >> > > Sierra Vista, AZ
> >>
> >
> >> --
> >>
> >
> >> Sent from my phone. Please excuse my brevity.
> --
> Sent from my phone. Please excuse my brevity.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
[[alternative HTML version deleted]]