Jeff Newmiller
2021-Sep-15 03:54 UTC
[R] How to remove all rows that have a numeric in the first (or any) column
You cannot apply vectorized operators to list columns... you have to use a map function like sapply or purrr::map_lgl to obtain a logical vector by running the function once for each list element: sapply( VPN_Sheet1$HVA, is.numeric ) On September 14, 2021 8:38:35 PM PDT, Gregg Powell <g.a.powell at protonmail.com> wrote:>Here is the output: > >> str(VPN_Sheet1$HVA) >List of 2174 > $ : chr "Email: fffd at fffffffffff.com" > $ : num 1 > $ : chr "Eloisa Libas" > $ : chr "Percival Esquejo" > $ : chr "Louchelle Singh" > $ : num 2 > $ : chr "Charisse Anne Tabarno, RN" > $ : chr "Sol Amor Mucoy" > $ : chr "Josan Moira Paler" > $ : num 3 > $ : chr "Anna Katrina V. Alberto" > $ : chr "Nenita Velarde" > $ : chr "Eunice Arrances" > $ : num 4 > $ : chr "Catherine Henson" > $ : chr "Maria Carla Daya" > $ : chr "Renee Ireine Alit" > $ : num 5 > $ : chr "Marol Joseph Domingo - PS" > $ : chr "Kissy Andrea Arriesgado" > $ : chr "Pia B Baluyut, RN" > $ : num 6 > $ : chr "Gladys Joy Tan" > $ : chr "Frances Zarzua" > $ : chr "Fairy Jane Nery" > $ : num 7 > $ : chr "Gladys Tijam, RMT" > $ : chr "Sarah Jane Aramburo" > $ : chr "Eve Mendoza" > $ : num 8 > $ : chr "Gloria Padolino" > $ : chr "Joyce Pearl Javier" > $ : chr "Ayza Padilla" > $ : num 9 > $ : chr "Walfredson Calderon" > $ : chr "Stephanie Anne Militante" > $ : chr "Rennua Oquilan" > $ : num 10 > $ : chr "Neil John Nery" > $ : chr "Maria Reyna Reyes" > $ : chr "Rowella Villegas" > $ : num 11 > $ : chr "Katelyn Mendiola" > $ : chr "Maria Riza Mariano" > $ : chr "Marie Vallianne Carantes" > $ : num 12 > >??????? Original Message ??????? > >On Tuesday, September 14th, 2021 at 8:32 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: > >> An atomic column of data by design has exactly one mode, so if any values are non-numeric then the entire column will be non-numeric. What does >> > >> str(VPN_Sheet1$HVA) >> > >> tell you? It is likely either a factor or character data. >> > >> On September 14, 2021 7:01:53 PM PDT, Gregg Powell via R-help r-help at r-project.org wrote: >> > >> > > Stuck on this problem - How does one remove all rows in a dataframe that have a numeric in the first (or any) column? >> > > >> > > Seems straight forward - but I'm having trouble. >> > > >> > I've attempted to used: >> > > >> > VPN_Sheet1 <- VPN_Sheet1[!is.numeric(VPN_Sheet1$HVA),] >> > > >> > and >> > > >> > VPN_Sheet1 <- VPN_Sheet1[!is.integer(VPN_Sheet1$HVA),] >> > > >> > Neither work - Neither throw an error. >> > > >> > class(VPN_Sheet1$HVA) returns: >> > > >> > [1] "list" >> > > >> > So, the HVA column returns a list. >> > > >> > > Data looks like the attached screen grab - >> > > >> > > The ONLY rows I need to delete are the rows where there is a numeric in the HVA column. >> > > >> > > There are some 5000+ rows in the actual data. >> > > >> > > Would be grateful for a solution to this problem. >> > > >> > How to get R to detect whether the value in column 1 is a number so the rows with the number values can be deleted? >> > > >> > > Thanks in advance to any and all willing to help on this problem. >> > > >> > > Gregg Powell >> > > >> > > Sierra Vista, AZ >> > >> -- >> > >> Sent from my phone. Please excuse my brevity.-- Sent from my phone. Please excuse my brevity.
Avi Gross
2021-Sep-15 04:39 UTC
[R] How to remove all rows that have a numeric in the first (or any) column
Calling something a data.frame does not make it a data.frame. The abbreviated object shown below is a list of singletons. If it is a column in a larger object that is a data.frame, then it is a list column which is valid but can be ticklish to handle within base R but less so in the tidyverse. For example, if I try to make a data.frame the normal way, the list gets made into multiple columns and copied to each row. Not what was expected. I think some tidyverse functionality does better. Like this: library(tidyverse) temp=list("Hello", 1, 1.1, "bye") Now making a data.frame has an odd result:> mydf=data.frame(alpha=1:4, beta=temp) > mydfalpha beta..Hello. beta.1 beta.1.1 beta..bye. 1 1 Hello 1 1.1 bye 2 2 Hello 1 1.1 bye 3 3 Hello 1 1.1 bye 4 4 Hello 1 1.1 bye But a tibble handles it:> mydf=tibble(alpha=1:4, beta=temp) > mydf# A tibble: 4 x 2 alpha beta <int> <list> 1 1 <chr [1]> 2 2 <dbl [1]> 3 3 <dbl [1]> 4 4 <chr [1]> So if the data does look like this, with a list column, but access can be tricky as subsetting a list with [] returns a list and you need [[]]. I found a somehwhat odd solution like this: mydf %>% filter(!map_lgl(beta, is.numeric)) -> mydf2 # A tibble: 2 x 2 alpha beta <int> <list> 1 1 <chr [1]> 2 4 <chr [1]> When I saved that result into mydf2, I got this. Original: > str(mydf) tibble [4 x 2] (S3: tbl_df/tbl/data.frame) $ alpha: int [1:4] 1 2 3 4 $ beta :List of 4 ..$ : chr "Hello" ..$ : num 1 ..$ : num 1.1 ..$ : chr "bye" Output when any row with a numeric is removed:> str(mydf2)tibble [2 x 2] (S3: tbl_df/tbl/data.frame) $ alpha: int [1:2] 1 4 $ beta :List of 2 ..$ : chr "Hello" ..$ : chr "bye" So if you try variations on your code motivated by what I show, good luck. I am sure there are many better ways but I repeat, it can be tricky. -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Jeff Newmiller Sent: Tuesday, September 14, 2021 11:54 PM To: Gregg Powell <g.a.powell at protonmail.com> Cc: Gregg Powell via R-help <r-help at r-project.org> Subject: Re: [R] How to remove all rows that have a numeric in the first (or any) column You cannot apply vectorized operators to list columns... you have to use a map function like sapply or purrr::map_lgl to obtain a logical vector by running the function once for each list element: sapply( VPN_Sheet1$HVA, is.numeric ) On September 14, 2021 8:38:35 PM PDT, Gregg Powell <g.a.powell at protonmail.com> wrote:>Here is the output: > >> str(VPN_Sheet1$HVA) >List of 2174 > $ : chr "Email: fffd at fffffffffff.com" > $ : num 1 > $ : chr "Eloisa Libas" > $ : chr "Percival Esquejo" > $ : chr "Louchelle Singh" > $ : num 2 > $ : chr "Charisse Anne Tabarno, RN" > $ : chr "Sol Amor Mucoy" > $ : chr "Josan Moira Paler" > $ : num 3 > $ : chr "Anna Katrina V. Alberto" > $ : chr "Nenita Velarde" > $ : chr "Eunice Arrances" > $ : num 4 > $ : chr "Catherine Henson" > $ : chr "Maria Carla Daya" > $ : chr "Renee Ireine Alit" > $ : num 5 > $ : chr "Marol Joseph Domingo - PS" > $ : chr "Kissy Andrea Arriesgado" > $ : chr "Pia B Baluyut, RN" > $ : num 6 > $ : chr "Gladys Joy Tan" > $ : chr "Frances Zarzua" > $ : chr "Fairy Jane Nery" > $ : num 7 > $ : chr "Gladys Tijam, RMT" > $ : chr "Sarah Jane Aramburo" > $ : chr "Eve Mendoza" > $ : num 8 > $ : chr "Gloria Padolino" > $ : chr "Joyce Pearl Javier" > $ : chr "Ayza Padilla" > $ : num 9 > $ : chr "Walfredson Calderon" > $ : chr "Stephanie Anne Militante" > $ : chr "Rennua Oquilan" > $ : num 10 > $ : chr "Neil John Nery" > $ : chr "Maria Reyna Reyes" > $ : chr "Rowella Villegas" > $ : num 11 > $ : chr "Katelyn Mendiola" > $ : chr "Maria Riza Mariano" > $ : chr "Marie Vallianne Carantes" > $ : num 12 > >??????? Original Message ??????? > >On Tuesday, September 14th, 2021 at 8:32 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: > >> An atomic column of data by design has exactly one mode, so if any >> values are non-numeric then the entire column will be non-numeric. >> What does >> > >> str(VPN_Sheet1$HVA) >> > >> tell you? It is likely either a factor or character data. >> > >> On September 14, 2021 7:01:53 PM PDT, Gregg Powell via R-help r-help at r-project.org wrote: >> > >> > > Stuck on this problem - How does one remove all rows in a dataframe that have a numeric in the first (or any) column? >> > > >> > > Seems straight forward - but I'm having trouble. >> > > >> > I've attempted to used: >> > > >> > VPN_Sheet1 <- VPN_Sheet1[!is.numeric(VPN_Sheet1$HVA),] >> > > >> > and >> > > >> > VPN_Sheet1 <- VPN_Sheet1[!is.integer(VPN_Sheet1$HVA),] >> > > >> > Neither work - Neither throw an error. >> > > >> > class(VPN_Sheet1$HVA) returns: >> > > >> > [1] "list" >> > > >> > So, the HVA column returns a list. >> > > >> > > Data looks like the attached screen grab - >> > > >> > > The ONLY rows I need to delete are the rows where there is a numeric in the HVA column. >> > > >> > > There are some 5000+ rows in the actual data. >> > > >> > > Would be grateful for a solution to this problem. >> > > >> > How to get R to detect whether the value in column 1 is a number so the rows with the number values can be deleted? >> > > >> > > Thanks in advance to any and all willing to help on this problem. >> > > >> > > Gregg Powell >> > > >> > > Sierra Vista, AZ >> > >> -- >> > >> Sent from my phone. Please excuse my brevity.-- Sent from my phone. Please excuse my brevity. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Avi Gross
2021-Sep-15 04:41 UTC
[R] How to remove all rows that have a numeric in the first (or any) column
Calling something a data.frame does not make it a data.frame. The abbreviated object shown below is a list of singletons. If it is a column in a larger object that is a data.frame, then it is a list column which is valid but can be ticklish to handle within base R but less so in the tidyverse. For example, if I try to make a data.frame the normal way, the list gets made into multiple columns and copied to each row. Not what was expected. I think some tidyverse functionality does better. Like this: library(tidyverse) temp=list("Hello", 1, 1.1, "bye") Now making a data.frame has an odd result:> mydf=data.frame(alpha=1:4, beta=temp) > mydfalpha beta..Hello. beta.1 beta.1.1 beta..bye. 1 1 Hello 1 1.1 bye 2 2 Hello 1 1.1 bye 3 3 Hello 1 1.1 bye 4 4 Hello 1 1.1 bye But a tibble handles it:> mydf=tibble(alpha=1:4, beta=temp) > mydf# A tibble: 4 x 2 alpha beta <int> <list> 1 1 <chr [1]> 2 2 <dbl [1]> 3 3 <dbl [1]> 4 4 <chr [1]> So if the data does look like this, with a list column, but access can be tricky as subsetting a list with [] returns a list and you need [[]]. I found a somehwhat odd solution like this: mydf %>% filter(!map_lgl(beta, is.numeric)) -> mydf2 # A tibble: 2 x 2 alpha beta <int> <list> 1 1 <chr [1]> 2 4 <chr [1]> When I saved that result into mydf2, I got this. Original: > str(mydf) tibble [4 x 2] (S3: tbl_df/tbl/data.frame) $ alpha: int [1:4] 1 2 3 4 $ beta :List of 4 ..$ : chr "Hello" ..$ : num 1 ..$ : num 1.1 ..$ : chr "bye" Output when any row with a numeric is removed:> str(mydf2)tibble [2 x 2] (S3: tbl_df/tbl/data.frame) $ alpha: int [1:2] 1 4 $ beta :List of 2 ..$ : chr "Hello" ..$ : chr "bye" So if you try variations on your code motivated by what I show, good luck. I am sure there are many better ways but I repeat, it can be tricky. -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Jeff Newmiller Sent: Tuesday, September 14, 2021 11:54 PM To: Gregg Powell <g.a.powell at protonmail.com> Cc: Gregg Powell via R-help <r-help at r-project.org> Subject: Re: [R] How to remove all rows that have a numeric in the first (or any) column You cannot apply vectorized operators to list columns... you have to use a map function like sapply or purrr::map_lgl to obtain a logical vector by running the function once for each list element: sapply( VPN_Sheet1$HVA, is.numeric ) On September 14, 2021 8:38:35 PM PDT, Gregg Powell <g.a.powell at protonmail.com> wrote:>Here is the output: > >> str(VPN_Sheet1$HVA) >List of 2174 > $ : chr "Email: fffd at fffffffffff.com" > $ : num 1 > $ : chr "Eloisa Libas" > $ : chr "Percival Esquejo" > $ : chr "Louchelle Singh" > $ : num 2 > $ : chr "Charisse Anne Tabarno, RN" > $ : chr "Sol Amor Mucoy" > $ : chr "Josan Moira Paler" > $ : num 3 > $ : chr "Anna Katrina V. Alberto" > $ : chr "Nenita Velarde" > $ : chr "Eunice Arrances" > $ : num 4 > $ : chr "Catherine Henson" > $ : chr "Maria Carla Daya" > $ : chr "Renee Ireine Alit" > $ : num 5 > $ : chr "Marol Joseph Domingo - PS" > $ : chr "Kissy Andrea Arriesgado" > $ : chr "Pia B Baluyut, RN" > $ : num 6 > $ : chr "Gladys Joy Tan" > $ : chr "Frances Zarzua" > $ : chr "Fairy Jane Nery" > $ : num 7 > $ : chr "Gladys Tijam, RMT" > $ : chr "Sarah Jane Aramburo" > $ : chr "Eve Mendoza" > $ : num 8 > $ : chr "Gloria Padolino" > $ : chr "Joyce Pearl Javier" > $ : chr "Ayza Padilla" > $ : num 9 > $ : chr "Walfredson Calderon" > $ : chr "Stephanie Anne Militante" > $ : chr "Rennua Oquilan" > $ : num 10 > $ : chr "Neil John Nery" > $ : chr "Maria Reyna Reyes" > $ : chr "Rowella Villegas" > $ : num 11 > $ : chr "Katelyn Mendiola" > $ : chr "Maria Riza Mariano" > $ : chr "Marie Vallianne Carantes" > $ : num 12 > >??????? Original Message ??????? > >On Tuesday, September 14th, 2021 at 8:32 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: > >> An atomic column of data by design has exactly one mode, so if any >> values are non-numeric then the entire column will be non-numeric. >> What does >> > >> str(VPN_Sheet1$HVA) >> > >> tell you? It is likely either a factor or character data. >> > >> On September 14, 2021 7:01:53 PM PDT, Gregg Powell via R-help r-help at r-project.org wrote: >> > >> > > Stuck on this problem - How does one remove all rows in a dataframe that have a numeric in the first (or any) column? >> > > >> > > Seems straight forward - but I'm having trouble. >> > > >> > I've attempted to used: >> > > >> > VPN_Sheet1 <- VPN_Sheet1[!is.numeric(VPN_Sheet1$HVA),] >> > > >> > and >> > > >> > VPN_Sheet1 <- VPN_Sheet1[!is.integer(VPN_Sheet1$HVA),] >> > > >> > Neither work - Neither throw an error. >> > > >> > class(VPN_Sheet1$HVA) returns: >> > > >> > [1] "list" >> > > >> > So, the HVA column returns a list. >> > > >> > > Data looks like the attached screen grab - >> > > >> > > The ONLY rows I need to delete are the rows where there is a numeric in the HVA column. >> > > >> > > There are some 5000+ rows in the actual data. >> > > >> > > Would be grateful for a solution to this problem. >> > > >> > How to get R to detect whether the value in column 1 is a number so the rows with the number values can be deleted? >> > > >> > > Thanks in advance to any and all willing to help on this problem. >> > > >> > > Gregg Powell >> > > >> > > Sierra Vista, AZ >> > >> -- >> > >> Sent from my phone. Please excuse my brevity.-- Sent from my phone. Please excuse my brevity. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.