You may need a few more steps than that, Val.
I commend you for stating your need clearly and showing a reasonable set of test
data and spelling out the expected result.
If your data is polluted the way you describe, then read.table() likely will
treat those columns as character and not numeric. In your example you want to
recognize "13X" as having an X. Similarly "3BC" has a B.
Those two columns can be handled by the same technique and later made numeric.
You seem to want any numerals in the first column to disqualify it too.
So consider what techniques you have learned and thus what you are allowed to do
for such an assignment. Unless we know otherwise, we may assume this is homework
of some sort.
We, reading this, have no idea what parts of basic R you can use and I hope
nobody jumps in offering tidyverse packages.
So ask yourself how to create one of dozens of ways to make a copy of your data
that includes only rows where column 1 follows the rule of containing no digits
between 0 and 9. You can use things that say count characters of some kind and
compare it to the length of the item, for example. You might use regular
expressions. Whatever you do, should remove your sixth row in the example and
nothing else.
Can you now take the result and shorten it by removing anything in column 2
using some new technique that shows if there are one or more letters? An example
might be to try converting the value to an integer and back to character and
seeing if they match. Again, lots of possibilities but you need only one that
works.
Can you take that shorter version and repeat pretty much the same filter on
column 3?
That should work and if ambitious, you can even find a way to create a compound
filter that does all three columns at once.
-----Original Message-----
From: Val <valkremk at gmail.com>
To: r-help at R-project.org (r-help at r-project.org) <r-help at
r-project.org>
Sent: Fri, Jan 28, 2022 10:08 pm
Subject: [R] Row exclude
Hi All,
I want to remove rows that contain a character string in an integer
column or a digit in a character column.
Sample data
dat1 <-read.table(text="Name, Age, Weight
Alex,? 20,? 13X
Bob,? 25,? 142
Carol, 24,? 120
John,? 3BC,? 175
Katy,? 35,? 160
Jack3, 34,? 140",sep=",",header=TRUE,stringsAsFactors=F)
If the Age/Weight column contains any character(s) then remove
if the Name? column contains an digit then remove that row
Desired output
? Name? Age weight
1? Bob? ? 25? ? 142
2? Carol? 24? ? 120
3? Katy? ? 35? ? 160
Thank you,
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]