thr3ads.net - R help - [R] Select dataframe row containing a digit [Nov 2022]

If this information is useful, please help other people find it:
Share via:

Luigi Marongiu

2022-Nov-30 12:40 UTC

[R] Select dataframe row containing a digit

Hello,
I have a data frame where some lines containing strings including digits.
How do I select those rows and change their values?

In essence, I have a data frame with different values assigned to the
column "val". I am formatting everything to either "POS" and
"NEG",
but values entered as number should get the value "NUM".
How do I change such values?

-- 
Best regards,
Luigi


```
df = data.frame(id = runif(10, 1, 100),
                val = c("", "POs", "Pos",
"P", "Y",
                        "13.6", "Neg", "N",
"0.5", "58.4"),
                stringsAsFactors = FALSE)
df$val[df$val == ""] = NA
df$val[df$val == "POs"] = "POS"
df$val[df$val == "Pos"] = "POS"
df$val[df$val == "P"] = "POS"
df$val[df$val == "Y"] = "POS"
df$val[df$val == "Neg"] = "NEG"
df$val[df$val == "N"] = "NEG"
```

Ivan Krylov

2022-Nov-30 13:02 UTC

head link

[R] Select dataframe row containing a digit

? Wed, 30 Nov 2022 13:40:50 +0100
Luigi Marongiu <marongiu.luigi at gmail.com> ?????:
> I am formatting everything to either "POS" and "NEG",
> but values entered as number should get the value "NUM".
> How do I change such values?
Thanks for providing an example!

One idea would be to use a regular expression to locate numbers. For
example, grepl('[0-9]', df$val) will return a logical vector indexing
the rows containing digits. Alternatively, grepl('^[0-9.]+$', df$val,
perl = TRUE) will index all strings consisting solely of digits and
decimal separators.

Another idea would be to parse all of the strings as numbers and filter
out those that didn't succeed. Use as.numeric() to perform the parsing,
suppressWarnings() to silence the messages telling you that the parsing
failed for some of the strings and is.na() to get the logical vector
indexing those entries that failed to parse.

-- 
Best regards,
Ivan

Rui Barradas

2022-Nov-30 14:38 UTC

head link

[R] Select dataframe row containing a digit

?s 12:40 de 30/11/2022, Luigi Marongiu escreveu:> Hello,
> I have a data frame where some lines containing strings including digits.
> How do I select those rows and change their values?
> 
> In essence, I have a data frame with different values assigned to the
> column "val". I am formatting everything to either
"POS" and "NEG",
> but values entered as number should get the value "NUM".
> How do I change such values?
> 
Hello,

Here is a way with grep.


i <- grep("^P|^Y", df$val, ignore.case = TRUE)
df$val[i] <- "POS"
i <- grep("^N", df$val, ignore.case = TRUE)
df$val[i] <- "NEG"
i <- grep("\\d+", df$val)
df$val[i] <- "NUM"
is.na(df$val) <- df$val == ""
df


Hope this helps,

Rui Barradas

Rolf Turner

2022-Nov-30 19:59 UTC

head link

[R] Select dataframe row containing a digit

On Wed, 30 Nov 2022 13:40:50 +0100
Luigi Marongiu <marongiu.luigi at gmail.com> wrote:
> Hello,
> I have a data frame where some lines containing strings including
> digits. How do I select those rows and change their values?
> 
> In essence, I have a data frame with different values assigned to the
> column "val". I am formatting everything to either
"POS" and "NEG",
> but values entered as number should get the value "NUM".
> How do I change such values?
> 
What I do in such circumstances:

suppressWarnings(X$val[!is.na(as.numeric(X$val))] <- "NUM")

The "suppressWarnings()" bit is just included due to my OCD.

This avoids fooling about with regular expressions, which always
requires a huge amount of trial and error, and a great diminishment of
the amount of hair on one's head (as a result of tearing out).

Note that I have changed the name of your data frame from "df" to
"X",
since df() is a built-in R function (density of the F-distribution).

See fortunes::fortune("might clash").

cheers,

Rolf Turner

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

Luigi Marongiu

2022-Dec-01 11:25 UTC

head link

[R] Select dataframe row containing a digit

Thank you, those are all viable solutions.
Regards
Luigi

On Wed, Nov 30, 2022 at 8:59 PM Rolf Turner <r.turner at auckland.ac.nz>
wrote:>
>
> On Wed, 30 Nov 2022 13:40:50 +0100
> Luigi Marongiu <marongiu.luigi at gmail.com> wrote:
>
> > Hello,
> > I have a data frame where some lines containing strings including
> > digits. How do I select those rows and change their values?
> >
> > In essence, I have a data frame with different values assigned to the
> > column "val". I am formatting everything to either
"POS" and "NEG",
> > but values entered as number should get the value "NUM".
> > How do I change such values?
> >
>
> What I do in such circumstances:
>
> suppressWarnings(X$val[!is.na(as.numeric(X$val))] <- "NUM")
>
> The "suppressWarnings()" bit is just included due to my OCD.
>
> This avoids fooling about with regular expressions, which always
> requires a huge amount of trial and error, and a great diminishment of
> the amount of hair on one's head (as a result of tearing out).
>
> Note that I have changed the name of your data frame from "df" to
"X",
> since df() is a built-in R function (density of the F-distribution).
>
> See fortunes::fortune("might clash").
>
> cheers,
>
> Rolf Turner
>
> --
> Honorary Research Fellow
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>

-- 
Best regards,
Luigi

R help - Nov 2022 - Select dataframe row containing a digit

[R] Select dataframe row containing a digit

[R] Select dataframe row containing a digit

[R] Select dataframe row containing a digit

[R] Select dataframe row containing a digit

[R] Select dataframe row containing a digit