Hello Bert and Avi,Sorry, it is typo. it should be:
for (i in colnames(df)){
? ......
}
below is the code I'm currently using
try2.un$ab2 <-
? ifelse(grepl("ab2",try2.un$data1), try2.un$data1,
? ? ? ? ?ifelse(grepl("ab2",try2.un$data2), try2.un$data2,
? ? ? ? ? ? ? ? ifelse(grepl("ab2",try2.un$data3), try2.un$data3,
? ? ? ? ? ? ? ? ? ? ? ?ifelse(grepl("ab2",try2.un$data4),
try2.un$data4,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ifelse(grepl("ab2",try2.un$data5),
try2.un$data5,NA
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ) ) ) ) )
As you can see, it uses 5 fields (data1 -- 5 ) in ifelse function. I want to
turn it to for loop, because the number of data(s) fields is dynamic. In this
sample is 5, But it maybe more than 15 in some of situation. So, I want use loop
to solve it and avoid to write those many ifelse statement. Also, in try2.un
data frame, there are many other fields that I don't need to use in the
loop.?
I'm not sure if the loop is a correct solution. But I'm willing to learn
any more suggestion from you.
Thanks,
Kai
On Tuesday, November 15, 2022 at 09:23:03 AM PST, avi.e.gross at gmail.com
<avi.e.gross at gmail.com> wrote:
Kai,
As Bert pointed out, it may not be clear what you want.
As a GUESS, you have some arbitrary data.frame object with multiple columns and
you want to do something on selected columns. Consider changing your idea to be
in several stages for simplicity and then optionally later rewriting it.
So step 1 is to get a vector of column names. The normal way to do this in base
R is not with a function called columns(df) but colnames(df) ...
Step 2 is to use one of many techniques that take that vector of names and
select the ones you want to keep. In base R there are many ways to do that
including using regular expressions as in the "grep" family of
functions. You may end up with a new vector of names perhaps shorter or in a
different order.
Step 3 is to use those names in your loop. If you want say to convert a column
from character to numeric, and your loop index is "current" you might
write something like:
??? df[current] <- as.numeric(df[current])
There are many ways and it depends on what exactly you want to do. There are
packages designed to make some of these things fairly simple, such as dplyr
where you can ask to match names that start or end a certain way or that are of
certain types.
Avi
-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Kai Yang via
R-help
Sent: Tuesday, November 15, 2022 11:18 AM
To: R-help Mailing List <r-help at r-project.org>
Subject: [R] add specific fields in for loop
Hi Team,
I can write a for loop like this:
for (i in columns(df)){
? ......
}
But it will working on all column in dataframe df. If I want to work on some of
specific fields (say: the fields' name content 'date'), how should I
modify the for loop? I changed the code below, but it doesn't work.
for (i in columns(df) %in% 'date' ){
? .....
}
Thank you,
Kai
??? [[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
@vi@e@gross m@iii@g oii gm@ii@com
2022-Nov-15 22:54 UTC
[R] add specific fields in for loop
Kai,
I have read all the messages exchanged so far and what I have not yet seen is a
clear explanation of what you want to do. I mean not as R code that may have
mistakes, but as what your goal is.
Your code below was a gigantic set of nested if statements that is not trivial
to parse.
So help explain a bit or you may keep getting great solutions to problems you
are not trying to solve.
You have a data.frame you called ?df? that seems to currently have no relation
to the rest of the code. You do seem to have a data.frame called ?try2.un?
instead so I assume you want an answer using that.
Your code seems to want to make a new column called ?ab2? by using info
currently held in columns ?data1? through ?data5? but you want a solution that
is more general. First I want to see what your code does do and make sure that
is what you want.
Your code starts like this (see below for the complete code):
ifelse(grepl("ab2",try2.un$data1), try2.un$data1, # else clauses
below
The above uses the logical version of grep, lgrep, and it seems that you are
asking for all of the items in the column vector data1 to be searched for the
unanchored presence of the string ?ab2? and the first result is a vector of
TRUE/FALSE. For those that are TRUE, meaning ?ab2? was found, you want the
actual result copied into the new column named ?ab2? and for those marked as
FALSE, continue with the next code line. I note you do not show any
initialization for the new column to something like NA and depend on the final
nested ifelse to set that as a default.
If what I wrote above is correct, then for any rows where data1 did not contain
the specified text, you now search in data2:
ifelse(grepl("ab2",try2.un$data2), try2.un$data2,
In this design, anything found in multiple places will only match the first
place found. Anything not found anywhere ends up with an NA.
So in English, IFF the above is what you want, you want a search across all
columns for the designated search string of ?ab2? but only keep the first.
To make a loop I suggest something like this:
try2.un$ab2 <- NA
Then choose what columns you want but do NOT choose ?ab2?. If you want ALL other
columns, then BEFORE the above line, save the current names as in:
loop.cols <- names(try2.un)
If you only want a subset, use some code that narrows down what you want. You
have not told us enough to make a suggestion. The point remains to have a
variable (vector) that can be used in a loop that holds exactly the columns you
want and in the right order. Unless I read you wrong, the order MATTERS as the
first match wins and if the columns have different matches like ?I am ab2? and
?ab2 was my mother? you get the idea that you are keeping the exact text of the
first match.
If my guess of your need was wrong, the rest is not going to make much sense.
So here is a loop:
for (i in loop.cols) { print(i)}
I used ?i? because you seem to like it. I prefer a more useful name. All the
above does is print the names so you see if what you are doing makes sense.
Now rewrite that to do what you want and find a way to only update an NA value.
You may want to think about what that means.
One idea is
try2.un$ab2 <-
ifelse(is.na(try2.un$ab2) && grepl("ab2",try2.un[i]),
try2.un[i],
try2.un$ab2)
The above, which I have not tried, would be run in a loop and checks both
whether an entry is still NA, and whether the current ith column has what you
want. If both are true, it selects the value for those entries/rows from the
column being looped on. If not, it retains the current non-NA setting from an
earlier iteration of the loop.
You need to flesh this out for yourself as I am not supplying complete and
tested code.
But note this is a very different meaning that some of us guessed and may still
not be what you want. There are many such questions about doing something the
same to each of the selected columns in a data.frame as in replacing all values
of 999 with NA. In many such cases the order does not matter. Other such
questions may want to check if any of the columns matches and simply return
TRUE/FALSE in a new column or externally. Some of such requests are potentially
simpler and easier.
So you need to be very clear on what you want. I am going by what I think your
sample code DOES and am not too sure it is exactly what you want.
From: Kai Yang <yangkai9999 at yahoo.com>
Sent: Tuesday, November 15, 2022 1:53 PM
To: 'R-help Mailing List' <r-help at r-project.org>; avi.e.gross
at gmail.com
Subject: Re: [R] add specific fields in for loop
Hello Bert and Avi,
Sorry, it is typo. it should be:
for (i in colnames(df)){
......
}
below is the code I'm currently using
try2.un$ab2 <-
ifelse(grepl("ab2",try2.un$data1), try2.un$data1,
ifelse(grepl("ab2",try2.un$data2), try2.un$data2,
ifelse(grepl("ab2",try2.un$data3), try2.un$data3,
ifelse(grepl("ab2",try2.un$data4),
try2.un$data4,
ifelse(grepl("ab2",try2.un$data5),
try2.un$data5,NA
) ) ) ) )
As you can see, it uses 5 fields (data1 -- 5 ) in ifelse function. I want to
turn it to for loop, because the number of data(s) fields is dynamic. In this
sample is 5, But it maybe more than 15 in some of situation. So, I want use loop
to solve it and avoid to write those many ifelse statement. Also, in try2.un
data frame, there are many other fields that I don't need to use in the
loop.
I'm not sure if the loop is a correct solution. But I'm willing to learn
any more suggestion from you.
Thanks,
Kai
On Tuesday, November 15, 2022 at 09:23:03 AM PST, avi.e.gross at gmail.com
<mailto:avi.e.gross at gmail.com> <avi.e.gross at gmail.com
<mailto:avi.e.gross at gmail.com> > wrote:
Kai,
As Bert pointed out, it may not be clear what you want.
As a GUESS, you have some arbitrary data.frame object with multiple columns and
you want to do something on selected columns. Consider changing your idea to be
in several stages for simplicity and then optionally later rewriting it.
So step 1 is to get a vector of column names. The normal way to do this in base
R is not with a function called columns(df) but colnames(df) ...
Step 2 is to use one of many techniques that take that vector of names and
select the ones you want to keep. In base R there are many ways to do that
including using regular expressions as in the "grep" family of
functions. You may end up with a new vector of names perhaps shorter or in a
different order.
Step 3 is to use those names in your loop. If you want say to convert a column
from character to numeric, and your loop index is "current" you might
write something like:
df[current] <- as.numeric(df[current])
There are many ways and it depends on what exactly you want to do. There are
packages designed to make some of these things fairly simple, such as dplyr
where you can ask to match names that start or end a certain way or that are of
certain types.
Avi
-----Original Message-----
From: R-help <r-help-bounces at r-project.org <mailto:r-help-bounces at
r-project.org> > On Behalf Of Kai Yang via R-help
Sent: Tuesday, November 15, 2022 11:18 AM
To: R-help Mailing List <r-help at r-project.org <mailto:r-help at
r-project.org> >
Subject: [R] add specific fields in for loop
Hi Team,
I can write a for loop like this:
for (i in columns(df)){
......
}
But it will working on all column in dataframe df. If I want to work on some of
specific fields (say: the fields' name content 'date'), how should I
modify the for loop? I changed the code below, but it doesn't work.
for (i in columns(df) %in% 'date' ){
.....
}
Thank you,
Kai
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org <mailto:R-help at r-project.org> mailing list --
To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help at r-project.org <mailto:R-help at r-project.org> mailing list --
To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]