James Milks
2021-Aug-10 10:58 UTC
[R] Replacing certain rows with values from a different column
I have two columns in a larger data set that list countries in one column and,
in some cases, individual provinces within a country or oversea territories in
another. I have country population in a second data set that I?m planning to use
to calculate per capita rates in the first data set. My issue: I need to match
my two data sets. Here are some examples:
First data set:
Province <- c("Australian Capital Territory", "New South
Wales", "Northern Territory", "Queensland", "South
Australia", "Tasmania", "Victoria", "Western
Australia", "", "", "", "Faroe
Islands", "Greenland")
Country <- c("Australia", "Australia",
"Australia", "Australia", "Australia",
"Australia", "Australia", "Australia",
"Austria", "Azerbaijan", "Denmark",
"Denmark", "Denmark")
firstdf <- data.frame(Province, Country)
Second data set:
Country <- c("Australia", "Austria",
"Azerbaijan", "Denmark", "Faroe Islands",
"Greenland")
seconddf <- data.frame(Country)
In this example, I need to aggregate sum Australia while keeping Faroe Islands
and Greenland separate from Denmark. What I?d like to do is create a column that
looks like this:
firstdf$nation <- c("Australia", "Australia",
"Australia", "Australia", "Australia",
"Australia", "Australia", "Australia",
"Austria", "Azerbaijan", "Denmark", ?Faroe
Islands", ?Greenland?)
Is there a way to do this or am I stuck doing this by hand?
Thanks for any help on this vexing issue.
Jim Milks
[[alternative HTML version deleted]]
Gerrit Eichner
2021-Aug-10 11:07 UTC
[R] Replacing certain rows with values from a different column
Hi, James,
if I understand you correctly, maybe,
with(firstdf,
ifelse(Province %in% seconddf$Country,
Province,
Country)
)
does what you want?
Hth -- Gerrit
---------------------------------------------------------------------
Dr. Gerrit Eichner Mathematical Institute, Room 212
gerrit.eichner at math.uni-giessen.de Justus-Liebig-University Giessen
Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany
http://www.uni-giessen.de/eichner
---------------------------------------------------------------------
Am 10.08.2021 um 12:58 schrieb James Milks via R-help:> I have two columns in a larger data set that list countries in one column
and, in some cases, individual provinces within a country or oversea territories
in another. I have country population in a second data set that I?m planning to
use to calculate per capita rates in the first data set. My issue: I need to
match my two data sets. Here are some examples:
>
> First data set:
>
> Province <- c("Australian Capital Territory", "New South
Wales", "Northern Territory", "Queensland", "South
Australia", "Tasmania", "Victoria", "Western
Australia", "", "", "", "Faroe
Islands", "Greenland")
>
> Country <- c("Australia", "Australia",
"Australia", "Australia", "Australia",
"Australia", "Australia", "Australia",
"Austria", "Azerbaijan", "Denmark",
"Denmark", "Denmark")
>
> firstdf <- data.frame(Province, Country)
>
> Second data set:
>
> Country <- c("Australia", "Austria",
"Azerbaijan", "Denmark", "Faroe Islands",
"Greenland")
>
> seconddf <- data.frame(Country)
>
> In this example, I need to aggregate sum Australia while keeping Faroe
Islands and Greenland separate from Denmark. What I?d like to do is create a
column that looks like this:
>
> firstdf$nation <- c("Australia", "Australia",
"Australia", "Australia", "Australia",
"Australia", "Australia", "Australia",
"Austria", "Azerbaijan", "Denmark", ?Faroe
Islands", ?Greenland?)
>
> Is there a way to do this or am I stuck doing this by hand?
>
> Thanks for any help on this vexing issue.
>
> Jim Milks
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>