thr3ads.net - R help - [R] conditional filling of data.frame

If this information is useful, please help other people find it:
Share via:

Ebert,Timothy Aaron

2022-Mar-10 17:58 UTC

[R] conditional filling of data.frame - improve code

You could try some of the "join" commands from dplyr.
https://dplyr.tidyverse.org/reference/mutate-joins.html
https://statisticsglobe.com/r-dplyr-join-inner-left-right-full-semi-anti

Regards,
Tim
-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Jeff Newmiller
Sent: Thursday, March 10, 2022 11:25 AM
To: r-help at r-project.org; Ivan Calandra <ivan.calandra at rgzm.de>;
R-help <r-help at r-project.org>
Subject: Re: [R] conditional filling of data.frame - improve code

[External Email]

Use merge.

expts <- read.csv( text "expt,sample
ex1,sample1-1
ex1,sample1-2
ex2,sample2-1
ex2,sample2-2
ex2,sample2-3
", header=TRUE, as.is=TRUE )

mydata <- data.frame(sample = c("sample2-2", "sample2-3",
"sample1-1", "sample1-1", "sample1-1",
"sample2-1"))

merge( mydata, expts, by="sample", all.x=TRUE )

On March 10, 2022 7:50:23 AM PST, Ivan Calandra <ivan.calandra at rgzm.de>
wrote:>Dear useRs,
>
>I would like to improve my ugly (though working) code, but I think I 
>need a completely different approach and I just can't think out of my
box!
>
>I have some external information about which sample(s) belong to which 
>experiment. I need to get that manually into R (either typing directly 
>in a script or read a CSV file, but that makes no difference):
>exp <- list(ex1 = c("sample1-1", "sample1-2"), ex2 =
c("sample2-1",
>"sample2-2" , "sample2-3"))
>
>Then I have my data, only with the sample IDs:
>mydata <- data.frame(sample = c("sample2-2",
"sample2-3", "sample1-1",
>"sample1-1", "sample1-1", "sample2-1"))
>
>Now I want to add a column to mydata with the experiment ID. The best I 
>could find is that:
>for (i in names(exp)) mydata[mydata[["sample"]] %in% exp[[i]], 
>"experiment"] <- i
>
>In this example, the experiment ID could be extracted from the sample 
>IDs, but this is not the case with my real data so it really is a 
>matter of matching. Of course I also have other columns with my real data.
>
>I'm pretty sure the last line (with the loop) can be improved in terms 
>of readability (speed is not an issue here). I have close to no 
>constraints on 'exp' (here I chose a list, but anything could do),
the
>only thing that cannot change is the format of 'mydata'.
>
>Thank you in advance!
>Ivan
>
--
Sent from my phone. Please excuse my brevity.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=Jzc7veojt_O3lQLFgC3O7ArDl8buUJGuuOHJZMWZJ9wTuTTwl_piuFOAv-w0ckT5&s=4HazMU4Mqs2oOcAkBrZd0VGrHX_lw6J1XozQNQ9RsHk&ePLEASE
do read the posting guide
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=Jzc7veojt_O3lQLFgC3O7ArDl8buUJGuuOHJZMWZJ9wTuTTwl_piuFOAv-w0ckT5&s=LdQqnVBkEAmRk7baBZLPs2svUpN6DIYaznrka_X8maI&eand
provide commented, minimal, self-contained, reproducible code.

Ivan Calandra

2022-Mar-11 07:24 UTC

head link

[R] conditional filling of data.frame - improve code

Thank you Jeff and Tim for your ideas. Indeed merge/join is probably the 
nicest way. Still, the code becomes much longer because I need more 
formatting of the input and output objects than with my ugly for loop :)

Cheers,
Ivan

--
Dr. Ivan Calandra
Imaging lab
RGZM - MONREPOS Archaeological Research Centre
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

Le 10/03/2022 ? 18:58, Ebert,Timothy Aaron a ?crit?:> You could try some of the "join" commands from dplyr.
> https://dplyr.tidyverse.org/reference/mutate-joins.html
> https://statisticsglobe.com/r-dplyr-join-inner-left-right-full-semi-anti
>
>
> Regards,
> Tim
> -----Original Message-----
> From: R-help <r-help-bounces at r-project.org> On Behalf Of Jeff
Newmiller
> Sent: Thursday, March 10, 2022 11:25 AM
> To: r-help at r-project.org; Ivan Calandra <ivan.calandra at
rgzm.de>; R-help <r-help at r-project.org>
> Subject: Re: [R] conditional filling of data.frame - improve code
>
> [External Email]
>
> Use merge.
>
> expts <- read.csv( text > "expt,sample
> ex1,sample1-1
> ex1,sample1-2
> ex2,sample2-1
> ex2,sample2-2
> ex2,sample2-3
> ", header=TRUE, as.is=TRUE )
>
> mydata <- data.frame(sample = c("sample2-2",
"sample2-3", "sample1-1", "sample1-1",
"sample1-1", "sample2-1"))
>
> merge( mydata, expts, by="sample", all.x=TRUE )
>
>
> On March 10, 2022 7:50:23 AM PST, Ivan Calandra <ivan.calandra at
rgzm.de> wrote:
>> Dear useRs,
>>
>> I would like to improve my ugly (though working) code, but I think I
>> need a completely different approach and I just can't think out of
my box!
>>
>> I have some external information about which sample(s) belong to which
>> experiment. I need to get that manually into R (either typing directly
>> in a script or read a CSV file, but that makes no difference):
>> exp <- list(ex1 = c("sample1-1", "sample1-2"),
ex2 = c("sample2-1",
>> "sample2-2" , "sample2-3"))
>>
>> Then I have my data, only with the sample IDs:
>> mydata <- data.frame(sample = c("sample2-2",
"sample2-3", "sample1-1",
>> "sample1-1", "sample1-1", "sample2-1"))
>>
>> Now I want to add a column to mydata with the experiment ID. The best I
>> could find is that:
>> for (i in names(exp)) mydata[mydata[["sample"]] %in%
exp[[i]],
>> "experiment"] <- i
>>
>> In this example, the experiment ID could be extracted from the sample
>> IDs, but this is not the case with my real data so it really is a
>> matter of matching. Of course I also have other columns with my real
data.
>>
>> I'm pretty sure the last line (with the loop) can be improved in
terms
>> of readability (speed is not an issue here). I have close to no
>> constraints on 'exp' (here I chose a list, but anything could
do), the
>> only thing that cannot change is the format of 'mydata'.
>>
>> Thank you in advance!
>> Ivan
>>
> --
> Sent from my phone. Please excuse my brevity.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=Jzc7veojt_O3lQLFgC3O7ArDl8buUJGuuOHJZMWZJ9wTuTTwl_piuFOAv-w0ckT5&s=4HazMU4Mqs2oOcAkBrZd0VGrHX_lw6J1XozQNQ9RsHk&e>
PLEASE do read the posting guide
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=Jzc7veojt_O3lQLFgC3O7ArDl8buUJGuuOHJZMWZJ9wTuTTwl_piuFOAv-w0ckT5&s=LdQqnVBkEAmRk7baBZLPs2svUpN6DIYaznrka_X8maI&e>
and provide commented, minimal, self-contained, reproducible code.

R help - Mar 2022 - conditional filling of data.frame - improve code

[R] conditional filling of data.frame - improve code

[R] conditional filling of data.frame - improve code