thr3ads.net - R help - [R] Joining two datasets - recursive procedure? [Mar 2015]

If this information is useful, please help other people find it:
Share via:

Bert Gunter

2015-Mar-21 17:13 UTC

[R] Joining two datasets - recursive procedure?

... or cleaner:

z1 <- with(f1,v4 + z -ave(z,v1,v2,FUN=mean))


Just for curiosity, was this homework? (in which case I should
probably have not provided you an answer -- that is, assuming that I
HAVE provided an answer).

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll




On Sat, Mar 21, 2015 at 7:53 AM, Bert Gunter <bgunter at gene.com>
wrote:> z <- rnorm(nrow(f1)) ## or anything you want
> z1 <- f1$v4 + z - with(f1,ave(z,v1,v2,FUN=mean))
>
>
> aggregate(v4~v1,f1,sum)
> aggregate(z1~v1,f1,sum)
> aggregate(v4~v2,f1,sum)
> aggregate(z1~v2,f1,sum)
> aggregate(v4~v3,f1,sum)
> aggregate(z1~v3,f1,sum)
>
>
> Cheers,
> Bert
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
> (650) 467-7374
>
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
> Clifford Stoll
>
>
>
>
> On Sat, Mar 21, 2015 at 6:49 AM, Luca Meyer <lucam1968 at gmail.com>
wrote:
>> Hi Bert,
>>
>> Thank you for your message. I am looking into ave() and tapply() as you
>> suggested but at the same time I have prepared a example of input and
output
>> files, just in case you or someone else would like to make an attempt
to
>> generate a code that goes from input to output.
>>
>> Please see below or download it from
>> https://www.dropbox.com/s/qhmpkkrejjkpbkx/sample_code.txt?dl=0
>>
>> # this is (an extract of) the INPUT file I have:
>> f1 <- structure(list(v1 = c("A", "A",
"A", "A", "A", "A", "B",
"B",
>> "B", "B", "B", "B"), v2 =
c("A", "B", "C", "A", "B",
"C", "A",
>> "B", "C", "A", "B",
"C"), v3 = c("B", "B", "B",
"C", "C", "C",
>> "B", "B", "B", "C",
"C", "C"), v4 = c(18.18530, 3.43806,0.00273, 1.42917,
>> 1.05786, 0.00042, 2.37232, 3.01835, 0, 1.13430, 0.92872,
>> 0)), .Names = c("v1", "v2", "v3",
"v4"), class = "data.frame", row.names >> c(2L,
>> 9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L, 206L))
>>
>> # this is (an extract of) the OUTPUT file I would like to obtain:
>> f2 <- structure(list(v1 = c("A", "A",
"A", "A", "A", "A", "B",
"B",
>> "B", "B", "B", "B"), v2 =
c("A", "B", "C", "A", "B",
"C", "A",
>> "B", "C", "A", "B",
"C"), v3 = c("B", "B", "B",
"C", "C", "C",
>> "B", "B", "B", "C",
"C", "C"), v4 = c(17.83529, 3.43806,0.00295, 1.77918,
>> 1.05786, 0.0002, 2.37232, 3.01835, 0, 1.13430, 0.92872,
>> 0)), .Names = c("v1", "v2", "v3",
"v4"), class = "data.frame", row.names >> c(2L,
>> 9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L, 206L))
>>
>> # please notice that while the aggregated v4 on v3 has changed ?
>> aggregate(f1[,c("v4")],list(f1$v3),sum)
>> aggregate(f2[,c("v4")],list(f2$v3),sum)
>>
>> # ? the aggregated v4 over v1xv2 has remained unchanged:
>> aggregate(f1[,c("v4")],list(f1$v1,f1$v2),sum)
>> aggregate(f2[,c("v4")],list(f2$v1,f2$v2),sum)
>>
>> Thank you very much in advance for your assitance.
>>
>> Luca
>>
>> 2015-03-21 13:18 GMT+01:00 Bert Gunter <gunter.berton at
gene.com>:
>>>
>>> 1. Still not sure what you mean, but maybe look at ?ave and
?tapply,
>>> for which ave() is a wrapper.
>>>
>>> 2. You still need to heed the rest of Jeff's advice.
>>>
>>> Cheers,
>>> Bert
>>>
>>> Bert Gunter
>>> Genentech Nonclinical Biostatistics
>>> (650) 467-7374
>>>
>>> "Data is not information. Information is not knowledge. And
knowledge
>>> is certainly not wisdom."
>>> Clifford Stoll
>>>
>>>
>>>
>>>
>>> On Sat, Mar 21, 2015 at 4:53 AM, Luca Meyer <lucam1968 at
gmail.com> wrote:
>>> > Hi Jeff & other R-experts,
>>> >
>>> > Thank you for your note. I have tried myself to solve the
issue without
>>> > success.
>>> >
>>> > Following your suggestion, I am providing a sample of the
dataset I am
>>> > using below (also downloadble in plain text from
>>> >
https://www.dropbox.com/s/qhmpkkrejjkpbkx/sample_code.txt?dl=0):
>>> >
>>> > #this is an extract of the overall dataset (n=1200 cases)
>>> > f1 <- structure(list(v1 = c("A", "A",
"A", "A", "A", "A", "B",
"B",
>>> > "B", "B", "B", "B"),
v2 = c("A", "B", "C", "A",
"B", "C", "A",
>>> > "B", "C", "A", "B",
"C"), v3 = c("B", "B", "B",
"C", "C", "C",
>>> > "B", "B", "B", "C",
"C", "C"), v4 = c(18.1853007621835,
>>> > 3.43806581506388,
>>> > 0.002733567617055, 1.42917483425029, 1.05786640463504,
>>> > 0.000420548864162308,
>>> > 2.37232740842861, 3.01835841813241, 0, 1.13430282139936,
>>> > 0.928725667117666,
>>> > 0)), .Names = c("v1", "v2",
"v3", "v4"), class = "data.frame", row.names
>>> > >>> > c(2L,
>>> > 9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L, 206L))
>>> >
>>> > I need to find a automated procedure that allows me to adjust
v3
>>> > marginals
>>> > while maintaining v1xv2 marginals unchanged.
>>> >
>>> > That is: modify the v4 values you can find by running:
>>> >
>>> > aggregate(f1[,c("v4")],list(f1$v3),sum)
>>> >
>>> > while maintaining costant the values you can find by running:
>>> >
>>> > aggregate(f1[,c("v4")],list(f1$v1,f1$v2),sum)
>>> >
>>> > Now does it make sense?
>>> >
>>> > Please notice I have tried to build some syntax that tries to
modify
>>> > values
>>> > within each v1xv2 combination by computing sum of v4, row
percentage in
>>> > terms of v4, and there is where my effort is blocked. Not
really sure
>>> > how I
>>> > should proceed. Any suggestion?
>>> >
>>> > Thanks,
>>> >
>>> > Luca
>>> >
>>> >
>>> > 2015-03-19 2:38 GMT+01:00 Jeff Newmiller <jdnewmil at
dcn.davis.ca.us>:
>>> >
>>> >> I don't understand your description. The standard
practice on this list
>>> >> is
>>> >> to provide a reproducible R example [1] of the kind of
data you are
>>> >> working
>>> >> with (and any code you have tried) to go along with your
description.
>>> >> In
>>> >> this case, that would be two dputs of your input data
frames and a dput
>>> >> of
>>> >> an output data frame (generated by hand from your input
data frame).
>>> >> (Probably best to not use the full number of input values
just to keep
>>> >> the
>>> >> size down.) We could then make an attempt to generate code
that goes
>>> >> from
>>> >> input to output.
>>> >>
>>> >> Of course, if you post that hard work using HTML then it
will get
>>> >> corrupted (much like the text below from your earlier
emails) and we
>>> >> won't
>>> >> be able to use it. Please learn to post from your email
software using
>>> >> plain text when corresponding with this mailing list.
>>> >>
>>> >> [1]
>>> >>
>>> >>
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
>>> >>
>>> >>
---------------------------------------------------------------------------
>>> >> Jeff Newmiller                        The     .....      
.....  Go
>>> >> Live...
>>> >> DCN:<jdnewmil at dcn.davis.ca.us>        Basics:
##.#.       ##.#.  Live
>>> >> Go...
>>> >>                                       Live:   OO#.. Dead:
OO#..
>>> >> Playing
>>> >> Research Engineer (Solar/Batteries            O.O#.      
#.O#.  with
>>> >> /Software/Embedded Controllers)               .OO#.      
.OO#.
>>> >> rocks...1k
>>> >>
>>> >>
---------------------------------------------------------------------------
>>> >> Sent from my phone. Please excuse my brevity.
>>> >>
>>> >> On March 18, 2015 9:05:37 AM PDT, Luca Meyer <lucam1968
at gmail.com>
>>> >> wrote:
>>> >> >Thanks for you input Michael,
>>> >> >
>>> >> >The continuous variable I have measures quantities
(down to the 3rd
>>> >> >decimal level) so unfortunately are not frequencies.
>>> >> >
>>> >> >Any more specific suggestions on how that could be
tackled?
>>> >> >
>>> >> >Thanks & kind regards,
>>> >> >
>>> >> >Luca
>>> >> >
>>> >> >
>>> >> >==>>> >> >
>>> >> >Michael Friendly wrote:
>>> >> >I'm not sure I understand completely what you want
to do, but
>>> >> >if the data were frequencies, it sounds like task for
fitting a
>>> >> >loglinear model with the model formula
>>> >> >
>>> >> >~ V1*V2 + V3
>>> >> >
>>> >> >On 3/18/2015 2:17 AM, Luca Meyer wrote:
>>> >> >>* Hello,
>>> >> >*>>* I am facing a quite challenging task (at
least to me) and I was
>>> >> >wondering
>>> >> >*>* if someone could advise how R could assist me
to speed the task
>>> >> > up.
>>> >> >*>>* I am dealing with a dataset with 3 discrete
variables and one
>>> >> >continuous
>>> >> >*>* variable. The discrete variables are:
>>> >> >*>>* V1: 8 modalities
>>> >> >*>* V2: 13 modalities
>>> >> >*>* V3: 13 modalities
>>> >> >*>>* The continuous variable V4 is a decimal
number always greater
>>> >> > than
>>> >> >zero in
>>> >> >*>* the marginals of each of the 3 variables but it
is sometimes equal
>>> >> >to zero
>>> >> >*>* (and sometimes negative) in the joint tables.
>>> >> >*>>* I have got 2 files:
>>> >> >*>>* => one with distribution of all possible
combinations of V1xV2
>>> >> >(some of
>>> >> >*>* which are zero or neagtive) and
>>> >> >*>* => one with the marginal distribution of V3.
>>> >> >*>>* I am trying to build the long and narrow
dataset V1xV2xV3 in such
>>> >> >a way
>>> >> >*>* that each V1xV2 cell does not get modified and
V3 fits as closely
>>> >> >as
>>> >> >*>* possible to its marginal distribution. Does it
make sense?
>>> >> >*>>* To be even more specific, my 2 input files
look like the
>>> >> >following.
>>> >> >*>>* FILE 1
>>> >> >*>* V1,V2,V4
>>> >> >*>* A, A, 24.251
>>> >> >*>* A, B, 1.065
>>> >> >*>* (...)
>>> >> >*>* B, C, 0.294
>>> >> >*>* B, D, 2.731
>>> >> >*>* (...)
>>> >> >*>* H, L, 0.345
>>> >> >*>* H, M, 0.000
>>> >> >*>>* FILE 2
>>> >> >*>* V3, V4
>>> >> >*>* A, 1.575
>>> >> >*>* B, 4.294
>>> >> >*>* C, 10.044
>>> >> >*>* (...)
>>> >> >*>* L, 5.123
>>> >> >*>* M, 3.334
>>> >> >*>>* What I need to achieve is a file such as
the following
>>> >> >*>>* FILE 3
>>> >> >*>* V1, V2, V3, V4
>>> >> >*>* A, A, A, ???
>>> >> >*>* A, A, B, ???
>>> >> >*>* (...)
>>> >> >*>* D, D, E, ???
>>> >> >*>* D, D, F, ???
>>> >> >*>* (...)
>>> >> >*>* H, M, L, ???
>>> >> >*>* H, M, M, ???
>>> >> >*>>* Please notice that FILE 3 need to be such
that if I aggregate on
>>> >> >V1+V2 I
>>> >> >*>* recover exactly FILE 1 and that if I aggregate
on V3 I can recover
>>> >> >a file
>>> >> >*>* as close as possible to FILE 3 (ideally the
same file).
>>> >> >*>>* Can anyone suggest how I could do that with
R?
>>> >> >*>>* Thank you very much indeed for any
assistance you are able to
>>> >> >provide.
>>> >> >*>>* Kind regards,
>>> >> >*>>* Luca*
>>> >> >
>>> >> >       [[alternative HTML version deleted]]
>>> >> >
>>> >> >______________________________________________
>>> >> >R-help at r-project.org mailing list -- To UNSUBSCRIBE
and more, see
>>> >> >https://stat.ethz.ch/mailman/listinfo/r-help
>>> >> >PLEASE do read the posting guide
>>> >> >http://www.R-project.org/posting-guide.html
>>> >> >and provide commented, minimal, self-contained,
reproducible code.
>>> >>
>>> >>
>>> >
>>> >         [[alternative HTML version deleted]]
>>> >
>>> > ______________________________________________
>>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > PLEASE do read the posting guide
>>> > http://www.R-project.org/posting-guide.html
>>> > and provide commented, minimal, self-contained, reproducible
code.
>>
>>

Luca Meyer

2015-Mar-22 09:00 UTC

head link

[R] Joining two datasets - recursive procedure?

Hi Bert, hello R-experts,

I am close to a solution but I still need one hint w.r.t. the following
procedure (available also from
https://www.dropbox.com/s/qhmpkkrejjkpbkx/sample_code.txt?dl=0)

rm(list=ls())

# this is (an extract of) the INPUT file I have:
f1 <- structure(list(v1 = c("A", "A", "A",
"A", "A", "A", "B", "B",
"B",
"B", "B", "B"), v2 = c("A",
"B", "C", "A", "B", "C",
"A", "B", "C", "A",
"B", "C"), v3 = c("B", "B",
"B", "C", "C", "C", "B",
"B", "B", "C", "C",
"C"), v4 = c(18.18530, 3.43806,0.00273, 1.42917, 1.05786, 0.00042,
2.37232,
3.01835, 0, 1.13430, 0.92872, 0)), .Names = c("v1", "v2",
"v3", "v4"),
class = "data.frame", row.names = c(2L, 9L, 11L, 41L, 48L, 50L, 158L,
165L,
167L, 197L, 204L, 206L))

# this is the procedure that Bert suggested (slightly adjusted):
z <- rnorm(nrow(f1)) ## or anything you want
z1 <- round(with(f1,v4 + z -ave(z,v1,v2,FUN=mean)), digits=5)
aggregate(v4~v1*v2,f1,sum)
aggregate(z1~v1*v2,f1,sum)
aggregate(v4~v3,f1,sum)
aggregate(z1~v3,f1,sum)

My question to you is: how can I set z so that I can obtain specific values
for z1-v4 in the v3 aggregation?
In other words, how can I configure the procedure so that e.g. B=29 and
C=2.56723 after running the procedure:
aggregate(z1~v3,f1,sum)

Thank you,

Luca

PS: to avoid any doubts you might have about who I am the following is my
web page: http://lucameyer.wordpress.com/


2015-03-21 18:13 GMT+01:00 Bert Gunter <gunter.berton at gene.com>:
> ... or cleaner:
>
> z1 <- with(f1,v4 + z -ave(z,v1,v2,FUN=mean))
>
>
> Just for curiosity, was this homework? (in which case I should
> probably have not provided you an answer -- that is, assuming that I
> HAVE provided an answer).
>
> Cheers,
> Bert
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
> (650) 467-7374
>
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
> Clifford Stoll
>
>
>
>
> On Sat, Mar 21, 2015 at 7:53 AM, Bert Gunter <bgunter at gene.com>
wrote:
> > z <- rnorm(nrow(f1)) ## or anything you want
> > z1 <- f1$v4 + z - with(f1,ave(z,v1,v2,FUN=mean))
> >
> >
> > aggregate(v4~v1,f1,sum)
> > aggregate(z1~v1,f1,sum)
> > aggregate(v4~v2,f1,sum)
> > aggregate(z1~v2,f1,sum)
> > aggregate(v4~v3,f1,sum)
> > aggregate(z1~v3,f1,sum)
> >
> >
> > Cheers,
> > Bert
> >
> > Bert Gunter
> > Genentech Nonclinical Biostatistics
> > (650) 467-7374
> >
> > "Data is not information. Information is not knowledge. And
knowledge
> > is certainly not wisdom."
> > Clifford Stoll
> >
> >
> >
> >
> > On Sat, Mar 21, 2015 at 6:49 AM, Luca Meyer <lucam1968 at
gmail.com> wrote:
> >> Hi Bert,
> >>
> >> Thank you for your message. I am looking into ave() and tapply()
as you
> >> suggested but at the same time I have prepared a example of input
and
> output
> >> files, just in case you or someone else would like to make an
attempt to
> >> generate a code that goes from input to output.
> >>
> >> Please see below or download it from
> >> https://www.dropbox.com/s/qhmpkkrejjkpbkx/sample_code.txt?dl=0
> >>
> >> # this is (an extract of) the INPUT file I have:
> >> f1 <- structure(list(v1 = c("A", "A",
"A", "A", "A", "A", "B",
"B",
> >> "B", "B", "B", "B"), v2 =
c("A", "B", "C", "A", "B",
"C", "A",
> >> "B", "C", "A", "B",
"C"), v3 = c("B", "B", "B",
"C", "C", "C",
> >> "B", "B", "B", "C",
"C", "C"), v4 = c(18.18530, 3.43806,0.00273,
> 1.42917,
> >> 1.05786, 0.00042, 2.37232, 3.01835, 0, 1.13430, 0.92872,
> >> 0)), .Names = c("v1", "v2", "v3",
"v4"), class = "data.frame",
> row.names > >> c(2L,
> >> 9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L, 206L))
> >>
> >> # this is (an extract of) the OUTPUT file I would like to obtain:
> >> f2 <- structure(list(v1 = c("A", "A",
"A", "A", "A", "A", "B",
"B",
> >> "B", "B", "B", "B"), v2 =
c("A", "B", "C", "A", "B",
"C", "A",
> >> "B", "C", "A", "B",
"C"), v3 = c("B", "B", "B",
"C", "C", "C",
> >> "B", "B", "B", "C",
"C", "C"), v4 = c(17.83529, 3.43806,0.00295,
> 1.77918,
> >> 1.05786, 0.0002, 2.37232, 3.01835, 0, 1.13430, 0.92872,
> >> 0)), .Names = c("v1", "v2", "v3",
"v4"), class = "data.frame",
> row.names > >> c(2L,
> >> 9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L, 206L))
> >>
> >> # please notice that while the aggregated v4 on v3 has changed ?
> >> aggregate(f1[,c("v4")],list(f1$v3),sum)
> >> aggregate(f2[,c("v4")],list(f2$v3),sum)
> >>
> >> # ? the aggregated v4 over v1xv2 has remained unchanged:
> >> aggregate(f1[,c("v4")],list(f1$v1,f1$v2),sum)
> >> aggregate(f2[,c("v4")],list(f2$v1,f2$v2),sum)
> >>
> >> Thank you very much in advance for your assitance.
> >>
> >> Luca
> >>
> >> 2015-03-21 13:18 GMT+01:00 Bert Gunter <gunter.berton at
gene.com>:
> >>>
> >>> 1. Still not sure what you mean, but maybe look at ?ave and
?tapply,
> >>> for which ave() is a wrapper.
> >>>
> >>> 2. You still need to heed the rest of Jeff's advice.
> >>>
> >>> Cheers,
> >>> Bert
> >>>
> >>> Bert Gunter
> >>> Genentech Nonclinical Biostatistics
> >>> (650) 467-7374
> >>>
> >>> "Data is not information. Information is not knowledge.
And knowledge
> >>> is certainly not wisdom."
> >>> Clifford Stoll
> >>>
> >>>
> >>>
> >>>
> >>> On Sat, Mar 21, 2015 at 4:53 AM, Luca Meyer <lucam1968 at
gmail.com>
> wrote:
> >>> > Hi Jeff & other R-experts,
> >>> >
> >>> > Thank you for your note. I have tried myself to solve the
issue
> without
> >>> > success.
> >>> >
> >>> > Following your suggestion, I am providing a sample of the
dataset I
> am
> >>> > using below (also downloadble in plain text from
> >>> >
https://www.dropbox.com/s/qhmpkkrejjkpbkx/sample_code.txt?dl=0):
> >>> >
> >>> > #this is an extract of the overall dataset (n=1200 cases)
> >>> > f1 <- structure(list(v1 = c("A",
"A", "A", "A", "A", "A",
"B", "B",
> >>> > "B", "B", "B",
"B"), v2 = c("A", "B", "C",
"A", "B", "C", "A",
> >>> > "B", "C", "A",
"B", "C"), v3 = c("B", "B",
"B", "C", "C", "C",
> >>> > "B", "B", "B",
"C", "C", "C"), v4 = c(18.1853007621835,
> >>> > 3.43806581506388,
> >>> > 0.002733567617055, 1.42917483425029, 1.05786640463504,
> >>> > 0.000420548864162308,
> >>> > 2.37232740842861, 3.01835841813241, 0, 1.13430282139936,
> >>> > 0.928725667117666,
> >>> > 0)), .Names = c("v1", "v2",
"v3", "v4"), class = "data.frame",
> row.names
> >>> > > >>> > c(2L,
> >>> > 9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L,
206L))
> >>> >
> >>> > I need to find a automated procedure that allows me to
adjust v3
> >>> > marginals
> >>> > while maintaining v1xv2 marginals unchanged.
> >>> >
> >>> > That is: modify the v4 values you can find by running:
> >>> >
> >>> > aggregate(f1[,c("v4")],list(f1$v3),sum)
> >>> >
> >>> > while maintaining costant the values you can find by
running:
> >>> >
> >>> > aggregate(f1[,c("v4")],list(f1$v1,f1$v2),sum)
> >>> >
> >>> > Now does it make sense?
> >>> >
> >>> > Please notice I have tried to build some syntax that
tries to modify
> >>> > values
> >>> > within each v1xv2 combination by computing sum of v4, row
percentage
> in
> >>> > terms of v4, and there is where my effort is blocked. Not
really sure
> >>> > how I
> >>> > should proceed. Any suggestion?
> >>> >
> >>> > Thanks,
> >>> >
> >>> > Luca
> >>> >
> >>> >
> >>> > 2015-03-19 2:38 GMT+01:00 Jeff Newmiller <jdnewmil at
dcn.davis.ca.us>:
> >>> >
> >>> >> I don't understand your description. The standard
practice on this
> list
> >>> >> is
> >>> >> to provide a reproducible R example [1] of the kind
of data you are
> >>> >> working
> >>> >> with (and any code you have tried) to go along with
your
> description.
> >>> >> In
> >>> >> this case, that would be two dputs of your input data
frames and a
> dput
> >>> >> of
> >>> >> an output data frame (generated by hand from your
input data frame).
> >>> >> (Probably best to not use the full number of input
values just to
> keep
> >>> >> the
> >>> >> size down.) We could then make an attempt to generate
code that goes
> >>> >> from
> >>> >> input to output.
> >>> >>
> >>> >> Of course, if you post that hard work using HTML then
it will get
> >>> >> corrupted (much like the text below from your earlier
emails) and we
> >>> >> won't
> >>> >> be able to use it. Please learn to post from your
email software
> using
> >>> >> plain text when corresponding with this mailing list.
> >>> >>
> >>> >> [1]
> >>> >>
> >>> >>
>
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> >>> >>
> >>> >>
> ---------------------------------------------------------------------------
> >>> >> Jeff Newmiller                        The     .....  
.....  Go
> >>> >> Live...
> >>> >> DCN:<jdnewmil at dcn.davis.ca.us>       
Basics: ##.#.       ##.#.
> Live
> >>> >> Go...
> >>> >>                                       Live:   OO#..
Dead: OO#..
> >>> >> Playing
> >>> >> Research Engineer (Solar/Batteries            O.O#.  
#.O#.
> with
> >>> >> /Software/Embedded Controllers)               .OO#.  
.OO#.
> >>> >> rocks...1k
> >>> >>
> >>> >>
> ---------------------------------------------------------------------------
> >>> >> Sent from my phone. Please excuse my brevity.
> >>> >>
> >>> >> On March 18, 2015 9:05:37 AM PDT, Luca Meyer
<lucam1968 at gmail.com>
> >>> >> wrote:
> >>> >> >Thanks for you input Michael,
> >>> >> >
> >>> >> >The continuous variable I have measures
quantities (down to the 3rd
> >>> >> >decimal level) so unfortunately are not
frequencies.
> >>> >> >
> >>> >> >Any more specific suggestions on how that could
be tackled?
> >>> >> >
> >>> >> >Thanks & kind regards,
> >>> >> >
> >>> >> >Luca
> >>> >> >
> >>> >> >
> >>> >> >==> >>> >> >
> >>> >> >Michael Friendly wrote:
> >>> >> >I'm not sure I understand completely what you
want to do, but
> >>> >> >if the data were frequencies, it sounds like task
for fitting a
> >>> >> >loglinear model with the model formula
> >>> >> >
> >>> >> >~ V1*V2 + V3
> >>> >> >
> >>> >> >On 3/18/2015 2:17 AM, Luca Meyer wrote:
> >>> >> >>* Hello,
> >>> >> >*>>* I am facing a quite challenging task
(at least to me) and I
> was
> >>> >> >wondering
> >>> >> >*>* if someone could advise how R could assist
me to speed the task
> >>> >> > up.
> >>> >> >*>>* I am dealing with a dataset with 3
discrete variables and one
> >>> >> >continuous
> >>> >> >*>* variable. The discrete variables are:
> >>> >> >*>>* V1: 8 modalities
> >>> >> >*>* V2: 13 modalities
> >>> >> >*>* V3: 13 modalities
> >>> >> >*>>* The continuous variable V4 is a
decimal number always greater
> >>> >> > than
> >>> >> >zero in
> >>> >> >*>* the marginals of each of the 3 variables
but it is sometimes
> equal
> >>> >> >to zero
> >>> >> >*>* (and sometimes negative) in the joint
tables.
> >>> >> >*>>* I have got 2 files:
> >>> >> >*>>* => one with distribution of all
possible combinations of V1xV2
> >>> >> >(some of
> >>> >> >*>* which are zero or neagtive) and
> >>> >> >*>* => one with the marginal distribution
of V3.
> >>> >> >*>>* I am trying to build the long and
narrow dataset V1xV2xV3 in
> such
> >>> >> >a way
> >>> >> >*>* that each V1xV2 cell does not get modified
and V3 fits as
> closely
> >>> >> >as
> >>> >> >*>* possible to its marginal distribution.
Does it make sense?
> >>> >> >*>>* To be even more specific, my 2 input
files look like the
> >>> >> >following.
> >>> >> >*>>* FILE 1
> >>> >> >*>* V1,V2,V4
> >>> >> >*>* A, A, 24.251
> >>> >> >*>* A, B, 1.065
> >>> >> >*>* (...)
> >>> >> >*>* B, C, 0.294
> >>> >> >*>* B, D, 2.731
> >>> >> >*>* (...)
> >>> >> >*>* H, L, 0.345
> >>> >> >*>* H, M, 0.000
> >>> >> >*>>* FILE 2
> >>> >> >*>* V3, V4
> >>> >> >*>* A, 1.575
> >>> >> >*>* B, 4.294
> >>> >> >*>* C, 10.044
> >>> >> >*>* (...)
> >>> >> >*>* L, 5.123
> >>> >> >*>* M, 3.334
> >>> >> >*>>* What I need to achieve is a file such
as the following
> >>> >> >*>>* FILE 3
> >>> >> >*>* V1, V2, V3, V4
> >>> >> >*>* A, A, A, ???
> >>> >> >*>* A, A, B, ???
> >>> >> >*>* (...)
> >>> >> >*>* D, D, E, ???
> >>> >> >*>* D, D, F, ???
> >>> >> >*>* (...)
> >>> >> >*>* H, M, L, ???
> >>> >> >*>* H, M, M, ???
> >>> >> >*>>* Please notice that FILE 3 need to be
such that if I aggregate
> on
> >>> >> >V1+V2 I
> >>> >> >*>* recover exactly FILE 1 and that if I
aggregate on V3 I can
> recover
> >>> >> >a file
> >>> >> >*>* as close as possible to FILE 3 (ideally
the same file).
> >>> >> >*>>* Can anyone suggest how I could do that
with R?
> >>> >> >*>>* Thank you very much indeed for any
assistance you are able to
> >>> >> >provide.
> >>> >> >*>>* Kind regards,
> >>> >> >*>>* Luca*
> >>> >> >
> >>> >> >       [[alternative HTML version deleted]]
> >>> >> >
> >>> >> >______________________________________________
> >>> >> >R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more, see
> >>> >> >https://stat.ethz.ch/mailman/listinfo/r-help
> >>> >> >PLEASE do read the posting guide
> >>> >> >http://www.R-project.org/posting-guide.html
> >>> >> >and provide commented, minimal, self-contained,
reproducible code.
> >>> >>
> >>> >>
> >>> >
> >>> >         [[alternative HTML version deleted]]
> >>> >
> >>> > ______________________________________________
> >>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE
and more, see
> >>> > https://stat.ethz.ch/mailman/listinfo/r-help
> >>> > PLEASE do read the posting guide
> >>> > http://www.R-project.org/posting-guide.html
> >>> > and provide commented, minimal, self-contained,
reproducible code.
> >>
> >>
>
	[[alternative HTML version deleted]]

Bert Gunter

2015-Mar-22 14:55 UTC

head link

[R] Joining two datasets - recursive procedure?

I would have thought that this is straightforward given my previous email...

Just set z to what you want -- e,g, all B values to 29/number of B's,
and all C values to 2.567/number of C's (etc. for more categories).

A slick but sort of cheat way to do this programmatically -- in the
sense that it relies on the implementation of factor() rather than its
API -- is:

y <- f1$v3  ## to simplify the notation; could be done using with()
z <- (c(29,2.567)/table(y))[c(y)]

Then proceed to z1 as I previously described

-- Bert


Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll




On Sun, Mar 22, 2015 at 2:00 AM, Luca Meyer <lucam1968 at gmail.com>
wrote:> Hi Bert, hello R-experts,
>
> I am close to a solution but I still need one hint w.r.t. the following
> procedure (available also from
> https://www.dropbox.com/s/qhmpkkrejjkpbkx/sample_code.txt?dl=0)
>
> rm(list=ls())
>
> # this is (an extract of) the INPUT file I have:
> f1 <- structure(list(v1 = c("A", "A", "A",
"A", "A", "A", "B", "B",
"B",
> "B", "B", "B"), v2 = c("A",
"B", "C", "A", "B", "C",
"A", "B", "C", "A",
> "B", "C"), v3 = c("B", "B",
"B", "C", "C", "C", "B",
"B", "B", "C", "C",
> "C"), v4 = c(18.18530, 3.43806,0.00273, 1.42917, 1.05786,
0.00042, 2.37232,
> 3.01835, 0, 1.13430, 0.92872, 0)), .Names = c("v1",
"v2", "v3", "v4"), class
> = "data.frame", row.names = c(2L, 9L, 11L, 41L, 48L, 50L, 158L,
165L, 167L,
> 197L, 204L, 206L))
>
> # this is the procedure that Bert suggested (slightly adjusted):
> z <- rnorm(nrow(f1)) ## or anything you want
> z1 <- round(with(f1,v4 + z -ave(z,v1,v2,FUN=mean)), digits=5)
> aggregate(v4~v1*v2,f1,sum)
> aggregate(z1~v1*v2,f1,sum)
> aggregate(v4~v3,f1,sum)
> aggregate(z1~v3,f1,sum)
>
> My question to you is: how can I set z so that I can obtain specific values
> for z1-v4 in the v3 aggregation?
> In other words, how can I configure the procedure so that e.g. B=29 and
> C=2.56723 after running the procedure:
> aggregate(z1~v3,f1,sum)
>
> Thank you,
>
> Luca
>
> PS: to avoid any doubts you might have about who I am the following is my
> web page: http://lucameyer.wordpress.com/
>
>
> 2015-03-21 18:13 GMT+01:00 Bert Gunter <gunter.berton at gene.com>:
>>
>> ... or cleaner:
>>
>> z1 <- with(f1,v4 + z -ave(z,v1,v2,FUN=mean))
>>
>>
>> Just for curiosity, was this homework? (in which case I should
>> probably have not provided you an answer -- that is, assuming that I
>> HAVE provided an answer).
>>
>> Cheers,
>> Bert
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>> (650) 467-7374
>>
>> "Data is not information. Information is not knowledge. And
knowledge
>> is certainly not wisdom."
>> Clifford Stoll
>>
>>
>>
>>
>> On Sat, Mar 21, 2015 at 7:53 AM, Bert Gunter <bgunter at
gene.com> wrote:
>> > z <- rnorm(nrow(f1)) ## or anything you want
>> > z1 <- f1$v4 + z - with(f1,ave(z,v1,v2,FUN=mean))
>> >
>> >
>> > aggregate(v4~v1,f1,sum)
>> > aggregate(z1~v1,f1,sum)
>> > aggregate(v4~v2,f1,sum)
>> > aggregate(z1~v2,f1,sum)
>> > aggregate(v4~v3,f1,sum)
>> > aggregate(z1~v3,f1,sum)
>> >
>> >
>> > Cheers,
>> > Bert
>> >
>> > Bert Gunter
>> > Genentech Nonclinical Biostatistics
>> > (650) 467-7374
>> >
>> > "Data is not information. Information is not knowledge. And
knowledge
>> > is certainly not wisdom."
>> > Clifford Stoll
>> >
>> >
>> >
>> >
>> > On Sat, Mar 21, 2015 at 6:49 AM, Luca Meyer <lucam1968 at
gmail.com> wrote:
>> >> Hi Bert,
>> >>
>> >> Thank you for your message. I am looking into ave() and
tapply() as you
>> >> suggested but at the same time I have prepared a example of
input and
>> >> output
>> >> files, just in case you or someone else would like to make an
attempt
>> >> to
>> >> generate a code that goes from input to output.
>> >>
>> >> Please see below or download it from
>> >> https://www.dropbox.com/s/qhmpkkrejjkpbkx/sample_code.txt?dl=0
>> >>
>> >> # this is (an extract of) the INPUT file I have:
>> >> f1 <- structure(list(v1 = c("A", "A",
"A", "A", "A", "A", "B",
"B",
>> >> "B", "B", "B", "B"),
v2 = c("A", "B", "C", "A",
"B", "C", "A",
>> >> "B", "C", "A", "B",
"C"), v3 = c("B", "B", "B",
"C", "C", "C",
>> >> "B", "B", "B", "C",
"C", "C"), v4 = c(18.18530, 3.43806,0.00273,
>> >> 1.42917,
>> >> 1.05786, 0.00042, 2.37232, 3.01835, 0, 1.13430, 0.92872,
>> >> 0)), .Names = c("v1", "v2",
"v3", "v4"), class = "data.frame",
>> >> row.names >> >> c(2L,
>> >> 9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L, 206L))
>> >>
>> >> # this is (an extract of) the OUTPUT file I would like to
obtain:
>> >> f2 <- structure(list(v1 = c("A", "A",
"A", "A", "A", "A", "B",
"B",
>> >> "B", "B", "B", "B"),
v2 = c("A", "B", "C", "A",
"B", "C", "A",
>> >> "B", "C", "A", "B",
"C"), v3 = c("B", "B", "B",
"C", "C", "C",
>> >> "B", "B", "B", "C",
"C", "C"), v4 = c(17.83529, 3.43806,0.00295,
>> >> 1.77918,
>> >> 1.05786, 0.0002, 2.37232, 3.01835, 0, 1.13430, 0.92872,
>> >> 0)), .Names = c("v1", "v2",
"v3", "v4"), class = "data.frame",
>> >> row.names >> >> c(2L,
>> >> 9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L, 206L))
>> >>
>> >> # please notice that while the aggregated v4 on v3 has changed
?
>> >> aggregate(f1[,c("v4")],list(f1$v3),sum)
>> >> aggregate(f2[,c("v4")],list(f2$v3),sum)
>> >>
>> >> # ? the aggregated v4 over v1xv2 has remained unchanged:
>> >> aggregate(f1[,c("v4")],list(f1$v1,f1$v2),sum)
>> >> aggregate(f2[,c("v4")],list(f2$v1,f2$v2),sum)
>> >>
>> >> Thank you very much in advance for your assitance.
>> >>
>> >> Luca
>> >>
>> >> 2015-03-21 13:18 GMT+01:00 Bert Gunter <gunter.berton at
gene.com>:
>> >>>
>> >>> 1. Still not sure what you mean, but maybe look at ?ave
and ?tapply,
>> >>> for which ave() is a wrapper.
>> >>>
>> >>> 2. You still need to heed the rest of Jeff's advice.
>> >>>
>> >>> Cheers,
>> >>> Bert
>> >>>
>> >>> Bert Gunter
>> >>> Genentech Nonclinical Biostatistics
>> >>> (650) 467-7374
>> >>>
>> >>> "Data is not information. Information is not
knowledge. And knowledge
>> >>> is certainly not wisdom."
>> >>> Clifford Stoll
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Sat, Mar 21, 2015 at 4:53 AM, Luca Meyer <lucam1968
at gmail.com>
>> >>> wrote:
>> >>> > Hi Jeff & other R-experts,
>> >>> >
>> >>> > Thank you for your note. I have tried myself to solve
the issue
>> >>> > without
>> >>> > success.
>> >>> >
>> >>> > Following your suggestion, I am providing a sample of
the dataset I
>> >>> > am
>> >>> > using below (also downloadble in plain text from
>> >>> >
https://www.dropbox.com/s/qhmpkkrejjkpbkx/sample_code.txt?dl=0):
>> >>> >
>> >>> > #this is an extract of the overall dataset (n=1200
cases)
>> >>> > f1 <- structure(list(v1 = c("A",
"A", "A", "A", "A", "A",
"B", "B",
>> >>> > "B", "B", "B",
"B"), v2 = c("A", "B", "C",
"A", "B", "C", "A",
>> >>> > "B", "C", "A",
"B", "C"), v3 = c("B", "B",
"B", "C", "C", "C",
>> >>> > "B", "B", "B",
"C", "C", "C"), v4 = c(18.1853007621835,
>> >>> > 3.43806581506388,
>> >>> > 0.002733567617055, 1.42917483425029,
1.05786640463504,
>> >>> > 0.000420548864162308,
>> >>> > 2.37232740842861, 3.01835841813241, 0,
1.13430282139936,
>> >>> > 0.928725667117666,
>> >>> > 0)), .Names = c("v1", "v2",
"v3", "v4"), class = "data.frame",
>> >>> > row.names
>> >>> > >> >>> > c(2L,
>> >>> > 9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L,
206L))
>> >>> >
>> >>> > I need to find a automated procedure that allows me
to adjust v3
>> >>> > marginals
>> >>> > while maintaining v1xv2 marginals unchanged.
>> >>> >
>> >>> > That is: modify the v4 values you can find by
running:
>> >>> >
>> >>> > aggregate(f1[,c("v4")],list(f1$v3),sum)
>> >>> >
>> >>> > while maintaining costant the values you can find by
running:
>> >>> >
>> >>> >
aggregate(f1[,c("v4")],list(f1$v1,f1$v2),sum)
>> >>> >
>> >>> > Now does it make sense?
>> >>> >
>> >>> > Please notice I have tried to build some syntax that
tries to modify
>> >>> > values
>> >>> > within each v1xv2 combination by computing sum of v4,
row percentage
>> >>> > in
>> >>> > terms of v4, and there is where my effort is blocked.
Not really
>> >>> > sure
>> >>> > how I
>> >>> > should proceed. Any suggestion?
>> >>> >
>> >>> > Thanks,
>> >>> >
>> >>> > Luca
>> >>> >
>> >>> >
>> >>> > 2015-03-19 2:38 GMT+01:00 Jeff Newmiller <jdnewmil
at dcn.davis.ca.us>:
>> >>> >
>> >>> >> I don't understand your description. The
standard practice on this
>> >>> >> list
>> >>> >> is
>> >>> >> to provide a reproducible R example [1] of the
kind of data you are
>> >>> >> working
>> >>> >> with (and any code you have tried) to go along
with your
>> >>> >> description.
>> >>> >> In
>> >>> >> this case, that would be two dputs of your input
data frames and a
>> >>> >> dput
>> >>> >> of
>> >>> >> an output data frame (generated by hand from your
input data
>> >>> >> frame).
>> >>> >> (Probably best to not use the full number of
input values just to
>> >>> >> keep
>> >>> >> the
>> >>> >> size down.) We could then make an attempt to
generate code that
>> >>> >> goes
>> >>> >> from
>> >>> >> input to output.
>> >>> >>
>> >>> >> Of course, if you post that hard work using HTML
then it will get
>> >>> >> corrupted (much like the text below from your
earlier emails) and
>> >>> >> we
>> >>> >> won't
>> >>> >> be able to use it. Please learn to post from your
email software
>> >>> >> using
>> >>> >> plain text when corresponding with this mailing
list.
>> >>> >>
>> >>> >> [1]
>> >>> >>
>> >>> >>
>> >>> >>
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
>> >>> >>
>> >>> >>
>> >>> >>
---------------------------------------------------------------------------
>> >>> >> Jeff Newmiller                        The    
.....       .....  Go
>> >>> >> Live...
>> >>> >> DCN:<jdnewmil at dcn.davis.ca.us>       
Basics: ##.#.       ##.#.
>> >>> >> Live
>> >>> >> Go...
>> >>> >>                                       Live:  
OO#.. Dead: OO#..
>> >>> >> Playing
>> >>> >> Research Engineer (Solar/Batteries           
O.O#.       #.O#.
>> >>> >> with
>> >>> >> /Software/Embedded Controllers)              
.OO#.       .OO#.
>> >>> >> rocks...1k
>> >>> >>
>> >>> >>
>> >>> >>
---------------------------------------------------------------------------
>> >>> >> Sent from my phone. Please excuse my brevity.
>> >>> >>
>> >>> >> On March 18, 2015 9:05:37 AM PDT, Luca Meyer
<lucam1968 at gmail.com>
>> >>> >> wrote:
>> >>> >> >Thanks for you input Michael,
>> >>> >> >
>> >>> >> >The continuous variable I have measures
quantities (down to the
>> >>> >> > 3rd
>> >>> >> >decimal level) so unfortunately are not
frequencies.
>> >>> >> >
>> >>> >> >Any more specific suggestions on how that
could be tackled?
>> >>> >> >
>> >>> >> >Thanks & kind regards,
>> >>> >> >
>> >>> >> >Luca
>> >>> >> >
>> >>> >> >
>> >>> >> >==>> >>> >> >
>> >>> >> >Michael Friendly wrote:
>> >>> >> >I'm not sure I understand completely what
you want to do, but
>> >>> >> >if the data were frequencies, it sounds like
task for fitting a
>> >>> >> >loglinear model with the model formula
>> >>> >> >
>> >>> >> >~ V1*V2 + V3
>> >>> >> >
>> >>> >> >On 3/18/2015 2:17 AM, Luca Meyer wrote:
>> >>> >> >>* Hello,
>> >>> >> >*>>* I am facing a quite challenging
task (at least to me) and I
>> >>> >> > was
>> >>> >> >wondering
>> >>> >> >*>* if someone could advise how R could
assist me to speed the
>> >>> >> > task
>> >>> >> > up.
>> >>> >> >*>>* I am dealing with a dataset with 3
discrete variables and one
>> >>> >> >continuous
>> >>> >> >*>* variable. The discrete variables are:
>> >>> >> >*>>* V1: 8 modalities
>> >>> >> >*>* V2: 13 modalities
>> >>> >> >*>* V3: 13 modalities
>> >>> >> >*>>* The continuous variable V4 is a
decimal number always greater
>> >>> >> > than
>> >>> >> >zero in
>> >>> >> >*>* the marginals of each of the 3
variables but it is sometimes
>> >>> >> > equal
>> >>> >> >to zero
>> >>> >> >*>* (and sometimes negative) in the joint
tables.
>> >>> >> >*>>* I have got 2 files:
>> >>> >> >*>>* => one with distribution of all
possible combinations of
>> >>> >> > V1xV2
>> >>> >> >(some of
>> >>> >> >*>* which are zero or neagtive) and
>> >>> >> >*>* => one with the marginal
distribution of V3.
>> >>> >> >*>>* I am trying to build the long and
narrow dataset V1xV2xV3 in
>> >>> >> > such
>> >>> >> >a way
>> >>> >> >*>* that each V1xV2 cell does not get
modified and V3 fits as
>> >>> >> > closely
>> >>> >> >as
>> >>> >> >*>* possible to its marginal distribution.
Does it make sense?
>> >>> >> >*>>* To be even more specific, my 2
input files look like the
>> >>> >> >following.
>> >>> >> >*>>* FILE 1
>> >>> >> >*>* V1,V2,V4
>> >>> >> >*>* A, A, 24.251
>> >>> >> >*>* A, B, 1.065
>> >>> >> >*>* (...)
>> >>> >> >*>* B, C, 0.294
>> >>> >> >*>* B, D, 2.731
>> >>> >> >*>* (...)
>> >>> >> >*>* H, L, 0.345
>> >>> >> >*>* H, M, 0.000
>> >>> >> >*>>* FILE 2
>> >>> >> >*>* V3, V4
>> >>> >> >*>* A, 1.575
>> >>> >> >*>* B, 4.294
>> >>> >> >*>* C, 10.044
>> >>> >> >*>* (...)
>> >>> >> >*>* L, 5.123
>> >>> >> >*>* M, 3.334
>> >>> >> >*>>* What I need to achieve is a file
such as the following
>> >>> >> >*>>* FILE 3
>> >>> >> >*>* V1, V2, V3, V4
>> >>> >> >*>* A, A, A, ???
>> >>> >> >*>* A, A, B, ???
>> >>> >> >*>* (...)
>> >>> >> >*>* D, D, E, ???
>> >>> >> >*>* D, D, F, ???
>> >>> >> >*>* (...)
>> >>> >> >*>* H, M, L, ???
>> >>> >> >*>* H, M, M, ???
>> >>> >> >*>>* Please notice that FILE 3 need to
be such that if I aggregate
>> >>> >> > on
>> >>> >> >V1+V2 I
>> >>> >> >*>* recover exactly FILE 1 and that if I
aggregate on V3 I can
>> >>> >> > recover
>> >>> >> >a file
>> >>> >> >*>* as close as possible to FILE 3
(ideally the same file).
>> >>> >> >*>>* Can anyone suggest how I could do
that with R?
>> >>> >> >*>>* Thank you very much indeed for any
assistance you are able to
>> >>> >> >provide.
>> >>> >> >*>>* Kind regards,
>> >>> >> >*>>* Luca*
>> >>> >> >
>> >>> >> >       [[alternative HTML version deleted]]
>> >>> >> >
>> >>> >>
>______________________________________________
>> >>> >> >R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more, see
>> >>> >> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> >> >PLEASE do read the posting guide
>> >>> >> >http://www.R-project.org/posting-guide.html
>> >>> >> >and provide commented, minimal,
self-contained, reproducible code.
>> >>> >>
>> >>> >>
>> >>> >
>> >>> >         [[alternative HTML version deleted]]
>> >>> >
>> >>> > ______________________________________________
>> >>> > R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more, see
>> >>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> > PLEASE do read the posting guide
>> >>> > http://www.R-project.org/posting-guide.html
>> >>> > and provide commented, minimal, self-contained,
reproducible code.
>> >>
>> >>
>
>

R help - Mar 2015 - Joining two datasets - recursive procedure?

[R] Joining two datasets - recursive procedure?

[R] Joining two datasets - recursive procedure?

[R] Joining two datasets - recursive procedure?