thr3ads.net - R help - [R] Joining two datasets - recursive procedure? [Mar 2015]

If this information is useful, please help other people find it:
Share via:

Luca Meyer

2015-Mar-21 11:53 UTC

[R] Joining two datasets - recursive procedure?

Hi Jeff & other R-experts,

Thank you for your note. I have tried myself to solve the issue without
success.

Following your suggestion, I am providing a sample of the dataset I am
using below (also downloadble in plain text from
https://www.dropbox.com/s/qhmpkkrejjkpbkx/sample_code.txt?dl=0):

#this is an extract of the overall dataset (n=1200 cases)
f1 <- structure(list(v1 = c("A", "A", "A",
"A", "A", "A", "B", "B",
"B", "B", "B", "B"), v2 =
c("A", "B", "C", "A", "B",
"C", "A",
"B", "C", "A", "B", "C"), v3 =
c("B", "B", "B", "C", "C",
"C",
"B", "B", "B", "C", "C",
"C"), v4 = c(18.1853007621835, 3.43806581506388,
0.002733567617055, 1.42917483425029, 1.05786640463504,
0.000420548864162308,
2.37232740842861, 3.01835841813241, 0, 1.13430282139936, 0.928725667117666,
0)), .Names = c("v1", "v2", "v3", "v4"),
class = "data.frame", row.names c(2L,
9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L, 206L))

I need to find a automated procedure that allows me to adjust v3 marginals
while maintaining v1xv2 marginals unchanged.

That is: modify the v4 values you can find by running:

aggregate(f1[,c("v4")],list(f1$v3),sum)

while maintaining costant the values you can find by running:

aggregate(f1[,c("v4")],list(f1$v1,f1$v2),sum)

Now does it make sense?

Please notice I have tried to build some syntax that tries to modify values
within each v1xv2 combination by computing sum of v4, row percentage in
terms of v4, and there is where my effort is blocked. Not really sure how I
should proceed. Any suggestion?

Thanks,

Luca


2015-03-19 2:38 GMT+01:00 Jeff Newmiller <jdnewmil at dcn.davis.ca.us>:
> I don't understand your description. The standard practice on this list
is
> to provide a reproducible R example [1] of the kind of data you are working
> with (and any code you have tried) to go along with your description. In
> this case, that would be two dputs of your input data frames and a dput of
> an output data frame (generated by hand from your input data frame).
> (Probably best to not use the full number of input values just to keep the
> size down.) We could then make an attempt to generate code that goes from
> input to output.
>
> Of course, if you post that hard work using HTML then it will get
> corrupted (much like the text below from your earlier emails) and we
won't
> be able to use it. Please learn to post from your email software using
> plain text when corresponding with this mailing list.
>
> [1]
>
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#. 
Live
> Go...
>                                       Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> On March 18, 2015 9:05:37 AM PDT, Luca Meyer <lucam1968 at gmail.com>
wrote:
> >Thanks for you input Michael,
> >
> >The continuous variable I have measures quantities (down to the 3rd
> >decimal level) so unfortunately are not frequencies.
> >
> >Any more specific suggestions on how that could be tackled?
> >
> >Thanks & kind regards,
> >
> >Luca
> >
> >
> >==> >
> >Michael Friendly wrote:
> >I'm not sure I understand completely what you want to do, but
> >if the data were frequencies, it sounds like task for fitting a
> >loglinear model with the model formula
> >
> >~ V1*V2 + V3
> >
> >On 3/18/2015 2:17 AM, Luca Meyer wrote:
> >>* Hello,
> >*>>* I am facing a quite challenging task (at least to me) and I
was
> >wondering
> >*>* if someone could advise how R could assist me to speed the task
up.
> >*>>* I am dealing with a dataset with 3 discrete variables and
one
> >continuous
> >*>* variable. The discrete variables are:
> >*>>* V1: 8 modalities
> >*>* V2: 13 modalities
> >*>* V3: 13 modalities
> >*>>* The continuous variable V4 is a decimal number always
greater than
> >zero in
> >*>* the marginals of each of the 3 variables but it is sometimes
equal
> >to zero
> >*>* (and sometimes negative) in the joint tables.
> >*>>* I have got 2 files:
> >*>>* => one with distribution of all possible combinations of
V1xV2
> >(some of
> >*>* which are zero or neagtive) and
> >*>* => one with the marginal distribution of V3.
> >*>>* I am trying to build the long and narrow dataset V1xV2xV3 in
such
> >a way
> >*>* that each V1xV2 cell does not get modified and V3 fits as
closely
> >as
> >*>* possible to its marginal distribution. Does it make sense?
> >*>>* To be even more specific, my 2 input files look like the
> >following.
> >*>>* FILE 1
> >*>* V1,V2,V4
> >*>* A, A, 24.251
> >*>* A, B, 1.065
> >*>* (...)
> >*>* B, C, 0.294
> >*>* B, D, 2.731
> >*>* (...)
> >*>* H, L, 0.345
> >*>* H, M, 0.000
> >*>>* FILE 2
> >*>* V3, V4
> >*>* A, 1.575
> >*>* B, 4.294
> >*>* C, 10.044
> >*>* (...)
> >*>* L, 5.123
> >*>* M, 3.334
> >*>>* What I need to achieve is a file such as the following
> >*>>* FILE 3
> >*>* V1, V2, V3, V4
> >*>* A, A, A, ???
> >*>* A, A, B, ???
> >*>* (...)
> >*>* D, D, E, ???
> >*>* D, D, F, ???
> >*>* (...)
> >*>* H, M, L, ???
> >*>* H, M, M, ???
> >*>>* Please notice that FILE 3 need to be such that if I
aggregate on
> >V1+V2 I
> >*>* recover exactly FILE 1 and that if I aggregate on V3 I can
recover
> >a file
> >*>* as close as possible to FILE 3 (ideally the same file).
> >*>>* Can anyone suggest how I could do that with R?
> >*>>* Thank you very much indeed for any assistance you are able
to
> >provide.
> >*>>* Kind regards,
> >*>>* Luca*
> >
> >       [[alternative HTML version deleted]]
> >
> >______________________________________________
> >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
>
	[[alternative HTML version deleted]]

Bert Gunter

2015-Mar-21 12:18 UTC

head link

[R] Joining two datasets - recursive procedure?

1. Still not sure what you mean, but maybe look at ?ave and ?tapply,
for which ave() is a wrapper.

2. You still need to heed the rest of Jeff's advice.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll




On Sat, Mar 21, 2015 at 4:53 AM, Luca Meyer <lucam1968 at gmail.com>
wrote:> Hi Jeff & other R-experts,
>
> Thank you for your note. I have tried myself to solve the issue without
> success.
>
> Following your suggestion, I am providing a sample of the dataset I am
> using below (also downloadble in plain text from
> https://www.dropbox.com/s/qhmpkkrejjkpbkx/sample_code.txt?dl=0):
>
> #this is an extract of the overall dataset (n=1200 cases)
> f1 <- structure(list(v1 = c("A", "A", "A",
"A", "A", "A", "B", "B",
> "B", "B", "B", "B"), v2 =
c("A", "B", "C", "A", "B",
"C", "A",
> "B", "C", "A", "B", "C"),
v3 = c("B", "B", "B", "C",
"C", "C",
> "B", "B", "B", "C", "C",
"C"), v4 = c(18.1853007621835, 3.43806581506388,
> 0.002733567617055, 1.42917483425029, 1.05786640463504,
> 0.000420548864162308,
> 2.37232740842861, 3.01835841813241, 0, 1.13430282139936, 0.928725667117666,
> 0)), .Names = c("v1", "v2", "v3",
"v4"), class = "data.frame", row.names > c(2L,
> 9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L, 206L))
>
> I need to find a automated procedure that allows me to adjust v3 marginals
> while maintaining v1xv2 marginals unchanged.
>
> That is: modify the v4 values you can find by running:
>
> aggregate(f1[,c("v4")],list(f1$v3),sum)
>
> while maintaining costant the values you can find by running:
>
> aggregate(f1[,c("v4")],list(f1$v1,f1$v2),sum)
>
> Now does it make sense?
>
> Please notice I have tried to build some syntax that tries to modify values
> within each v1xv2 combination by computing sum of v4, row percentage in
> terms of v4, and there is where my effort is blocked. Not really sure how I
> should proceed. Any suggestion?
>
> Thanks,
>
> Luca
>
>
> 2015-03-19 2:38 GMT+01:00 Jeff Newmiller <jdnewmil at
dcn.davis.ca.us>:
>
>> I don't understand your description. The standard practice on this
list is
>> to provide a reproducible R example [1] of the kind of data you are
working
>> with (and any code you have tried) to go along with your description.
In
>> this case, that would be two dputs of your input data frames and a dput
of
>> an output data frame (generated by hand from your input data frame).
>> (Probably best to not use the full number of input values just to keep
the
>> size down.) We could then make an attempt to generate code that goes
from
>> input to output.
>>
>> Of course, if you post that hard work using HTML then it will get
>> corrupted (much like the text below from your earlier emails) and we
won't
>> be able to use it. Please learn to post from your email software using
>> plain text when corresponding with this mailing list.
>>
>> [1]
>>
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
>>
---------------------------------------------------------------------------
>> Jeff Newmiller                        The     .....       .....  Go
Live...
>> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.      
##.#.  Live
>> Go...
>>                                       Live:   OO#.. Dead: OO#.. 
Playing
>> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>> /Software/Embedded Controllers)               .OO#.       .OO#. 
rocks...1k
>>
---------------------------------------------------------------------------
>> Sent from my phone. Please excuse my brevity.
>>
>> On March 18, 2015 9:05:37 AM PDT, Luca Meyer <lucam1968 at
gmail.com> wrote:
>> >Thanks for you input Michael,
>> >
>> >The continuous variable I have measures quantities (down to the 3rd
>> >decimal level) so unfortunately are not frequencies.
>> >
>> >Any more specific suggestions on how that could be tackled?
>> >
>> >Thanks & kind regards,
>> >
>> >Luca
>> >
>> >
>> >==>> >
>> >Michael Friendly wrote:
>> >I'm not sure I understand completely what you want to do, but
>> >if the data were frequencies, it sounds like task for fitting a
>> >loglinear model with the model formula
>> >
>> >~ V1*V2 + V3
>> >
>> >On 3/18/2015 2:17 AM, Luca Meyer wrote:
>> >>* Hello,
>> >*>>* I am facing a quite challenging task (at least to me)
and I was
>> >wondering
>> >*>* if someone could advise how R could assist me to speed the
task up.
>> >*>>* I am dealing with a dataset with 3 discrete variables
and one
>> >continuous
>> >*>* variable. The discrete variables are:
>> >*>>* V1: 8 modalities
>> >*>* V2: 13 modalities
>> >*>* V3: 13 modalities
>> >*>>* The continuous variable V4 is a decimal number always
greater than
>> >zero in
>> >*>* the marginals of each of the 3 variables but it is sometimes
equal
>> >to zero
>> >*>* (and sometimes negative) in the joint tables.
>> >*>>* I have got 2 files:
>> >*>>* => one with distribution of all possible combinations
of V1xV2
>> >(some of
>> >*>* which are zero or neagtive) and
>> >*>* => one with the marginal distribution of V3.
>> >*>>* I am trying to build the long and narrow dataset
V1xV2xV3 in such
>> >a way
>> >*>* that each V1xV2 cell does not get modified and V3 fits as
closely
>> >as
>> >*>* possible to its marginal distribution. Does it make sense?
>> >*>>* To be even more specific, my 2 input files look like the
>> >following.
>> >*>>* FILE 1
>> >*>* V1,V2,V4
>> >*>* A, A, 24.251
>> >*>* A, B, 1.065
>> >*>* (...)
>> >*>* B, C, 0.294
>> >*>* B, D, 2.731
>> >*>* (...)
>> >*>* H, L, 0.345
>> >*>* H, M, 0.000
>> >*>>* FILE 2
>> >*>* V3, V4
>> >*>* A, 1.575
>> >*>* B, 4.294
>> >*>* C, 10.044
>> >*>* (...)
>> >*>* L, 5.123
>> >*>* M, 3.334
>> >*>>* What I need to achieve is a file such as the following
>> >*>>* FILE 3
>> >*>* V1, V2, V3, V4
>> >*>* A, A, A, ???
>> >*>* A, A, B, ???
>> >*>* (...)
>> >*>* D, D, E, ???
>> >*>* D, D, F, ???
>> >*>* (...)
>> >*>* H, M, L, ???
>> >*>* H, M, M, ???
>> >*>>* Please notice that FILE 3 need to be such that if I
aggregate on
>> >V1+V2 I
>> >*>* recover exactly FILE 1 and that if I aggregate on V3 I can
recover
>> >a file
>> >*>* as close as possible to FILE 3 (ideally the same file).
>> >*>>* Can anyone suggest how I could do that with R?
>> >*>>* Thank you very much indeed for any assistance you are
able to
>> >provide.
>> >*>>* Kind regards,
>> >*>>* Luca*
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> >______________________________________________
>> >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide
>> >http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Luca Meyer

2015-Mar-21 13:49 UTC

head link

[R] Joining two datasets - recursive procedure?

Hi Bert,

Thank you for your message. I am looking into ave() and tapply() as you
suggested but at the same time I have prepared a example of input and
output files, just in case you or someone else would like to make an
attempt to generate a code that goes from input to output.

Please see below or download it from
https://www.dropbox.com/s/qhmpkkrejjkpbkx/sample_code.txt?dl=0

# this is (an extract of) the INPUT file I have:
f1 <- structure(list(v1 = c("A", "A", "A",
"A", "A", "A", "B", "B",
"B", "B", "B", "B"), v2 =
c("A", "B", "C", "A", "B",
"C", "A",
"B", "C", "A", "B", "C"), v3 =
c("B", "B", "B", "C", "C",
"C",
"B", "B", "B", "C", "C",
"C"), v4 = c(18.18530, 3.43806,0.00273, 1.42917,
1.05786, 0.00042, 2.37232, 3.01835, 0, 1.13430, 0.92872,
0)), .Names = c("v1", "v2", "v3", "v4"),
class = "data.frame", row.names c(2L,
9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L, 206L))

# this is (an extract of) the OUTPUT file I would like to obtain:
f2 <- structure(list(v1 = c("A", "A", "A",
"A", "A", "A", "B", "B",
"B", "B", "B", "B"), v2 =
c("A", "B", "C", "A", "B",
"C", "A",
"B", "C", "A", "B", "C"), v3 =
c("B", "B", "B", "C", "C",
"C",
"B", "B", "B", "C", "C",
"C"), v4 = c(17.83529, 3.43806,0.00295, 1.77918,
1.05786, 0.0002, 2.37232, 3.01835, 0, 1.13430, 0.92872,
0)), .Names = c("v1", "v2", "v3", "v4"),
class = "data.frame", row.names c(2L,
9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L, 206L))

# please notice that while the aggregated v4 on v3 has changed ?
aggregate(f1[,c("v4")],list(f1$v3),sum)
aggregate(f2[,c("v4")],list(f2$v3),sum)

# ? the aggregated v4 over v1xv2 has remained unchanged:
aggregate(f1[,c("v4")],list(f1$v1,f1$v2),sum)
aggregate(f2[,c("v4")],list(f2$v1,f2$v2),sum)

Thank you very much in advance for your assitance.

Luca

2015-03-21 13:18 GMT+01:00 Bert Gunter <gunter.berton at gene.com>:
> 1. Still not sure what you mean, but maybe look at ?ave and ?tapply,
> for which ave() is a wrapper.
>
> 2. You still need to heed the rest of Jeff's advice.
>
> Cheers,
> Bert
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
> (650) 467-7374
>
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
> Clifford Stoll
>
>
>
>
> On Sat, Mar 21, 2015 at 4:53 AM, Luca Meyer <lucam1968 at gmail.com>
wrote:
> > Hi Jeff & other R-experts,
> >
> > Thank you for your note. I have tried myself to solve the issue
without
> > success.
> >
> > Following your suggestion, I am providing a sample of the dataset I am
> > using below (also downloadble in plain text from
> > https://www.dropbox.com/s/qhmpkkrejjkpbkx/sample_code.txt?dl=0):
> >
> > #this is an extract of the overall dataset (n=1200 cases)
> > f1 <- structure(list(v1 = c("A", "A",
"A", "A", "A", "A", "B",
"B",
> > "B", "B", "B", "B"), v2 =
c("A", "B", "C", "A", "B",
"C", "A",
> > "B", "C", "A", "B",
"C"), v3 = c("B", "B", "B",
"C", "C", "C",
> > "B", "B", "B", "C",
"C", "C"), v4 = c(18.1853007621835, 3.43806581506388,
> > 0.002733567617055, 1.42917483425029, 1.05786640463504,
> > 0.000420548864162308,
> > 2.37232740842861, 3.01835841813241, 0, 1.13430282139936,
> 0.928725667117666,
> > 0)), .Names = c("v1", "v2", "v3",
"v4"), class = "data.frame", row.names
> > > c(2L,
> > 9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L, 206L))
> >
> > I need to find a automated procedure that allows me to adjust v3
> marginals
> > while maintaining v1xv2 marginals unchanged.
> >
> > That is: modify the v4 values you can find by running:
> >
> > aggregate(f1[,c("v4")],list(f1$v3),sum)
> >
> > while maintaining costant the values you can find by running:
> >
> > aggregate(f1[,c("v4")],list(f1$v1,f1$v2),sum)
> >
> > Now does it make sense?
> >
> > Please notice I have tried to build some syntax that tries to modify
> values
> > within each v1xv2 combination by computing sum of v4, row percentage
in
> > terms of v4, and there is where my effort is blocked. Not really sure
> how I
> > should proceed. Any suggestion?
> >
> > Thanks,
> >
> > Luca
> >
> >
> > 2015-03-19 2:38 GMT+01:00 Jeff Newmiller <jdnewmil at
dcn.davis.ca.us>:
> >
> >> I don't understand your description. The standard practice on
this list
> is
> >> to provide a reproducible R example [1] of the kind of data you
are
> working
> >> with (and any code you have tried) to go along with your
description. In
> >> this case, that would be two dputs of your input data frames and a
dput
> of
> >> an output data frame (generated by hand from your input data
frame).
> >> (Probably best to not use the full number of input values just to
keep
> the
> >> size down.) We could then make an attempt to generate code that
goes
> from
> >> input to output.
> >>
> >> Of course, if you post that hard work using HTML then it will get
> >> corrupted (much like the text below from your earlier emails) and
we
> won't
> >> be able to use it. Please learn to post from your email software
using
> >> plain text when corresponding with this mailing list.
> >>
> >> [1]
> >>
>
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> >>
> ---------------------------------------------------------------------------
> >> Jeff Newmiller                        The     .....       ..... 
Go
> Live...
> >> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.      
##.#.  Live
> >> Go...
> >>                                       Live:   OO#.. Dead: OO#.. 
Playing
> >> Research Engineer (Solar/Batteries            O.O#.       #.O#. 
with
> >> /Software/Embedded Controllers)               .OO#.       .OO#.
> rocks...1k
> >>
> ---------------------------------------------------------------------------
> >> Sent from my phone. Please excuse my brevity.
> >>
> >> On March 18, 2015 9:05:37 AM PDT, Luca Meyer <lucam1968 at
gmail.com>
> wrote:
> >> >Thanks for you input Michael,
> >> >
> >> >The continuous variable I have measures quantities (down to
the 3rd
> >> >decimal level) so unfortunately are not frequencies.
> >> >
> >> >Any more specific suggestions on how that could be tackled?
> >> >
> >> >Thanks & kind regards,
> >> >
> >> >Luca
> >> >
> >> >
> >> >==> >> >
> >> >Michael Friendly wrote:
> >> >I'm not sure I understand completely what you want to do,
but
> >> >if the data were frequencies, it sounds like task for fitting
a
> >> >loglinear model with the model formula
> >> >
> >> >~ V1*V2 + V3
> >> >
> >> >On 3/18/2015 2:17 AM, Luca Meyer wrote:
> >> >>* Hello,
> >> >*>>* I am facing a quite challenging task (at least to
me) and I was
> >> >wondering
> >> >*>* if someone could advise how R could assist me to speed
the task up.
> >> >*>>* I am dealing with a dataset with 3 discrete
variables and one
> >> >continuous
> >> >*>* variable. The discrete variables are:
> >> >*>>* V1: 8 modalities
> >> >*>* V2: 13 modalities
> >> >*>* V3: 13 modalities
> >> >*>>* The continuous variable V4 is a decimal number
always greater than
> >> >zero in
> >> >*>* the marginals of each of the 3 variables but it is
sometimes equal
> >> >to zero
> >> >*>* (and sometimes negative) in the joint tables.
> >> >*>>* I have got 2 files:
> >> >*>>* => one with distribution of all possible
combinations of V1xV2
> >> >(some of
> >> >*>* which are zero or neagtive) and
> >> >*>* => one with the marginal distribution of V3.
> >> >*>>* I am trying to build the long and narrow dataset
V1xV2xV3 in such
> >> >a way
> >> >*>* that each V1xV2 cell does not get modified and V3 fits
as closely
> >> >as
> >> >*>* possible to its marginal distribution. Does it make
sense?
> >> >*>>* To be even more specific, my 2 input files look
like the
> >> >following.
> >> >*>>* FILE 1
> >> >*>* V1,V2,V4
> >> >*>* A, A, 24.251
> >> >*>* A, B, 1.065
> >> >*>* (...)
> >> >*>* B, C, 0.294
> >> >*>* B, D, 2.731
> >> >*>* (...)
> >> >*>* H, L, 0.345
> >> >*>* H, M, 0.000
> >> >*>>* FILE 2
> >> >*>* V3, V4
> >> >*>* A, 1.575
> >> >*>* B, 4.294
> >> >*>* C, 10.044
> >> >*>* (...)
> >> >*>* L, 5.123
> >> >*>* M, 3.334
> >> >*>>* What I need to achieve is a file such as the
following
> >> >*>>* FILE 3
> >> >*>* V1, V2, V3, V4
> >> >*>* A, A, A, ???
> >> >*>* A, A, B, ???
> >> >*>* (...)
> >> >*>* D, D, E, ???
> >> >*>* D, D, F, ???
> >> >*>* (...)
> >> >*>* H, M, L, ???
> >> >*>* H, M, M, ???
> >> >*>>* Please notice that FILE 3 need to be such that if I
aggregate on
> >> >V1+V2 I
> >> >*>* recover exactly FILE 1 and that if I aggregate on V3 I
can recover
> >> >a file
> >> >*>* as close as possible to FILE 3 (ideally the same file).
> >> >*>>* Can anyone suggest how I could do that with R?
> >> >*>>* Thank you very much indeed for any assistance you
are able to
> >> >provide.
> >> >*>>* Kind regards,
> >> >*>>* Luca*
> >> >
> >> >       [[alternative HTML version deleted]]
> >> >
> >> >______________________________________________
> >> >R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
> >> >https://stat.ethz.ch/mailman/listinfo/r-help
> >> >PLEASE do read the posting guide
> >> >http://www.R-project.org/posting-guide.html
> >> >and provide commented, minimal, self-contained, reproducible
code.
> >>
> >>
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

R help - Mar 2015 - Joining two datasets - recursive procedure?

[R] Joining two datasets - recursive procedure?

[R] Joining two datasets - recursive procedure?

[R] Joining two datasets - recursive procedure?