I'm not sure I understand completely what you want to do, but
if the data were frequencies, it sounds like task for fitting a
loglinear model with the model formula
~ V1*V2 + V3
On 3/18/2015 2:17 AM, Luca Meyer wrote:> Hello,
>
> I am facing a quite challenging task (at least to me) and I was wondering
> if someone could advise how R could assist me to speed the task up.
>
> I am dealing with a dataset with 3 discrete variables and one continuous
> variable. The discrete variables are:
>
> V1: 8 modalities
> V2: 13 modalities
> V3: 13 modalities
>
> The continuous variable V4 is a decimal number always greater than zero in
> the marginals of each of the 3 variables but it is sometimes equal to zero
> (and sometimes negative) in the joint tables.
>
> I have got 2 files:
>
> => one with distribution of all possible combinations of V1xV2 (some of
> which are zero or neagtive) and
> => one with the marginal distribution of V3.
>
> I am trying to build the long and narrow dataset V1xV2xV3 in such a way
> that each V1xV2 cell does not get modified and V3 fits as closely as
> possible to its marginal distribution. Does it make sense?
>
> To be even more specific, my 2 input files look like the following.
>
> FILE 1
> V1,V2,V4
> A, A, 24.251
> A, B, 1.065
> (...)
> B, C, 0.294
> B, D, 2.731
> (...)
> H, L, 0.345
> H, M, 0.000
>
> FILE 2
> V3, V4
> A, 1.575
> B, 4.294
> C, 10.044
> (...)
> L, 5.123
> M, 3.334
>
> What I need to achieve is a file such as the following
>
> FILE 3
> V1, V2, V3, V4
> A, A, A, ???
> A, A, B, ???
> (...)
> D, D, E, ???
> D, D, F, ???
> (...)
> H, M, L, ???
> H, M, M, ???
>
> Please notice that FILE 3 need to be such that if I aggregate on V1+V2 I
> recover exactly FILE 1 and that if I aggregate on V3 I can recover a file
> as close as possible to FILE 3 (ideally the same file).
>
> Can anyone suggest how I could do that with R?
>
> Thank you very much indeed for any assistance you are able to provide.
>
> Kind regards,
>
> Luca
>
> [[alternative HTML version deleted]]
>