thr3ads.net - R help - [R] Three-component Negative Binomial Mixture: R code [Nov 2016]

If this information is useful, please help other people find it:
Share via:

danilo.carita at uniparthenope.it

2016-Nov-08 15:18 UTC

[R] Three-component Negative Binomial Mixture: R code

I tried the function flexmix() with the driver FLXMRnegbin() with two  
components first, in order to compare its results with those provided  
by my function mixnbinom(). In particular, I ran the following code:

> fm0 <- flexmix(y ~ 1, data = data.frame(y), k = 2, model =
FLXMRnegbin())

where "y" is my vector of counts. The previous function provided me  
the following parameters:

>                    Comp.1   Comp.2
> coef.(Intercept) 1.2746536 1.788578
> theta            0.1418201 5.028766

with priors 0.342874 and 0.657126, respectively. I assume that the  
coefficients "Intercept" represent the two means of the model (mu1 and
mu2), while the "theta" coefficients are the size parameters (size1  
and size2).

Unfortunately, unlike my function mixnbinom(), the model computed with  
flexmix() did not provide a good fit to my data (p-value ~0).

Is there something wrong in the process above?

Achim Zeileis <Achim.Zeileis at uibk.ac.at> ha scritto:
> On Mon, 7 Nov 2016, danilo.carita at uniparthenope.it wrote:
>
>> I need a function for R software which computes a mixture of Negative
>> Binomial distributions with at least three components.
>
> The package "countreg" on R-Forge provides a driver FLXMRnegbin()
> that can be combined with the "flexmix" package (i.e., functions
> flexmix() and stepFlexmix()). The manual page provides some worked  
> illustrations in example("FLXMRnegbin", package =
"countreg").
>
> Note that the driver is mainly designed for negative binomial  
> _regression_ models. But if you just regress on a constant (y ~ 1)  
> you can also get negative binomial mixture distributions without  
> covariates.

-------------------------------------------------------------
Danilo Carit?

PhD Candidate
University of Naples "Parthenope"
Dipartimento di Studi Aziendali e Quantitativi
via G. Parisi, 13, 80132 Napoli - Italy

Achim Zeileis

2016-Nov-08 16:05 UTC

head link

[R] Three-component Negative Binomial Mixture: R code

On Tue, 8 Nov 2016, danilo.carita at uniparthenope.it wrote:
> I tried the function flexmix() with the driver FLXMRnegbin() with two 
> components first, in order to compare its results with those provided by my
> function mixnbinom(). In particular, I ran the following code:
>
>
>> fm0 <- flexmix(y ~ 1, data = data.frame(y), k = 2, model =
FLXMRnegbin())
>
>
> where "y" is my vector of counts. The previous function provided
me the
> following parameters:
>
>
>>                   Comp.1   Comp.2
>> coef.(Intercept) 1.2746536 1.788578
>> theta            0.1418201 5.028766
>
>
> with priors 0.342874 and 0.657126, respectively. I assume that the 
> coefficients "Intercept" represent the two means of the model
(mu1 and mu2),
No, a log link is employed, i.e., exp(1.2746536) and exp(1.788578) are the 
means.
> while the "theta" coefficients are the size parameters (size1 and
size2).
Yes.
> Unfortunately, unlike my function mixnbinom(), the model computed with 
> flexmix() did not provide a good fit to my data (p-value ~0).
>
> Is there something wrong in the process above?
Hard to say without a reproducible example. Using parameter values similar 
to the ones you cite above, the following seems to do a reasonable job:

## packages
library("countreg")
library("flexmix")

## artificial data from two NB distributions:
## 1/3 is NB(mu = 3.5, theta = 0.2) and
## 2/3 is NB(mu = 6.0, theta = 5.0)
set.seed(1)
y <- c(rnbinom(200, mu = 3.5, size = 0.2), rnbinom(400, mu = 6, size = 5))

## fit 2-component mixture model
set.seed(1)
fm <- flexmix(y ~ 1, k = 2, model = FLXMRnegbin())

## inspect estimated parameters -> look acceptable
parameters(fm)
exp(parameters(fm)[1,])

My experience was that finding good starting values may be a problem for 
flexmix(). So maybe setting these in some better way would be beneficial.
> Achim Zeileis <Achim.Zeileis at uibk.ac.at> ha scritto:
>
>> On Mon, 7 Nov 2016, danilo.carita at uniparthenope.it wrote:
>> 
>>> I need a function for R software which computes a mixture of
Negative
>>> Binomial distributions with at least three components.
>> 
>> The package "countreg" on R-Forge provides a driver
FLXMRnegbin() that can
>> be combined with the "flexmix" package (i.e., functions
flexmix() and
>> stepFlexmix()). The manual page provides some worked illustrations in 
>> example("FLXMRnegbin", package = "countreg").
>> 
>> Note that the driver is mainly designed for negative binomial
_regression_
>> models. But if you just regress on a constant (y ~ 1) you can also get 
>> negative binomial mixture distributions without covariates.
>
>
> -------------------------------------------------------------
> Danilo Carit?
>
> PhD Candidate
> University of Naples "Parthenope"
> Dipartimento di Studi Aziendali e Quantitativi
> via G. Parisi, 13, 80132 Napoli - Italy
> -------------------------------------------------------------
>

danilo.carita at uniparthenope.it

2016-Nov-10 09:55 UTC

head link

[R] Three-component Negative Binomial Mixture: R code

Thank you for your hints, now the goodness of fit test provides me  
good results, but surprisingly for me the three-component model turns  
out to be worse than the two-component one (indeed, I focused on the  
three-component mixture because the two-component one exhibits a low  
p-value).

In addition, I have noticed that for some data the function fails to  
find good starting values, as you have mentioned in your previuous  
answer. The problem is that the driver FLXMRnegbin() allows to specify  
only the theta parameter (and only one value, even in the event of  
mixtures of two or more components).

I have read the description of flexmix() function too, but it seems  
that it does not allow to set starting values for the parameters of  
the model. Am I right? Or is there a way to do it?

Achim Zeileis <Achim.Zeileis at uibk.ac.at> ha scritto:
> On Tue, 8 Nov 2016, danilo.carita at uniparthenope.it wrote:
>
>> I tried the function flexmix() with the driver FLXMRnegbin() with  
>> two components first, in order to compare its results with those  
>> provided by my function mixnbinom(). In particular, I ran the  
>> following code:
>>
>>
>>> fm0 <- flexmix(y ~ 1, data = data.frame(y), k = 2, model =
FLXMRnegbin())
>>
>>
>> where "y" is my vector of counts. The previous function
provided me
>> the following parameters:
>>
>>
>>>                  Comp.1   Comp.2
>>> coef.(Intercept) 1.2746536 1.788578
>>> theta            0.1418201 5.028766
>>
>>
>> with priors 0.342874 and 0.657126, respectively. I assume that the  
>> coefficients "Intercept" represent the two means of the model
(mu1
>> and mu2),
>
> No, a log link is employed, i.e., exp(1.2746536) and exp(1.788578)  
> are the means.
>
>> while the "theta" coefficients are the size parameters (size1
and size2).
>
> Yes.
>
>> Unfortunately, unlike my function mixnbinom(), the model computed  
>> with flexmix() did not provide a good fit to my data (p-value ~0).
>>
>> Is there something wrong in the process above?
>
> Hard to say without a reproducible example. Using parameter values  
> similar to the ones you cite above, the following seems to do a  
> reasonable job:
>
> ## packages
> library("countreg")
> library("flexmix")
>
> ## artificial data from two NB distributions:
> ## 1/3 is NB(mu = 3.5, theta = 0.2) and
> ## 2/3 is NB(mu = 6.0, theta = 5.0)
> set.seed(1)
> y <- c(rnbinom(200, mu = 3.5, size = 0.2), rnbinom(400, mu = 6, size =
5))
>
> ## fit 2-component mixture model
> set.seed(1)
> fm <- flexmix(y ~ 1, k = 2, model = FLXMRnegbin())
>
> ## inspect estimated parameters -> look acceptable
> parameters(fm)
> exp(parameters(fm)[1,])
>
> My experience was that finding good starting values may be a problem  
> for flexmix(). So maybe setting these in some better way would be  
> beneficial.

-------------------------------------------------------------
Danilo Carit?

PhD Candidate
University of Naples "Parthenope"
Dipartimento di Studi Aziendali e Quantitativi
via G. Parisi, 13, 80132 Napoli - Italy

R help - Nov 2016 - Three-component Negative Binomial Mixture: R code

[R] Three-component Negative Binomial Mixture: R code

[R] Three-component Negative Binomial Mixture: R code

[R] Three-component Negative Binomial Mixture: R code