On Thu, 6 Jun 2002, vito muggeo wrote:
> Hi there,
> This is not a strictly related R question; I apologize for this.
>
> I'm interest in simulate the sampling distribution of the LRT, testing
for
> trend among the levels of some categorical variable X in a regression
model.
> (in practice this is achieved by assigning scores to the levels of X and
> fitting such numeric variable).
> To simulate the sampling distribution the steps are:
>
> 1. Simulate the model *without* trend
> 2. for every sample compare the model with and without the "X by
score"
> variable.
>
> My question is which scores should I use? It is well known that the score
> affect the test, so which score have I to use to get the LRT? Can different
> values lead to different null distributions? Any suggestion is coming?
The maximum likelihood ratio test of constant vs non-decreasing trend is
*not* equivalent to any set of scores and has a different null
distribution (asymptotically a mixture of chi-squared variables with
different degrees of freedom).
That's why most people instead assign scores, which is much simpler and
has good power against most interesting alternatives. If you want to use
a test with scores then use the scores you want to use.
The LRT is equivalent to using the best non-decreasing set of scores
(found by isotonic regression) for each dataset, and the reason for the
strange limiting distribution is to take account of this adaptive choice
of scores.
-thomas
Thomas Lumley Asst. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._