thr3ads.net - R help - [R] [Not R question]: Better fit for order probit model [Jun 2007]

If this information is useful, please help other people find it:
Share via:

adschai at optonline.net

2007-Jun-16 01:31 UTC

[R] [Not R question]: Better fit for order probit model

Hi,

I have a model which tries to fit a set of data with 10-level ordered responses.
Somehow, in my data, the majority of the observations are from level 6-10 and
leave only about 1-5% of total observations contributed to level 1-10. As a
result, my model tends to perform badly on points that have lower level than 6.

I would like to ask if there's any way to circumvent this problem or not. I
was thinking of the followings ideas. But I am opened to any suggestions if you
could please.

1. Bootstrapping with small size of samples each time. Howevever, in each sample
basket, I intentionally sample in such a way that there is a good mix between
observations from each level. Then I have to do this many times. But I don't
know how to obtain the true standard error of estimated parameters after all
bootstrapping has been done. Is it going to be simply the average of all
standard errors estimated each time?

2. Weighting points with level 1-6 more. But it's unclear to me how to put
this weight back to maximum likelihood when estimating parameters. It's
unlike OLS where your objective is to minimize error or, if you'd like, a
penalty function. But MLE is obviously not a penalty function.

3. Do step-wise regression. I will segment the data into two regions, first
points with response less than 6 and the rest with those above 6. The first step
is a binary regression to determine if the point belongs to which of the two
groups. Then in the second step, estimate ordered probit model for each group
separately. The question here is then, why I am choosing 6 as a cutting point
instead of others?

Any suggestions would be really appreciated. Thank you.

- adschai

Robert A LaBudde

2007-Jun-16 02:51 UTC

head link

[R] [Not R question]: Better fit for order probit model

At 09:31 PM 6/15/2007, adschai wrote:>I have a model which tries to fit a set of data with 10-level 
>ordered responses. Somehow, in my data, the majority of the 
>observations are from level 6-10 and leave only about 1-5% of total 
>observations contributed to level 1-10. As a result, my model tends 
>to perform badly on points that have lower level than 6.
>
>I would like to ask if there's any way to circumvent this problem or 
>not. I was thinking of the followings ideas. But I am opened to any 
>suggestions if you could please.
>
>1. Bootstrapping with small size of samples each time. Howevever, in 
>each sample basket, I intentionally sample in such a way that there 
>is a good mix between observations from each level. Then I have to 
>do this many times. But I don't know how to obtain the true standard 
>error of estimated parameters after all bootstrapping has been done. 
>Is it going to be simply the average of all standard errors 
>estimated each time?
>
>2. Weighting points with level 1-6 more. But it's unclear to me how 
>to put this weight back to maximum likelihood when estimating 
>parameters. It's unlike OLS where your objective is to minimize 
>error or, if you'd like, a penalty function. But MLE is obviously 
>not a penalty function.
>
>3. Do step-wise regression. I will segment the data into two 
>regions, first points with response less than 6 and the rest with 
>those above 6. The first step is a binary regression to determine if 
>the point belongs to which of the two groups. Then in the second 
>step, estimate ordered probit model for each group separately. The 
>question here is then, why I am choosing 6 as a cutting point 
>instead of others?
>
>Any suggestions would be really appreciated. Thank you.
You could do the obvious, and lump categories such as 1-6 or 1-7 
together to make a composite category.

You don't mention the size of your dataset. If there are 10,000 data, 
you might live with a 1% category. If you only have 100 data, you 
have too many categories.

Also, next time plan your study and training better so that next time 
your categories are fully utilized. And don't use so many categories. 
People have trouble even selecting responses on a 5-level scale.
===============================================================Robert A.
LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: ral at lcfltd.com
Least Cost Formulations, Ltd.            URL: http://lcfltd.com/
824 Timberlake Drive                     Tel: 757-467-0954
Virginia Beach, VA 23464-3239            Fax: 757-467-2947

"Vere scire est per causas scire"

Maybe Matching Threads

Search for more apparently analagous threads

R help - Jun 2007 - [Not R question]: Better fit for order probit model

[R] [Not R question]: Better fit for order probit model

[R] [Not R question]: Better fit for order probit model

Maybe Matching Threads