Paul Livingstone
2004-Dec-20 22:30 UTC
[R] why use profile likelihood for Box Cox transformation?
Hi All, I'm analysing some data that is conventionally modelled as log(Y) = a + bX + e. However, using the boxcox function, it appears that the optimum value of lambda is approx 0.05. I have 40 data sets of differing sizes and for about half of these, lambda is significantly non-zero. So, it is worth looking into. The alternative model, Y^lambda = a + bX + e, has been explored before by non-statistician colleagues. But instead of using boxcox and maximising the profile likelihood, the model has been twisted, shuffled, differenced and logged, to get ln(dY/dX) = A + B.ln(Y) + E and lambda ( =f(B) ) estimated via LS regression. Note: RHS contains Y, not X. This relationship has some physical justification. I assume that these two approaches are not equivalent, is this correct? I assume the Box Cox approach (profile likelihood) is better, is this correct and why? Any help would be greatly appreciated. Thanks, Paul. Paul Livingstone Statistical Analyst AeroStructures® Level 14, 222 Kings Way South Melbourne, Vic, 3205 Phone: 03 9694 1083 Mobile: 0418 121 530 Fax: 03 9696 8195 Email: paul.livingstone@aerostructures.com.au Web: www.aerostructures.com.au [[alternative HTML version deleted]]
Peter Dalgaard
2004-Dec-20 23:53 UTC
[R] why use profile likelihood for Box Cox transformation?
"Paul Livingstone" <paul.livingstone at aerostructures.com.au> writes:> The alternative model, Y^lambda = a + bX + e, has been explored > before by non-statistician colleagues. But instead of using boxcox > and maximising the profile likelihood, the model has been twisted, > shuffled, differenced and logged, to get > > ln(dY/dX) = A + B.ln(Y) + E > > and lambda ( =f(B) ) estimated via LS regression. Note: RHS contains > Y, not X. This relationship has some physical justification. > > > I assume that these two approaches are not equivalent, is this correct?Correct.> I assume the Box Cox approach (profile likelihood) is better, is > this correct and why?This is sort of similar to the issue of output least squares vs. system least squares in inverse problems theory. If what you have is a relation between Y and x and (only) Y is measured with errors, you'd be getting a bias towards zero in the estimated B by using the "shuffled" equation. Then again, it's not really obvious that Box-Cox is right either because it mixes up the functional relation and the error chacteristics. Y^lambda should be linear in X _and_ have normally distributed errors with a constant variance. You might need one lambda to linearize and another to stabilize the variance. If the errors really enter at the systems level (you have a stochastic differential equation), it's a different story altogether! -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907