Hi,
Hope this does not sound too ignorant .
I am trying to detrend and transform variables to achieve normality and
stationarity (for time series use, namely spectral analysis). I am using the
boxcox transformations.
As my dataset contains zeros, I found I need to add a constant to it in
order to run "boxcox". I have ran tests adding several types of
constants,
from .0001 (my unit of measurement) to 10 (still way below my maximum
value). Most of my data concentrates in low values. I found the estimate of
lambda changes drastically (from positive to negative) with the successive
constants. I also found that normality (evaluated by running Shapiro and
Jarque Bera test) after performing the suggested transformation obtained
from "box-cox" is maximized for constant=0.0001(pvalues>0.7,
lambda=0.2) and
not for constant=1 (p-values <0.01, lambda=-0.94). Curiously, running the
Shapiro tests with constant=1 and across lambda values, the highest p-value
was obtained with lambda=0.2 (and not -0.94!)
Why does box-cox() return lambda values that are so far from
"creating"
normality in the data? What type of best estimates are they? How should I
choose the constant?
I have skimmed (I am not a statistician.) through the Box-Cox(1964) paper
and found no reference to this.
So, any suggestion will be precious,
Nuno
[[alternative HTML version deleted]]