On 16/04/2013 1:19 PM, Noah Silverman wrote:> Hi,
>
> I have some data, that when plotted looks very close to a log-normal
distribution. My goal is to build a regression model to test how this variable
responds to several independent variables.
>
> To do this, I want to use the fitdistr tool from the MASS package to see
how well my data fits the actual distribution, and also build a generalized
linear model using the glm command.
>
>
> The summary of my data is:
>
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 0.0000 0.0000 0.0000 0.8617 0.8332 55.5600
>
> So, no missing values, no negative values.
>
> When I try to use the fitdistr command, I get an error that I don't
understand:
> m <- fitdistr(y, densfun="lognormal")
>
> Error in fitdistr(y, densfun = "lognormal") : need positive
values to fit a log-Normal
You have zeros in your data. The lognormal distribution never takes on
the value zero.
If they are zero because of rounding (e.g. 0.001 would be recorded as
zero), and there aren't too many of them, then replacing the zeros with
a small positive value (e.g. half the smallest non-zero value) might
make sense. But your median is zero, so at least half of your
observations are zero.
You need to come up with a better model than "lognormal".
Duncan Murdoch
>
>
> When I try to build a simple model, I also get an error:
>
> l <- glm(y~ x, family=gaussian(link="log"))
>
> Error in eval(expr, envir, enclos) : cannot find valid starting values:
please specify some
>
>
>
> Can anyone offer some suggestions?
>
>
> Thanks!
>
> --
> Noah Silverman, M.S.
> UCLA Department of Statistics
> 8117 Math Sciences Building
> Los Angeles, CA 90095
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.