I wouldn't go quite so far as to say there's absolutely nothing else
-- one could, e.g., also fit lognormal, gamma, beta or most any other
two parameters distributions from the supplied data [assuming the
support matches].
What I did say is that you need domain specific knowledge to pick a
distribution to which to fit: then, if the moments are known in closed
form from the parameters, moment matching comes down to simultaneous
non-linear equations. I'm not aware of a unified infrastructure for
this in R [so I'm cc'ing the list in case someone else is], but it's
not a terribly difficult problem for the low dimensions we're talking
about.
E.g.,
If you know your data has a gamma distribution with mean 10 and
variance 20, you look at the Wikipedia gamma distribution page to find
Mean = k * theta
Variance = k * theta * theta
So Variance / Mean = theta --> Theta = 2 for your problem. Then k = 5.
Similarly, the all-great Wikipedians provide closed form solutions to
get the lognormal parameters back from observed sample moments:
http://en.wikipedia.org/wiki/Lognormal_distribution#Arithmetic_moments
As Bert rightly cautions, this is far outside the realm of good
practice and your energies would be better served if you could get a
better picture of the underlying data.
Best,
Michael
On Fri, Jun 8, 2012 at 9:13 AM, Bert Gunter <gunter.berton at gene.com>
wrote:>
> Andras:
> I realize my comment was rather cryptic, but which part of Michael's
"You can't" did you not understand? Other then
>
> ?dnorm
>
> which, as Michael said, is probably not a good thing, you can do nothing.
You need to refocus your efforts on changing the system to get useful data, not
trying to make a silk purse out of a sow's ear. Or, as John Tukey said many
years ago:
>
> "The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body of data.
"
> -- John Tukey
>
> -- Bert
>
>
>
>
> On Fri, Jun 8, 2012 at 5:14 AM, Andras Farkas <motyocska at
yahoo.com> wrote:
>>
>>
>> Dear Bert and Michael
>>
>> thank you for your note below. Based on Michael's input and the
lack of covariance matrix availble to me (for the most part), moment matching
sounds like the best option. I have searched the internet for discussions on
this using R but did not find much useful information.?I also have to apologize,
but I am somewhat new to the software and this level of statistics.I am usually
pretty good at figuring things out, but this one is probably way over my head. I
was wondering if you could point me into the right direction using R to
"re-build" the distribution that has the following parameters:
>>
>> mean: 0.007, median: 0.003, SD:0.011.
>>
>> I greatly apreciate your help,
>>
>> Sincerely,
>>
>> Andras
>>
>> gunter.berton at gene.com> wrote:
>>
>>
>> From: Bert Gunter <gunter.berton at gene.com>
>> Subject: Re: [R] "Re-creating" distributions
>> To: "R. Michael Weylandt" <michael.weylandt at
gmail.com>
>> Cc: "Andras Farkas" <motyocska at yahoo.com>, r-help at
r-project.org
>> Date: Friday, June 8, 2012, 12:29 AM
>>
>> Related comment:
>>
>> "Even the data aren't sufficient." -- Brian Joiner (some
years ago).
>>
>> Explanation: See W.E. Deming on "analytic" vs
"enumerative" statistics.
>>
>> --- Bert
>>
>> On Thu, Jun 7, 2012 at 8:06 PM, R. Michael Weylandt
>> <michael.weylandt at gmail.com> wrote:
>> > Short answer: no, those are (in general) insufficient parameters
to
>> > characterize a distribution.
>> >
>> > Long answer: unfortunately, it's not uncommon that those
"summary
>> > statistics" are the only ones reported based on someone or
other's
>> > limited experience with the Gaussian. There are a few things you
could
>> > try, but each of them has problems:
>> >
>> > i) Pretend like your data is in fact normal and use those
parameters
>> > because they do uniquely characterize a normal distribution. MASS
>> > (among others) provides a multivariate normal distribution
[mvrnorm]
>> > if you have a covariance matrix available.
>> >
>> > ii) If you have reason to imagine another distribution [guided by
>> > domain knowledge], try to get its parameters in so far as possible
by
>> > moment matching. Covariance structures are much harder for the
general
>> > case though.
>> >
>> > iii) If you can get something that resembles original data, simply
>> > work by bootstrapping / imputation.
>> >
>> > Hope this helps,
>> > Michael
>> >
>> > On Thu, Jun 7, 2012 at 3:34 PM, Andras Farkas <motyocska at
yahoo.com> wrote:
>> >> Dear All,
>> >>
>> >> I often have to work with?certain models in which I try to
"reproduce" a distribution the best I can with very little known
information avaible. Is there a package or function in R that could best
reproduce a probability distribution using only the mean, median and SD values
availble without knowing the actual distribution type?to begin with and/or the
covariance matrix (for more then 1 data set)? All I usually have reported
availble is mean, median and SD. I hope I made?my question?clear enough...
>> >>
>> >> thanks,
>> >>
>> >> Andras
>> >>
>> >>