Ted Byers
2008-Sep-22 16:26 UTC
[R] Statistical question re assessing fit of distribution functions.
I am in a situation where I have to fit a distrution, such as cauchy or normal, to an empirical dataset. Well and good, that is easy. But I wanted to assess just how good the fit is, using ks.test. I am concerned about the following note in the docs (about the example provided): "Note that the distribution theory is not valid here as we have estimated the parameters of the normal distribution from the same sample" This implies I should not use ks.test(x,"pnorm",mean =1.187, sd =0.917), where the numbers shown are estimated from 'x'. If this is so, how do I get a correct test? I know I can not use different samples because of just how different the parameters are from one sample to the next, so using parameters estimated from the sample from week one to define the distribution function for ks.test will give a poor fit for the data from week two. And the sample size is small enough that I would not have confidence in the parameters estimated from a portion of a samlpe to fit against the remainder of the sample. Thanks Ted -- View this message in context: http://www.nabble.com/Statistical-question-re-assessing-fit-of-distribution-functions.-tp19611539p19611539.html Sent from the R help mailing list archive at Nabble.com.
Timur Shtatland
2008-Sep-22 18:00 UTC
[R] Statistical question re assessing fit of distribution functions.
If one of the goals is the normality test, then there may be better alternatives to the Kolmogorov-Smirnov test. See an explanation on: http://graphpad.com/FAQ/viewfaq.cfm?faq=959 The R implementation: ?shapiro.test A casual search also turned this up: http://tolstoy.newcastle.edu.au/R/help/04/09/3201.html http://tolstoy.newcastle.edu.au/R/help/04/08/3121.html http://www.karlin.mff.cuni.cz/~pawlas/2008/MAI061/dagost.R Best, Timur -- Timur Shtatland, Ph.D. Senior Bioinformatics Scientist Agencourt Bioscience Corporation - A Beckman Coulter Company 500 Cummings Center, Suite 2450 Beverly, MA 01915 www.agencourt.com On Mon, Sep 22, 2008 at 12:26 PM, Ted Byers <r.ted.byers at gmail.com> wrote:> > I am in a situation where I have to fit a distrution, such as cauchy or > normal, to an empirical dataset. Well and good, that is easy. > > But I wanted to assess just how good the fit is, using ks.test. > > I am concerned about the following note in the docs (about the example > provided): "Note that the distribution theory is not valid here as we have > estimated the parameters of the normal distribution from the same sample" > > This implies I should not use ks.test(x,"pnorm",mean =1.187, sd =0.917), > where the numbers shown are estimated from 'x'. If this is so, how do I get > a correct test? I know I can not use different samples because of just how > different the parameters are from one sample to the next, so using > parameters estimated from the sample from week one to define the > distribution function for ks.test will give a poor fit for the data from > week two. And the sample size is small enough that I would not have > confidence in the parameters estimated from a portion of a samlpe to fit > against the remainder of the sample. > > Thanks > > Ted > > -- > View this message in context: http://www.nabble.com/Statistical-question-re-assessing-fit-of-distribution-functions.-tp19611539p19611539.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >