Alex Gutteridge
2006-Apr-28 09:39 UTC
[R] Checking Goodness of Fit With Kolmogorov-Smirnov
Hi, I'm using the power.law.fit function from the igraph package to fit a power law distribution to some data. This function returns the power law exponent as it's only result. I would like to have some sort of goodness-of-fit and/or error estimate of the exponent returned. This paper: http://www.edpsciences.org/articles/epjb/pdf/2004/18/b04111.pdf suggests using the Kolmogorov-Smirnov test to measure the goodness-of- fit, however after reading the ks.test help page I'm not sure how to proceed. The ks.test help page (and other stats texts) seem to say that for the test to be valid, the parameters of the distribution (the power law exponent in this case) cannot be estimated from the data. If that's the case how do I test how good the fit is? Do I need to use something else instead of KS? I've seen other mentions of using KS to measure the goodness-of-fit to power laws though so perhaps I've misunderstood something basic here (I'm a biologist not a statistician). If KS is the right tool to use, how would I go about specifying the continuous distribution function for a power law? The help page only shows how to use it with the built in 'pgamma' function is there an analogous function for power law distributions - I couldn't find one. Thanks for any help in steering me in the right direction. Dr Alex Gutteridge Post-Doctoral Researcher Bioinformatics Center Institute for Chemical Research Kyoto University Gokasho, Uji, Kyoto 611-0011 Japan
Hi, this function uses maximum likelihood estimation to fit the exponent, and returns an mle object. See ?"mle-class" for the details. Just to give you an example for getting confidence intervals: library(igraph) data <- sample (1:100000, 100000, rep=TRUE, prob=(1:100000)^-2.4) res <- power.law.fit(data) confint(res) You may want to check these as well: http://arxiv.org/abs/cond-mat/0412004 http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf Hope this helps, Gabor On Fri, Apr 28, 2006 at 06:39:01PM +0900, Alex Gutteridge wrote:> Hi, > > I'm using the power.law.fit function from the igraph package to fit a > power law distribution to some data. This function returns the power > law exponent as it's only result. I would like to have some sort of > goodness-of-fit and/or error estimate of the exponent returned. This > paper: > > http://www.edpsciences.org/articles/epjb/pdf/2004/18/b04111.pdf > > suggests using the Kolmogorov-Smirnov test to measure the goodness-of- > fit, however after reading the ks.test help page I'm not sure how to > proceed. The ks.test help page (and other stats texts) seem to say > that for the test to be valid, the parameters of the distribution > (the power law exponent in this case) cannot be estimated from the > data. If that's the case how do I test how good the fit is? Do I need > to use something else instead of KS? I've seen other mentions of > using KS to measure the goodness-of-fit to power laws though so > perhaps I've misunderstood something basic here (I'm a biologist not > a statistician). > > If KS is the right tool to use, how would I go about specifying the > continuous distribution function for a power law? The help page only > shows how to use it with the built in 'pgamma' function is there an > analogous function for power law distributions - I couldn't find one. > > Thanks for any help in steering me in the right direction. > > Dr Alex Gutteridge > Post-Doctoral Researcher > > Bioinformatics Center > Institute for Chemical Research > Kyoto University > Gokasho, Uji, Kyoto 611-0011 > Japan > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html-- Csardi Gabor <csardi at rmki.kfki.hu> MTA RMKI, ELTE TTK