Jochen1980
2011-Nov-03 00:08 UTC
[R] Kolmogorov-Smirnov-Test on binned data, I guess gumbel-distributed data
Hi R-Users, I read some texts related to KS-tests. Most of those authors stated, that KS-Tests are not suitable for binned data, but some of them refer to 'other' authors who are claiming that KS-Tests are okay for binned data. I searched for sources and can't find examples which approve that it is okay to use KS-Tests for binned data - do you have any links to articles or tutorials? Anyway, I look for a test which backens me up that my data is gumbel-distributed. I estimated the gumbel-parameters mue and beta and after having a look on resulting plots, in my opinion: that looks quite good! You can the plot, related data, and the rscript here: www.jochen-bauer.net/downloads/kstest/Rplots-1000.pdf http://www.jochen-bauer.net/downloads/kstest/rm2700-1000.txt http://www.jochen-bauer.net/downloads/kstest/rcalc.R The story about the data: I am wondering what test I should choose if KS-Test is not appropriate? I get real high p-Values for data-row-1-histogram-heights and fitted-gumbel-distribution-function-to-bin-midth-vals. Most of the time, KS-test results in distances of 0.01 and p-Values of 0.99 or 1. This sounds strange to me, too high. Otherwise my plots are looking good and as you can see, in my first experiment I sampled 1000 values. In a second experiment I created only 50 random-values for the gumbel-parameter-estimation. I try to reduce permutations, so I will be able to create results faster, but I have to find out, when data fails for gumbel-distribution. The results surprised me, I expected that my tests and plots get worse, but I got still high p-values for the KS-Test and still a nice looking plot. www.jochen-bauer.net/downloads/kstest/Rplots-0050.pdf http://www.jochen-bauer.net/downloads/kstest/rm2700-0050.txt Moreover besides the shuffled data of my randomisation-test there are real-data-values. I calculated the p-value that my real data point occurs under estimated gumbel distribution. Those p-values between 1000permutation-experiment and 50-permutation-experiment are correlating enormously ... around 0.98. Pearson and Spearman-correlation-coefficients told me this. I guess that backens up the fact, that my plots are not getting worse nor the KS-Tests do. I hope I was able to state my current situation and you are able to give me some hints, for some literature or other tests or backen me up in my guess that my data is gumbel-distributed. Thanks in advance. Jochen I hope I was able to tell -- View this message in context: http://r.789695.n4.nabble.com/Kolmogorov-Smirnov-Test-on-binned-data-I-guess-gumbel-distributed-data-tp3983781p3983781.html Sent from the R help mailing list archive at Nabble.com.