Johannes Radinger
2012-Sep-28 10:10 UTC
[R] Crosstable-like analysis (ks test) of dataframe
Hi, I have a dataframe with multiple (appr. 20) columns containing vectors of different values (different distributions). Now I'd like to create a crosstable where I compare the distribution of each vector (df-column) with each other. For the comparison I want to use the ks.test(). The result should contain as row and column names the column names of the input dataframe and the cells should be populated with the p-value of the ks.test for each pairwise analysis. My data.frame looks like: df <- data.frame(X=rnorm(1000,2),Y=rnorm(1000,1),Z=rnorm(1000,2)) And the test for one single case is: ks <- ks.test(df$X,df$Z) where the p value is: ks[2] How can I create an automatized way of this pairwise analysis? Any suggestions? I guess that is a quite common analysis (probably with other tests). cheers, Johannes
Hello,
Try the following.
f <- function(x, y, ...,
alternative = c("two.sided", "less",
"greater"), exact = NULL){
#w <- getOption("warn")
#options(warn = -1) # ignore warnings
p <- ks.test(x, y, ..., alternative = alternative, exact =
exact)$p.value
#options(warn = w)
p
}
n <- 1e1
dat <- data.frame(X=rnorm(n), Y=runif(n), Z=rchisq(n, df=3))
apply(dat, 2, function(x) apply(dat, 2, function(y) f(x, y)))
Hope this helps,
Rui Barradas
Em 28-09-2012 11:10, Johannes Radinger escreveu:> Hi,
>
> I have a dataframe with multiple (appr. 20) columns containing
> vectors of different values (different distributions).
> Now I'd like to create a crosstable
> where I compare the distribution of each vector (df-column) with
> each other. For the comparison I want to use the ks.test().
> The result should contain as row and column names the column names
> of the input dataframe and the cells should be populated with
> the p-value of the ks.test for each pairwise analysis.
>
> My data.frame looks like:
> df <- data.frame(X=rnorm(1000,2),Y=rnorm(1000,1),Z=rnorm(1000,2))
>
> And the test for one single case is:
> ks <- ks.test(df$X,df$Z)
>
> where the p value is:
> ks[2]
>
> How can I create an automatized way of this pairwise analysis?
> Any suggestions? I guess that is a quite common analysis (probably with
> other tests).
>
> cheers,
> Johannes
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.