pguilha
2011-Jul-04 18:22 UTC
[R] clustering based on most significant pvalues does not separate the groups!
Hi all, I have some microarray data on 40 samples that fall into two groups. I have a value for 480k probes for each of those samples. I performed a t test (rowttests) on each row(giving the indices of the columns for each group) then used p.adjust() to adjust the pvalues for the number of tests performed. I then selected only the probes with adj-p.value<=0.05. I end up with roughly 2000 probes to do the clustering on but using pvclust, and hclust, the samples do no split up into the two groups. I would have imagined that using only those values that are significantly different between the two groups, the clustering should surely reflect that? Please, what am I missing!!!!??? Thanks! Paul PS: I am hoping I have just thought this through in the wrong way and there is a simple explanation, but can provide the code I am using for clustering if necessary! -- View this message in context: http://r.789695.n4.nabble.com/clustering-based-on-most-significant-pvalues-does-not-separate-the-groups-tp3644249p3644249.html Sent from the R help mailing list archive at Nabble.com.
S Ellison
2011-Jul-06 09:13 UTC
[R] clustering based on most significant pvalues does not separate the groups!
t-tests and the like test for a difference in mean value, not for non-overlapping populations or data sets. The fact that the mean of one data set differs significantly from the mean of the other does not mean that the ranges of the individual points in each data set are disjoint. set.seed(1023) x<-rnorm(60, 10) y<-x+0.75 boxplot(x,y) #Lots of overlap for individual points t.test(x,y) #Strongly significant difference Does that correspond to your situation well enough to account for your puzzlement? S Ellison> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of pguilha > Sent: 04 July 2011 19:22 > To: r-help at r-project.org > Subject: [R] clustering based on most significant pvalues > does not separate the groups! > > Hi all, > > I have some microarray data on 40 samples that fall into two > groups. I have a value for 480k probes for each of those > samples. I performed a t test > (rowttests) on each row(giving the indices of the columns for > each group) then used p.adjust() to adjust the pvalues for > the number of tests performed. I then selected only the > probes with adj-p.value<=0.05. I end up with roughly 2000 > probes to do the clustering on but using pvclust, and hclust, > the samples do no split up into the two groups. I would have > imagined that using only those values that are significantly > different between the two groups, the clustering should > surely reflect that? > > Please, what am I missing!!!!??? > > Thanks! > > Paul > > PS: I am hoping I have just thought this through in the wrong > way and there is a simple explanation, but can provide the > code I am using for clustering if necessary! > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/clustering-based-on-most-signifi > cant-pvalues-does-not-separate-the-groups-tp3644249p3644249.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > *******************************************************************This email and any attachments are confidential. Any use...{{dropped:8}}
pguilha
2011-Jul-06 16:04 UTC
[R] clustering based on most significant pvalues does not separate the groups!
Yes absolutely, your explanation makes sense. Thanks very much. rgds Paul -- View this message in context: http://r.789695.n4.nabble.com/clustering-based-on-most-significant-pvalues-does-not-separate-the-groups-tp3644249p3649233.html Sent from the R help mailing list archive at Nabble.com.