Here's a function Josh Wiley provided in another thread:
spec.cor <- function(dat, r, ...) {
x <- cor(dat, ...)
x[upper.tri(x, TRUE)] <- NA
i <- which(abs(x) >= r, arr.ind = TRUE)
data.frame(matrix(colnames(x)[as.vector(i)], ncol = 2), value = x[i])
}
Michael
On Thu, Nov 17, 2011 at 4:08 PM, Musa Hassan <musahass at gmail.com>
wrote:> Hi Michael,
> I was able to solve this. I just used the WGCNA library which allows for
> stringsAsFactors to be defined in the work space making everything stored
as
> strings remain strings. My problem now is parsing through the results to
> pull out only significant correlations defined by a certain Pearson
> correlation value say 0.8.
>
> On 17 November 2011 15:32, R. Michael Weylandt <michael.weylandt at
gmail.com>
> wrote:
>>
>> I can't see how it's stored like that and the email servers
garble it
>> up. Use dput() to create a plain text representation and paste that
>> back in.
>>
>> Thanks,
>> Michael
>>
>> On Thu, Nov 17, 2011 at 9:37 AM, muzz56 <musahass at gmail.com>
wrote:
>> > Hi Michael,
>> > Here is a sample of the data.
>> >
>> > ?Gene Array1 Array2 Array3 Array4 Array5 Array6 Array7 Array8
Array9
>> > Array10
>> > Array11 ?Fth1 26016.01 23134.66 17445.71 39856.04 27245.45
23622.98
>> > 37887.75
>> > 49857.46 25864.73 21852.51 29198.4 ?B2m 7573.64 7768.52 6608.24
8571.65
>> > 6380.78 6242.76 6903.92 7330.63 7256.18 5678.21 10937.05 ?Tmsb4x
6192.44
>> > 4277.22 5024.59 4851.51 3062.55 4562.43 7948.1 5018.58 3200.17
2855.77
>> > 6139.23 ?H2-D1 3141.41 3986.06 3328.62 4726.6 3589.89 2885.95
7509.88
>> > 5257.62 4742.26 3431.33 5300.72 ?Prdx5 3935.7 3938.9 3401.68
4193.14
>> > 4028.95
>> > 3438.19 6640.15 5486.61 4424.57 3368.83 5265.92
>> > I want to retain the gene names in the data. What you've
proposed will
>> > take
>> > them out and I'll have to append them back to the results
after the
>> > cor()
>> >
>> > On 17 November 2011 09:33, Michael Weylandt [via R] <
>> > ml-node+s789695n4080177h34 at n4.nabble.com> wrote:
>> >
>> >> I think something like this should do it, but I can't test
without
>> >> data:
>> >>
>> >> rownames(mydata) <- mydata[,1] # Put the elements in the
first column
>> >> as rownames
>> >> mydata <- mydata[,-1] # drop the things that are now
rownames
>> >>
>> >> Michael
>> >>
>> >> On Thu, Nov 17, 2011 at 9:23 AM, Musa Hassan <[hidden
>> >>
email]<http://user/SendEmail.jtp?type=node&node=4080177&i=0>>
>> >> wrote:
>> >>
>> >> > Hi Michael,
>> >> > Thanks for the response. I have noticed that the error
occurred
>> >> > during
>> >> my
>> >> > data read. It appears that the rownames (which when the
data is
>> >> transposed
>> >> > become my colnames) were converted to numbers instead of
strings as
>> >> > they
>> >> > should be. The original header names don't change,
just the rownames.
>> >> > I
>> >> have
>> >> > to figure out how to import the data and have the strings
not
>> >> > converted.
>> >> > Right now am using:
>> >> > mydata = read.csv(mydata.csv,
headers=T,stringsAsFactors=F)
>> >> >
>> >> > then to convert the data frame to matrix
>> >> > mydata=data.matrix(mydata)
>> >> >
>> >> > Then I just do the correlation as Peter suggested.
>> >> >
>> >> > expression=cor(t(expression))
>> >> >
>> >> > Thanks.
>> >> >
>> >> > On 17 November 2011 08:51, R. Michael Weylandt
<[hidden
>> >> >
email]<http://user/SendEmail.jtp?type=node&node=4080177&i=1>>
>> >>
>> >> > wrote:
>> >> >>
>> >> >> On Wed, Nov 16, 2011 at 11:22 PM, muzz56 <[hidden
>> >> >>
email]<http://user/SendEmail.jtp?type=node&node=4080177&i=2>>
>> >> wrote:
>> >> >> > Thanks to everyone who replied to my post, I
finally got it to
>> >> >> > work.
>> >> I
>> >> >> > am
>> >> >> > however not sure how well it worked since it run
so quickly, but
>> >> seems
>> >> >> > like
>> >> >> > I have a 2000 x 2000 data set.
>> >> >>
>> >> >> Behold the great and mighty power that is R!
Don't worry -- on a
>> >> >> decent machine the correlation of a 2k x 2k data set
should be
>> >> >> pretty
>> >> >> fast. (It's about 9 seconds on my old-ish laptop
with a bunch of
>> >> >> other
>> >> >> junk running)
>> >> >>
>> >> >> > ?My followup questions would be, how do I get
>> >> >> > only pairs with say a certain pearson
correlation value
>> >> >> > additionally
>> >> it
>> >> >> > seems like my output didn't retain the
headers but instead
>> >> >> > replaced
>> >> them
>> >> >> > with numbers making it hard to know which gene
pairs correlate.
>> >> >>
>> >> >> This is a little worrisome: R carries column names
through cor() so
>> >> >> this would suggest you weren't using them. Were
your headers listed
>> >> >> as
>> >> >> part of your data (instead of being names)? If so,
they would have
>> >> >> been taken as numbers.
>> >> >>
>> >> >> Take a look at dimnames(NAMEOFDATA) -- if your
headers aren't there,
>> >> >> then they are being treated as data instead of
numbers. If they are,
>> >> >> can you provide some reproducible code and we can
debug more fully.
>> >> >> The easiest way to send data is to use the dput()
function to get a
>> >> >> copy-pasteable plain text representation. It would
also be great if
>> >> >> you could restrict it to a subset of your data rather
than the full
>> >> >> 4M
>> >> >> data points, but if that's hard to do, don't
worry.
>> >> >>
>> >> >> You should have expected behavior like
>> >> >>
>> >> >> X <- matrix(1:9,3)
>> >> >> colnames(X) <-
c("A","B","C")
>> >> >> cor(X) # Prints with labels
>> >> >>
>> >> >> Michael
>> >> >>
>> >> >> >
>> >> >> > On 16 November 2011 17:11, Nordlund, Dan
(DSHS/RDA) [via R] <
>> >> >> > [hidden email]
>> >> >> >
<http://user/SendEmail.jtp?type=node&node=4080177&i=3>>
>> >> wrote:
>> >> >> >
>> >> >> >> > -----Original Message-----
>> >> >> >> > From: [hidden
>> >> >> >> >
email]<http://user/SendEmail.jtp?type=node&node=4078114&i=0
>> >> >[mailto:
>> >> >> >> r-help-bounces at r-
>> >> >> >> > project.org] On Behalf Of muzz56
>> >> >> >> > Sent: Wednesday, November 16, 2011
12:28 PM
>> >> >> >> > To: [hidden
>> >> >> >> >
email]<http://user/SendEmail.jtp?type=node&node=4078114&i=1>
>> >> >> >> > Subject: Re: [R] Pairwise correlation
>> >> >> >> >
>> >> >> >> > Thanks Peter. I tried this after
reading in the csv (read.csv)
>> >> >> >> > and
>> >> >> >> > converted the data to matrix
(as.matrix). But when I tried the
>> >> >> >> > correlation,
>> >> >> >> > I keeping getting the error (x must be
numeric) yet when I view
>> >> the
>> >> >> >> > data,
>> >> >> >> > its numeric.
>> >> >> >> >
>> >> >> >>
>> >> >> >> What does R tell you if you execute the
following?
>> >> >> >>
>> >> >> >> str(x)
>> >> >> >>
>> >> >> >> Just because the data looks like it is
numeric when it prints
>> >> doesn't
>> >> >> >> mean
>> >> >> >> it is.
>> >> >> >>
>> >> >> >>
>> >> >> >> Dan
>> >> >> >>
>> >> >> >> Daniel J. Nordlund
>> >> >> >> Washington State Department of Social and
Health Services
>> >> >> >> Planning, Performance, and Accountability
>> >> >> >> Research and Data Analysis Division
>> >> >> >> Olympia, WA 98504-5204
>> >> >> >>
>> >> >> >>
>> >> >> >>
______________________________________________
>> >> >> >> [hidden email]
>> >> >> >>
<http://user/SendEmail.jtp?type=node&node=4078114&i=2>mailing
>> >> >> >> list
>> >> >> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> >> >> PLEASE do read the posting guide
>> >> >> >> http://www.R-project.org/posting-guide.html
>> >> >> >> and provide commented, minimal,
self-contained, reproducible
>> >> >> >> code.
>> >> >> >>
>> >> >> >>
>> >> >> >> ------------------------------
>> >> >> >> ?If you reply to this email, your message
will be added to the
>> >> >> >> discussion
>> >> >> >> below:
>> >> >> >>
>> >> >> >>
>> >>
>> >>
http://r.789695.n4.nabble.com/Pairwise-correlation-tp4076963p4078114.html
>> >> >> >> ?To unsubscribe from Pairwise correlation,
click
>> >> >> >> here<
>> >>
>> >> >> >> .
>> >> >> >>
>> >> >> >> NAML<
>> >>
>> >>
http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.InstantMailNamespace&breadcrumbs=instant+emails%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>> >>
>> >> >> >>
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > View this message in context:
>> >> >> >
>> >>
>> >>
http://r.789695.n4.nabble.com/Pairwise-correlation-tp4076963p4078915.html
>> >> >> > Sent from the R help mailing list archive at
Nabble.com.
>> >> >> > ? ? ? ?[[alternative HTML version deleted]]
>> >> >> >
>> >> >> > ______________________________________________
>> >> >> > [hidden email]
>> >> >> >
<http://user/SendEmail.jtp?type=node&node=4080177&i=4>mailing list
>> >> >> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >> >> > PLEASE do read the posting guide
>> >> >> > http://www.R-project.org/posting-guide.html
>> >> >> > and provide commented, minimal, self-contained,
reproducible code.
>> >> >> >
>> >> >
>> >> >
>> >>
>> >> ______________________________________________
>> >> [hidden email]
>> >>
<http://user/SendEmail.jtp?type=node&node=4080177&i=5>mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> >> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible
code.
>> >>
>> >>
>> >> ------------------------------
>> >> ?If you reply to this email, your message will be added to the
>> >> discussion
>> >> below:
>> >>
>> >>
http://r.789695.n4.nabble.com/Pairwise-correlation-tp4076963p4080177.html
>> >> ?To unsubscribe from Pairwise correlation, click
>> >>
here<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4076963&code=bXVzYWhhc3NAZ21haWwuY29tfDQwNzY5NjN8LTE5ODYxNDM0OTI=>
>> >> .
>> >>
>> >>
NAML<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.InstantMailNamespace&breadcrumbs=instant+emails%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>> >>
>> >
>> >
>> > --
>> > View this message in context:
>> >
http://r.789695.n4.nabble.com/Pairwise-correlation-tp4076963p4080194.html
>> > Sent from the R help mailing list archive at Nabble.com.
>> > ? ? ? ?[[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>
>