thr3ads.net - R help - [R] Re; Getting SNPS from PLINK to R [Jun 2011]

If this information is useful, please help other people find it:
Share via:

Jim Silverton

2011-Jun-21 03:32 UTC

[R] Re; Getting SNPS from PLINK to R

I a using plink on a large SNP dataset with a .map and .ped file.
I want to get some sort of file say a list of all the SNPs that plink is
saying that I have. ANyideas on how to do this?

-- 
Thanks,
Jim.

	[[alternative HTML version deleted]]

Clemontina Alexander

2011-Jun-21 13:23 UTC

head link

[R] Re; Getting SNPS from PLINK to R

Hi,
If you go to this site:
http://pngu.mgh.harvard.edu/~purcell/plink/res.shtml#teach

And download the teaching.zip file, I think there was information in
the word document about reading plink data into R, though I am not
100% sure. I think a read.table("filename.ped", header=T)  command may
be enough. The word document is for plink beginners so it may not be
what you are looking for.

Tina







On Mon, Jun 20, 2011 at 11:32 PM, Jim Silverton <jim.silverton at
gmail.com> wrote:>
> I a using plink on a large SNP dataset with a .map and .ped file.
> I want to get some sort of file say a list of all the SNPs that plink is
> saying that I have. ANyideas on how to do this?
>
> --
> Thanks,
> Jim.
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Natalie Van Zuydam

2011-Jun-21 13:39 UTC

head link

[R] Re; Getting SNPS from PLINK to R

Hi Jim,

If you convert the ped and map files to binary plink files the .bim file 
will tell you the name and the position of the snps.  This would be the 
easiest method.  Alternatively packages like GenABEL and genetics have 
functions to read in PLINK formatted data for analysis in R.

Best wishes,
Natalie

On 21/06/2011 14:23, Clemontina Alexander wrote:> Hi,
> If you go to this site:
> http://pngu.mgh.harvard.edu/~purcell/plink/res.shtml#teach
>
> And download the teaching.zip file, I think there was information in
> the word document about reading plink data into R, though I am not
> 100% sure. I think a read.table("filename.ped", header=T) 
command may
> be enough. The word document is for plink beginners so it may not be
> what you are looking for.
>
> Tina
>
>
>
>
>
>
>
> On Mon, Jun 20, 2011 at 11:32 PM, Jim Silverton<jim.silverton at
gmail.com>  wrote:
>> I a using plink on a large SNP dataset with a .map and .ped file.
>> I want to get some sort of file say a list of all the SNPs that plink
is
>> saying that I have. ANyideas on how to do this?
>>
>> --
>> Thanks,
>> Jim.
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Natalie Van Zuydam

2011-Jun-21 13:39 UTC

head link

[R] Re; Getting SNPS from PLINK to R

Hi Jim,

If you convert the ped and map files to binary plink files the .bim file 
will tell you the name and the position of the snps.  This would be the 
easiest method.  Alternatively packages like GenABEL and genetics have 
functions to read in PLINK formatted data for analysis in R.

Best wishes,
Natalie

On 21/06/2011 14:23, Clemontina Alexander wrote:> Hi,
> If you go to this site:
> http://pngu.mgh.harvard.edu/~purcell/plink/res.shtml#teach
>
> And download the teaching.zip file, I think there was information in
> the word document about reading plink data into R, though I am not
> 100% sure. I think a read.table("filename.ped", header=T) 
command may
> be enough. The word document is for plink beginners so it may not be
> what you are looking for.
>
> Tina
>
>
>
>
>
>
>
> On Mon, Jun 20, 2011 at 11:32 PM, Jim Silverton<jim.silverton at
gmail.com>  wrote:
>> I a using plink on a large SNP dataset with a .map and .ped file.
>> I want to get some sort of file say a list of all the SNPs that plink
is
>> saying that I have. ANyideas on how to do this?
>>
>> --
>> Thanks,
>> Jim.
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Mike Miller

2011-Jun-23 01:23 UTC

head link

[R] Re; Getting SNPS from PLINK to R

Resending to correct bad subject line...

On Mon, 20 Jun 2011, Jim Silverton wrote:
> I a using plink on a large SNP dataset with a .map and .ped file. I want 
> to get some sort of file say a list of all the SNPs that plink is saying 
> that I have. ANyideas on how to do this?

All the SNPs you have are listed in the .map file.  An easy way to put the 
data in to R, if there isn't too much, is to do this:

plink --file whatever --out whatever --recodeA

That will make a file called whatever.raw, single space delimited, 
consisting of minor allele counts (0, 1, 2, NA) that you can bring into R 
like this:

data <- read.table("whatever.raw", delim=" ", header=T)

If you have tons of data, you'll want to work with the compact binary 
format (four genotypes per byte):

plink --file whatever --out whatever --make-bed

Then see David Duffy's reply.  However, I'm not sure if R can work with 
the compact format in memory.  It might expand those genotypes (minor 
allele counts) from two-bit integers to double-precision floats.  What 
does read.plink() create in memory?

There is another package I've been meaning to look at that is supposed to 
help with the memory management problem for large genotype files:

http://cran.r-project.org/web/packages/ff/

I haven't used it yet, but I am hopeful.  Maybe David Duffy or someone 
else here will know more about it.

If you have a lot of data, also consider chopping the data into pieces 
before loading it into R.  That's what we do.  With a 100 core system, I 
break the data into 100 files (I use the GNU/Linux "split" command and
a
few other tricks) and have all 100 cores run at once to analyze the data.

When I work with genotype data as allele counts using Octave, I store the 
data, both in files and in memory, as unsigned 8-bit integers, using 3 as 
the missing value.  That's still inefficient compared to the PLINK system, 
but it is way better than using doubles.

Best,
Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota

Maybe Matching Threads

Search for more maybe matching threads

R help - Jun 2011 - Re; Getting SNPS from PLINK to R

[R] Re; Getting SNPS from PLINK to R

[R] Re; Getting SNPS from PLINK to R

[R] Re; Getting SNPS from PLINK to R

[R] Re; Getting SNPS from PLINK to R

[R] Re; Getting SNPS from PLINK to R

Maybe Matching Threads