thr3ads.net - similar to: "How to import the large data into R"

Displaying 20 results from an estimated 1000 matches similar to: "How to import the large data into R"

Reshaping genetic data from long to wide

2006 Apr 06

Reshaping genetic data from long to wide

Bottom Line Up Front: How does one reshape genetic data from long to wide? I currently have a lot of data. About 180 individuals (some probands/patients, some parents, rare siblings) and SNP data from 6000 loci on each. The standard formats seem to be something along the lines of Famid, pid, fatid, motid, affected, sex, locus1Allele1, locus1Allele2, locus2Allele1, locus2Allele2, etc In other

bug in codetools/R CMD check?

2011 Feb 03

bug in codetools/R CMD check?

Hi Mr Tierney, I have noticed an error message from R 1.12.x's CMD check for a while (apparently prof Ripley completely rewrote CMD check in R 1.12+) e.g.: http://bioconductor.org/checkResults/2.7/bioc-LATEST/snpMatrix/lamb2-checksrc.html ---------------- * checking R code for possible problems ... NOTE Warning: non-unique value when setting 'row.names': ?new? Error in

extract p-value from mixed model in kinship package

2011 Apr 14

extract p-value from mixed model in kinship package

Dear R experts I was using kinship package to fit mixed model with kinship matrix. The package looks like lme4, but I could find a way to extract p-value out of it. I need to extract is as I need to analyse large number of variables (> 10000). Please help me: require(kinship) Generating random example data id <- 1:100 dadid <- c(rep(0, 5), rep(1, 5), rep(3, 5), rep(5, 5), rep(7,

Qtl - package - Question

2009 Nov 06

Qtl - package - Question

Dear R-Helpers, I am using qtl package to analyze qtl data from QTL cartographer. I have the map file and cro file from QTL cartographer. I was trying to import these two files in R using qtl package. data=read.cross("qtlcart", ".", "crofile.txt", "mapfile.txt") ### I have matched the file structure with the one on the website of qtl package - It matches

Using PCA to correct p-values from snpMatrix

2011 Jan 03

Using PCA to correct p-values from snpMatrix

Hi R-help folks, I have been doing some single SNP association work using snpMatrix. This works well, but produces a lot of false positives, because of population structure in my data. I would like to correct the p-values (which snpMatrix gives me) for population structure, possibly using principle component analysis (PCA). My data is complicated, so here's a simple example of what

Sweave, cairo_pdf, CJK, ghostscript

2011 Oct 22

Sweave, cairo_pdf, CJK, ghostscript

I have had some fun in the last few days trying to put together an annotated map of China with R and some public GIS data: http://sourceforge.net/projects/outmodedbonsai/files/snpMatrix%20next/1.17.7.11/China_Choropleth_Maps.pdf/download It is done, and rather nice... there are a few issues: - the default pdf() device cannot do CJK with embedded fonts - and cairo_pdf() is not hooked up to

Installing ncdf package

2004 Mar 13

Installing ncdf package

Hi I used the command R CMD INSTALL ncdf to install ncdf package in linux. Can somebody explain to me what might be wrong. Thanks. Below is the error. * Installing *source* package 'ncdf' ... Special note: checking for gcc... gcc checking for C compiler default output... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for

trouble with netCDF (PR#824)

2001 Jan 24

trouble with netCDF (PR#824)

Having trouble with netCDF since I upgraded to 1.2.1 (did directly from 1.1.1). Package was recompiled after the upgrade. Symptoms: open.netCDF returns cleanly, but read.netCDF causes R to segfault and dump core. What else do you need me to write to help? --please do not edit the information below-- Version: platform = i586-pc-linux-gnu arch = i586 os = linux-gnu system = i586, linux-gnu

Memory problem on a linux cluster using a large data set

2006 Dec 18

Memory problem on a linux cluster using a large data set

Hello, I have a large data set 320.000 rows and 1000 columns. All the data has the values 0,1,2. I wrote a script to remove all the rows with more than 46 missing values. This works perfect on a smaller dataset. But the problem arises when I try to run it on the larger data set I get an error “cannot allocate vector size 1240 kb”. I’ve searched through previous posts and found out that it might

NetCDF within R: installation assistance

2008 Dec 23

NetCDF within R: installation assistance

Greetings. I am attempting to add NetCDF libraries within R, and have failed. We have R version 2.8, and are running on a 64-bit Redhat Linux 2.6.18 kernel: Red Hat Enterprise Linux Client release 5.2 (Tikanga) Linux halfmoon.ncdc.noaa.gov 2.6.18-92.1.22.el5 #1 SMP Fri Dec 5 09:28:22 EST 2008 x86_64 x86_64 x86_64 GNU/Linux I have run the installation instructions found at

2 x 3 Probability under the null

2011 Oct 27

2 x 3 Probability under the null

I have a 2 x 3 matrix called snp and I want to compute the following probability: choose(sum(snp[,1]), snp[1,1]) * choose(sum(snp[,2]), snp[1,2]) * choose(sum(snp[,3]), snp[1,3])/choose(sum(snp), sum(snp[1,])) but I keep getting Infs and NaNs. Is there a function that can do this in R? -- Thanks, Jim. [[alternative HTML version deleted]]

[ncdf4] error converting GEIA data to netCDF

2012 Aug 28

[ncdf4] error converting GEIA data to netCDF

summary: I can successfully ncvar_put(...) data to a file, but when I try to ncvar_get(...) the same data, I get > Error in if (nc$var[[li]]$hasAddOffset) addOffset = nc$var[[li]]$addOffset else addOffset = 0 : > argument is of length zero How to fix or debug? details: R code @ https://github.com/TomRoche/GEIA_to_NetCDF successfully (if crudely) uses R packages={ncdf4, maps, fields}

efficient code. how to reduce running time?

2007 Jan 21

efficient code. how to reduce running time?

Hi, I am new to R. and even though I've made my code to run and do what it needs to . It is taking forever and I can't use it like this. I was wondering if you could help me find ways to fix the code to run faster. Here are my codes.. the data set is a bunch of 0s and 1s in a data.frame. What I am doing is this. I pick a column and make up a new column Y with values associated with that

help in R

2006 Apr 26

help in R

Hi, I cant understand where I am going wrong.Below is my code.I would really appreciate your help. Thanks. > genfile<-read.table("c:/tina/phd/bs871/hw/genfile.txt",skip=1) > > #read in SNP data > snp.dat <- as.matrix(genfile) > snp.name <- scan("c:/tina/phd/bs871/hw/genfile.txt",nline=1,what="character") Read 100 items

A package to read and write NetCDF?

2004 Apr 06

A package to read and write NetCDF?

I am looking for a package to read and write NetCDF files. NetCDF package says it can only read, not write. Another package for the standard binary file format? Daehyok Shin

Index out SNP position

2013 Jan 03

Index out SNP position

Dear R experts, I have 2 matix: A& B. I am trying to index B against A - (1) find out B rows that fall between the col 1 and 2 of A& put them into a new vector SNP.I made code as below, but I cannot think of a right way to do it. Could anyone help me with the code? Thanks,Jiang---- A <-

splitting multiple data in one column into multiple rows with one entry per column

2009 Jul 26

splitting multiple data in one column into multiple rows with one entry per column

Dear R colleagues, I annotated a list of single nuclotide polymorphiosms (SNP) with the corresponding genes using biomaRt. The result is the following data.frame (pasted from R): snp ensembl_gene_id 1 rs8032583 2 rs1071600 ENSG00000101605 3 rs13406898 ENSG00000167165 4 rs7030479

Fw: Memory problem on a linux cluster using a large data set [Broadcast]

2007 Jan 10

Fw: Memory problem on a linux cluster using a large data set [Broadcast]

Hi I listened to all your advise and ran my data on a computer with a 64 bits procesor but i still get the same error saying "it cannot allocate a vector of that size 1240 kb" . I don't want to cut my data in smaller pieces because we are looking at interaction. So are there any other options for me to try out or should i wait for the development of more advanced computers!

Errors melt()ing data...

2008 Feb 28

Errors melt()ing data...

Hi, I'm trying to melt() some data for subsequent cast()ing and am encoutering errors. The overall process requires a couple of casts()s and melt()s. ########Start Session 1########## ## I have the data in a (fully) melted format and can cast it fine... > norm1[1:10,] Pool SNP Sample.Name variable value 1 1 rs1045485 CA0092 Height.1 0.003488853 2 1 rs1045485

netCDF, ncdf library

2003 Nov 20

netCDF, ncdf library

Dear all, I would like to use data in .netcdf format and for those I have to use the netCDF or ncdf packages. Problem : these packages don't seem to work : "Error in testRversion(descfields) : This package has not been installed properly See the Note in ?library" (with netCDF package) Does anybody have the same problems with these packages ? Is there any

similar to: How to import the large data into R