thr3ads.net - R help - [R] Reading data into R [Jan 2008]

If this information is useful, please help other people find it:
Share via:

BEP

2008-Jan-03 14:00 UTC

[R] Reading data into R

Hello all,

I am working with a very large data set into R, and I have no interest in
reviving my SAS skills.  To do this, I will need to drop unwanted variables
given the size of the data file.  The most common strategy seems to be
subsetting the data after it is read into R.  Unfortunately, given the size
of the data set, I can't get the file read and then subsquently do the
subset procedure.  I would be appreciative of help on the following:

1.  What are the possibilities of reading in just a small set of variables
during the <read.table> statement (or another 'read' statement)? 
That is,
is it possible specify just the variables that I want to keep?

2.  Can I randomly select a set of observations during the 'read'
statement?


I have searched various R resources for this information, so if I am simply
overlooking a key resource on this issue, pointing that out to me would be
greatly appreciated.

Thanks in advance.

Brian

	[[alternative HTML version deleted]]

Gabor Grothendieck

2008-Jan-03 14:10 UTC

head link

[R] Reading data into R

On Jan 3, 2008 9:00 AM, BEP <perronbe at gmail.com>
wrote:> Hello all,
>
> I am working with a very large data set into R, and I have no interest in
> reviving my SAS skills.  To do this, I will need to drop unwanted variables
> given the size of the data file.  The most common strategy seems to be
> subsetting the data after it is read into R.  Unfortunately, given the size
> of the data set, I can't get the file read and then subsquently do the
> subset procedure.  I would be appreciative of help on the following:
>
> 1.  What are the possibilities of reading in just a small set of variables
> during the <read.table> statement (or another 'read'
statement)?  That is,
> is it possible specify just the variables that I want to keep?
read.table can skip columns.  Specify the releveant component of colClasses
as NULL.
>
> 2.  Can I randomly select a set of observations during the 'read'
statement?
>
>
> I have searched various R resources for this information, so if I am simply
> overlooking a key resource on this issue, pointing that out to me would be
> greatly appreciated.
>
The development version of sqldf can do all of the above (i.e. read in
a subset of
columns, a subset of rows or a random subset of rows) subject to certain
limitations on the input format.  See Example 6 on the home page:
   http://sqldf.googlecode.com

readTable in the R.utils package can also read in a subset of rows and columns.

Rubén Roa-Ureta

2008-Jan-03 14:16 UTC

head link

[R] Reading data into R

BEP wrote:> Hello all,
> 
> I am working with a very large data set into R, and I have no interest in
> reviving my SAS skills.  To do this, I will need to drop unwanted variables
> given the size of the data file.  The most common strategy seems to be
> subsetting the data after it is read into R.  Unfortunately, given the size
> of the data set, I can't get the file read and then subsquently do the
> subset procedure.  I would be appreciative of help on the following:
> 
> 1.  What are the possibilities of reading in just a small set of variables
> during the <read.table> statement (or another 'read'
statement)?  That is,
> is it possible specify just the variables that I want to keep?
> 
> 2.  Can I randomly select a set of observations during the 'read'
statement?
> 
> 
> I have searched various R resources for this information, so if I am simply
> overlooking a key resource on this issue, pointing that out to me would be
> greatly appreciated.
> 
> Thanks in advance.
> 
> Brian
Check this for input of specific columns from a large data matrix:
mysubsetdata<-do.call("cbind",scan(file='location and name of
your
file',what=list(NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,0,NULL,0,NULL,NULL),flush=TRUE))
This will input only columns 10 and 11 into 'mysubsetdata'.
With this method you can work out the way to select random columns.
HTH
Rub?n

Maybe Matching Threads

Search for more possibly parallel threads

R help - Jan 2008 - Reading data into R

[R] Reading data into R

[R] Reading data into R

[R] Reading data into R

Maybe Matching Threads