thr3ads.net - R help - [R] (no subject) [Aug 2013]

If this information is useful, please help other people find it:
Share via:

Babu Guha

2013-Aug-02 03:29 UTC

[R] (no subject)

I have a comma delimited file with 62 fields of which some are comments.
There are about 1.5 million records/lines. Sme of the fields which has
comments and which i do not need have 40 characters. Of the 62 fields, I
will need at most 12 fields. What's best way to read in the fields I need.
If I read the entire file at once I will run out of memory. Could anyone
please suggest some solution?

Thanks,
Babu.

	[[alternative HTML version deleted]]

Uwe Ligges

2013-Aug-02 08:42 UTC

head link

[R] skipping columns in read.table; was: (no subject)

On 02.08.2013 05:29, Babu Guha wrote:> I have a comma delimited file with 62 fields of which some are comments.
> There are about 1.5 million records/lines. Sme of the fields which has
> comments and which i do not need have 40 characters. Of the 62 fields, I
> will need at most 12 fields. What's best way to read in the fields I
need.
> If I read the entire file at once I will run out of memory. Could anyone
> please suggest some solution?
See ?read.table and its argument colClasses:

read.table(file, colClasses=c("numeric", "NULL",
"factor"))

Will read the first column as a numeric one, skip the second column and 
take the thirs one as a factor.

Best,
Uwe Ligges

>
> Thanks,
> Babu.
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Jim Lemon

2013-Aug-02 22:57 UTC

head link

[R] (no subject)

On 08/02/2013 01:29 PM, Babu Guha wrote:> I have a comma delimited file with 62 fields of which some are comments.
> There are about 1.5 million records/lines. Sme of the fields which has
> comments and which i do not need have 40 characters. Of the 62 fields, I
> will need at most 12 fields. What's best way to read in the fields I
need.
> If I read the entire file at once I will run out of memory. Could anyone
> please suggest some solution?
>Hi Babu,
Assuming that you know which fields you want, you could process the file 
line by line:

# say your file is "mydata.csv" and you want lines 1 to 12
mycon<-file("mydata.csv",open="r")
# assume you have exactly 1.5 million lines
mydata<-matrix(NA,nrow=1500000,ncol=12)
inputline<-"start"
lineindex<-1
while(nchar(inputline)) {
# read a line
  inputline<-readLines(mycon,1)
  if(nchar(inputline)) {
   mydata[lineindex,]<-
    unlist(sapply(strsplit(inputline,","),"[",1:12))
   lineindex<-lineindex+1
  }
}
close(mycon)

Jim

R help - Aug 2013 - (no subject)

[R] (no subject)

[R] skipping columns in read.table; was: (no subject)

[R] (no subject)