Displaying 20 results from an estimated 1000 matches similar to: "Loading large files in R"
2007 Feb 27
1
read.csv size limits
I have been using the read.csv function for a while now without any problems.
My files are usually 20-50 MBs and they take up to a minute to import. They
have all been under 50,000 rows and under 100 columns.
Recently, I tried importing a file of a similar size (which means about the
same amount of data), but with ~500,000 columns and ~20 rows. The process is
taking forever (~1 hour so far). In
2002 Aug 28
4
Huge data frames?
A friend of mine recently mentioned that he had painlessly imported a
data file with 8 columns and 500,000 rows into matlab. When I tried
the same thing in R (both Unix and Windows variants) I had little
success. The Windows version hung for a very long time, until I
eventually more or less ran out of virtual memory; I tried to set the
proper memory allocations for the Unix version, but it never
2009 Jun 14
2
read.csv
If read.csv's colClasses= argument is NOT used then read.csv accepts
double quoted numerics:
1: > read.csv(stdin())
0: A,B
1: "1",1
2: "2",2
3:
A B
1 1 1
2 2 2
However, if colClasses is used then it seems that it does not:
> read.csv(stdin(), colClasses = "numeric")
0: A,B
1: "1",1
2: "2",2
3:
Error in scan(file, what, nmax, sep,
2004 Oct 11
4
colClasses
Hi
I am trying to read a data frame from a text editor in to R. I want some
of the columns to be read in as "character" not numeric.
I figured that I can do that by using "colClasses" in "read.table"
command. However, I couldn't find out how to use
"colClasses". e.g. say I have 5 column in the data file. I want 1st and
3rd column to be read in as
2017 Oct 24
2
read.table(..., header == FALSE, colClasses = <vector with names attribute>)
Jeff,
Thank you for your reply. The intent was to construct a minimum
reproducible example. The same warning occurs when the 'file' argument
points to a file on disk with a million lines. But you are correct, my
example was slightly malformed and in fact gives an error under R
version 3.2.2. Please allow me to try again; in older versions of R,
?? > read.table(file =
2006 Jun 21
5
colClasses
Hi Folks!
I'm reading in some data from a .csv file that has a date column.
How do I use colClasses to get read.csv to recognize the date column?
The documentation on this seems to be nil -
And yes, I've read help and R Data Import/Export and can't figure out
what the colClasses syntax is.
Thanks,
john
[[alternative HTML version deleted]]
2013 Sep 30
4
read.table() with quoted integers
Hi!
It seems that read.table() in R 3.0.1 (Linux 64-bit) does not consider
quoted integers as an acceptable value for columns for which
colClasses="integer". But when colClasses is omitted, these columns are
read as integer anyway.
For example, let's consider a file named file.dat, containing:
"1"
"2"
> read.table("file.dat",
2006 Sep 26
2
colClasses: supressed 'NA'
Hi,
The colClasses seem to be supressing 'NA' vlaues. How do I fix this?
R script and first 5 lines of output is below.
File "test2.dat" has blanks that are read as "NA" when I do not use
'colClasses', but as blanks when I use 'colClasses'.
temp.df <- read.fwf("test2.dat", width=c(10,1,1,1,1,2,2,3,3,1),
2010 Feb 11
2
trouble with read.table and colClasses='raw'
Hi all,
First off, it is surprising that there are no examples of how to use
read.table() under ?read.table !
I am trying to read in a flat file of type 'raw'. It has 1000 rows and 600K
columns. I have the RAM to accomplish this, but can't get the data into R
using read.table:
x <- read.table("data",header=TRUE,colClasses=rep(,600000))
#returns error: no method or
2017 Oct 24
0
read.table(..., header == FALSE, colClasses = <vector with names attribute>)
>>>>> Benjamin Tyner <btyner at gmail.com>
>>>>> on Tue, 24 Oct 2017 07:21:33 -0400 writes:
> Jeff,
> Thank you for your reply. The intent was to construct a minimum
> reproducible example. The same warning occurs when the 'file' argument
> points to a file on disk with a million lines. But you are correct, my
>
2006 Mar 07
7
reading in only one column from text file
How do I manipulate the read.table function to read in only the 2nd
column???
[[alternative HTML version deleted]]
2004 Sep 08
1
A couple of issues with colClasses/setAs
Consider this:
$ cat test.dat
1 a
2 b
Now, we want to read the 2nd column as a factor and ignore the first
(since it's just a sequential ID). We can't just put "factor" among
the colClasses (would have been nice), so let's try this instead
> setAs("character","factor",as.factor)
Arguments in definition changed from (x) to (from)
>
2012 Sep 14
1
Any way to get read.table.ffdf() (in the ff package) to pass colClasses or comment.char parameters through to read.fwf() ?
Hi everyone, my apologies if I'm overlooking something obvious in the
documentation. I'm relatively inexperienced with the (awesome) ff package.
My goal is to use the read.table.ffdf() function to call the read.fwf()
function and pass through the colClasses and comment.char arguments. The
code below shows exactly what doesn't work for me.
If the colClasses and comment.char
2011 Mar 09
4
Help with read.csv
Hello,
I have a file that looks like this:
Date,Hour,DA_DMD,DMD,DA_RTP,RTP,,
1/1/2006,1,3393.9,3412,76.65,105.04,,
1/1/2006,2,3173.3,3202,69.20,67.67,,
1/1/2006,3,3040.0,3051,69.20,77.67,,
1/1/2006,4,2998.2,2979,67.32,69.10,,
1/1/2006,5,3005.8,2958,65.20,68.34,,
where the ',' is the separator and I tried to read it into R, but...
> y <- read.csv("Data/Data_tmp.csv",
2008 Feb 26
3
using eval-parse-paste in a loop
R-helpers
I have 120 small Excel sheets to read and I am using
library(xlsReadWrite): one example below.
I had hoped to read sheets by looping over a list of numbers in their
name (eg Book1.xls, Book2.xls, etc).
I thought I had seen examples which used eval-parse-paste in this way.
However, I have not been able to get it to work..
1. is this a feasible approach?
2. if not
2009 Sep 26
1
questions on csv reading
Hi,
Is there any official way to determine the colClasses of a data.frame?
Why has POSIXct such a strange class structure?
Why is colClasses "ordered" not allowed (and doesn't work)?
Background
==========
I am writing a chunked csv reader that provides the functionality of read.table for large files (in the next version of package ff). In chunked reading, one wants to learn the
2009 Aug 12
3
Zoo and numeric data
Hi,
I have a csv file with different datatypes:
2009-01-01, character1, 10, 20.1
2009-01-02, character2, 11, 21.1
(I have attached the file to this post)
I read this file with read.zoo as I want a zoo/xts timeseries:
> t = read.zoo("./data.txt", sep=",", dec = ".", header=FALSE)
If I look at the zoo data all integer/numeric columns are read as
character:
>
2011 Aug 22
3
automatic file input
Dear all,
I have 100 files which are used as input.and I have to input the name of my files again and again.the name of the files are 1.out, 2.out......100.out.
I want to know if there is anything like perl so that i can use something like this-
for($f = 1; $f <= 100; $f++) {
$file = $f.".out";
I have tried this thing in R but it does not work.Can somebody please help me.
2017 Oct 23
2
read.table(..., header == FALSE, colClasses = <vector with names attribute>)
Hello
I noticed that starting with R version 3.3.0 onward, this generates a
warning:
?? > txt <- c("a", "3.14")
?? > read.table(file = textConnection(txt), header = FALSE, colClasses
= c(x = "character", y = "numeric"))
the warning is "not all columns named in 'colClasses' exist" and I guess
the change was made in response
2012 Apr 23
1
Can I specify POSIX[cl]t column classes inside read.csv?
I'm loading a nicely formatted csv file.
? ? #!/usr/bin/env Rscript
? ? kpi <- read.csv(
? ? ? # This is a dump of the username, date_joined and last_login columns
? ? ? # from the auth_user Django table.
? ? ? 'data/2012-04-23.csv',
? ? ? colClasses = c('character')
? ? )
? ? print(kpi[sample(nrow(kpi), 3),2:3])
Here's what the three rows I printed look like.
? ? ? ?