similar to: How to import BIG csv files with separate "map"?

Displaying 20 results from an estimated 6000 matches similar to: "How to import BIG csv files with separate "map"?"

2010 Jan 19
2
Memory usage in read.csv()
I'm sure this has gotten some attention before, but I have two CSV files generated from vmstat and free that are roughly 6-8 Mb (about 80,000 lines) each. When I try to use read.csv(), R allocates all available memory (about 4.9 Gb) when loading the files, which is over 300 times the size of the raw data. Here are the scripts used to generate the CSV files as well as the R code: Scripts (run
2007 Apr 18
10
importing excel-file
Dear R-experts, It is a quite stupid question but please help me. I am very confuced. I am able to import normal txt ant mat-files to R but unable to import .xls-file I do not understand the online help. Can please anyone send me the corresponding command lines? The .xls-file is attached. In my file we use commas for the decimal format (example: 0,712), changes might be needed. Thanks, Corinna
2009 Jun 14
2
read.csv
If read.csv's colClasses= argument is NOT used then read.csv accepts double quoted numerics: 1: > read.csv(stdin()) 0: A,B 1: "1",1 2: "2",2 3: A B 1 1 1 2 2 2 However, if colClasses is used then it seems that it does not: > read.csv(stdin(), colClasses = "numeric") 0: A,B 1: "1",1 2: "2",2 3: Error in scan(file, what, nmax, sep,
2009 Sep 26
1
questions on csv reading
Hi, Is there any official way to determine the colClasses of a data.frame? Why has POSIXct such a strange class structure? Why is colClasses "ordered" not allowed (and doesn't work)? Background ========== I am writing a chunked csv reader that provides the functionality of read.table for large files (in the next version of package ff). In chunked reading, one wants to learn the
2012 Apr 23
1
Can I specify POSIX[cl]t column classes inside read.csv?
I'm loading a nicely formatted csv file. ? ? #!/usr/bin/env Rscript ? ? kpi <- read.csv( ? ? ? # This is a dump of the username, date_joined and last_login columns ? ? ? # from the auth_user Django table. ? ? ? 'data/2012-04-23.csv', ? ? ? colClasses = c('character') ? ? ) ? ? print(kpi[sample(nrow(kpi), 3),2:3]) Here's what the three rows I printed look like. ? ? ? ?
2011 Mar 09
4
Help with read.csv
Hello, I have a file that looks like this: Date,Hour,DA_DMD,DMD,DA_RTP,RTP,, 1/1/2006,1,3393.9,3412,76.65,105.04,, 1/1/2006,2,3173.3,3202,69.20,67.67,, 1/1/2006,3,3040.0,3051,69.20,77.67,, 1/1/2006,4,2998.2,2979,67.32,69.10,, 1/1/2006,5,3005.8,2958,65.20,68.34,, where the ',' is the separator and I tried to read it into R, but... > y <- read.csv("Data/Data_tmp.csv",
2013 Nov 18
1
Reading in csv data with ff package
I've spent some time trying to wrap my head around reading in large csv files with the ff-package. I think I know how to do it, but am bumping into some problems. I've tried to recreate the issues as best as I can with a smaller example and maybe someone can help explain the problems. The following code just creates a csv file with an integer column, character column and logical column.
2010 Feb 11
3
read numeric values with thousands seperator from csv file
Hello, Is there an easy way to read a csv file with numeric values that contain thousands seperators. The file looks like this: Date;opening;High;Low;closing;Volume 12/02/08;4,764.95;4,897.62;4,729.13;4,895.31;- 13/02/08;4,868.02;4,927.81;4,833.85;4,898.60;- 14/02/08;4,942.18;4,962.43;4,877.88;4,895.99;- I want to get the numeric values as..., well, numeric values, and not as character strings.
2012 Jul 25
3
ff package: reading selected columns from csv
*Dear R users, Ive just started using the ff package. There is a csv file (~4Gb) with 7 columns and 6e+7 rows. I want to read only column from the file, skipping the first 100 rows. Below Ive provided different outcomes, which will clarify my problem * > sessionInfo() R version 2.14.2 (2012-02-29) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: ... attached base packages: [1] tools
2011 Nov 08
3
Reading a specific column of a csv file in a loop
Dear all: I have two larges files with 2000 columns. For each file I am performing a loop to extract the "i"th element of each file and create a data frame with both "i"th elements in order to perform further analysis. I am not extracting all the "i"th elements but only certain which I am indicating on a vector called "d". See an example of my code below
2005 Feb 03
3
Reading Dates in a csv File
Hi all. I'm reading in a flat, comma-delimited flat file using read.csv. It works marvelously for the most part. I am using the colClasses argument to, basically, create numeric, factor and character classes for the columns I'm reading in. However, a couple of the fields in the file are date fields. I'm fully aware that POSIXct can be used as a class, however the field must obey,
2013 Apr 26
4
Read big data (>3G ) methods ?
Hi all scientists, Recently, I am dealing with big data ( >3G txt or csv format ) in my desktop (windows 7 - 64 bit version), but I can not read them faster, thought I search from internet. [define colClasses for read.table, cobycol and limma packages I have use them, but it is not so fast]. Could you share your methods to read big data to R faster? Though this is an odd question, but we
2011 Dec 06
1
About summary in linear models
Hello!!, for linear models fit I use Gretl, but now I'm starting to use R, I would like to know if is there some function to obtain a extended summary like in Gretl. I will write a example in Gretl Modelo 1: MCO, usando las observaciones 1968-1982 (T = 15) Variable dependiente: Invest Coeficient St error t-ratio p-value const 377,631 35,0955 10,7601 <0,00001 *** GNP
2023 Jan 05
1
R 'arima' discrepancies
Rob J Hyndman gives great explanation here (https://robjhyndman.com/hyndsight/estimation/) for reasons why results from R's arima may differ from other softwares. @iacobus, to cite one, 'Major discrepancies between R and Stata for ARIMA' (https://stackoverflow.com/questions/22443395/major-discrepancies-between-r-and-stata-for-arima), assign the, sometimes, big diferences from R
2012 Jan 19
1
Help with .csv file reading !
Hello, Here's my problem : I have a csv file which I have to read with read.table() function (or read.csv). The file has about 60000 lines whose data are written this way: character;character;character;character 14/10/2010 13:10;0;49;0;49; 14/10/2010 13:20;0;49;0;49; 14/10/2010 13:30;0;49;0;49; I tried to use the function this way: read.csv("file.csv",sep =
2008 Jul 23
1
Time series reliability questions
Hello all, I have been using R's time series capabilities to perform analysis for quite some time now and I am having some questions regarding its reliability. In several cases I have had substantial disagreement between R and other packages (such as gretl and the commercial EViews package). I have just encountered another problem and thought I'd post it to the list. In this case,
2007 Feb 27
1
read.csv size limits
I have been using the read.csv function for a while now without any problems. My files are usually 20-50 MBs and they take up to a minute to import. They have all been under 50,000 rows and under 100 columns. Recently, I tried importing a file of a similar size (which means about the same amount of data), but with ~500,000 columns and ~20 rows. The process is taking forever (~1 hour so far). In
2004 Nov 17
4
R/S-related projects on Sourceforge? Trove Categorization
Hi R-Users and Developers, Several months ago I made a request on Sourceforge to add the R/S - programming language to the _Trove_ categorization. ("The Trove is a means to convey basic metainformation about your project.") Today I got the following response of one of the sourceforge admins. <SNIP> SourceForge.net will consider the inclusion of a programming language within
2004 Nov 17
4
R/S-related projects on Sourceforge? Trove Categorization
Hi R-Users and Developers, Several months ago I made a request on Sourceforge to add the R/S - programming language to the _Trove_ categorization. ("The Trove is a means to convey basic metainformation about your project.") Today I got the following response of one of the sourceforge admins. <SNIP> SourceForge.net will consider the inclusion of a programming language within
2010 Sep 08
2
big data
Hello, I searched the internet but i didn't find the answer for the next problem: I want to do a glm on a csv file consisting of 25 columns and 4 mln rows. Not all the columns are relevant. My problem is to read the data into R. Manipulate the data and then do a glm. I've tried with: dd<-scan("myfile.csv",colClasses=classes) dat<-as.data.frame(dd) My question is: what