thr3ads.net - similar to: "Help with .csv file reading !"

Displaying 20 results from an estimated 6000 matches similar to: "Help with .csv file reading !"

2009 Sep 26

questions on csv reading

Hi, Is there any official way to determine the colClasses of a data.frame? Why has POSIXct such a strange class structure? Why is colClasses "ordered" not allowed (and doesn't work)? Background ========== I am writing a chunked csv reader that provides the functionality of read.table for large files (in the next version of package ff). In chunked reading, one wants to learn the

Reading in csv data with ff package

2013 Nov 18

Reading in csv data with ff package

I've spent some time trying to wrap my head around reading in large csv files with the ff-package. I think I know how to do it, but am bumping into some problems. I've tried to recreate the issues as best as I can with a smaller example and maybe someone can help explain the problems. The following code just creates a csv file with an integer column, character column and logical column.

read.csv

2009 Jun 14

read.csv

If read.csv's colClasses= argument is NOT used then read.csv accepts double quoted numerics: 1: > read.csv(stdin()) 0: A,B 1: "1",1 2: "2",2 3: A B 1 1 1 2 2 2 However, if colClasses is used then it seems that it does not: > read.csv(stdin(), colClasses = "numeric") 0: A,B 1: "1",1 2: "2",2 3: Error in scan(file, what, nmax, sep,

ff package: reading selected columns from csv

2012 Jul 25

ff package: reading selected columns from csv

*Dear R users, Ive just started using the ff package. There is a csv file (~4Gb) with 7 columns and 6e+7 rows. I want to read only column from the file, skipping the first 100 rows. Below Ive provided different outcomes, which will clarify my problem * > sessionInfo() R version 2.14.2 (2012-02-29) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: ... attached base packages: [1] tools

Can I specify POSIX[cl]t column classes inside read.csv?

2012 Apr 23

Can I specify POSIX[cl]t column classes inside read.csv?

I'm loading a nicely formatted csv file. ? ? #!/usr/bin/env Rscript ? ? kpi <- read.csv( ? ? ? # This is a dump of the username, date_joined and last_login columns ? ? ? # from the auth_user Django table. ? ? ? 'data/2012-04-23.csv', ? ? ? colClasses = c('character') ? ? ) ? ? print(kpi[sample(nrow(kpi), 3),2:3]) Here's what the three rows I printed look like. ? ? ? ?

Help with read.csv

2011 Mar 09

Help with read.csv

Hello, I have a file that looks like this: Date,Hour,DA_DMD,DMD,DA_RTP,RTP,, 1/1/2006,1,3393.9,3412,76.65,105.04,, 1/1/2006,2,3173.3,3202,69.20,67.67,, 1/1/2006,3,3040.0,3051,69.20,77.67,, 1/1/2006,4,2998.2,2979,67.32,69.10,, 1/1/2006,5,3005.8,2958,65.20,68.34,, where the ',' is the separator and I tried to read it into R, but... > y <- read.csv("Data/Data_tmp.csv",

Memory usage in read.csv()

2010 Jan 19

Memory usage in read.csv()

I'm sure this has gotten some attention before, but I have two CSV files generated from vmstat and free that are roughly 6-8 Mb (about 80,000 lines) each. When I try to use read.csv(), R allocates all available memory (about 4.9 Gb) when loading the files, which is over 300 times the size of the raw data. Here are the scripts used to generate the CSV files as well as the R code: Scripts (run

Reading a specific column of a csv file in a loop

2011 Nov 08

Reading a specific column of a csv file in a loop

Dear all: I have two larges files with 2000 columns. For each file I am performing a loop to extract the "i"th element of each file and create a data frame with both "i"th elements in order to perform further analysis. I am not extracting all the "i"th elements but only certain which I am indicating on a vector called "d". See an example of my code below

Reading Dates in a csv File

2005 Feb 03

Reading Dates in a csv File

Hi all. I'm reading in a flat, comma-delimited flat file using read.csv. It works marvelously for the most part. I am using the colClasses argument to, basically, create numeric, factor and character classes for the columns I'm reading in. However, a couple of the fields in the file are date fields. I'm fully aware that POSIXct can be used as a class, however the field must obey,

read numeric values with thousands seperator from csv file

2010 Feb 11

read numeric values with thousands seperator from csv file

Hello, Is there an easy way to read a csv file with numeric values that contain thousands seperators. The file looks like this: Date;opening;High;Low;closing;Volume 12/02/08;4,764.95;4,897.62;4,729.13;4,895.31;- 13/02/08;4,868.02;4,927.81;4,833.85;4,898.60;- 14/02/08;4,942.18;4,962.43;4,877.88;4,895.99;- I want to get the numeric values as..., well, numeric values, and not as character strings.

read.table() and NULL for colClasses

2004 Jul 28

read.table() and NULL for colClasses

Hi, is there are reason for not supporting NULL or "NULL" values for argument colClasses in read.table(), much like you can use NULL values for argument 'what' in scan()? This would help quite a bit when reading large data files where only a few columns are of interest. I've modfied read.table() to so it calls scan(what=...) also with NULLs for the fields to be skipped.

Reading multiple text files where some files are empty

2010 Aug 31

Reading multiple text files where some files are empty

Hi All, I have a problem with reading in multiple text files where some of the files have no data and was hoping someone may be able to help me find a solution. Each text file is a daily log of fish movement. However, on some occasions no movements will be recorded on a particular day and therefore the text file for that day is empty. I'm currently using the following code to read the

Difference in 'read.table' between R.1.4.1 and R1.5.0

2002 May 08

Difference in 'read.table' between R.1.4.1 and R1.5.0

This sequence of commands worked fine in R.1.4.1. The data file was the same in both instances: > acct.log <- read.table(file, col.names=c('cmd', 'user', 'start', 'end', + 'elapsed', 'sys', 'usr', 'cpu', 'char', 'blocks'), + colClasses=c('NA', 'NA', rep('numeric',

read.table(..., header == FALSE, colClasses = <vector with names attribute>)

2017 Oct 24

read.table(..., header == FALSE, colClasses = <vector with names attribute>)

>>>>> Benjamin Tyner <btyner at gmail.com> >>>>> on Tue, 24 Oct 2017 07:21:33 -0400 writes: > Jeff, > Thank you for your reply. The intent was to construct a minimum > reproducible example. The same warning occurs when the 'file' argument > points to a file on disk with a million lines. But you are correct, my >

Any way to get read.table.ffdf() (in the ff package) to pass colClasses or comment.char parameters through to read.fwf() ?

2012 Sep 14

Any way to get read.table.ffdf() (in the ff package) to pass colClasses or comment.char parameters through to read.fwf() ?

Hi everyone, my apologies if I'm overlooking something obvious in the documentation. I'm relatively inexperienced with the (awesome) ff package. My goal is to use the read.table.ffdf() function to call the read.fwf() function and pass through the colClasses and comment.char arguments. The code below shows exactly what doesn't work for me. If the colClasses and comment.char

A couple of issues with colClasses/setAs

2004 Sep 08

A couple of issues with colClasses/setAs

Consider this: $ cat test.dat 1 a 2 b Now, we want to read the 2nd column as a factor and ignore the first (since it's just a sequential ID). We can't just put "factor" among the colClasses (would have been nice), so let's try this instead > setAs("character","factor",as.factor) Arguments in definition changed from (x) to (from) >

colClasses: supressed 'NA'

2006 Sep 26

colClasses: supressed 'NA'

Hi, The colClasses seem to be supressing 'NA' vlaues. How do I fix this? R script and first 5 lines of output is below. File "test2.dat" has blanks that are read as "NA" when I do not use 'colClasses', but as blanks when I use 'colClasses'. temp.df <- read.fwf("test2.dat", width=c(10,1,1,1,1,2,2,3,3,1),

reading partial data set

2011 Dec 07

reading partial data set

Hi all, I'm trying to read a data set into R, but the file is messy, so I have to do it partially. The whole data is in a .txt file, and the values are separated by a space. So far ok. The problem is that in this file, not all the lines have the same number of elements, and the reading stops. And I loose the reading of the previous lines. ex. of data set: 11 12 13 21 22 23 31 32

how to drop fields by name when reading in data?

2010 Mar 19

how to drop fields by name when reading in data?

I have a number of space separated files of weather data, with some equivalent column names, and differing number of fields in each file. Some of the files have 40 or more vars, but I only want a subset of the fields. I can use colClasses with read.table to drop some of the fields, but only if I know where those columns are in the first place, and they're not always in the same place. So I

A warning message generated from 'read.csv'

2013 Nov 04

A warning message generated from 'read.csv'

Hi, I'm using R version 3.0.2. While I executed the following command filedata <- read.csv(file, header=TRUE, colClasses="character") I got the warning message: In scan(file, what, nmax, sep, dec, quote, skip, nlines, ... : EOF within quoted string I'd like to know what this means? And how shall I fix the problem? Thank you for your help. Best, Chia-Chieh Lin

similar to: Help with .csv file reading !