thr3ads.net - similar to: "Scanning data files line-by-line"

Displaying 20 results from an estimated 20000 matches similar to: "Scanning data files line-by-line"

List of lists? Data frames? (Or other data structures?)

2003 May 01

List of lists? Data frames? (Or other data structures?)

Hi, I'm faced with the following problem and would appreciate some advice. I could have a data frame x that looks like this: aa bb a 1 "A" b 2 "B" The advantage of this is that I could access all the individual components easily. Also I could access all the rows and columns easily. Alternatively, I could have a list of

reading in data with variable length

2005 Dec 06

reading in data with variable length

I have very large csv files (up to 1GB each of ASCII text). I'd like to be able to read them directly in to R. The problem I am having is with the variable length of the data in each record. Here's a (simplified) example: $ cat foo.csv Name,Start Month,Data Foo,10,-0.5615,2.3065,0.1589,-0.3649,1.5955

Value Lookup from File without Slurping

2009 Jan 16

Value Lookup from File without Slurping

Dear all, I have a repository file (let's call it repo.txt) that contain two columns like this: # tag value AAA 0.2 AAT 0.3 AAC 0.02 AAG 0.02 ATA 0.3 ATT 0.7 Given another query vector > qr <- c("AAC", "ATT") I would like to find the corresponding value for each query above, yielding: 0.02 0.7 However, I want to avoid slurping whole repo.txt

how to read a freetext line ?

2011 Nov 17

how to read a freetext line ?

hi everyone . Here I have a text where there are some integer and string variables.But I can not read them by readLines and scan the text is : weight ;30;130 food:2;1;12 color:white;black the first column is the names of the variables and others are the value of them. the column in different line are different. Can anyone help me ? -- TANG Jie Email: totangjie@gmail.com Tel: 0086-2154896104

File reading.

2001 Oct 17

File reading.

Hi all, Appologies for the rather basic IO question but I am rather new to R... Migrated from IDL/Matlab recently. I have a rather simple Fortran control file (sigh...) that I am trying to parse and read using R. My problem is that the file's format is somewhat flexible. Imagine: --- 1> 39 1901 2> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 3> 22 24 26 28 30 32 34 36

read.table issue with "#"

2012 Mar 01

read.table issue with "#"

Hello, > > The problem is that I get a the following error bacause anything after the > # is ignored. > > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, > : > line 6 did not have 500 elements > > R thinks that line 6 has only 2 elements because of the #. > Use 'readLines' instead, followed by 'strsplit'. In the

spliting strings ...

2007 Dec 13

spliting strings ...

Hi everyone, I have a vector of strings, each string made up by different number of words. I want to get a new vector which has only the first word of each string in the first vector. I came up with this: str <- c('aaa bbb', 'cc', 'd eee aa', 'mmm o n') str1 <- rep(1, length(str)) for (i in 1:length(str)) { str1[i] <- strsplit(str, "

Question regarding subsetting

2005 Jul 22

Question regarding subsetting

I run R 2.1.1 in a Linux environment (RedHat 9) although my question is not platform-specific. Consider the following: > A <- c("Prefix-aaa", "Prefix-bbb", "Prefix-ccc") > B <- strsplit(A, "-") > B [[1]] [1] "Prefix" "aaa" [[2]] [1] "Prefix" "bbb" [[3]] [1] "Prefix" "ccc" How

(no subject)

2011 Mar 15

(no subject)

I was wondering if there is a way to get read in a single keystroke at a time in R as a string, akin to ncurses-style interfaces. I looked into readLines, readChar, etc. using stdin, but these all require the use of an end of line. Has anyone ever had need to do this or have any ideas on how to do this? Thanks, Jon PS I apologize if this double-sends, but I am having mail client issues.

Issue with Control-Z in a text file on Windows - readLines() appears to truncate

2013 Apr 10

Issue with Control-Z in a text file on Windows - readLines() appears to truncate

Working on Windows I have had to deal with CSV files that, unfortunately, contain embedded Control-Zs, i.e. ASCII character 26 in decimal, and the readLines() function in R on Windows (2.15.2 and 3.0.0) appears to truncate at the control-Z. There is no problem at all on Ubuntu Linux with R 3.0.0. Am I mistaken or is this genuine? # Create a small file with embedded Control-Z h3 <-

degree-min-sec data

2004 Jan 09

degree-min-sec data

Hello - Have both astronomic and geodetic data sets with values in the form "ddd:mm:ss.sssss", where dd is an integer between -180 and 180, mm is an integer between 0 and 60, and ss is a floating-point number between 0 and 60.0. In order to do anything useful with these values they need to be turned into their "decimal degree" equivalent. Assuming the data is a vector y, the

Read every second line from ASCII file

2007 Apr 30

Read every second line from ASCII file

Dear all, I have an ASCII file where records are separated by a blank. I would like to read those data; however, only the data in rows 1, 3, 5, 7, ... are important; the other lines (2,4,6,8,....) contain no useful information for me. So far I used awk/gawk to do it: gawk '{if ((FNR % 2) != 0) {print $0}}' infile.txt > outfile.txt What is the recommended way to accomplish this in R?

scan html: sep = "<td>"

2005 Apr 04

scan html: sep = "<td>"

Hi I try to import html text and I need to split the fields at each <td> or </td> entry How can I succeed? sep = '<td>' doens't yield the right result thanks for hints

problem with packages

2011 Jan 10

problem with packages

Hello, I am on a laptop with Win7, running R-2.12.1 if I click on Packages/InstallPackages I get : > utils:::menuInstallPkgs() Warning: unable to access index for repository http://cran.skazkaforyou.com/bin/windows/contrib/2.12 Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.12 Error in install.packages(NULL, .libPaths()[1L], dependencies

Import graph object

2010 Jul 14

Import graph object

Dear all, I have a txt file of the following format that describes the relationships between a network of a certain number of nodes. {4, 2, 3} {3, 4, 1} {4, 2, 1} {2, 1, 3} {2, 3} {} {2, 5, 1} {3, 5, 4} {3, 4} {2, 5, 3} For example the first line {4, 2, 3} implies that there is a connection between Node 1 and Node 4, a connection between Node 1 and Node 2 and a connection between Node 1 and

Incremental ReadLines

2009 Nov 02

Incremental ReadLines

I've been trying to figure out how to read in a large file for a few days now, and after extensive research I'm still not sure what to do. I have a large comma delimited text file that contains 59 fields in each record. There is also a header every 121 records This function works well for smallish records getcsv=function(fname){ ff=file(description = fname) x <- readLines(ff)

Convert char vector to numeric table

2003 Mar 31

Convert char vector to numeric table

I'm a great fan of read.table(), but this time the data had a lot of cruft. So I used readLines() and editted the char vector to eventually get something like this: " 23.4 1.5 4.2" " 19.1 2.2 4.1" and so on. To get that into a 3 col numeric table, I first just used: writeLines(data,"tempfile")

Reading a file line by line - separating lines VS separating columns

2009 Mar 18

Reading a file line by line - separating lines VS separating columns

Hello all. I wish to read a large data set into R. My current issue is in getting the data so that R would be able to access it. Using read.table won't work since the data is over 1GB in size (and I am using windows XP), so my plan was to read the file chunk by chunk and each time move it into bigmemory (I'll play with that when the time will come, maybe ff is better ?!). I encountered

reading formatted txt file into a data frame

2010 May 06

reading formatted txt file into a data frame

Dear all Lets say I have a plain text file as follows: > cat(c("[ID: 001 ] [Writer: Steven Moffat ] [Rating: 8.9 ] Doctor Who", + "[ID: 002 ] [Writer: Joss Whedon ] [Rating: 8.8 ] Buffy", + "[ID: 003 ] [Writer: J. Michael Straczynski ] [Rating: 7.4 ] Babylon [5]"), + sep = "\n", file = "tmp.txt") I would somehow like to read

problem with read.table

2004 Feb 03

problem with read.table

Any ideas why read.table complains about not correct number of elements in line while readLine/strsplit indicate that all lines have the same number of elements ? R > tbl <- read.table('tmp', header = T, sep = '\t') Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 32 did not have 27 elements > lines <-

similar to: Scanning data files line-by-line