thr3ads.net - similar to: "read in large data file (tsv) with inline filter?"

Displaying 20 results from an estimated 5000 matches similar to: "read in large data file (tsv) with inline filter?"

2007 Aug 23

read big text file into R

Dear Rs: Hi, I am trying to read a big text file (nrows=243440, ncols=144). It seems the computational time of all the read methods (scan,readtable,read.delim) is not linear to the number of rows I want to read in: things became really slow once I tried to read in 100000 lines compare to 10000 lines). If I am reading the profiling result right, I guess scan wouldn't help either. My

importing data to SQLite database with sqldf

2009 Feb 20

importing data to SQLite database with sqldf

Hi all, I am attempting to learn SQL through sqldf... One task I am particularly interested in is merging separate (presumably large) files into a single table without loading these files into R as an intermediate step (by loading them into SQLite and merging them there). Taking a step back, I've considered these alternatives: 1) I know if I use straight SQLite commands I might use the

Help troubleshooting silent failure reading huge file with read.delim

2010 Oct 06

Help troubleshooting silent failure reading huge file with read.delim

I am trying to read a tab-delimited 1.25 GB file of 4,115,119 records each with 52 fields. I am using R 2.11.0 on a 64-bit Windows 7 machine with 8 GB memory. I have tried the two following statements with the same results: d <- read.delim(filename, as.is=TRUE) d <- read.delim(filename, as.is=TRUE, nrows=4200000) I have tried starting R with this parameter but that changed

Date-Time-Stamp input method for user-specific formats

2009 Oct 05

Date-Time-Stamp input method for user-specific formats

Date-Time-Stamp input method to correctly interpret user-specific formats:coding is 90% there - based on exmple at http://tolstoy.newcastle.edu.au/R/help/05/02/12003.html ...anyone got the last 10% please? CONTEXT: Data is received where one of the columns is a datetimestamp. At midnight, the value represented as text in this column consists of just the date part, e.g.

Delete query in sqldf?

2007 Sep 07

Delete query in sqldf?

Dear All, Is sqldf equipped with delete queries? I have tried delete queries but with no success. Thanks in advance, Paul

read.delim very slow in reading files with lots of columns

2009 Sep 23

read.delim very slow in reading files with lots of columns

Hi, I am trying to read a tab-delimited file into R (Ver. 2.8). The machine I am using is 64bit Linux with 16 GB. The file is basically a matrix(~600x700000) and as large as 3GB. The read.delim() ran extremely slow (hours) even with a subset of the file (31 MB with 6x700000) I monitored the memory usage, and found it constantly only took less than 1% of 16GB memory. Does read.delim()

sumarizar

2012 Jan 19

sumarizar

*Hola!!! resulta que tengo unos datos de divisas ordenados por fechas (días) los que he convertido a formato tipo YYYY-MM-DD donde DD siempre es 01:* * * * EUR.resto$date<-as.Date(EUR.resto$date) EUR.resto$mo <- substr(EUR.resto$date,6,7) EUR.resto$yr <- substr(EUR.resto$date, 1,4)

recursive methods for concatenating sets of files

2006 Sep 13

recursive methods for concatenating sets of files

Hello, I would like to read sets of files within a folder, perhaps using recursive methods. Right now, I rename the files before import. It would be even better to do this without renaming files, without providing explicit filenames, perhaps by importing files based on chronology, and translating each filename into a header? Please excuse my ignorance, and help cure my clunky programming

reading tables from url

2007 Nov 14

reading tables from url

I'm trying to read some web tables directly into R. These are both genome sequencing projects (eukaryotes and metagenomes) from NCBI and look very similar; however, only the first one works. http://www.ncbi.nlm.nih.gov/genomes/leuks.cgi http://www.ncbi.nlm.nih.gov/genomes/lenvs.cgi I added ?dump=selected to the end of the url string to get a tab- delimited file (which is what happens

How to import specific column(s) using "read.table"?

2004 Aug 09

How to import specific column(s) using "read.table"?

Dear R people, I have a very big tab-delim txt file with header and I only want to import several columns into R. I checked the options for "read.table" and only found "nrows" which lets you specify the maximum number of rows to read in. Although I can use some text editors (e.g., wordpad) to edit the txt file first before running R, I feel it?s not very convenient. The

large dataset import, aggregation and reshape

2005 Apr 24

large dataset import, aggregation and reshape

Dear useRs We have a data-set (comma delimited) with 12Millions of rows, and 5 columns (in fact many more, but we need only 4 of them): id, factor 'a' (5 levels), factor 'b' (15 levels), date-stamp, numeric measurement. We run R on suse-linux 9.1 with 2GB RAM, (and a 3.5GB swap file). on average we have 30 obs. per id. We want to aggregate (eg. sum of the measuresments under

Problems reading tab-delim files using read.table and read.delim

2012 Feb 08

Problems reading tab-delim files using read.table and read.delim

Hello, I used read.xlsx to read in Excel files but for large files it turned out to be not very efficient. For that reason I use a programme which writes each sheet in an Excel file into tab-delim txt files. After that I tried using read.table and read.delim to read in those txt files. Unfortunately, the results are not as expected. To show you what I mean I created a tiny Excel sheet with some

Exceptional slowness with read.csv

2024 Apr 08

Exceptional slowness with read.csv

Hi Dave, That's rather frustrating. I've found vroom (from the package vroom) to be helpful with large files like this. Does the following give you any better luck? vroom(file_name, delim = ",", skip = 2459465, n_max = 5) Of course, when you know you've got errors & the files are big like that it can take a bit of work resolving things. The command line tools awk

Sqldf INSERT INTO

2011 Apr 20

Sqldf INSERT INTO

Hi, I am new to R and trying to migrate from SAS. I am trying to copy data from one table to another table which have same columns using sqldf. but not working and showing "NULL" I wrote statement as sqldf("INSERT INTO new select * from data") but showing NULL Please help me in this regard. Thank you -- View this message in context:

Ways to work with R and Postgres

2010 Jun 27

Ways to work with R and Postgres

Hi, I post this message to the general r-help list hoping anyone within a wider range have suggestions: There are three ways to integration R and postgres, especially on 64bit Microsoft windows Platform, 1. via RODBC package, which has 32 bit and 64 bit version for windows 2. via RPostgres interface, which only has 32bit version currently 3. via plr for Greenplum, which only supports a

Wishlist: write.delim()

2005 Sep 08

Wishlist: write.delim()

Hi, It would be great if someone would add write.delim() as an adjunct to write.table(), just as with write.csv(). I store a lot of data in tab-delimited files and can read it in easily with: read.delim("text.txt", as.is=TRUE) and would love to be able to write it out as easily when I create these files. The obvious setting needed for write.delim() is sep = "\t", but in

function case in sqldf (datas from oracle) with a null value

2012 Aug 20

function case in sqldf (datas from oracle) with a null value

I use sqldf to join 2 dataframes from 2 distinct databases : a and b come from old sqldf's. sqldf("select a.*, b.*, case a.QTY when null then b.QTY else a.QTY end as NEW_QTY" from a inner join b on a.OBJECT=b.OBJECT") R doesn't understand "when null". I tried with "when NA", "when '' ", "when ' ' " but it doesn't

read.table: fill=T for header?

2011 Apr 27

read.table: fill=T for header?

Dear ExpeRts,t I am trying to read tab delimted data produced by somewhat brain dead software that seems to think it's a good idea to have an extra tab character after the last column - except for the header line. As explained in the help page, read.delim now assumes that the first column contains the row.names (which is not even wrong) but now and all col.names get shiftet by one column.

Comparing two matrices

2006 Jul 06

Comparing two matrices

hi: I have matrix with dimensions(200 X 20,000). I have another file, a tab-delim file where first column variables are row names and second column variables are column names. For instance: > tmat Apple Orange Mango Grape Star A 0 0 0 0 0 O 0 0 0 0 0 M 0 0 0 0 0 G 0 0 0 0 0 S 0 0 0 0 0

How to print the frequency table (produced by the command "table" to Excel

2016 Apr 26

How to print the frequency table (produced by the command "table" to Excel

Hi jpm miao, You can get CSV files that can be imported into Excel like this: library(prettyR) sink("excel_table1.csv") delim.table(table(df[,c("y","z")])) sink() sink("excel_table2.csv") delim.table(as.data.frame(table(df[,c("y","z")])),label="") sink() sink("excel_table3.csv")

similar to: read in large data file (tsv) with inline filter?