similar to: read in large data file (tsv) with inline filter?

Displaying 20 results from an estimated 5000 matches similar to: "read in large data file (tsv) with inline filter?"

2007 Aug 23
2
read big text file into R
Dear Rs: Hi, I am trying to read a big text file (nrows=243440, ncols=144). It seems the computational time of all the read methods (scan,readtable,read.delim) is not linear to the number of rows I want to read in: things became really slow once I tried to read in 100000 lines compare to 10000 lines). If I am reading the profiling result right, I guess scan wouldn't help either. My
2009 Feb 20
2
importing data to SQLite database with sqldf
Hi all, I am attempting to learn SQL through sqldf... One task I am particularly interested in is merging separate (presumably large) files into a single table without loading these files into R as an intermediate step (by loading them into SQLite and merging them there). Taking a step back, I've considered these alternatives: 1) I know if I use straight SQLite commands I might use the
2010 Oct 06
3
Help troubleshooting silent failure reading huge file with read.delim
I am trying to read a tab-delimited 1.25 GB file of 4,115,119 records each with 52 fields. I am using R 2.11.0 on a 64-bit Windows 7 machine with 8 GB memory. I have tried the two following statements with the same results: d <- read.delim(filename, as.is=TRUE) d <- read.delim(filename, as.is=TRUE, nrows=4200000) I have tried starting R with this parameter but that changed
2009 Oct 05
6
Date-Time-Stamp input method for user-specific formats
Date-Time-Stamp input method to correctly interpret user-specific formats:coding is 90% there - based on exmple at http://tolstoy.newcastle.edu.au/R/help/05/02/12003.html ...anyone got the last 10% please? CONTEXT: Data is received where one of the columns is a datetimestamp. At midnight, the value represented as text in this column consists of just the date part, e.g.
2007 Sep 07
3
Delete query in sqldf?
Dear All, Is sqldf equipped with delete queries? I have tried delete queries but with no success. Thanks in advance, Paul
2009 Sep 23
1
read.delim very slow in reading files with lots of columns
Hi, I am trying to read a tab-delimited file into R (Ver. 2.8). The machine I am using is 64bit Linux with 16 GB. The file is basically a matrix(~600x700000) and as large as 3GB. The read.delim() ran extremely slow (hours) even with a subset of the file (31 MB with 6x700000) I monitored the memory usage, and found it constantly only took less than 1% of 16GB memory. Does read.delim()
2012 Jan 19
8
sumarizar
*Hola!!! resulta que tengo unos datos de divisas ordenados por fechas (días) los que he convertido a formato tipo YYYY-MM-DD donde DD siempre es 01:* * * * EUR.resto$date<-as.Date(EUR.resto$date) EUR.resto$mo <- substr(EUR.resto$date,6,7) EUR.resto$yr <- substr(EUR.resto$date, 1,4)
2006 Sep 13
2
recursive methods for concatenating sets of files
Hello, I would like to read sets of files within a folder, perhaps using recursive methods. Right now, I rename the files before import. It would be even better to do this without renaming files, without providing explicit filenames, perhaps by importing files based on chronology, and translating each filename into a header? Please excuse my ignorance, and help cure my clunky programming
2007 Nov 14
1
reading tables from url
I'm trying to read some web tables directly into R. These are both genome sequencing projects (eukaryotes and metagenomes) from NCBI and look very similar; however, only the first one works. http://www.ncbi.nlm.nih.gov/genomes/leuks.cgi http://www.ncbi.nlm.nih.gov/genomes/lenvs.cgi I added ?dump=selected to the end of the url string to get a tab- delimited file (which is what happens
2004 Aug 09
5
How to import specific column(s) using "read.table"?
Dear R people, I have a very big tab-delim txt file with header and I only want to import several columns into R. I checked the options for "read.table" and only found "nrows" which lets you specify the maximum number of rows to read in. Although I can use some text editors (e.g., wordpad) to edit the txt file first before running R, I feel it?s not very convenient. The
2005 Apr 24
1
large dataset import, aggregation and reshape
Dear useRs We have a data-set (comma delimited) with 12Millions of rows, and 5 columns (in fact many more, but we need only 4 of them): id, factor 'a' (5 levels), factor 'b' (15 levels), date-stamp, numeric measurement. We run R on suse-linux 9.1 with 2GB RAM, (and a 3.5GB swap file). on average we have 30 obs. per id. We want to aggregate (eg. sum of the measuresments under
2012 Feb 08
2
Problems reading tab-delim files using read.table and read.delim
Hello, I used read.xlsx to read in Excel files but for large files it turned out to be not very efficient. For that reason I use a programme which writes each sheet in an Excel file into tab-delim txt files. After that I tried using read.table and read.delim to read in those txt files. Unfortunately, the results are not as expected. To show you what I mean I created a tiny Excel sheet with some
2024 Apr 08
2
Exceptional slowness with read.csv
Hi Dave, That's rather frustrating. I've found vroom (from the package vroom) to be helpful with large files like this. Does the following give you any better luck? vroom(file_name, delim = ",", skip = 2459465, n_max = 5) Of course, when you know you've got errors & the files are big like that it can take a bit of work resolving things. The command line tools awk
2011 Apr 20
1
Sqldf INSERT INTO
Hi, I am new to R and trying to migrate from SAS. I am trying to copy data from one table to another table which have same columns using sqldf. but not working and showing "NULL" I wrote statement as sqldf("INSERT INTO new select * from data") but showing NULL Please help me in this regard. Thank you -- View this message in context:
2010 Jun 27
2
Ways to work with R and Postgres
Hi, I post this message to the general r-help list hoping anyone within a wider range have suggestions: There are three ways to integration R and postgres, especially on 64bit Microsoft windows Platform, 1. via RODBC package, which has 32 bit and 64 bit version for windows 2. via RPostgres interface, which only has 32bit version currently 3. via plr for Greenplum, which only supports a
2005 Sep 08
1
Wishlist: write.delim()
Hi, It would be great if someone would add write.delim() as an adjunct to write.table(), just as with write.csv(). I store a lot of data in tab-delimited files and can read it in easily with: read.delim("text.txt", as.is=TRUE) and would love to be able to write it out as easily when I create these files. The obvious setting needed for write.delim() is sep = "\t", but in
2012 Aug 20
1
function case in sqldf (datas from oracle) with a null value
I use sqldf to join 2 dataframes from 2 distinct databases : a and b come from old sqldf's. sqldf("select a.*, b.*, case a.QTY when null then b.QTY else a.QTY end as NEW_QTY" from a inner join b on a.OBJECT=b.OBJECT") R doesn't understand "when null". I tried with "when NA", "when '' ", "when ' ' " but it doesn't
2011 Apr 27
1
read.table: fill=T for header?
Dear ExpeRts,t I am trying to read tab delimted data produced by somewhat brain dead software that seems to think it's a good idea to have an extra tab character after the last column - except for the header line. As explained in the help page, read.delim now assumes that the first column contains the row.names (which is not even wrong) but now and all col.names get shiftet by one column.
2006 Jul 06
3
Comparing two matrices
hi: I have matrix with dimensions(200 X 20,000). I have another file, a tab-delim file where first column variables are row names and second column variables are column names. For instance: > tmat Apple Orange Mango Grape Star A 0 0 0 0 0 O 0 0 0 0 0 M 0 0 0 0 0 G 0 0 0 0 0 S 0 0 0 0 0
2016 Apr 26
0
How to print the frequency table (produced by the command "table" to Excel
Hi jpm miao, You can get CSV files that can be imported into Excel like this: library(prettyR) sink("excel_table1.csv") delim.table(table(df[,c("y","z")])) sink() sink("excel_table2.csv") delim.table(as.data.frame(table(df[,c("y","z")])),label="") sink() sink("excel_table3.csv")