thr3ads.net - similar to: "read.delim very slow in reading files with lots of columns"

Displaying 20 results from an estimated 3000 matches similar to: "read.delim very slow in reading files with lots of columns"

memory usage grows too fast

2009 May 14

memory usage grows too fast

Hi All, I have a 1000x1000000 matrix. The calculation I would like to do is actually very simple: for each row, calculate the frequency of a given pattern. For example, a toy dataset is as follows. Col1 Col2 Col3 Col4 01 02 02 00 => Freq of ?02? is 0.5 02 02 02 01 => Freq of ?02? is 0.75 00 02 01 01 ? My code is quite simple as the following to find the pattern ?02?.

sequencially merge multiple files in a folder

2008 Nov 18

sequencially merge multiple files in a folder

Dear all, If the question is too easy, please forgive me since I am only few weeks old in R. I have worked on this question a few days and still cannot figure it out. Here I have a folder with more than 50 tab-delimited files. Each file has a few hundreds of thousands rows/subjects, and the number of columns/variables of each file varies.The 1st row consists of all the variable names. Now I

lrm in Design package--missing value where TRUE/FALSE needed

2009 Sep 04

lrm in Design package--missing value where TRUE/FALSE needed

Hi, A error message arose while I was trying to fit a ordinal model with lrm() I am using R 2.8 with Design package. Here is a small set of mydata: RC RS Sex CovA CovB CovC CovD CovE 2 1 0 1 1 0 -0.005575280 2 2 1 0 1 0 1 -0.001959580 2 3 0 0 0 1 0 -0.004725880 2 0 0 0 1 0 0 -0.005504850 2 2 1 1 0 0 0 -0.003880170 1 2 1 0 0 1 0 -0.006074230 2 2 1 0 0 1 1 -0.003963920 2 2 1 0 0 1 0

Problems reading tab-delim files using read.table and read.delim

2012 Feb 08

Problems reading tab-delim files using read.table and read.delim

Hello, I used read.xlsx to read in Excel files but for large files it turned out to be not very efficient. For that reason I use a programme which writes each sheet in an Excel file into tab-delim txt files. After that I tried using read.table and read.delim to read in those txt files. Unfortunately, the results are not as expected. To show you what I mean I created a tiny Excel sheet with some

Wishlist: write.delim()

2005 Sep 08

Wishlist: write.delim()

Hi, It would be great if someone would add write.delim() as an adjunct to write.table(), just as with write.csv(). I store a lot of data in tab-delimited files and can read it in easily with: read.delim("text.txt", as.is=TRUE) and would love to be able to write it out as easily when I create these files. The obvious setting needed for write.delim() is sep = "\t", but in

combining data from multiple read.delim() invocations.

2014 Jul 01

combining data from multiple read.delim() invocations.

Is there a better way to do the following? I have data in a number of tab delimited files. I am using read.delim() to read them, in a loop. I am invoking my code on Linux Fedora 20, from the BASH command line, using Rscript. The code I'm using looks like: arguments <- commandArgs(trailingOnly=TRUE); # initialize the capped_data data.frame capped_data <- data.frame(lpar="NULL",

read.delim()

2010 Jul 28

read.delim()

I am reading in a very large file with names in it and R is truncating the number of rows it reads in. The separator in this file is a pipe '|' and so I use dat <- read.delim('pathToMyFile', header= TRUE, sep='|') It turns out that it is reading up to row 61145 and stopping and I think I see why, but am not sure of the best solution to this problem. I see the name of

The try() function with read.delim().

2008 May 14

The try() function with read.delim().

I have written a function which reads data from files using read.delim (). The names of these files are complicated and are built using arguments to the wrapper function. Since the files in question may or may not exist, I thought to enclose the read.delim() call in a try(): file <- <complicated expression> temp <- try(read.delim(file)) if(inherits(temp,"try-error")) {

Reading clipboard with read.delim("clipboard") crash (PR#13957)

2009 Sep 18

Reading clipboard with read.delim("clipboard") crash (PR#13957)

Full_Name: Liam Gretton Version: 2.9.2 OS: openSUSE 11.1 (x86_64) Submission from: (NULL) (143.210.13.77) Reading a large number of rows of delimited data via the clipboard results in a segfault or double free error. I've tested copying from various applications, but gedit will do. This problem exists in the openSUSE-supplied 2.8.1, I've just built 2.9.2 to see if it's still there,

read.table() vs read.delim() any difference??

2012 May 04

read.table() vs read.delim() any difference??

Hi, I have a tab seperated file with 206 rows and 30 columns. I read in the file into R using read.table() function. I checked the dim() of the data frame created in R, it had only 103 rows (exactly half), 30 columns. Then I tried reading in the file using read.delim() function and this time the dim() showed to be 206 rows, 30 columns as expected. Reading the read.table() R-help documentation, I

UTF-16 input and read.delim/scan

2012 May 18

UTF-16 input and read.delim/scan

Hi all, I am running 64-bit R 2.15.0 on windows 7. I am trying to use read.delim to read from a file that has 2-byte unicode (CJK) characters. Here is an example of the data (it is tab-delimited if that gets messed up): HITId HITTypeId Title 2Q69Z6KW4ZMAGKKFRT6Q4ONO6MJF68 2LVJ1LY58B72OP36GNBHH16YF7RS7Z 看看句子，写写想法请看以下的句子，再回答问 So read.delim (code below) doesn't read in correctly. It reads

Help troubleshooting silent failure reading huge file with read.delim

2010 Oct 06

Help troubleshooting silent failure reading huge file with read.delim

I am trying to read a tab-delimited 1.25 GB file of 4,115,119 records each with 52 fields. I am using R 2.11.0 on a 64-bit Windows 7 machine with 8 GB memory. I have tried the two following statements with the same results: d <- read.delim(filename, as.is=TRUE) d <- read.delim(filename, as.is=TRUE, nrows=4200000) I have tried starting R with this parameter but that changed

Inconsistency, may be bug in read.delim ?

2018 Mar 19

Inconsistency, may be bug in read.delim ?

Dear friends, I stumbled into beheaviour of read.delim which I would consider a bug or at least an inconsistency that should be improved upon. Recently we had to work with data that used "", two double quotes, as symbol to start and end character input. Essentially the data looked like this data.csv ======== V1, V2, V3 ""data"", 3, """" The

superimpose multiple scatterplot matrices

2009 Jul 02

superimpose multiple scatterplot matrices

Dear R- experts, I am trying to superimpose two or more scatterplot matrices generated by pairs() to visualize the differences over each datasets, but have not been very successful. Two data frames, df1 and df2 for example, each has the same five variables in columns. My goal is to paint each dataset with a color on the same plot panel for each pair of the five variables. That is, in this

Feature request: extend functionality of 'unlist()' by args 'delim=c("/", "_", etc.)' and 'keep.special=TRUE/FALSE'

2011 May 19

Feature request: extend functionality of 'unlist()' by args 'delim=c("/", "_", etc.)' and 'keep.special=TRUE/FALSE'

Dear list, I hope this is the right place to post a feature request. If there's exists a more formal channel (e.g. as for bug reports), I'd appreciate a pointer. I work a lot with named nested lists with arbitrary degrees of "nestedness". In order to retrieve the names and/or values of "bottom layer/bottom tier", I love the functionality of 'unlist()', or

colClasses = "Date" in read.delim, how to pass date-format?

2005 Apr 18

colClasses = "Date" in read.delim, how to pass date-format?

Hi I have a huge data-set with one column being of type date. Of course I can import the data using this column as "factor" and then convert it later to dates, using: sws.bezuege$FaktDat <- dates(as.character(sws.bezuege$FaktDat), format = c(dates = "d.m.y")) But the conversion requires a huge amount of memory (and time), therefore I would

read.delim skips first column (why?)

2009 Jul 13

read.delim skips first column (why?)

Hi people, I have a text file like this one posted: snp_id gene chromosome distance_from_gene_center position pop1 pop2 pop3 pop4 pop5 pop6 pop7 rs2129081 RAPT2 3 -129993 "upstream" 0.439009 1.169210 NA 0.233020 0.093042 NA -0.902596 rs1202698 RAPT2 3 -128695 "upstream" NA

1.9.0-devel: _ in read.delim and make.names

2004 Mar 04

1.9.0-devel: _ in read.delim and make.names

In R 1.9.0, make.names will accept "_" as a valid character for a syntactically valid name. I would appreciate to have an option in ``read.delim'' (etc.) that would change "_" in headers of input files to "." for compatibility with code and data written for R 1.8.1 and before. Wolfram Fischer

Pipe delimiter ( | ) in "read.delim"

2006 Apr 12

Pipe delimiter ( | ) in "read.delim"

Hi R folks, Can anyone tell me how to read in a pipe ("|") delimited text file? I've tried the following: read.delim("c:/junk/junk.txt",sep="|", skip=7, check.names=FALSE,quote = "", header=F) The file looks something like the following: RD|I|04|013|9997|68103|5|7|017|830|20000221|00:00|12.6||6|||||||||||||

Error with read.delim & read.csv

2007 Nov 15

Error with read.delim & read.csv

Hi - I'm reading in a tab delimited file that is causing issues with read.delim. Specifically, for a specific set of lines, the last entry of the line is misread and considered to be the first entry of a new row (which is then padded with 'NA's' ). Specifically: tmp <- read.delim( "trouble.txt", header=F ) produces a data.frame, tmp where if I call tmp[,1],

similar to: read.delim very slow in reading files with lots of columns