similar to: read.delim very slow in reading files with lots of columns

Displaying 20 results from an estimated 3000 matches similar to: "read.delim very slow in reading files with lots of columns"

2009 May 14
3
memory usage grows too fast
Hi All, I have a 1000x1000000 matrix. The calculation I would like to do is actually very simple: for each row, calculate the frequency of a given pattern. For example, a toy dataset is as follows. Col1 Col2 Col3 Col4 01 02 02 00 => Freq of ?02? is 0.5 02 02 02 01 => Freq of ?02? is 0.75 00 02 01 01 ? My code is quite simple as the following to find the pattern ?02?.
2008 Nov 18
2
sequencially merge multiple files in a folder
Dear all, If the question is too easy, please forgive me since I am only few weeks old in R. I have worked on this question a few days and still cannot figure it out. Here I have a folder with more than 50 tab-delimited files. Each file has a few hundreds of thousands rows/subjects, and the number of columns/variables of each file varies.The 1st row consists of all the variable names. Now I
2009 Sep 04
2
lrm in Design package--missing value where TRUE/FALSE needed
Hi, A error message arose while I was trying to fit a ordinal model with lrm() I am using R 2.8 with Design package. Here is a small set of mydata: RC RS Sex CovA CovB CovC CovD CovE 2 1 0 1 1 0 -0.005575280 2 2 1 0 1 0 1 -0.001959580 2 3 0 0 0 1 0 -0.004725880 2 0 0 0 1 0 0 -0.005504850 2 2 1 1 0 0 0 -0.003880170 1 2 1 0 0 1 0 -0.006074230 2 2 1 0 0 1 1 -0.003963920 2 2 1 0 0 1 0
2012 Feb 08
2
Problems reading tab-delim files using read.table and read.delim
Hello, I used read.xlsx to read in Excel files but for large files it turned out to be not very efficient. For that reason I use a programme which writes each sheet in an Excel file into tab-delim txt files. After that I tried using read.table and read.delim to read in those txt files. Unfortunately, the results are not as expected. To show you what I mean I created a tiny Excel sheet with some
2005 Sep 08
1
Wishlist: write.delim()
Hi, It would be great if someone would add write.delim() as an adjunct to write.table(), just as with write.csv(). I store a lot of data in tab-delimited files and can read it in easily with: read.delim("text.txt", as.is=TRUE) and would love to be able to write it out as easily when I create these files. The obvious setting needed for write.delim() is sep = "\t", but in
2014 Jul 01
1
combining data from multiple read.delim() invocations.
Is there a better way to do the following? I have data in a number of tab delimited files. I am using read.delim() to read them, in a loop. I am invoking my code on Linux Fedora 20, from the BASH command line, using Rscript. The code I'm using looks like: arguments <- commandArgs(trailingOnly=TRUE); # initialize the capped_data data.frame capped_data <- data.frame(lpar="NULL",
2010 Jul 28
2
read.delim()
I am reading in a very large file with names in it and R is truncating the number of rows it reads in. The separator in this file is a pipe '|' and so I use dat <- read.delim('pathToMyFile', header= TRUE, sep='|') It turns out that it is reading up to row 61145 and stopping and I think I see why, but am not sure of the best solution to this problem. I see the name of
2008 May 14
2
The try() function with read.delim().
I have written a function which reads data from files using read.delim (). The names of these files are complicated and are built using arguments to the wrapper function. Since the files in question may or may not exist, I thought to enclose the read.delim() call in a try(): file <- <complicated expression> temp <- try(read.delim(file)) if(inherits(temp,"try-error")) {
2009 Sep 18
1
Reading clipboard with read.delim("clipboard") crash (PR#13957)
Full_Name: Liam Gretton Version: 2.9.2 OS: openSUSE 11.1 (x86_64) Submission from: (NULL) (143.210.13.77) Reading a large number of rows of delimited data via the clipboard results in a segfault or double free error. I've tested copying from various applications, but gedit will do. This problem exists in the openSUSE-supplied 2.8.1, I've just built 2.9.2 to see if it's still there,
2012 May 04
2
read.table() vs read.delim() any difference??
Hi, I have a tab seperated file with 206 rows and 30 columns. I read in the file into R using read.table() function. I checked the dim() of the data frame created in R, it had only 103 rows (exactly half), 30 columns. Then I tried reading in the file using read.delim() function and this time the dim() showed to be 206 rows, 30 columns as expected. Reading the read.table() R-help documentation, I
2012 May 18
1
UTF-16 input and read.delim/scan
Hi all, I am running 64-bit R 2.15.0 on windows 7. I am trying to use read.delim to read from a file that has 2-byte unicode (CJK) characters. Here is an example of the data (it is tab-delimited if that gets messed up): HITId HITTypeId Title 2Q69Z6KW4ZMAGKKFRT6Q4ONO6MJF68 2LVJ1LY58B72OP36GNBHH16YF7RS7Z 看看句子,写写想法 请看以下的句子,再回答问 So read.delim (code below) doesn't read in correctly. It reads
2010 Oct 06
3
Help troubleshooting silent failure reading huge file with read.delim
I am trying to read a tab-delimited 1.25 GB file of 4,115,119 records each with 52 fields. I am using R 2.11.0 on a 64-bit Windows 7 machine with 8 GB memory. I have tried the two following statements with the same results: d <- read.delim(filename, as.is=TRUE) d <- read.delim(filename, as.is=TRUE, nrows=4200000) I have tried starting R with this parameter but that changed
2018 Mar 19
1
Inconsistency, may be bug in read.delim ?
Dear friends, I stumbled into beheaviour of read.delim which I would consider a bug or at least an inconsistency that should be improved upon. Recently we had to work with data that used "", two double quotes, as symbol to start and end character input. Essentially the data looked like this data.csv ======== V1, V2, V3 ""data"", 3, """" The
2011 May 19
1
Feature request: extend functionality of 'unlist()' by args 'delim=c("/", "_", etc.)' and 'keep.special=TRUE/FALSE'
Dear list, I hope this is the right place to post a feature request. If there's exists a more formal channel (e.g. as for bug reports), I'd appreciate a pointer. I work a lot with named nested lists with arbitrary degrees of "nestedness". In order to retrieve the names and/or values of "bottom layer/bottom tier", I love the functionality of 'unlist()', or
2009 Jul 02
1
superimpose multiple scatterplot matrices
Dear R- experts, I am trying to superimpose two or more scatterplot matrices generated by pairs() to visualize the differences over each datasets, but have not been very successful. Two data frames, df1 and df2 for example, each has the same five variables in columns. My goal is to paint each dataset with a color on the same plot panel for each pair of the five variables. That is, in this
2005 Apr 18
2
colClasses = "Date" in read.delim, how to pass date-format?
Hi I have a huge data-set with one column being of type date. Of course I can import the data using this column as "factor" and then convert it later to dates, using: sws.bezuege$FaktDat <- dates(as.character(sws.bezuege$FaktDat), format = c(dates = "d.m.y")) But the conversion requires a huge amount of memory (and time), therefore I would
2009 Jul 13
3
read.delim skips first column (why?)
Hi people, I have a text file like this one posted: snp_id gene chromosome distance_from_gene_center position pop1 pop2 pop3 pop4 pop5 pop6 pop7 rs2129081 RAPT2 3 -129993 "upstream" 0.439009 1.169210 NA 0.233020 0.093042 NA -0.902596 rs1202698 RAPT2 3 -128695 "upstream" NA
2004 Mar 04
1
1.9.0-devel: _ in read.delim and make.names
In R 1.9.0, make.names will accept "_" as a valid character for a syntactically valid name. I would appreciate to have an option in ``read.delim'' (etc.) that would change "_" in headers of input files to "." for compatibility with code and data written for R 1.8.1 and before. Wolfram Fischer
2006 Apr 12
1
Pipe delimiter ( | ) in "read.delim"
Hi R folks, Can anyone tell me how to read in a pipe ("|") delimited text file? I've tried the following: read.delim("c:/junk/junk.txt",sep="|", skip=7, check.names=FALSE,quote = "", header=F) The file looks something like the following: RD|I|04|013|9997|68103|5|7|017|830|20000221|00:00|12.6||6|||||||||||||
2007 Nov 15
0
Error with read.delim & read.csv
Hi - I'm reading in a tab delimited file that is causing issues with read.delim. Specifically, for a specific set of lines, the last entry of the line is misread and considered to be the first entry of a new row (which is then padded with 'NA's' ). Specifically: tmp <- read.delim( "trouble.txt", header=F ) produces a data.frame, tmp where if I call tmp[,1],