similar to: Help troubleshooting silent failure reading huge file with read.delim

Displaying 20 results from an estimated 3000 matches similar to: "Help troubleshooting silent failure reading huge file with read.delim"

2005 Nov 02
2
RODBC and Excel: Wrong Data Type Assumed on Import
The first column in my Excel sheet has mostly numbers but I need to treat it as character data: > library(RODBC) > channel <- odbcConnectExcel("U:/efg/lab/R/Plasmid/construct list.xls") > plasmid <- sqlFetch(channel,"Sheet1", as.is=TRUE) > odbcClose(channel) > names(plasmid) [1] "Plasmid Number" "Plasmid"
2008 Dec 21
3
Globbing Files in R
Dear all, For example I want to process set of files. Typically Perl's idiom would be: __BEGIN__ @files = glob("/mydir/*.txt"); foreach my $file (@files) { # process the file } __END__ What's the R's way to do that? - Gundala Viswanath Jakarta - Indonesia
2010 Dec 03
2
Replacing a period in a string
Hello I have a sting of the form "12.084.547,17" which I would like R to understand as a number which has "," as the decimal separator, does anybody know how to do this? thank you Felipe Parra [[alternative HTML version deleted]]
2011 Feb 06
2
Fortran and long integers
Hi all, I'm hoping someone more knowledgeable in Fortran than I can chime in with opinion. I'm the maintainer of the flashClust package that implements fast hierarchical clustering. The fortran code fails when the number of clustered objects is larger than about 46300. My guess is that this is because the code uses the following construct: IOFFSET=J+(I-1)*N-(I*(I+1))/2 where N is the
2012 Feb 08
2
Problems reading tab-delim files using read.table and read.delim
Hello, I used read.xlsx to read in Excel files but for large files it turned out to be not very efficient. For that reason I use a programme which writes each sheet in an Excel file into tab-delim txt files. After that I tried using read.table and read.delim to read in those txt files. Unfortunately, the results are not as expected. To show you what I mean I created a tiny Excel sheet with some
2005 Sep 08
1
Wishlist: write.delim()
Hi, It would be great if someone would add write.delim() as an adjunct to write.table(), just as with write.csv(). I store a lot of data in tab-delimited files and can read it in easily with: read.delim("text.txt", as.is=TRUE) and would love to be able to write it out as easily when I create these files. The obvious setting needed for write.delim() is sep = "\t", but in
2010 Jul 28
2
read.delim()
I am reading in a very large file with names in it and R is truncating the number of rows it reads in. The separator in this file is a pipe '|' and so I use dat <- read.delim('pathToMyFile', header= TRUE, sep='|') It turns out that it is reading up to row 61145 and stopping and I think I see why, but am not sure of the best solution to this problem. I see the name of
2012 May 04
2
read.table() vs read.delim() any difference??
Hi, I have a tab seperated file with 206 rows and 30 columns. I read in the file into R using read.table() function. I checked the dim() of the data frame created in R, it had only 103 rows (exactly half), 30 columns. Then I tried reading in the file using read.delim() function and this time the dim() showed to be 206 rows, 30 columns as expected. Reading the read.table() R-help documentation, I
2008 May 14
2
The try() function with read.delim().
I have written a function which reads data from files using read.delim (). The names of these files are complicated and are built using arguments to the wrapper function. Since the files in question may or may not exist, I thought to enclose the read.delim() call in a try(): file <- <complicated expression> temp <- try(read.delim(file)) if(inherits(temp,"try-error")) {
2014 Jul 01
1
combining data from multiple read.delim() invocations.
Is there a better way to do the following? I have data in a number of tab delimited files. I am using read.delim() to read them, in a loop. I am invoking my code on Linux Fedora 20, from the BASH command line, using Rscript. The code I'm using looks like: arguments <- commandArgs(trailingOnly=TRUE); # initialize the capped_data data.frame capped_data <- data.frame(lpar="NULL",
2012 May 18
1
UTF-16 input and read.delim/scan
Hi all, I am running 64-bit R 2.15.0 on windows 7. I am trying to use read.delim to read from a file that has 2-byte unicode (CJK) characters. Here is an example of the data (it is tab-delimited if that gets messed up): HITId HITTypeId Title 2Q69Z6KW4ZMAGKKFRT6Q4ONO6MJF68 2LVJ1LY58B72OP36GNBHH16YF7RS7Z 看看句子,写写想法 请看以下的句子,再回答问 So read.delim (code below) doesn't read in correctly. It reads
2009 Sep 18
1
Reading clipboard with read.delim("clipboard") crash (PR#13957)
Full_Name: Liam Gretton Version: 2.9.2 OS: openSUSE 11.1 (x86_64) Submission from: (NULL) (143.210.13.77) Reading a large number of rows of delimited data via the clipboard results in a segfault or double free error. I've tested copying from various applications, but gedit will do. This problem exists in the openSUSE-supplied 2.8.1, I've just built 2.9.2 to see if it's still there,
2018 Mar 19
1
Inconsistency, may be bug in read.delim ?
Dear friends, I stumbled into beheaviour of read.delim which I would consider a bug or at least an inconsistency that should be improved upon. Recently we had to work with data that used "", two double quotes, as symbol to start and end character input. Essentially the data looked like this data.csv ======== V1, V2, V3 ""data"", 3, """" The
2009 Sep 23
1
read.delim very slow in reading files with lots of columns
Hi, I am trying to read a tab-delimited file into R (Ver. 2.8). The machine I am using is 64bit Linux with 16 GB. The file is basically a matrix(~600x700000) and as large as 3GB. The read.delim() ran extremely slow (hours) even with a subset of the file (31 MB with 6x700000) I monitored the memory usage, and found it constantly only took less than 1% of 16GB memory. Does read.delim()
2005 Apr 18
2
colClasses = "Date" in read.delim, how to pass date-format?
Hi I have a huge data-set with one column being of type date. Of course I can import the data using this column as "factor" and then convert it later to dates, using: sws.bezuege$FaktDat <- dates(as.character(sws.bezuege$FaktDat), format = c(dates = "d.m.y")) But the conversion requires a huge amount of memory (and time), therefore I would
2011 May 19
1
Feature request: extend functionality of 'unlist()' by args 'delim=c("/", "_", etc.)' and 'keep.special=TRUE/FALSE'
Dear list, I hope this is the right place to post a feature request. If there's exists a more formal channel (e.g. as for bug reports), I'd appreciate a pointer. I work a lot with named nested lists with arbitrary degrees of "nestedness". In order to retrieve the names and/or values of "bottom layer/bottom tier", I love the functionality of 'unlist()', or
2009 Jul 13
3
read.delim skips first column (why?)
Hi people, I have a text file like this one posted: snp_id gene chromosome distance_from_gene_center position pop1 pop2 pop3 pop4 pop5 pop6 pop7 rs2129081 RAPT2 3 -129993 "upstream" 0.439009 1.169210 NA 0.233020 0.093042 NA -0.902596 rs1202698 RAPT2 3 -128695 "upstream" NA
2004 Oct 06
3
read.delim problem with trailing spaces
I'm trying to read a comma delimited dataset that uses '.' for NA. I found that if the last field on a line was a missing '.' it was not read as NA, but just a '.', and the life variable was made a factor. The data looks like this, income,imr,region,oilexprt,imr80,gnp80,life Afghanistan,75,400.0,4,0,185.0,.,37.5 Algeria,400,86.3,2,1,20.5,1920,50.7
2004 Mar 04
1
1.9.0-devel: _ in read.delim and make.names
In R 1.9.0, make.names will accept "_" as a valid character for a syntactically valid name. I would appreciate to have an option in ``read.delim'' (etc.) that would change "_" in headers of input files to "." for compatibility with code and data written for R 1.8.1 and before. Wolfram Fischer
2006 Apr 12
1
Pipe delimiter ( | ) in "read.delim"
Hi R folks, Can anyone tell me how to read in a pipe ("|") delimited text file? I've tried the following: read.delim("c:/junk/junk.txt",sep="|", skip=7, check.names=FALSE,quote = "", header=F) The file looks something like the following: RD|I|04|013|9997|68103|5|7|017|830|20000221|00:00|12.6||6|||||||||||||