similar to: how to get how many lines there are in a file.

Displaying 20 results from an estimated 7000 matches similar to: "how to get how many lines there are in a file."

2013 Oct 04
2
Tab Separated File Reading Error
Hello, I have a seemingly simple problem that a tab-delimited file can't be read in. > annoTranscripts <- read.table("matched.txt", sep = '\t', stringsAsFactors = FALSE) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 5933 did not have 12 elements However, all lines do have 12 columns. > lines <-
2005 Aug 05
6
Computing sums of the columns of an array
Hi, I have a 5x731 array A, and I want to compute the sums of the columns. Currently I do: apply(A, 2, sum) But it turns out, this is slow: 70% of my CPU time is spent here, even though there are many complicated steps in my computation. Is there a faster way? Thanks, Martin
2011 Feb 08
3
intervals {nlme} lower CI greater than upper CI !!!????
Hi folks... check this out.. > GLU<-lme(gluc~rt*cd4+sex+age+rf+nadir+pharmac+factor(hcv)+factor(hbs)+ + haartd+hivdur+factor(arv), + random= ~rt|id, na.action=na.omit) > intervals(GLU)$fixed lower est. upper (Intercept) 67.3467070345 7.362307e+01 7.989944e+01 rt *0.0148050160* 6.249304e-02 1.101811e-01 cd4
2007 Dec 19
1
unexpected behavior from gzfile and unz
I get unexpected behavior from "readLines()" and "scan()" depending on how the file is opened with "gzfile" or "unz". More specifically: > file <- gzfile("file.gz") > readLines(file,1) [1] "a\tb\tc" > readLines(file,1) [1] "a\tb\tc" > close(file) It seems that the stream is rewound between calls to readLines.
2004 Jun 30
2
Slow IO: was [R] naive question
I believe IO in R is slow because of the way it is implemented, not because it has to do some extra work for the user. I compared scan() with 'what' argument set (which is, AFAIK, is the fastest way to read a CSV file) to an equivalent C code. It turned out to be 20 - 50 times slower. I can see at least two main reasons why R's IO is so slow (I didn't profile this though): A) it
2005 Feb 25
3
Loops and dataframes
Hi, I am experiencing a long delay when using dataframes inside loops and was wordering if this is a bug or not. Example code: > st <- rep(1,100000) > ed <- rep(2,100000) > for(i in 1:length(st)) st[i] <- ed[i] # works fine > df <- data.frame(start=st,end=ed) > for(i in 1:dim(df)[1]) df[i,1] <- df[i,2] #takes for ever R: R 2.0.0 (2004-10-04) OS: Linux, Fedora Core 2
2003 Nov 10
3
Reading an upper triangular matrix
Hola! I have data in the form of a symmetric distance matrix, in the file I have recorded only the upper triangular part, with diagonal. The matrix is 21x21, and the file have row and col names, and some other information. I am trying to read with the following code (I tried many variations on it, but all give the same error). The items in the data file is delimited by white space. (Part
2005 Apr 15
5
Pearson corelation and p-value for matrix
Hi, I was trying to evaluate the pearson correlation and the p-values for an nxm matrix, where each row represents a vector. One way to do it would be to iterate through each row, and find its correlation value( and the p-value) with respect to the other rows. Is there some function by which I can use the matrix as input? Ideally, the output would be an nxn matrix, containing the p-values
2004 May 01
5
skip lines on a connection
Hi, I am looking for an efficient way of skipping big chunks of lines on a connection (not necessarily at the beginning of the file). One way is to use read lines, e.g. readLines(1e6), but a) this incurs the overhead of construction of the return char vector and b) has a (fairly remote) potential to blow up the memory. Another way would be to use scan(), e.g. scan(con, skip=1e6, nmax=0)
2008 Mar 10
1
crossprod is slower than t(AA)%*BB
Dear Rdevelopers The background for this email is that I was helping a PhD student to improve the speed of her R code. I suggested to replace calls like t(AA)%*% BB by crossprod(AA,BB) since I expected this to be faster. The surprising result to me was that this change actually made her code slower. > ## Examples : > > AA <- matrix(rnorm(3000*1000),3000,1000) > BB <-
2014 Mar 12
3
Lectura de texto
Hola a todos, Me gustaria leer el texto que se encuentra en http://dl.dropboxusercontent.com/u/9601860/txt.txt He intentado txt <- 'http://dl.dropboxusercontent.com/u/9601860/txt.txt' r <- scan(txt) #Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : # invalid multibyte string at '<ff><fe>M' r <- read.table(txt, header = FALSE)
2005 Jan 24
1
Weighted.mean(x,wt) vs. t(x) %*% wt
What is the difference between the above two operations ? [[alternative HTML version deleted]]
2011 Feb 06
1
random interaction effect in lmer
Hi dears while modeling an interaction random effect in lmer i receive the instantaneous error message > ldlM4<-lmer(ldl~rt*cd4+age+rf+pharmac+factor(hcv)+ + hivdur+(rt:cd4|id),na.action=na.omit,REML=F) *Warning message: In mer_finalize(ans) : false convergence (8) * I think the matter lies in syntax, 'cause i sistematically receive the same message even when changing response... PS:
2001 Dec 29
1
Slow 'read.table' in R 1.4.0 (PR#1232)
The 'read.table' function appears to be up to 10X slower in R 1.4.0 than R 1.3.1 for some of the data sets I read in. I was comparing the source code for the 2 versions and see that it was rewritten in R 1.4.0. I think I found out what part of the problem might be. I was comparing R1.3.1 and R1.4.0 code and it appears that a statement is missing in some of the code for R 1.4. This is
2004 Nov 23
2
sorting without order
Hello, In order to increase the performance of a script I'd like to sort very large vectors containing repeated integer values. I'm not interesting in having the values sorted, but only grouped. I also need the equivalent of index.return from the standard "sort" function: f(c(10,1,10,100,1,10)) => grouped: c(10,10,10,1,1,100) ix: c(1,3,6,2,5,4) is there a way
2008 Jun 28
2
Parallel R
Hello, The problem I'm working now requires to operate on big matrices. I've noticed that there are some packages that allows to run some commands in parallel. I've tried snow and NetWorkSpaces, without much success (they are far more slower that the normal functions) My problem is very simple, it doesn't require any communication between parallel tasks; only that it divides
2010 Feb 08
5
Fast way to determine number of lines in a file
Hi all, Is there a fast way to determine the number of lines in a file? I'm looking for something like count.lines analogous to count.fields. Hadley -- http://had.co.nz/
2008 Apr 17
1
Couldn't (and shouldn't) is.unsorted() be faster?
Hi, Couldn't is.unsorted() bail out immediately here (after comparing the first 2 elements): > x <- 20000000:1 > system.time(is.unsorted(x), gcFirst=TRUE) user system elapsed 0.084 0.040 0.124 > x <- 200000000:1 > system.time(is.unsorted(x), gcFirst=TRUE) user system elapsed 0.772 0.440 1.214 Thanks! H.
2002 Feb 22
1
Summary: read.table on Mac OS X, CARBON vs. DARWIN
Thanks a lot, James!! The problem is fixed. On the version 1.4.0 Mac/darwin (the latest available version for this system) the function read.table (which is called from read.delim etc., too) has the bug you explained. Inserting the row nlines <- nlines+1 after lines <- c(lines, line) removes this bug. M. On Friday, February 22, 2002, at 02:33 PM, james.holtman at convergys.com
2007 Sep 06
2
problems in read.table
Dear R-users, I have encountered the following problem every now and then. But I was dealing with a very small dataset before, so it wasn't a problem (I just edited the dataset in Openoffice speadsheet). This time I have to deal with many large datasets containing commuting flow data. I appreciate if anyone could give me a hint or clue to get out of this problem. I have a .dat file