Displaying 20 results from an estimated 7000 matches similar to: "how to get how many lines there are in a file."
2013 Oct 04
2
Tab Separated File Reading Error
Hello,
I have a seemingly simple problem that a tab-delimited file can't be read in.
> annoTranscripts <- read.table("matched.txt", sep = '\t', stringsAsFactors = FALSE)
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 5933 did not have 12 elements
However, all lines do have 12 columns.
> lines <-
2005 Aug 05
6
Computing sums of the columns of an array
Hi,
I have a 5x731 array A, and I want to compute the sums of the columns.
Currently I do:
apply(A, 2, sum)
But it turns out, this is slow: 70% of my CPU time is spent here, even
though there are many complicated steps in my computation.
Is there a faster way?
Thanks,
Martin
2011 Feb 08
3
intervals {nlme} lower CI greater than upper CI !!!????
Hi folks...
check this out..
> GLU<-lme(gluc~rt*cd4+sex+age+rf+nadir+pharmac+factor(hcv)+factor(hbs)+
+ haartd+hivdur+factor(arv),
+ random= ~rt|id, na.action=na.omit)
> intervals(GLU)$fixed
lower est. upper
(Intercept) 67.3467070345 7.362307e+01 7.989944e+01
rt *0.0148050160* 6.249304e-02 1.101811e-01
cd4
2007 Dec 19
1
unexpected behavior from gzfile and unz
I get unexpected behavior from "readLines()" and
"scan()" depending on how the file is opened with
"gzfile" or "unz". More specifically:
> file <- gzfile("file.gz")
> readLines(file,1)
[1] "a\tb\tc"
> readLines(file,1)
[1] "a\tb\tc"
> close(file)
It seems that the stream is rewound between calls to
readLines.
2004 Jun 30
2
Slow IO: was [R] naive question
I believe IO in R is slow because of the way it is implemented, not
because it has to do some extra work for the user.
I compared scan() with 'what' argument set (which is, AFAIK, is the
fastest way to read a CSV file) to an equivalent C code. It turned out
to be 20 - 50 times slower.
I can see at least two main reasons why R's IO is so slow (I didn't
profile this though):
A) it
2005 Feb 25
3
Loops and dataframes
Hi,
I am experiencing a long delay when using dataframes inside loops and was
wordering if this is a bug or not.
Example code:
> st <- rep(1,100000)
> ed <- rep(2,100000)
> for(i in 1:length(st)) st[i] <- ed[i] # works fine
> df <- data.frame(start=st,end=ed)
> for(i in 1:dim(df)[1]) df[i,1] <- df[i,2] #takes for ever
R: R 2.0.0 (2004-10-04)
OS: Linux, Fedora Core 2
2003 Nov 10
3
Reading an upper triangular matrix
Hola!
I have data in the form of a symmetric distance matrix, in the file I
have recorded only the upper triangular part, with diagonal. The
matrix is 21x21, and the file have row and col names, and some other
information. I am trying to read with the following code (I tried
many variations on it, but all give the same error). The items in the
data file is delimited by white space.
(Part
2005 Apr 15
5
Pearson corelation and p-value for matrix
Hi,
I was trying to evaluate the pearson correlation and the p-values for an nxm matrix, where each row represents a vector. One way to do it would be to iterate through each row, and find its correlation value( and the p-value) with respect to the other rows. Is there some function by which I can use the matrix as input? Ideally, the output would be an nxn matrix, containing the p-values
2004 May 01
5
skip lines on a connection
Hi,
I am looking for an efficient way of skipping big chunks of lines on a
connection (not necessarily at the beginning of the file). One way is to
use read lines, e.g. readLines(1e6), but a) this incurs the overhead of
construction of the return char vector and b) has a (fairly remote)
potential to blow up the memory.
Another way would be to use scan(), e.g.
scan(con, skip=1e6, nmax=0)
2008 Mar 10
1
crossprod is slower than t(AA)%*BB
Dear Rdevelopers
The background for this email is that I was helping a PhD student to
improve the speed of her R code. I suggested to replace calls like
t(AA)%*% BB by crossprod(AA,BB) since I expected this to be faster. The
surprising result to me was that this change actually made her code
slower.
> ## Examples :
>
> AA <- matrix(rnorm(3000*1000),3000,1000)
> BB <-
2014 Mar 12
3
Lectura de texto
Hola a todos,
Me gustaria leer el texto que se encuentra en
http://dl.dropboxusercontent.com/u/9601860/txt.txt
He intentado
txt <- 'http://dl.dropboxusercontent.com/u/9601860/txt.txt'
r <- scan(txt)
#Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
:
# invalid multibyte string at '<ff><fe>M'
r <- read.table(txt, header = FALSE)
2005 Jan 24
1
Weighted.mean(x,wt) vs. t(x) %*% wt
What is the difference between the above two operations ?
[[alternative HTML version deleted]]
2011 Feb 06
1
random interaction effect in lmer
Hi dears while modeling an interaction random effect in lmer i receive the
instantaneous error message
> ldlM4<-lmer(ldl~rt*cd4+age+rf+pharmac+factor(hcv)+
+ hivdur+(rt:cd4|id),na.action=na.omit,REML=F)
*Warning message:
In mer_finalize(ans) : false convergence (8)
*
I think the matter lies in syntax, 'cause i sistematically receive the same
message even when changing response...
PS:
2001 Dec 29
1
Slow 'read.table' in R 1.4.0 (PR#1232)
The 'read.table' function appears to be up to 10X slower in R 1.4.0 than R
1.3.1 for some of the data sets I read in. I was comparing the source code
for the 2 versions and see that it was rewritten in R 1.4.0.
I think I found out what part of the problem might be. I was comparing
R1.3.1 and R1.4.0 code and it appears that a statement is missing in some
of the code for R 1.4. This is
2004 Nov 23
2
sorting without order
Hello,
In order to increase the performance of a script I'd like to sort very large vectors containing repeated integer values.
I'm not interesting in having the values sorted, but only grouped.
I also need the equivalent of index.return from the standard "sort" function:
f(c(10,1,10,100,1,10))
=>
grouped: c(10,10,10,1,1,100)
ix: c(1,3,6,2,5,4)
is there a way
2008 Jun 28
2
Parallel R
Hello,
The problem I'm working now requires to operate on big matrices.
I've noticed that there are some packages that allows to run some
commands in parallel. I've tried snow and NetWorkSpaces, without much
success (they are far more slower that the normal functions)
My problem is very simple, it doesn't require any communication
between parallel tasks; only that it divides
2010 Feb 08
5
Fast way to determine number of lines in a file
Hi all,
Is there a fast way to determine the number of lines in a file? I'm
looking for something like count.lines analogous to count.fields.
Hadley
--
http://had.co.nz/
2008 Apr 17
1
Couldn't (and shouldn't) is.unsorted() be faster?
Hi,
Couldn't is.unsorted() bail out immediately here (after comparing
the first 2 elements):
> x <- 20000000:1
> system.time(is.unsorted(x), gcFirst=TRUE)
user system elapsed
0.084 0.040 0.124
> x <- 200000000:1
> system.time(is.unsorted(x), gcFirst=TRUE)
user system elapsed
0.772 0.440 1.214
Thanks!
H.
2002 Feb 22
1
Summary: read.table on Mac OS X, CARBON vs. DARWIN
Thanks a lot, James!!
The problem is fixed. On the version 1.4.0 Mac/darwin (the latest
available version for this system) the function read.table (which is
called from read.delim etc., too) has the bug you explained.
Inserting the row
nlines <- nlines+1
after
lines <- c(lines, line)
removes this bug.
M.
On Friday, February 22, 2002, at 02:33 PM, james.holtman at convergys.com
2007 Sep 06
2
problems in read.table
Dear R-users,
I have encountered the following problem every now and then. But I was
dealing with a very small dataset before, so it wasn't a problem (I
just edited the dataset in Openoffice speadsheet). This time I have to
deal with many large datasets containing commuting flow data. I
appreciate if anyone could give me a hint or clue to get out of this
problem.
I have a .dat file