Displaying 20 results from an estimated 40 matches for "gcfirst".
2008 Mar 10
1
crossprod is slower than t(AA)%*BB
...*% BB by crossprod(AA,BB) since I expected this to be faster. The
surprising result to me was that this change actually made her code
slower.
> ## Examples :
>
> AA <- matrix(rnorm(3000*1000),3000,1000)
> BB <- matrix(rnorm(3000^2),3000,3000)
> system.time(crossprod(AA,BB),gcFirst=TRUE)
user system elapsed
24.58 0.06 24.69
> system.time(t(AA)%*%BB,gcFirst=TRUE)
user system elapsed
23.25 0.04 23.32
>
>
> AA <- matrix(rnorm(2000^2),2000,2000)
> BB <- matrix(rnorm(2000^2),2000,2000)
> system.time(crossprod(AA,BB),gcFirst=TRUE)...
2008 Apr 17
1
Couldn't (and shouldn't) is.unsorted() be faster?
Hi,
Couldn't is.unsorted() bail out immediately here (after comparing
the first 2 elements):
> x <- 20000000:1
> system.time(is.unsorted(x), gcFirst=TRUE)
user system elapsed
0.084 0.040 0.124
> x <- 200000000:1
> system.time(is.unsorted(x), gcFirst=TRUE)
user system elapsed
0.772 0.440 1.214
Thanks!
H.
2008 Jun 28
2
Parallel R
Hello,
The problem I'm working now requires to operate on big matrices.
I've noticed that there are some packages that allows to run some
commands in parallel. I've tried snow and NetWorkSpaces, without much
success (they are far more slower that the normal functions)
My problem is very simple, it doesn't require any communication
between parallel tasks; only that it divides
2005 Aug 05
6
Computing sums of the columns of an array
Hi,
I have a 5x731 array A, and I want to compute the sums of the columns.
Currently I do:
apply(A, 2, sum)
But it turns out, this is slow: 70% of my CPU time is spent here, even
though there are many complicated steps in my computation.
Is there a faster way?
Thanks,
Martin
2005 Feb 25
3
Loops and dataframes
Hi,
I am experiencing a long delay when using dataframes inside loops and was
wordering if this is a bug or not.
Example code:
> st <- rep(1,100000)
> ed <- rep(2,100000)
> for(i in 1:length(st)) st[i] <- ed[i] # works fine
> df <- data.frame(start=st,end=ed)
> for(i in 1:dim(df)[1]) df[i,1] <- df[i,2] #takes for ever
R: R 2.0.0 (2004-10-04)
OS: Linux, Fedora Core 2
2004 Dec 06
6
how to get how many lines there are in a file.
hi all
If I wanna get the total number of lines in a big file without reading
the file's content into R as matrix or data frame, any methods or
functions?
thanks in advance.
Regards
2005 Apr 15
5
Pearson corelation and p-value for matrix
Hi,
I was trying to evaluate the pearson correlation and the p-values for an nxm matrix, where each row represents a vector. One way to do it would be to iterate through each row, and find its correlation value( and the p-value) with respect to the other rows. Is there some function by which I can use the matrix as input? Ideally, the output would be an nxn matrix, containing the p-values
2008 Mar 10
2
write.table with row.names=FALSE unnecessarily slow?
write.table with large data frames takes quite a long time
> system.time({
+ write.table(df, '/tmp/dftest.txt', row.names=FALSE)
+ }, gcFirst=TRUE)
user system elapsed
97.302 1.532 98.837
A reason is because dimnames is always called, causing 'anonymous' row
names to be created as character vectors. Avoiding this in
src/library/utils, along the lines of
Index: write.table.R
============================================...
2005 Jan 24
1
Weighted.mean(x,wt) vs. t(x) %*% wt
What is the difference between the above two operations ?
[[alternative HTML version deleted]]
2005 Jan 20
2
Creating a custom connection to read from multiple files
Hello,
is it possible to create my own connection which I could use with
read.table or scan ? I would like to create a connection that would read
from multiple files in sequence (like if they were concatenated),
possibly with an option to skip first n lines of each file. I would like
to avoid using platform specific scripts for that... (currently I invoke
"/bin/cat" from R to create a
2020 Oct 10
1
which() vs. just logical selection in df
...# overestimate
dat <- dat[,1:3] # select just the first 3 columns
head(dat, 10) # print the first 10 rows
# Select using which() as the final step ~ 90ms total time on my macbook air
system.time(
head(
dat[which(dat$gender2=="other"),],),
gcFirst=TRUE)
# Select skipping which() ~130ms total time
system.time(
head(
dat[dat$gender2=="other", ]),
gcFirst=TRUE)
Now I would think that the second one without which() would be more
efficient. However, every time I run these, the first version, with
which() is more efficient by a...
2004 Nov 23
2
sorting without order
Hello,
In order to increase the performance of a script I'd like to sort very large vectors containing repeated integer values.
I'm not interesting in having the values sorted, but only grouped.
I also need the equivalent of index.return from the standard "sort" function:
f(c(10,1,10,100,1,10))
=>
grouped: c(10,10,10,1,1,100)
ix: c(1,3,6,2,5,4)
is there a way
2005 May 04
1
Cost of method dispatching: was: when can we expect Prof Tierney's compiled R?
...The guess is based on reading the code and on the following timing on R
level:
> n = 1e6; iA = seq(2,n); x = double(n);
> f1 <- function(x, iA) for (i in iA) x[i] = c(1.0)
> f2 <- function(x, iA) for (i in iA) x = c(1.0)
> last.gc.time = gc.time(TRUE)
> system.time(f1(x, iA), gcFirst=TRUE)
[1] 3.50 0.01 3.52 0.00 0.00
> print(gc.time() - last.gc.time); last.gc.time = gc.time()
[1] 1.25 0.82 1.24 0.00 0.00
> system.time(f2(x, iA), gcFirst=TRUE)
[1] 0.76 0.00 0.77 0.00 0.00
> print(gc.time() - last.gc.time); last.gc.time = gc.time()
[1] 0.25 0.18 0.23 0.00 0.00
f1 and f...
2010 Nov 06
1
Hashing and environments
....wfreqs) <- as.character(1:length(sample.wfreqs))
lex <- new("Lexicon",wfreqs=sample.wfreqs)
words.to.lookup <- trunc(runif(100,min=1,max=1e5))
## look up the words directly from the sample.wfreqs vector
system.time({
for(i in words.to.lookup)
sample.wfreqs[as.character(i)]
},gcFirst=TRUE)
## look up the words through the wfreq() function; time approx the same
system.time({
for(i in words.to.lookup)
wfreq(lex,as.character(i))
},gcFirst=TRUE)
***
I'm guessing that the problem is that the indexing of the wfreqs vector in my wfreq() function is not happening inside the a...
2005 May 08
3
Light-weight data.frame class: was: how to add method to .Primitive function
...ssess the magnitude of the problem.
Thanks,
Vadim
Now the transcript itself:
> # the motivation: subscription of a data.frame is *much* (almost 20
times) slower than that of a list
> # compare
> n = 1e6
> i = seq(n)
>
> x = data.frame(a=seq(n), b=seq(n))
> system.time(x[i,], gcFirst=TRUE)
[1] 1.01 0.14 1.14 0.00 0.00
>
> x = list(a=seq(n), b=seq(n))
> system.time(lapply(x, function(col) col[i]), gcFirst=TRUE)
[1] 0.06 0.00 0.06 0.00 0.00
>
>
> # the solution: define methods for the light-weight data.frame class
> lwdf = function(...) structure(list(...)...
2008 Feb 04
2
maybe a bug in the system.time() function? (PR#10696)
...ive simulations for the testing of a Population Monte
Carlo algorithm. This involves also a study of the CPU times in two different
case.
What I am trying to measure is the "real" CPU time, the one which is independent
on the %CPU.
I'm using the "system.time" function with gcFirst=TRUE and I realized that all
of the output values (user, system and elapsed) depend on the percentage of the
CPU, meaning that if your program is the only one running on the machine,
system.time() gives you certain values, and if there are many programs running
at the same time, for the exact same...
2006 May 14
1
Suggestion for system.time()
Hi, people. A tiny suggestion for the system.time function.
Could the returned vector have names? These could be like:
c("User", "System", "Elapsed", "Sub.User", "Sub.System")
That would then produce self-documenting output.
--
Fran?ois Pinard http://pinard.progiciels-bpi.ca
2010 Jun 04
5
R Newbie, please help!
Hello Everyone,
I just started a new job & it requires heavy use of R to analyze datasets.
I have a data.table that looks like this. It is sorted by ID & Date, there
are about 150 different IDs & the dataset spans 3 million rows. The main
columns of concern are ID, date, and totret. What I need to do is to derive
daily returns for each ID from totret, which is simply totret at time
2008 Nov 19
1
more efficient small subsets from moderate vectors?
This creates a named vector of length nx, then repeatedly draws a
single sample from it.
lkup <- function(nx, m=10000L) {
tbl <- seq_len(nx)
names(tbl) <- as.character(tbl)
v <- sample(names(tbl), m, replace=TRUE)
system.time(for(k in v) tbl[k], gcFirst=TRUE)
}
There is an abrupt performance degredation at nx=1000
> lkup(1000)
user system elapsed
0.180 0.000 0.179
> lkup(1001)
user system elapsed
2.444 0.016 2.462
This is because of the heuristic at stringSubscript.c:424, which
switches from a 'naive'...
2004 Nov 26
2
sorting a data.frame using a vector
Hi all,
I'm looking for an efficient solution (speed and memory) for the
following problem:
Given
- a data.frame x containing numbers of type double
with nrow(x)>ncol(x) and unique row lables and
- a character vector y containing a sorted order labels
Now, I'd like to sort the rows of the data.frame x w.r.t. the order of
labels in y.
example:
x <- data.frame(c(1:4),c(5:8))