Displaying 20 results from an estimated 500 matches similar to: "bigmemory not really parallel"
2012 Feb 02
0
bigkmeans not parallel
I'm using bigkmeans in 'biganalytics' to cluster my 60,000 by 600,000 matrix.
I'm using a 8 core Linux VM.
I have register parallel backend with
>registerDoMC()
And I checked how many cores registered with
>getDoParWorkers()
It returns 8, which is the number of cores I have on my machine.
And I run the test below, whose results shows improved speed due to
parallel.
check
2012 Jan 18
1
kmeans clustering on large but sparse matrix
Hi,
I have a 60k*600k matrix, which exceed the vector length limit of 2^32-1.
But it's rather sparse, only 0.02% has value. So I save is as MarketMatrix
(mm) file, it's about 300M in size. I use readMM in Matrix package to read
it in. If do so, the data type becomes dgTMatrix in 'Matrix' package
instead of the common matrix type.
The problem is, if I run k-means only on part of
2011 Sep 29
1
efficient coding with foreach and bigmemory
I recently learned about the bigmemory and foreach packages and am trying
to use them to help me create a very large matrix. Without those
packages, I can create the type of matrix that I want with 10 columns and
5e6 rows. I would like to be able to scale up to 5e9 rows, or more, if
possible.
I have created a simplified example of what I'm trying to do, below. The
first part of the
2010 Aug 11
1
Bigmemory: Error Running Example
Hi,
I am trying to run the bigmemory example provided on the
http://www.bigmemory.org/
The example runs on the "airline data" and generates summary of the csv
files:-
library(bigmemory)
library(biganalytics)
x <- read.big.matrix("2005.csv", type="integer", header=TRUE,
backingfile="airline.bin",
descriptorfile="airline.desc",
2010 Dec 17
1
[Fwd: adding more columns in big.matrix object of bigmemory package]
Hi,
With reference to the mail below, I have large datasets, coming from various
different sources, which I can read into filebacked big.matrix using library
bigmemory. I want to merge them all into one 'big.matrix' object. (Later, I
want to run regression using library 'biglm').
I am unsuccessfully trying to do this from quite some time now. Can you
please
2009 Jun 02
2
bigmemory - extracting submatrix from big.matrix object
I am using the library(bigmemory) to handle large datasets, say 1 GB,
and facing following problems. Any hints from anybody can be helpful.
_Problem-1:
_
I am using "read.big.matrix" function to create a filebacked big matrix
of my data and get the following warning:
> x =
read.big.matrix("/home/utkarsh.s/data.csv",header=T,type="double",shared=T,backingfile
2013 Jul 26
1
variación en los resultados de k medias (Alfredo Alvarez)
Buen día, no sé si estoy utilizando bien la lista, es la primera vez. Si lo
hago mal me corrigen por favor.
Sobre tu comentario Pedro, muchas gracias. Lo qeu entiendo con tu
sugerencia de set.seed es qeu de esa forma fijas los resultados, pero no
estoy seguro si otra agrupación funcione mejor. Es decir me interesa un
método de agrupación que genere la "mejor" agrupación y como los
2011 Feb 11
1
foreach with registerDoMC on R 2.12.0 OSX 10.6 --- errors and warnings
some hints for the search engines.
I just did
install.packages("foreach")
install.packages("doMC")
library(doMC)
registerDoMC()
library(foreach)
> foreach(i = 1:3) %dopar% sqrt(i)
The process has forked and you cannot use this CoreFoundation
functionality safely. You MUST exec().
Break on
2011 Oct 17
2
Foreach (doMC)
Hello,
I am trying to run a small example with foreach, but I am having some
problems. Here is the code:
*library(doMC)
registerDoMC()
zappa = list()
frank = list()
foreach (i = 1:4) %dopar% {
zappa[[i]] = kmeans (iris[-5],4)
frank[[i]] = warnings()
}*
The code runs without error. However the zappa and frank will be empty
lists.
If I use regular *for *instead, the list will be filled up
2011 Jul 02
5
%dopar% parallel processing experiment
dear R experts---
I am experimenting with multicore processing, so far with pretty
disappointing results. Here is my simple example:
A <- 100000
randvalues <- abs(rnorm(A))
minfn <- function( x, i ) { log(abs(x))+x^3+i/A+randvalues[i] } ?## an
arbitrary function
ARGV <- commandArgs(trailingOnly=TRUE)
if (ARGV[1] == "do-onecore") {
?library(foreach)
?discard <-
2012 Feb 18
3
foreach %do% and %dopar%
Hi everyone,
I'm working on a script trying to use foreach %dopar% but without success,
so I manage to run the code with foreach %do% and looks like this:
The code is part of a MCMC model for projects valuation, returning the most
important results (VPN, TIR, EVA, etc.) of the simulation.
foreach (simx = NsimT, .combine=cbind, .inorder=FALSE, .verbose=TRUE) %do% {
MCPVMPA = MCVAMPA[simx]
2011 Aug 10
4
Clustering Large Applications..sort of
Hello all,
I am using the clustering functions in R in order to work with large
masses of binary time series data, however the clustering functions do not
seem able to fit this size of practical problem. Library 'hclust' is good
(though it may be sub par for this size of problem, thus doubly poor for
this application) in that I do not want to make assumptions about the number
of
2015 Feb 09
2
R CMD check: Uses the superseded package: ‘doSNOW’
Dear list,
When I run an R CMD check --as-cran on my package (pROC) I get the
following note:
> Uses the superseded package: ?doSNOW?
The fact that it uses the doSNOW package is correct as I have the
following example in an .Rd file:
> #ifdef windows
> if (require(doSNOW)) {
> registerDoSNOW(cl <- makeCluster(2, type = "SOCK"))
> ci(roc2,
2010 Sep 16
2
parallel computation with plyr 1.2.1
Hi,
I have been trying to use the new .parallel argument with the most recent
version of plyr [1] to speed up some tasks. I can run the example in the NEWS
file [1], and it seems to be working correctly. However, R will only use a
single core when I try to apply this same approach with ddply().
1. http://cran.r-project.org/web/packages/plyr/NEWS
Watching my CPUs I see that in both cases
2010 Dec 16
0
adding more columns in big.matrix object of bigmemory package
Hi all,
Is there any way I can add more columns to an existing filebacked big.matrix
object.
In general, I want a way to modify an existing big.matrix object, i.e., add
rows/columns, rename colnames, etc.
I tried the following:
> library(bigmemory)
> x =
read.big.matrix("test.csv",header=T,type="double",shared=T,backingfile="test
2010 Jan 10
0
problems with bigmemory
Hi all,
I am trying to read a large csv file (~11 Gb - ~900,000 columns, 3000
rows) using the read.big.matrix command from the bigmemory package. I
am using the following command:
x<-read.big.matrix('data.csv', sep=',', header=TRUE, type='char',
backingfile='data.bin', descriptorfile='data.desc')
When the command starts, everything seems to be fine,
2010 Apr 23
2
bigmemory package woes
I have pretty big data sizes, like matrices of .5 to 1.5GB so once i need to
juggle several of them i am in need of disk cache. I am trying to use
bigmemory package but getting problems that are hard to understand. I am
getting seg faults and machine just hanging. I work by the way on Red Hat
Linux, 64 bit R version 10.
Simplest problem is just saving matrices. When i do something like
2011 Jul 04
1
writeLines + foreach/doMC
Hi
I'm processing sequencing data trying to collapsing the locations of each
unique sequence and write the results to a file (as storing that in a table
will require 10GB mem at least)
so I wrote a function that, given a sequence id, provide the needed line to
be stored
library(doMC) # load library
registerDoMC(12) # assign the Number of CPU
2015 Feb 10
1
R CMD check: Uses the superseded package: ‘doSNOW’
Oh, I completely missed that one.
It's very neat as it seems to work both on Windows and Unix.
Thanks!
Xavier
On 10/02/15 10:52, Martyn Plummer wrote:
> The CRAN package snow is superseded by the parallel package which is
> distributed with R since version 2.14.0. Here are the release notes
>
> \item There is a new package \pkg{parallel}.
>
> It incorporates (slightly
2012 Jan 12
1
parallel computation in plyr 1.7
Dear all,
I have a question regarding the possibility of parallel computation in plyr
version 1.7.
The help files of the following functions mention the argument '.parallel':
ddply, aaply, llply, daply, adply, dlply, alply, ldply, laply
However, the help files of the following functions do not mention this
argument: ?d_ply, ?aply, ?lply
Is it because parallel computation is not