similar to: bigmemory not really parallel

Displaying 20 results from an estimated 500 matches similar to: "bigmemory not really parallel"

2012 Feb 02
0
bigkmeans not parallel
I'm using bigkmeans in 'biganalytics' to cluster my 60,000 by 600,000 matrix. I'm using a 8 core Linux VM. I have register parallel backend with >registerDoMC() And I checked how many cores registered with >getDoParWorkers() It returns 8, which is the number of cores I have on my machine. And I run the test below, whose results shows improved speed due to parallel. check
2012 Jan 18
1
kmeans clustering on large but sparse matrix
Hi, I have a 60k*600k matrix, which exceed the vector length limit of 2^32-1. But it's rather sparse, only 0.02% has value. So I save is as MarketMatrix (mm) file, it's about 300M in size. I use readMM in Matrix package to read it in. If do so, the data type becomes dgTMatrix in 'Matrix' package instead of the common matrix type. The problem is, if I run k-means only on part of
2011 Sep 29
1
efficient coding with foreach and bigmemory
I recently learned about the bigmemory and foreach packages and am trying to use them to help me create a very large matrix. Without those packages, I can create the type of matrix that I want with 10 columns and 5e6 rows. I would like to be able to scale up to 5e9 rows, or more, if possible. I have created a simplified example of what I'm trying to do, below. The first part of the
2010 Aug 11
1
Bigmemory: Error Running Example
Hi, I am trying to run the bigmemory example provided on the http://www.bigmemory.org/ The example runs on the "airline data" and generates summary of the csv files:- library(bigmemory) library(biganalytics) x <- read.big.matrix("2005.csv", type="integer", header=TRUE, backingfile="airline.bin", descriptorfile="airline.desc",
2010 Dec 17
1
[Fwd: adding more columns in big.matrix object of bigmemory package]
Hi, With reference to the mail below, I have large datasets, coming from various different sources, which I can read into filebacked big.matrix using library bigmemory. I want to merge them all into one 'big.matrix' object. (Later, I want to run regression using library 'biglm'). I am unsuccessfully trying to do this from quite some time now. Can you please
2009 Jun 02
2
bigmemory - extracting submatrix from big.matrix object
I am using the library(bigmemory) to handle large datasets, say 1 GB, and facing following problems. Any hints from anybody can be helpful. _Problem-1: _ I am using "read.big.matrix" function to create a filebacked big matrix of my data and get the following warning: > x = read.big.matrix("/home/utkarsh.s/data.csv",header=T,type="double",shared=T,backingfile
2013 Jul 26
1
variación en los resultados de k medias (Alfredo Alvarez)
Buen día, no sé si estoy utilizando bien la lista, es la primera vez. Si lo hago mal me corrigen por favor. Sobre tu comentario Pedro, muchas gracias. Lo qeu entiendo con tu sugerencia de set.seed es qeu de esa forma fijas los resultados, pero no estoy seguro si otra agrupación funcione mejor. Es decir me interesa un método de agrupación que genere la "mejor" agrupación y como los
2011 Feb 11
1
foreach with registerDoMC on R 2.12.0 OSX 10.6 --- errors and warnings
some hints for the search engines. I just did install.packages("foreach") install.packages("doMC") library(doMC) registerDoMC() library(foreach) > foreach(i = 1:3) %dopar% sqrt(i) The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec(). Break on
2011 Oct 17
2
Foreach (doMC)
Hello, I am trying to run a small example with foreach, but I am having some problems. Here is the code: *library(doMC) registerDoMC() zappa = list() frank = list() foreach (i = 1:4) %dopar% { zappa[[i]] = kmeans (iris[-5],4) frank[[i]] = warnings() }* The code runs without error. However the zappa and frank will be empty lists. If I use regular *for *instead, the list will be filled up
2011 Jul 02
5
%dopar% parallel processing experiment
dear R experts--- I am experimenting with multicore processing, so far with pretty disappointing results. Here is my simple example: A <- 100000 randvalues <- abs(rnorm(A)) minfn <- function( x, i ) { log(abs(x))+x^3+i/A+randvalues[i] } ?## an arbitrary function ARGV <- commandArgs(trailingOnly=TRUE) if (ARGV[1] == "do-onecore") { ?library(foreach) ?discard <-
2012 Feb 18
3
foreach %do% and %dopar%
Hi everyone, I'm working on a script trying to use foreach %dopar% but without success, so I manage to run the code with foreach %do% and looks like this: The code is part of a MCMC model for projects valuation, returning the most important results (VPN, TIR, EVA, etc.) of the simulation. foreach (simx = NsimT, .combine=cbind, .inorder=FALSE, .verbose=TRUE) %do% { MCPVMPA = MCVAMPA[simx]
2011 Aug 10
4
Clustering Large Applications..sort of
Hello all, I am using the clustering functions in R in order to work with large masses of binary time series data, however the clustering functions do not seem able to fit this size of practical problem. Library 'hclust' is good (though it may be sub par for this size of problem, thus doubly poor for this application) in that I do not want to make assumptions about the number of
2015 Feb 09
2
R CMD check: Uses the superseded package: ‘doSNOW’
Dear list, When I run an R CMD check --as-cran on my package (pROC) I get the following note: > Uses the superseded package: ?doSNOW? The fact that it uses the doSNOW package is correct as I have the following example in an .Rd file: > #ifdef windows > if (require(doSNOW)) { > registerDoSNOW(cl <- makeCluster(2, type = "SOCK")) > ci(roc2,
2010 Sep 16
2
parallel computation with plyr 1.2.1
Hi, I have been trying to use the new .parallel argument with the most recent version of plyr [1] to speed up some tasks. I can run the example in the NEWS file [1], and it seems to be working correctly. However, R will only use a single core when I try to apply this same approach with ddply(). 1. http://cran.r-project.org/web/packages/plyr/NEWS Watching my CPUs I see that in both cases
2010 Dec 16
0
adding more columns in big.matrix object of bigmemory package
Hi all, Is there any way I can add more columns to an existing filebacked big.matrix object. In general, I want a way to modify an existing big.matrix object, i.e., add rows/columns, rename colnames, etc. I tried the following: > library(bigmemory) > x = read.big.matrix("test.csv",header=T,type="double",shared=T,backingfile="test
2010 Jan 10
0
problems with bigmemory
Hi all, I am trying to read a large csv file (~11 Gb - ~900,000 columns, 3000 rows) using the read.big.matrix command from the bigmemory package. I am using the following command: x<-read.big.matrix('data.csv', sep=',', header=TRUE, type='char', backingfile='data.bin', descriptorfile='data.desc') When the command starts, everything seems to be fine,
2010 Apr 23
2
bigmemory package woes
I have pretty big data sizes, like matrices of .5 to 1.5GB so once i need to juggle several of them i am in need of disk cache. I am trying to use bigmemory package but getting problems that are hard to understand. I am getting seg faults and machine just hanging. I work by the way on Red Hat Linux, 64 bit R version 10. Simplest problem is just saving matrices. When i do something like
2011 Jul 04
1
writeLines + foreach/doMC
Hi I'm processing sequencing data trying to collapsing the locations of each unique sequence and write the results to a file (as storing that in a table will require 10GB mem at least) so I wrote a function that, given a sequence id, provide the needed line to be stored library(doMC) # load library registerDoMC(12) # assign the Number of CPU
2015 Feb 10
1
R CMD check: Uses the superseded package: ‘doSNOW’
Oh, I completely missed that one. It's very neat as it seems to work both on Windows and Unix. Thanks! Xavier On 10/02/15 10:52, Martyn Plummer wrote: > The CRAN package snow is superseded by the parallel package which is > distributed with R since version 2.14.0. Here are the release notes > > \item There is a new package \pkg{parallel}. > > It incorporates (slightly
2012 Jan 12
1
parallel computation in plyr 1.7
Dear all, I have a question regarding the possibility of parallel computation in plyr version 1.7. The help files of the following functions mention the argument '.parallel': ddply, aaply, llply, daply, adply, dlply, alply, ldply, laply However, the help files of the following functions do not mention this argument: ?d_ply, ?aply, ?lply Is it because parallel computation is not