thr3ads.net - similar to: "bigmemory not really parallel"

Displaying 20 results from an estimated 500 matches similar to: "bigmemory not really parallel"

2012 Feb 02

bigkmeans not parallel

I'm using bigkmeans in 'biganalytics' to cluster my 60,000 by 600,000 matrix. I'm using a 8 core Linux VM. I have register parallel backend with >registerDoMC() And I checked how many cores registered with >getDoParWorkers() It returns 8, which is the number of cores I have on my machine. And I run the test below, whose results shows improved speed due to parallel. check

kmeans clustering on large but sparse matrix

2012 Jan 18

kmeans clustering on large but sparse matrix

Hi, I have a 60k*600k matrix, which exceed the vector length limit of 2^32-1. But it's rather sparse, only 0.02% has value. So I save is as MarketMatrix (mm) file, it's about 300M in size. I use readMM in Matrix package to read it in. If do so, the data type becomes dgTMatrix in 'Matrix' package instead of the common matrix type. The problem is, if I run k-means only on part of

efficient coding with foreach and bigmemory

2011 Sep 29

efficient coding with foreach and bigmemory

I recently learned about the bigmemory and foreach packages and am trying to use them to help me create a very large matrix. Without those packages, I can create the type of matrix that I want with 10 columns and 5e6 rows. I would like to be able to scale up to 5e9 rows, or more, if possible. I have created a simplified example of what I'm trying to do, below. The first part of the

Bigmemory: Error Running Example

2010 Aug 11

Bigmemory: Error Running Example

Hi, I am trying to run the bigmemory example provided on the http://www.bigmemory.org/ The example runs on the "airline data" and generates summary of the csv files:- library(bigmemory) library(biganalytics) x <- read.big.matrix("2005.csv", type="integer", header=TRUE, backingfile="airline.bin", descriptorfile="airline.desc",

[Fwd: adding more columns in big.matrix object of bigmemory package]

2010 Dec 17

[Fwd: adding more columns in big.matrix object of bigmemory package]

Hi, With reference to the mail below, I have large datasets, coming from various different sources, which I can read into filebacked big.matrix using library bigmemory. I want to merge them all into one 'big.matrix' object. (Later, I want to run regression using library 'biglm'). I am unsuccessfully trying to do this from quite some time now. Can you please

bigmemory - extracting submatrix from big.matrix object

2009 Jun 02

bigmemory - extracting submatrix from big.matrix object

I am using the library(bigmemory) to handle large datasets, say 1 GB, and facing following problems. Any hints from anybody can be helpful. _Problem-1: _ I am using "read.big.matrix" function to create a filebacked big matrix of my data and get the following warning: > x = read.big.matrix("/home/utkarsh.s/data.csv",header=T,type="double",shared=T,backingfile

variación en los resultados de k medias (Alfredo Alvarez)

2013 Jul 26

variación en los resultados de k medias (Alfredo Alvarez)

Buen día, no sé si estoy utilizando bien la lista, es la primera vez. Si lo hago mal me corrigen por favor. Sobre tu comentario Pedro, muchas gracias. Lo qeu entiendo con tu sugerencia de set.seed es qeu de esa forma fijas los resultados, pero no estoy seguro si otra agrupación funcione mejor. Es decir me interesa un método de agrupación que genere la "mejor" agrupación y como los

foreach with registerDoMC on R 2.12.0 OSX 10.6 --- errors and warnings

2011 Feb 11

foreach with registerDoMC on R 2.12.0 OSX 10.6 --- errors and warnings

some hints for the search engines. I just did install.packages("foreach") install.packages("doMC") library(doMC) registerDoMC() library(foreach) > foreach(i = 1:3) %dopar% sqrt(i) The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec(). Break on

Foreach (doMC)

2011 Oct 17

Foreach (doMC)

Hello, I am trying to run a small example with foreach, but I am having some problems. Here is the code: *library(doMC) registerDoMC() zappa = list() frank = list() foreach (i = 1:4) %dopar% { zappa[[i]] = kmeans (iris[-5],4) frank[[i]] = warnings() }* The code runs without error. However the zappa and frank will be empty lists. If I use regular *for *instead, the list will be filled up

%dopar% parallel processing experiment

2011 Jul 02

%dopar% parallel processing experiment

dear R experts--- I am experimenting with multicore processing, so far with pretty disappointing results. Here is my simple example: A <- 100000 randvalues <- abs(rnorm(A)) minfn <- function( x, i ) { log(abs(x))+x^3+i/A+randvalues[i] } ?## an arbitrary function ARGV <- commandArgs(trailingOnly=TRUE) if (ARGV[1] == "do-onecore") { ?library(foreach) ?discard <-

foreach %do% and %dopar%

2012 Feb 18

foreach %do% and %dopar%

Hi everyone, I'm working on a script trying to use foreach %dopar% but without success, so I manage to run the code with foreach %do% and looks like this: The code is part of a MCMC model for projects valuation, returning the most important results (VPN, TIR, EVA, etc.) of the simulation. foreach (simx = NsimT, .combine=cbind, .inorder=FALSE, .verbose=TRUE) %do% { MCPVMPA = MCVAMPA[simx]

Clustering Large Applications..sort of

2011 Aug 10

Clustering Large Applications..sort of

Hello all, I am using the clustering functions in R in order to work with large masses of binary time series data, however the clustering functions do not seem able to fit this size of practical problem. Library 'hclust' is good (though it may be sub par for this size of problem, thus doubly poor for this application) in that I do not want to make assumptions about the number of

R CMD check: Uses the superseded package: ‘doSNOW’

2015 Feb 09

R CMD check: Uses the superseded package: ‘doSNOW’

Dear list, When I run an R CMD check --as-cran on my package (pROC) I get the following note: > Uses the superseded package: ?doSNOW? The fact that it uses the doSNOW package is correct as I have the following example in an .Rd file: > #ifdef windows > if (require(doSNOW)) { > registerDoSNOW(cl <- makeCluster(2, type = "SOCK")) > ci(roc2,

parallel computation with plyr 1.2.1

2010 Sep 16

parallel computation with plyr 1.2.1

Hi, I have been trying to use the new .parallel argument with the most recent version of plyr [1] to speed up some tasks. I can run the example in the NEWS file [1], and it seems to be working correctly. However, R will only use a single core when I try to apply this same approach with ddply(). 1. http://cran.r-project.org/web/packages/plyr/NEWS Watching my CPUs I see that in both cases

adding more columns in big.matrix object of bigmemory package

2010 Dec 16

adding more columns in big.matrix object of bigmemory package

Hi all, Is there any way I can add more columns to an existing filebacked big.matrix object. In general, I want a way to modify an existing big.matrix object, i.e., add rows/columns, rename colnames, etc. I tried the following: > library(bigmemory) > x = read.big.matrix("test.csv",header=T,type="double",shared=T,backingfile="test

problems with bigmemory

2010 Jan 10

problems with bigmemory

Hi all, I am trying to read a large csv file (~11 Gb - ~900,000 columns, 3000 rows) using the read.big.matrix command from the bigmemory package. I am using the following command: x<-read.big.matrix('data.csv', sep=',', header=TRUE, type='char', backingfile='data.bin', descriptorfile='data.desc') When the command starts, everything seems to be fine,

bigmemory package woes

2010 Apr 23

bigmemory package woes

I have pretty big data sizes, like matrices of .5 to 1.5GB so once i need to juggle several of them i am in need of disk cache. I am trying to use bigmemory package but getting problems that are hard to understand. I am getting seg faults and machine just hanging. I work by the way on Red Hat Linux, 64 bit R version 10. Simplest problem is just saving matrices. When i do something like

writeLines + foreach/doMC

2011 Jul 04

writeLines + foreach/doMC

Hi I'm processing sequencing data trying to collapsing the locations of each unique sequence and write the results to a file (as storing that in a table will require 10GB mem at least) so I wrote a function that, given a sequence id, provide the needed line to be stored library(doMC) # load library registerDoMC(12) # assign the Number of CPU

R CMD check: Uses the superseded package: ‘doSNOW’

2015 Feb 10

R CMD check: Uses the superseded package: ‘doSNOW’

Oh, I completely missed that one. It's very neat as it seems to work both on Windows and Unix. Thanks! Xavier On 10/02/15 10:52, Martyn Plummer wrote: > The CRAN package snow is superseded by the parallel package which is > distributed with R since version 2.14.0. Here are the release notes > > \item There is a new package \pkg{parallel}. > > It incorporates (slightly

parallel computation in plyr 1.7

2012 Jan 12

parallel computation in plyr 1.7

Dear all, I have a question regarding the possibility of parallel computation in plyr version 1.7. The help files of the following functions mention the argument '.parallel': ddply, aaply, llply, daply, adply, dlply, alply, ldply, laply However, the help files of the following functions do not mention this argument: ?d_ply, ?aply, ?lply Is it because parallel computation is not

similar to: bigmemory not really parallel