thr3ads.net - search: "parallelise"

Displaying 20 results from an estimated 85 matches for "parallelise".

Did you mean: parallelism

2013 Feb 07

[LLVMdev] Parallel Loop Metadata

Hi Nadav, On 02/07/2013 07:46 PM, Nadav Rotem wrote: > Pekka suggested that we add two kind of metadata: llvm.loop.parallel > (attached to each loop latch) and llvm.mem.parallel (attached to each memory > instruction!). I think that the motivation for the first metadata is clear - > it says that the loop is data-parallel. I can also see us adding additional > metadata such as

[LLVMdev] Speculative paralellisation in LLVM compiler infrastructure!!!!!

2011 Oct 11

[LLVMdev] Speculative paralellisation in LLVM compiler infrastructure!!!!!

Hi, I am involved in the task of achieving speculative paralellisation in llvm. I have started my work by trying to see if a simple for loop can be paralellised in llvm.. The problem is i want to know how to check if a program is automatically parallelised when compiled with llvm or if explicitly need to do it how can i go about paralellising a for loop using llvm compiler infrsatructure.how do i check for data dependency between iterations using llvm. Thanks , Raj -------------- next part -------------- An HTML attachment was scrubbed... URL: &l...

[LLVMdev] Parallel Loop Metadata

2013 Feb 07

[LLVMdev] Parallel Loop Metadata

Hi, I am continuing the discussion about Parallel Loop Metadata from here: http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-February/059168.html and here: http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-February/058999.html Pekka suggested that we add two kind of metadata: llvm.loop.parallel (attached to each loop latch) and llvm.mem.parallel (attached to each memory instruction!). I think

Antwort: Buying more computer for GLM

2006 Aug 30

Antwort: Buying more computer for GLM

...would be the most cost-effective way to speed this up? The obvious way would be to get a machine with a faster processor (3GHz plus) but I wonder whether it might instead be better to run a dual- processor machine or something like that; this looks at least like a problem R should be able to parallelise, though I don't know whether it does. Thanks for your help, George Russell

[LLVMdev] Contributing to Polly with GSOC 2011

2011 Mar 21

[LLVMdev] Contributing to Polly with GSOC 2011

...&b ) ; if ( b == 2 ) { for ( i = 0 ; i < N; i += 2 ) { body ; } } else { f o r ( i = 0 ; i < N; i += b ) { body ; } } Now with the transformed code the for loop inside ’if’ will be detected as a SCoP and can be parallelised. Since value of N is 100 most of the time, the overall performance will be improved. Consider another scenario. f o r ( i = 0 ; i < N; i ++) { body ; } Suppose using profiling we know that N is always very small. So there wont be much gain from parallelising it. So we have to t...

R and MLE

2005 Jun 07

R and MLE

I learned R & MLE in the last few days. It is great! I wrote up my explorations as http://www.mayin.org/ajayshah/KB/R/mle/mle.html I will be most happy if R gurus will look at this and comment on how it can be improved. I have a few specific questions: * Should one use optim() or should one use stats4::mle()? I felt that mle() wasn't adding much value compared with optim, and

snow package

2011 Jun 12

snow package

Hi I try parallelising some code using the snow package and the following lines: cl <- makeSOCKcluster(8) pfunc <- function (x) (if(x <= (-th)) 1 else 0) ###correlation coefficient clusterExport(cl,c("pfunc","th")) cor.c.f <- parApply(cl,tms,c(1,2),FUN=pfunc) The parApply results in the error message: > cor.c.f <- parApply(cl,tms,c(1,2),FUN=pfunc) Error

How to utilise dual cores and multi-processors on WinXP

2007 Mar 06

How to utilise dual cores and multi-processors on WinXP

Hello, I have a question that I was wondering if anyone had a fairly straightforward answer to: what is the quickest and easiest way to take advantage of the extra cores / processors that are now commonplace on modern machines? And how do I do that in Windows? I realise that this is a complex question that is not answered easily, so let me refine it some more. The type of scripts that I'm

[LLVMdev] [PATCH / PROPOSAL] bitcode encoding that is ~15% smaller for large bitcode files...

2012 Sep 26

[LLVMdev] [PATCH / PROPOSAL] bitcode encoding that is ~15% smaller for large bitcode files...

...o a subset of the values), does this reduce the efficiency of a general purpose solution? Does it make more sense than just applying DEFLATE to the bitcode when it's written to the disk? The other advantage of separating the compression from the encoding, of course, is that it's easier to parallelise, as a fairly coarse-grained dataflow model can be used when streaming to and from compressed bitcode. David

plyr: version 1.2

2010 Sep 10

plyr: version 1.2

plyr is a set of tools for a common set of problems: you need to __split__ up a big data structure into homogeneous pieces, __apply__ a function to each piece and then __combine__ all the results back together. For example, you might want to: * fit the same model each patient subsets of a data frame * quickly calculate summary statistics for each group * perform group-wise transformations

plyr: version 1.2

2010 Sep 10

plyr: version 1.2

Output to sequentially numbered files... also, ideas for running R on Xgrid

2010 Mar 02

Output to sequentially numbered files... also, ideas for running R on Xgrid

Hello, I have some code to run on an XGrid cluster. Currently the code is written as a single, large job... this is no good for trying to run in parallel. To break it up I have basically taken out the highest level for-loop and am planning on batch-running many jobs, each one representing an instance of the removed loop. However, when it comes to output I am stuck. Previously the output was

[LLVMdev] Proposal: Generic auto-vectorization and parallelization approach for LLVM and Polly

2011 Jan 09

[LLVMdev] Proposal: Generic auto-vectorization and parallelization approach for LLVM and Polly

On 01/08/2011 07:34 PM, Renato Golin wrote: > On 9 January 2011 00:07, Tobias Grosser<grosser at fim.uni-passau.de> wrote: >> Matching the target vector width in our heuristics will obviously give the >> best performance. So to get optimal performance Polly needs to take target >> data into account. > > Indeed! And even if you lack target information, you

ext3 fsck question

2002 Feb 15

ext3 fsck question

Hi, After our big ext3 file server crashes, I notice the fsck spends some time replaying the journals (about 5-10 mins for all volumes on the server in question). I guess it must do this should you want to mount the volumes as ext2. My question--is it (theoretically) possible to tell fsck only to replay half-finished and to knock out incomplete transactions from the journals, leaving the kernel

RISC-V LLVM status update

2017 Aug 21

RISC-V LLVM status update

....org/llvm/status> updated with status, test results etc. ## Next steps and getting involved The plan has always been to work from the MC-layer upwards towards reliable RV32I codegen. This then provides a stable 'core' of the backend where it's easy for further development work to be parallelised, and for others to make contributions. I think we're now at that point. I would really like to avoid setting up a new 'downstream', and to use this opportunity to pull in new people to upstream LLVM development. However collaboration is made rather difficult for now due to the large...

[LLVMdev] Proposal: Generic auto-vectorization and parallelization approach for LLVM and Polly

2011 Jan 09

[LLVMdev] Proposal: Generic auto-vectorization and parallelization approach for LLVM and Polly

On 9 January 2011 00:07, Tobias Grosser <grosser at fim.uni-passau.de> wrote: > Matching the target vector width in our heuristics will obviously give the > best performance. So to get optimal performance Polly needs to take target > data into account. Indeed! And even if you lack target information, you won't generate wrong code. ;) > Talking about OpenCL. The lowering

How do I make this faster?

2011 Apr 09

How do I make this faster?

I was on vacation the last week and wrote some code to run a 500-day correlation between the Nasdaq tracking stock (QQQ) and 191 currency pairs for 500 days. The initial run took 9 hours(!) and I'd like to make it faster. So, I'm including my code below, in hopes that somebody will be able to figure out how to make it faster, either through parallelisation, or by making changes. I've

[V2V PATCH v3 5/6] v2v, in-place: introduce --block-driver command line option

2023 Mar 14

[V2V PATCH v3 5/6] v2v, in-place: introduce --block-driver command line option

...re "make check" might be quite time consuming. (FYI I'm on holiday at the moment, back 1st April) 'make check' runs the test suite and as Laszlo said is reasonably fast (on my machine anyway!). Well, it should be around 5-15 mins. You can add -j4 or -j`nproc` or similar to parallelise the tests. 'make check-valgrind' runs the same tests but with valgrind. This is highly unlikely to affect this patch series which only touches OCaml code. 'make check-slow' runs an extra set of tests that as you might guess are quite slow. I wouldn't bother with this for a s...

Proposal/patch: simple parallel LTO code generation

2015 Aug 12

Proposal/patch: simple parallel LTO code generation

...control flow integrity [1], rather than to optimise the program using whole program visibility). Code generation is in principle embarrassingly parallel, as it can in principle be partitioned at the function granularity level, however there are practical issues that need to be solved before we can parallelise code generation for LTO. The main issue is that the backend currently makes no effort to be thread safe. This can be overcome by observing that it is unnecessary for the backend to be thread safe if we arrange for each instance of the backend to operate in a different LLVMContext. This is the appr...

re: smp in Linux

1999 Mar 10

re: smp in Linux

A question to all you R-gurus: Can R (or S-plus, for that matter) make efficient use of multiple Intel Processors running under Linux (within the same PC, not over a net)? With the release of the new 2.2 kernel, this would seem a interesting and cost-efficient way of boosting the computational power of Intel/Linux platforms when using R (or S-plus). Thanks for any wise words, Kenneth

search for: parallelise