search for: parallelise

Displaying 20 results from an estimated 85 matches for "parallelise".

Did you mean: parallelism
2013 Feb 07
0
[LLVMdev] Parallel Loop Metadata
Hi Nadav, On 02/07/2013 07:46 PM, Nadav Rotem wrote: > Pekka suggested that we add two kind of metadata: llvm.loop.parallel > (attached to each loop latch) and llvm.mem.parallel (attached to each memory > instruction!). I think that the motivation for the first metadata is clear - > it says that the loop is data-parallel. I can also see us adding additional > metadata such as
2011 Oct 11
2
[LLVMdev] Speculative paralellisation in LLVM compiler infrastructure!!!!!
Hi, I am involved in the task of achieving speculative paralellisation in llvm. I have started my work by trying to see if a simple for loop can be paralellised in llvm.. The problem is i want to know how to check if a program is automatically parallelised when compiled with llvm or if explicitly need to do it how can i go about paralellising a for loop using llvm compiler infrsatructure.how do i check for data dependency between iterations using llvm. Thanks , Raj -------------- next part -------------- An HTML attachment was scrubbed... URL: &l...
2013 Feb 07
3
[LLVMdev] Parallel Loop Metadata
Hi, I am continuing the discussion about Parallel Loop Metadata from here: http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-February/059168.html and here: http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-February/058999.html Pekka suggested that we add two kind of metadata: llvm.loop.parallel (attached to each loop latch) and llvm.mem.parallel (attached to each memory instruction!). I think
2006 Aug 30
3
Antwort: Buying more computer for GLM
...would be the most cost-effective way to speed this up? The obvious way would be to get a machine with a faster processor (3GHz plus) but I wonder whether it might instead be better to run a dual- processor machine or something like that; this looks at least like a problem R should be able to parallelise, though I don't know whether it does. Thanks for your help, George Russell
2011 Mar 21
3
[LLVMdev] Contributing to Polly with GSOC 2011
...&b ) ; if ( b == 2 ) { for ( i = 0 ; i < N; i += 2 ) { body ; } } else { f o r ( i = 0 ; i < N; i += b ) { body ; } } Now with the transformed code the for loop inside ’if’ will be detected as a SCoP and can be parallelised. Since value of N is 100 most of the time, the overall performance will be improved. Consider another scenario. f o r ( i = 0 ; i < N; i ++) { body ; } Suppose using profiling we know that N is always very small. So there wont be much gain from parallelising it. So we have to t...
2005 Jun 07
1
R and MLE
I learned R & MLE in the last few days. It is great! I wrote up my explorations as http://www.mayin.org/ajayshah/KB/R/mle/mle.html I will be most happy if R gurus will look at this and comment on how it can be improved. I have a few specific questions: * Should one use optim() or should one use stats4::mle()? I felt that mle() wasn't adding much value compared with optim, and
2011 Jun 12
1
snow package
Hi I try parallelising some code using the snow package and the following lines: cl <- makeSOCKcluster(8) pfunc <- function (x) (if(x <= (-th)) 1 else 0) ###correlation coefficient clusterExport(cl,c("pfunc","th")) cor.c.f <- parApply(cl,tms,c(1,2),FUN=pfunc) The parApply results in the error message: > cor.c.f <- parApply(cl,tms,c(1,2),FUN=pfunc) Error
2007 Mar 06
2
How to utilise dual cores and multi-processors on WinXP
Hello, I have a question that I was wondering if anyone had a fairly straightforward answer to: what is the quickest and easiest way to take advantage of the extra cores / processors that are now commonplace on modern machines? And how do I do that in Windows? I realise that this is a complex question that is not answered easily, so let me refine it some more. The type of scripts that I'm
2012 Sep 26
0
[LLVMdev] [PATCH / PROPOSAL] bitcode encoding that is ~15% smaller for large bitcode files...
...o a subset of the values), does this reduce the efficiency of a general purpose solution? Does it make more sense than just applying DEFLATE to the bitcode when it's written to the disk? The other advantage of separating the compression from the encoding, of course, is that it's easier to parallelise, as a fairly coarse-grained dataflow model can be used when streaming to and from compressed bitcode. David
2010 Sep 10
0
plyr: version 1.2
plyr is a set of tools for a common set of problems: you need to __split__ up a big data structure into homogeneous pieces, __apply__ a function to each piece and then __combine__ all the results back together. For example, you might want to: * fit the same model each patient subsets of a data frame * quickly calculate summary statistics for each group * perform group-wise transformations
2010 Sep 10
0
plyr: version 1.2
plyr is a set of tools for a common set of problems: you need to __split__ up a big data structure into homogeneous pieces, __apply__ a function to each piece and then __combine__ all the results back together. For example, you might want to: * fit the same model each patient subsets of a data frame * quickly calculate summary statistics for each group * perform group-wise transformations
2010 Mar 02
1
Output to sequentially numbered files... also, ideas for running R on Xgrid
Hello, I have some code to run on an XGrid cluster. Currently the code is written as a single, large job... this is no good for trying to run in parallel. To break it up I have basically taken out the highest level for-loop and am planning on batch-running many jobs, each one representing an instance of the removed loop. However, when it comes to output I am stuck. Previously the output was
2011 Jan 09
0
[LLVMdev] Proposal: Generic auto-vectorization and parallelization approach for LLVM and Polly
On 01/08/2011 07:34 PM, Renato Golin wrote: > On 9 January 2011 00:07, Tobias Grosser<grosser at fim.uni-passau.de> wrote: >> Matching the target vector width in our heuristics will obviously give the >> best performance. So to get optimal performance Polly needs to take target >> data into account. > > Indeed! And even if you lack target information, you
2002 Feb 15
2
ext3 fsck question
Hi, After our big ext3 file server crashes, I notice the fsck spends some time replaying the journals (about 5-10 mins for all volumes on the server in question). I guess it must do this should you want to mount the volumes as ext2. My question--is it (theoretically) possible to tell fsck only to replay half-finished and to knock out incomplete transactions from the journals, leaving the kernel
2017 Aug 21
4
RISC-V LLVM status update
....org/llvm/status> updated with status, test results etc. ## Next steps and getting involved The plan has always been to work from the MC-layer upwards towards reliable RV32I codegen. This then provides a stable 'core' of the backend where it's easy for further development work to be parallelised, and for others to make contributions. I think we're now at that point. I would really like to avoid setting up a new 'downstream', and to use this opportunity to pull in new people to upstream LLVM development. However collaboration is made rather difficult for now due to the large...
2011 Jan 09
2
[LLVMdev] Proposal: Generic auto-vectorization and parallelization approach for LLVM and Polly
On 9 January 2011 00:07, Tobias Grosser <grosser at fim.uni-passau.de> wrote: > Matching the target vector width in our heuristics will obviously give the > best performance. So to get optimal performance Polly needs to take target > data into account. Indeed! And even if you lack target information, you won't generate wrong code. ;) > Talking about OpenCL. The lowering
2011 Apr 09
1
How do I make this faster?
I was on vacation the last week and wrote some code to run a 500-day correlation between the Nasdaq tracking stock (QQQ) and 191 currency pairs for 500 days. The initial run took 9 hours(!) and I'd like to make it faster. So, I'm including my code below, in hopes that somebody will be able to figure out how to make it faster, either through parallelisation, or by making changes. I've
2023 Mar 14
1
[V2V PATCH v3 5/6] v2v, in-place: introduce --block-driver command line option
...re "make check" might be quite time consuming. (FYI I'm on holiday at the moment, back 1st April) 'make check' runs the test suite and as Laszlo said is reasonably fast (on my machine anyway!). Well, it should be around 5-15 mins. You can add -j4 or -j`nproc` or similar to parallelise the tests. 'make check-valgrind' runs the same tests but with valgrind. This is highly unlikely to affect this patch series which only touches OCaml code. 'make check-slow' runs an extra set of tests that as you might guess are quite slow. I wouldn't bother with this for a s...
2015 Aug 12
2
Proposal/patch: simple parallel LTO code generation
...control flow integrity [1], rather than to optimise the program using whole program visibility). Code generation is in principle embarrassingly parallel, as it can in principle be partitioned at the function granularity level, however there are practical issues that need to be solved before we can parallelise code generation for LTO. The main issue is that the backend currently makes no effort to be thread safe. This can be overcome by observing that it is unnecessary for the backend to be thread safe if we arrange for each instance of the backend to operate in a different LLVMContext. This is the appr...
1999 Mar 10
3
re: smp in Linux
A question to all you R-gurus: Can R (or S-plus, for that matter) make efficient use of multiple Intel Processors running under Linux (within the same PC, not over a net)? With the release of the new 2.2 kernel, this would seem a interesting and cost-efficient way of boosting the computational power of Intel/Linux platforms when using R (or S-plus). Thanks for any wise words, Kenneth