thr3ads.net - similar to: "[RFC] Introducing convergence control bundles and intrinsics"

Displaying 20 results from an estimated 30000 matches similar to: "[RFC] Introducing convergence control bundles and intrinsics"

[RFC] Introducing convergence control bundles and intrinsics

2020 Aug 17

[RFC] Introducing convergence control bundles and intrinsics

Hi Hal, On Mon, Aug 17, 2020 at 2:13 AM Hal Finkel <hfinkel at anl.gov> wrote: > Thanks for sending this. What do you think that we should do with the > existing convergent attribute? My preference, which is implicitly expressed in the review, is to use `convergent` both for the new and the old thing. They are implicitly distinguished via the "convergencectrl" operand

[RFC] Introducing convergence control bundles and intrinsics

2020 Aug 17

[RFC] Introducing convergence control bundles and intrinsics

On Mon, Aug 17, 2020 at 7:14 PM Hal Finkel <hfinkel at anl.gov> wrote: > On 8/17/20 11:51 AM, Nicolai Hähnle wrote: > > Hi Hal, > > > > On Mon, Aug 17, 2020 at 2:13 AM Hal Finkel <hfinkel at anl.gov> wrote: > >> Thanks for sending this. What do you think that we should do with the > >> existing convergent attribute? > > My preference, which

_mm_lfence in both pathes of an if/else are hoisted by SimplfyCFG potentially breaking use as a speculation barrier

2020 Aug 09

_mm_lfence in both pathes of an if/else are hoisted by SimplfyCFG potentially breaking use as a speculation barrier

Hi Craig, The review for the similar GPU problem is now up here: https://reviews.llvm.org/D85603 (+ some other patches on the Phabricator stack). >From a pragmatic perspective, the constraints added to program transforms there are sufficient for what you need. You'd produce IR such as: %token = call token @llvm.experimental.convergence.anchor() br i1 %c, label %then, label %else

_mm_lfence in both pathes of an if/else are hoisted by SimplfyCFG potentially breaking use as a speculation barrier

2020 Jul 28

_mm_lfence in both pathes of an if/else are hoisted by SimplfyCFG potentially breaking use as a speculation barrier

_mm_lfence was originally documented as a load fence. But in light of speculative execution vulnerabilities it has started being advertised as a way to prevent speculative execution. Current Intel Software Development Manual documents it as "Specifically, LFENCE does not execute until all prior instructions have completed locally, and no later instruction begins execution until LFENCE

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Feb 09

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On Sat, Feb 9, 2019 at 4:44 PM Jan Sjodin <jan_sjodin at yahoo.com> wrote: > > The reason I'm looking for solutions that can work without "scanning the > > code" or "spooky action at a distance" is that we should have a solution > > that's easily digestible by folks who are not aware of GPU execution > models. > > > > The fallback

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Feb 01

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On 31.01.19 15:59, Jan Sjodin wrote: >> > Any transform that re-arranges control flow would potentially have to >> > know about the properties of ballot(), and the rules with respect to >> > the CFG (and maybe consider the target) to know where to insert the >> > intrinsics. > >> But the same is true for basically any approach to handling this. In

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 30

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On Wed, Jan 30, 2019 at 4:20 PM Jan Sjodin <jan_sjodin at yahoo.com> wrote: > > > > > > > for (int i = 0; i < 2; i++) { > > > > foo = ballot(true); // ballot 1 > > > > > > > > if (threadID /* ID of the thread within a wavefront/warp */ % 2 > == 0) continue; > > > > > > > > bar = ballot(true); //

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 31

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On Wed, Jan 30, 2019 at 7:20 AM Jan Sjodin via llvm-dev < llvm-dev at lists.llvm.org> wrote: > > > > > > > for (int i = 0; i < 2; i++) { > > > > foo = ballot(true); // ballot 1 > > > > > > > > if (threadID /* ID of the thread within a wavefront/warp */ % 2 > == 0) continue; > > > > > > > > bar =

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 30

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On Mon, Jan 28, 2019 at 9:09 PM Jan Sjodin <jan_sjodin at yahoo.com> wrote: > > > for (int i = 0; i < 2; i++) { > > foo = ballot(true); // ballot 1 > > > > if (threadID /* ID of the thread within a wavefront/warp */ % 2 == 0) continue; > > > > bar = ballot(true); // ballot 2 > > } > > > > versus: > > > > int i =

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 31

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

Strong agree with Mehdi, I am also not really sure what is the proposal at this point so it's hard to comment further. > There are a number of questions that I have. Do we need better machine descriptions so that various resources can be considered? Do we need the capability to reason about the machine state for the cross-lane operations to enable more optimizations? Are intrinsics the

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 28

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On Fri, Jan 25, 2019 at 3:05 AM Jan Sjodin <jan_sjodin at yahoo.com> wrote: > > > for (...) { > > ballot(); > > if (... /* non-uniform */) continue; > > } > > > > into > > > > for (...) { > > do { > > ballot(); > > } while (... /* non-uniform */); > > } > > I'm not sure if I follow

[PATCH 0/5] btrfs: lz4/lz4hc compression

2012 Jun 23

[PATCH 0/5] btrfs: lz4/lz4hc compression

WARNING: This is not compatible with the previous lz4 patchset. If you''re using experimental compression that isn''t in mainline kernels, be prepared to backup and restore or decompress before upgrading, and have backups in case it eats data (which appears not to be a problem any more, but has been during development). These patches add lz4 and lz4hc compression

[RFC] IR-level Region Annotations

2017 Jan 18

[RFC] IR-level Region Annotations

> On Jan 17, 2017, at 4:36 PM, Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > On 01/17/2017 05:36 PM, Wael Yehia via llvm-dev wrote: >> Hi. Regarding the token approach, I've read some documentation (review D11861, EH in llvm, and Reid and David's presentation) but couldn't answer the following question. >> Does the intrinsic or the

RFC: (Co-)Convergent functions and uniform function parameters

2016 Oct 24

RFC: (Co-)Convergent functions and uniform function parameters

> On Oct 24, 2016, at 4:15 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote: > > On 25.10.2016 01:11, Nicolai Hähnle wrote: >> On 24.10.2016 21:54, Mehdi Amini wrote: >>>> On Oct 24, 2016, at 12:38 PM, Nicolai Hähnle via llvm-dev >>>> <llvm-dev at lists.llvm.org> wrote: >>>> Some brain-storming on an issue with SPMD/SIMT backend

[RFC] IR-level Region Annotations

2017 Jan 17

[RFC] IR-level Region Annotations

Hi. Regarding the token approach, I've read some documentation (review D11861, EH in llvm, and Reid and David's presentation) but couldn't answer the following question.Does the intrinsic or the instruction returning a token type object act as a code motion barrier? In other words, does it prevent other operations from being reordered with it?If the answer is no, then does it mean the

constrOptim convergence

2004 Oct 05

constrOptim convergence

Hello, I got a question with the R function constrOptim. >From the R help, it says that the return values of "constrOptim" are the same as "optim". For the return value "convergence" of the function "optim", the values should be 0, 1, 10, 51 and 52. See http://www.maths.lth.se/help/R/.R/library/stats/html/optim.html When I use constrOptim, I get

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 24

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

I don't see how this would fix the continue vs. nested loop problem I explained earlier. That is, how would this prevent turning: for (...) { ballot(); if (... /* non-uniform */) continue; } into for (...) { do { ballot(); } while (... /* non-uniform */); } and vice versa? Note that there's no duplication going on here, and the single-threaded flow of control is

Convergent series

2010 Jul 14

Convergent series

What are some reliable R functions that can compute the value of a convergent series? David -- David R. Bickel, PhD Associate Professor Ottawa Institute of Systems Biology Biochem., Micro. and I. Department Mathematics and Statistics Department University of Ottawa 451 Smyth Road Ottawa, Ontario K1H 8M5 http://www.statomics.com Office Tel: (613) 562-5800 ext. 8670 Office Fax: (613) 562-5185

bug in '...' of constrOptim (PR#14071)

2009 Nov 18

bug in '...' of constrOptim (PR#14071)

Dear all, There appears to be a bug in how constrOptim handles ... arguments that are suppose to be passed to optim, according to the documentation. This means you can't get the hessian to be returned, for example (so this is a real problem, and not just a question of mistaken documentation). Looking at the code, it appears that a call to the user-defined f includes the ..., when the ...

convergence error code in mixed effects models

2007 Dec 13

convergence error code in mixed effects models

Dear All, I want to analyse treatment effects with time series data: I measured e.g. leaf number (five replicate plants) in relation to two soil pH - after 2,4,6,8 weeks. I used mixed effects models, but some analyses didn?t work. It seems for me as if this is a randomly occurring problem since sometimes the same model works sometimes not. An example: > names(test) [1] "rep"

similar to: [RFC] Introducing convergence control bundles and intrinsics