thr3ads.net - search: "simt"

Displaying 20 results from an estimated 40 matches for "simt".

Did you mean: sidt

extracting the t-statistic: just the numbers, please

2004 Jul 29

extracting the t-statistic: just the numbers, please

...quite sure there is an easy answer to this, but I am unsure how to gather a bunch of t-statistics in an organized format. I am trying to generate a list of t-statistics for a randomization routine. If I try to collect a bunch of t-statistics from a run, this is what happens: > M <- 10 ; simt <- NULL > for(i in 1:M) + { + perm<-sample(site,replace=F) + + permute<-cbind(perm, site, a, b, c) + + m<- order(perm) + + m1<-cbind(perm[m], site[m], a[m], b[m], c[m]) + + black<-c((m1[1:5,5]),(m1[11:15,5])) + #black + + white<-c((m1[6:10,5]),(m1[16:20,5])) + #w...

[LLVMdev] RFC: Convergent attribute

2015 May 13

[LLVMdev] RFC: Convergent attribute

Below is a proposal for a new "convergent" intrinsic attribute and MachineInstr property, needed for correctly modeling many SPMD/SIMT programming models in LLVM. Comments and feedback welcome. —Owen In order to make LLVM more suitable for programming models variously called SPMD and SIMT, we would like to propose a new intrinsic and MachineInstr annotation called "convergent", which will be used to impose certain...

[LLVMdev] Re presenting SIMT programs in LLVM

2009 Oct 12

[LLVMdev] Re presenting SIMT programs in LLVM

...ode page using LLVM as a backend. Most of the program transformations currently available in LLVM are equally applicable to PTX programs as well as single threaded programs since the PTX instruction set closely resembles LLVM. However, PTX explicitly deals with single-instruction multiple-thread (SIMT) programs where many hundreds or thousands of threads cooperatively execute a program. All threads begin executing the same program and then can take different paths depending on input data. During execution, threads may synchronize via barriers, votes, and atomic operations which are made visibl...

[LLVMdev] RFC: Convergent attribute

2015 May 14

[LLVMdev] RFC: Convergent attribute

...ould be very hard to address doesn't mean it isn't a limitation. :) > > Philip > > On 05/13/2015 01:17 PM, Owen Anderson wrote: >> Below is a proposal for a new "convergent" intrinsic attribute and MachineInstr property, needed for correctly modeling many SPMD/SIMT programming models in LLVM. Comments and feedback welcome. >> >> —Owen >> >> >> >> >> >> In order to make LLVM more suitable for programming models variously called SPMD >> and SIMT, we would like to propose a new intrinsic and MachineIns...

[LLVMdev] RFC: Convergent attribute

2015 Aug 14

[LLVMdev] RFC: Convergent attribute

...rgent. > > Jingyue > > On Wed, May 13, 2015 at 1:17 PM, Owen Anderson <resistor at mac.com <mailto:resistor at mac.com>> wrote: > Below is a proposal for a new "convergent" intrinsic attribute and MachineInstr property, needed for correctly modeling many SPMD/SIMT programming models in LLVM. Comments and feedback welcome. > > —Owen > > > > > > In order to make LLVM more suitable for programming models variously called SPMD > and SIMT, we would like to propose a new intrinsic and MachineInstr annotation > called "conv...

[LLVMdev] RFC: Convergent attribute

2015 Aug 14

[LLVMdev] RFC: Convergent attribute

...e semantics of convergent. > > Jingyue > > On Wed, May 13, 2015 at 1:17 PM, Owen Anderson <resistor at mac.com> wrote: > >> Below is a proposal for a new "convergent" intrinsic attribute and >> MachineInstr property, needed for correctly modeling many SPMD/SIMT >> programming models in LLVM. Comments and feedback welcome. >> >> —Owen >> >> >> >> >> >> In order to make LLVM more suitable for programming models variously >> called SPMD >> and SIMT, we would like to propose a new intrinsic an...

RFC: (Co-)Convergent functions and uniform function parameters

2016 Oct 24

RFC: (Co-)Convergent functions and uniform function parameters

Hi all, Some brain-storming on an issue with SPMD/SIMT backend support where I think some additional IR attributes would be useful. Sorry for the somewhat long mail; the short version of my current thinking is that I would like to have the following: 1) convergent: a call to a function with this attribute cannot be moved to have additional control...

RFC: (Co-)Convergent functions and uniform function parameters

2016 Oct 24

RFC: (Co-)Convergent functions and uniform function parameters

On 24.10.2016 21:54, Mehdi Amini wrote: >> On Oct 24, 2016, at 12:38 PM, Nicolai Hähnle via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> Some brain-storming on an issue with SPMD/SIMT backend support where I think some additional IR attributes would be useful. Sorry for the somewhat long mail; the short version of my current thinking is that I would like to have the following: >> >> 1) convergent: a call to a function with this attribute cannot be moved to have addit...

[RFC] Refinement of convergent semantics

2015 Sep 22

[RFC] Refinement of convergent semantics

Hi Jingyue, I consider it a very important element of the design of convergent that it does not require baseline LLVM to contain a definition of uniformity, which would itself pull in a definition of SIMT/SPMD, warps, threads, etc. The intention is that it should be a conservative (but hopefully not too conservative) approximation, and that implementations of specific GPU programming models (CUDA, OpenCL, individual GPU vendors, etc) may layer more permissive semantics on top of it in code that is...

[LLVMdev] Changes to the PTX calling conventions

2011 Dec 14

[LLVMdev] Changes to the PTX calling conventions

...ctions, and a new, common > calling convention for kernels? I think this might make sense. One major issue with OpenCL C (and I suppose CUDA) kernels some fail to see is that the functions are "directly callable" (just by choosing a correct the calling convention) in general only for SIMT/SPMD-style machines (like NVIDIA and I suppose AMD's GPUs). For the MIMD (with possible SIMD/vector extensions) CPU-architectures you need to transform the kernel function to a "work group function" so it retains its parallel work item semantics whenever the kernel is to be called wi...

RFC: (Co-)Convergent functions and uniform function parameters

2016 Oct 26

RFC: (Co-)Convergent functions and uniform function parameters

...nction arguments in a way that forbids transformations of the form select(call ... , call ...) -> call (select ..., ...), ... But it would be nice to have a clear definition of _why_ those transformations must be forbidden. It's not clear how to do that without pulling in a full model of SIMT-style parallel execution, and admittedly I don't think we have a sane model for _that_ in the first place :-( Something that at least partially addresses the SIMT-style semantics: For every pair (initial state, function inputs) and every call site of the relevant function, keep a log of fun...

RFC: (Co-)Convergent functions and uniform function parameters

2016 Oct 24

RFC: (Co-)Convergent functions and uniform function parameters

...te: > > On 25.10.2016 01:11, Nicolai Hähnle wrote: >> On 24.10.2016 21:54, Mehdi Amini wrote: >>>> On Oct 24, 2016, at 12:38 PM, Nicolai Hähnle via llvm-dev >>>> <llvm-dev at lists.llvm.org> wrote: >>>> Some brain-storming on an issue with SPMD/SIMT backend support where >>>> I think some additional IR attributes would be useful. Sorry for the >>>> somewhat long mail; the short version of my current thinking is that >>>> I would like to have the following: >>>> >>>> 1) convergent: a...

[LLVMdev] Changes to the PTX calling conventions

2011 Dec 14

[LLVMdev] Changes to the PTX calling conventions

...#39;m fine with this. Any core LLVM devs have any issues with this? > > One major issue with OpenCL C (and I suppose CUDA) kernels some > fail to see is that the functions are "directly callable" > (just by choosing a correct the calling convention) in general only for > SIMT/SPMD-style machines (like NVIDIA and I suppose AMD's GPUs). > > For the MIMD (with possible SIMD/vector extensions) CPU-architectures > you need to transform the kernel function to a "work group function" > so it retains its parallel work item semantics whenever the kernel...

[LLVMdev] Changes to the PTX calling conventions

2011 Dec 14

[LLVMdev] Changes to the PTX calling conventions

...e" kernel functions as they need the special treatment before they can be called (like a C function). BTW what about the other OpenCL data like required_wg_size which affect the possible "kernel treatment" of pocl and can be converted to some special instructions (I suppose) for the SIMT targets? Currently only the TCE target in Clang adds metadata for the required_wg_size kernel attribute (as we need it in "offline compilation") but IMHO that could be useful in general, as a default metadata (to enable its support in pocl for all targets, for example). -- Pekka

RFC: (Co-)Convergent functions and uniform function parameters

2016 Oct 31

RFC: (Co-)Convergent functions and uniform function parameters

...way that forbids transformations of the form select(call ... > , call ...) -> call (select ..., ...), ... > > But it would be nice to have a clear definition of _why_ those > transformations must be forbidden. It's not clear how to do that without > pulling in a full model of SIMT-style parallel execution, and admittedly I > don't think we have a sane model for _that_ in the first place :-( > > Something that at least partially addresses the SIMT-style semantics: For > every pair (initial state, function inputs) and every call site of the > relevant functi...

[LLVMdev] Proposal: pragma for branch divergence

2015 Jan 24

[LLVMdev] Proposal: pragma for branch divergence

...hread 0 enabled and then bar() with the other 31 threads enabled. Therefore, the run time of the above code will be the run time of foo() + the run time of bar(). More details about branch divergence can be found in the CUDA C programming guide: http://docs.nvidia.com/cuda/cuda-c-programming-guide/#simt-architecture <http://docs.nvidia.com/cuda/cuda-c-programming-guide/#simt-architecture>How branch divergence affects compiler optimizationsDue to CUDA's different execution model, some optimizations in LLVM, such as jump threading, can be unfortunately harmful. The above figure illustrates...

[LLVMdev] Changes to the PTX calling conventions

2011 Dec 14

[LLVMdev] Changes to the PTX calling conventions

...as they need the > special treatment before they can be called (like a C function). > > BTW what about the other OpenCL data like required_wg_size which > affect the possible "kernel treatment" of pocl and can be converted to some > special instructions (I suppose) for the SIMT targets? Currently only the > TCE target in Clang adds metadata for the required_wg_size kernel > attribute (as we need it in "offline compilation") but IMHO that could be > useful in general, as a default metadata (to enable its support in pocl > for all targets, for example)...

[LLVMdev] [cfe-dev] Proposal: pragma for branch divergence

2015 Jan 24

[LLVMdev] [cfe-dev] Proposal: pragma for branch divergence

...> bar() with the other 31 threads enabled. Therefore, the run time of the > above code will be the run time of foo() + the run time of bar(). More > details about branch divergence can be found in the CUDA C programming > guide: > http://docs.nvidia.com/cuda/cuda-c-programming-guide/#simt-architecture > <http://docs.nvidia.com/cuda/cuda-c-programming-guide/#simt-architecture>How > branch divergence affects compiler optimizationsDue to CUDA's different > execution model, some optimizations in LLVM, such as jump threading, can be > unfortunately harmful. The abov...

[LLVMdev] Changes to the PTX calling conventions

2011 Dec 13

[LLVMdev] Changes to the PTX calling conventions

On Tue, Dec 13, 2011 at 3:37 PM, Villmow, Micah <Micah.Villmow at amd.com>wrote: > ** ** > > *From:* Justin Holewinski [mailto:justin.holewinski at gmail.com] > *Sent:* Tuesday, December 13, 2011 10:50 AM > > *To:* Villmow, Micah > *Cc:* LLVM Developers Mailing List > *Subject:* Re: [LLVMdev] Changes to the PTX calling conventions**** > > ** ** > > On

[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

2013 Jan 25

[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

...te such functions is to generate embarrassingly parallel "for-loops" (wiloops) that produce the multi-WI DLP execution. That is, the loop executes the code in the parallel regions for each work item in the work group. This step is needed to make the multi-WI kernel executable on non-SIMD/SIMT platforms (read: CPUs). On the "SPMD-tailored" processors (many GPUs) this step is not always necessary as they can input the single kernel instructions and do the "spreading" on the fly. We have a different method to generate the WG functions for such targets. > Moreover, O...

search for: simt