search for: wavefront

Displaying 20 results from an estimated 36 matches for "wavefront".

2007 Jun 30
0
[idea] wavefront obj+mtl viewing inside cube plugin
...her than developing a separate plugin for each, allowing us to do what we do best (programming) and the artists to do what they do best (create stunning models). This can be accomplished if we write a model viewer that uses the inside of the cube as its output. A common and portable format is the Wavefront OBJ+MTL spec. A few days ago I came across an open source library for parsing these files, as well as an example OpenGL program that displays the textured model. If I base a new plugin on Dennis' gears, it should be trivial to mesh the two (no pun intended :P). I have already compiled and te...
2016 Dec 20
2
Assign different RegClasses to a virtual register based on 'uniform' attribute?
Hi, I am working on a new LLVM target for Intel GPU, which also has same kind of scalar/vector register classes used in AMDGPU target. Like for a i32 virtual register, it will be held in scalar register if its value is uniform across a wavefront/warp, otherwise it will be in a vector register. Does AMDGPU already done this? I read the code, but I didn't figure out how to do this. Anybody has idea on this? - Ruiling -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-...
2016 Dec 20
0
Assign different RegClasses to a virtual register based on 'uniform' attribute?
...:09AM +0800, Ruiling Song wrote: > Hi, > > I am working on a new LLVM target for Intel GPU, which also has same kind > of scalar/vector register classes used in AMDGPU target. Like for a i32 > virtual register, it will be held in scalar register if its value is > uniform across a wavefront/warp, otherwise it will be in a vector register. > Does AMDGPU already done this? I read the code, but I didn't figure out how > to do this. Anybody has idea on this? > In the AMDGPU backend we select everything we can to scalar instructions, and then after instruction selection, we...
2016 Dec 21
0
Assign different RegClasses to a virtual register based on 'uniform' attribute?
...t; > > > > I am working on a new LLVM target for Intel GPU, which also has same kind > > > of scalar/vector register classes used in AMDGPU target. Like for a i32 > > > virtual register, it will be held in scalar register if its value is > > > uniform across a wavefront/warp, otherwise it will be in a vector register. > > > Does AMDGPU already done this? I read the code, but I didn't figure out how > > > to do this. Anybody has idea on this? > > > > > > > In the AMDGPU backend we select everything we can to scalar > &g...
2016 Dec 21
3
Assign different RegClasses to a virtual register based on 'uniform' attribute?
...te: > > Hi, > > > > I am working on a new LLVM target for Intel GPU, which also has same kind > > of scalar/vector register classes used in AMDGPU target. Like for a i32 > > virtual register, it will be held in scalar register if its value is > > uniform across a wavefront/warp, otherwise it will be in a vector register. > > Does AMDGPU already done this? I read the code, but I didn't figure out how > > to do this. Anybody has idea on this? > > > > In the AMDGPU backend we select everything we can to scalar > instructions, and then afte...
2019 Jan 28
2
[RFC] Adding thread group semantics to LangRef (motivated by GPUs)
...p the example a little... in the second snippet, we're supposed to break out of the inner loop if the outer loop's exit condition is true. Here's a more concrete example: for (int i = 0; i < 2; i++) { foo = ballot(true); // ballot 1 if (threadID /* ID of the thread within a wavefront/warp */ % 2 == 0) continue; bar = ballot(true); // ballot 2 } versus: int i = 0; while (true) { do { if (i == 2) break; foo = ballot(true); // ballot 1 i++; } while (threadID % 2 == 0); if (i == 2) break; bar = ballot(true); // ballot 2 i++; } &g...
2016 Dec 21
1
Assign different RegClasses to a virtual register based on 'uniform' attribute?
...t; > I am working on a new LLVM target for Intel GPU, which also has same kind > > > > of scalar/vector register classes used in AMDGPU target. Like for a i32 > > > > virtual register, it will be held in scalar register if its value is > > > > uniform across a wavefront/warp, otherwise it will be in a vector register. > > > > Does AMDGPU already done this? I read the code, but I didn't figure out how > > > > to do this. Anybody has idea on this? > > > > > > > > > > In the AMDGPU backend we select everything...
2006 Oct 25
3
ISA sound cards and stuff
...rdware. I don't really have anything Plug and Play to test, and no ISA network cards anymore. Anyhow, if anybody does have, it would be nice if (s)he could test if all works OK. The Plug and Play is in kernel, no userland tools should be needed (hopefully). The only sound card missing is Wavefront. The source code for device driver needs some updating (there seem to be a patch floating around somewhere though, haven't tested it yet). The way it is now, it produces unresolved symbols if compiled. I see these packages mostly useful for folks with older laptops (ISA based sound cards...
2020 May 29
2
Dynamically determine the CostPerUse value in the register allocator.
...il), and found that having a dynamic register cost is important to achieve an optical allocation. Precisely, it is important to limit the number of VGPRs allocated for a kernel/device-function to a smallest value since it will have a direct impact on the occupancy. The occupancy means the number of wavefronts that can be launched at runtime for a kernel program. Some initial thoughts on how to fix it: 1. Have a target interface (a switch) to enable/discard the CostPerUse value. 2. Get the register cost in the same way we define various calling conventions (*CallingConv.td). 3. Compute the Co...
2019 Jan 30
3
[RFC] Adding thread group semantics to LangRef (motivated by GPUs)
On Mon, Jan 28, 2019 at 9:09 PM Jan Sjodin <jan_sjodin at yahoo.com> wrote: > > > for (int i = 0; i < 2; i++) { > > foo = ballot(true); // ballot 1 > > > > if (threadID /* ID of the thread within a wavefront/warp */ % 2 == 0) continue; > > > > bar = ballot(true); // ballot 2 > > } > > > > versus: > > > > int i = 0; > > while (true) { > > do { > > if (i == 2) break; > > foo = ballot(true); // ballot 1 > >...
2019 Jan 31
2
[RFC] Adding thread group semantics to LangRef (motivated by GPUs)
..., Jan 30, 2019 at 7:20 AM Jan Sjodin via llvm-dev < llvm-dev at lists.llvm.org> wrote: > > > > > > > for (int i = 0; i < 2; i++) { > > > > foo = ballot(true); // ballot 1 > > > > > > > > if (threadID /* ID of the thread within a wavefront/warp */ % 2 > == 0) continue; > > > > > > > > bar = ballot(true); // ballot 2 > > > > } > > > > > > > > versus: > > > > > > > > int i = 0; > > > > while (true) { > > > > do { > &g...
2020 May 30
2
Dynamically determine the CostPerUse value in the register allocator.
...g a dynamic register cost is important to achieve an > optical allocation. > > Precisely, it is important to limit the number of VGPRs allocated for a > kernel/device-function to a smallest value since it will have a direct > impact on the occupancy. The occupancy means the number of wavefronts that > can be launched at runtime for a kernel program. > > > > Some initial thoughts on how to fix it: > > 1. Have a target interface (a switch) to enable/discard the CostPerUse > value. > 2. Get the register cost in the same way we define various calling >...
2019 Jan 31
3
[RFC] Adding thread group semantics to LangRef (motivated by GPUs)
...2019 at 7:20 AM Jan Sjodin via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > > > > > for (int i = 0; i < 2; i++) { > > > > foo = ballot(true); // ballot 1 > > > > > > > > if (threadID /* ID of the thread within a wavefront/warp */ % 2 > == 0) continue; > > > > > > > > bar = ballot(true); // ballot 2 > > > > } > > > > > > > > versus: > > > > > > > > int i = 0; > > > > while (true) { > > > > do { > &g...
2019 Jan 30
2
[RFC] Adding thread group semantics to LangRef (motivated by GPUs)
On Wed, Jan 30, 2019 at 4:20 PM Jan Sjodin <jan_sjodin at yahoo.com> wrote: > > > > > > > for (int i = 0; i < 2; i++) { > > > > foo = ballot(true); // ballot 1 > > > > > > > > if (threadID /* ID of the thread within a wavefront/warp */ % 2 > == 0) continue; > > > > > > > > bar = ballot(true); // ballot 2 > > > > } > > > > > > > > versus: > > > > > > > > int i = 0; > > > > while (true) { > > > > do { > &g...
2009 Oct 08
2
[LLVMdev] Instructions that cannot be duplicated
...lement gets a unique id between 0 and 127 - The condition is true if id > 32 - Predication is used on control flow What happens in the original code: The first half of the first hardware thread predicates computation on the first condition, second half executes bar and all threads in the second wavefront execute bar. Both hardware threads hit the barrier and wait for the other hardware thread to reach that point, then continue execution. What happens in the optimized code: first half of the first hardware thread predicates computation on the first condition, the second half executes bar and waits...
2019 Feb 01
2
[RFC] Adding thread group semantics to LangRef (motivated by GPUs)
...yahoo.com > <mailto:jan_sjodin at yahoo.com>> wrote: > > > > > > > > for (int i = 0; i < 2; i++) { > > > >  foo = ballot(true); // ballot 1 > > > > > > > >    if (threadID /* ID of the thread within a wavefront/warp */ % 2 == 0) continue; > > > > > > > >    bar = ballot(true); // ballot 2 > > > > } > > > > > > > > versus: > > > > > > > > int i = 0; > > > > while (true) { >...
2009 Oct 08
0
[LLVMdev] Instructions that cannot be duplicated
...id between 0 and 127 > - The condition is true if id > 32 > - Predication is used on control flow > What happens in the original code: > The first half of the  first hardware thread predicates computation on the first condition, second half executes bar and all threads in the second wavefront execute bar. Both hardware threads hit the barrier and wait for the other hardware thread to reach that point, then continue execution. > > What happens in the optimized code: > first half of the first hardware thread predicates computation on the first condition, the second half executes...
2019 Feb 09
1
[RFC] Adding thread group semantics to LangRef (motivated by GPUs)
...jan_sjodin at yahoo.com>> wrote: > > > > > > > > > > > > for (int i = 0; i < 2; i++) { > > > > > foo = ballot(true); // ballot 1 > > > > > > > > > > if (threadID /* ID of the thread within a wavefront/warp */ > % 2 == 0) continue; > > > > > > > > > > bar = ballot(true); // ballot 2 > > > > > } > > > > > > > > > > versus: > > > > > > > > > > int i = 0; > > &...
2009 Oct 08
3
[LLVMdev] Instructions that cannot be duplicated
...d 127 >> - The condition is true if id > 32 >> - Predication is used on control flow >> What happens in the original code: >> The first half of the  first hardware thread predicates computation on the first condition, second half executes bar and all threads in the second wavefront execute bar. Both hardware threads hit the barrier and wait for the other hardware thread to reach that point, then continue execution. >> >> What happens in the optimized code: >> first half of the first hardware thread predicates computation on the first condition, the second ha...
2019 Feb 27
2
Dealing with illegal operand mappings in RegBankSelect
> On Feb 26, 2019, at 7:25 PM, Quentin Colombet <qcolombet at apple.com> wrote: > > > >> On Feb 26, 2019, at 4:18 PM, Matt Arsenault <arsenm2 at gmail.com <mailto:arsenm2 at gmail.com>> wrote: >> >> >> >>> On Feb 26, 2019, at 7:01 PM, Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>> wrote: