thr3ads.net - search: "wavefront"

Displaying 20 results from an estimated 36 matches for "wavefront".

[idea] wavefront obj+mtl viewing inside cube plugin

2007 Jun 30

[idea] wavefront obj+mtl viewing inside cube plugin

...her than developing a separate plugin for each, allowing us to do what we do best (programming) and the artists to do what they do best (create stunning models). This can be accomplished if we write a model viewer that uses the inside of the cube as its output. A common and portable format is the Wavefront OBJ+MTL spec. A few days ago I came across an open source library for parsing these files, as well as an example OpenGL program that displays the textured model. If I base a new plugin on Dennis' gears, it should be trivial to mesh the two (no pun intended :P). I have already compiled and te...

Assign different RegClasses to a virtual register based on 'uniform' attribute?

2016 Dec 20

Assign different RegClasses to a virtual register based on 'uniform' attribute?

Hi, I am working on a new LLVM target for Intel GPU, which also has same kind of scalar/vector register classes used in AMDGPU target. Like for a i32 virtual register, it will be held in scalar register if its value is uniform across a wavefront/warp, otherwise it will be in a vector register. Does AMDGPU already done this? I read the code, but I didn't figure out how to do this. Anybody has idea on this? - Ruiling -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-...

Assign different RegClasses to a virtual register based on 'uniform' attribute?

2016 Dec 20

Assign different RegClasses to a virtual register based on 'uniform' attribute?

...:09AM +0800, Ruiling Song wrote: > Hi, > > I am working on a new LLVM target for Intel GPU, which also has same kind > of scalar/vector register classes used in AMDGPU target. Like for a i32 > virtual register, it will be held in scalar register if its value is > uniform across a wavefront/warp, otherwise it will be in a vector register. > Does AMDGPU already done this? I read the code, but I didn't figure out how > to do this. Anybody has idea on this? > In the AMDGPU backend we select everything we can to scalar instructions, and then after instruction selection, we...

Assign different RegClasses to a virtual register based on 'uniform' attribute?

2016 Dec 21

Assign different RegClasses to a virtual register based on 'uniform' attribute?

...t; > > > > I am working on a new LLVM target for Intel GPU, which also has same kind > > > of scalar/vector register classes used in AMDGPU target. Like for a i32 > > > virtual register, it will be held in scalar register if its value is > > > uniform across a wavefront/warp, otherwise it will be in a vector register. > > > Does AMDGPU already done this? I read the code, but I didn't figure out how > > > to do this. Anybody has idea on this? > > > > > > > In the AMDGPU backend we select everything we can to scalar > &g...

Assign different RegClasses to a virtual register based on 'uniform' attribute?

2016 Dec 21

Assign different RegClasses to a virtual register based on 'uniform' attribute?

...te: > > Hi, > > > > I am working on a new LLVM target for Intel GPU, which also has same kind > > of scalar/vector register classes used in AMDGPU target. Like for a i32 > > virtual register, it will be held in scalar register if its value is > > uniform across a wavefront/warp, otherwise it will be in a vector register. > > Does AMDGPU already done this? I read the code, but I didn't figure out how > > to do this. Anybody has idea on this? > > > > In the AMDGPU backend we select everything we can to scalar > instructions, and then afte...

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 28

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

...p the example a little... in the second snippet, we're supposed to break out of the inner loop if the outer loop's exit condition is true. Here's a more concrete example: for (int i = 0; i < 2; i++) { foo = ballot(true); // ballot 1 if (threadID /* ID of the thread within a wavefront/warp */ % 2 == 0) continue; bar = ballot(true); // ballot 2 } versus: int i = 0; while (true) { do { if (i == 2) break; foo = ballot(true); // ballot 1 i++; } while (threadID % 2 == 0); if (i == 2) break; bar = ballot(true); // ballot 2 i++; } &g...

Assign different RegClasses to a virtual register based on 'uniform' attribute?

2016 Dec 21

Assign different RegClasses to a virtual register based on 'uniform' attribute?

...t; > I am working on a new LLVM target for Intel GPU, which also has same kind > > > > of scalar/vector register classes used in AMDGPU target. Like for a i32 > > > > virtual register, it will be held in scalar register if its value is > > > > uniform across a wavefront/warp, otherwise it will be in a vector register. > > > > Does AMDGPU already done this? I read the code, but I didn't figure out how > > > > to do this. Anybody has idea on this? > > > > > > > > > > In the AMDGPU backend we select everything...

ISA sound cards and stuff

2006 Oct 25

ISA sound cards and stuff

...rdware. I don't really have anything Plug and Play to test, and no ISA network cards anymore. Anyhow, if anybody does have, it would be nice if (s)he could test if all works OK. The Plug and Play is in kernel, no userland tools should be needed (hopefully). The only sound card missing is Wavefront. The source code for device driver needs some updating (there seem to be a patch floating around somewhere though, haven't tested it yet). The way it is now, it produces unresolved symbols if compiled. I see these packages mostly useful for folks with older laptops (ISA based sound cards...

Dynamically determine the CostPerUse value in the register allocator.

2020 May 29

Dynamically determine the CostPerUse value in the register allocator.

...il), and found that having a dynamic register cost is important to achieve an optical allocation. Precisely, it is important to limit the number of VGPRs allocated for a kernel/device-function to a smallest value since it will have a direct impact on the occupancy. The occupancy means the number of wavefronts that can be launched at runtime for a kernel program. Some initial thoughts on how to fix it: 1. Have a target interface (a switch) to enable/discard the CostPerUse value. 2. Get the register cost in the same way we define various calling conventions (*CallingConv.td). 3. Compute the Co...

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 30

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On Mon, Jan 28, 2019 at 9:09 PM Jan Sjodin <jan_sjodin at yahoo.com> wrote: > > > for (int i = 0; i < 2; i++) { > > foo = ballot(true); // ballot 1 > > > > if (threadID /* ID of the thread within a wavefront/warp */ % 2 == 0) continue; > > > > bar = ballot(true); // ballot 2 > > } > > > > versus: > > > > int i = 0; > > while (true) { > > do { > > if (i == 2) break; > > foo = ballot(true); // ballot 1 > >...

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 31

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

..., Jan 30, 2019 at 7:20 AM Jan Sjodin via llvm-dev < llvm-dev at lists.llvm.org> wrote: > > > > > > > for (int i = 0; i < 2; i++) { > > > > foo = ballot(true); // ballot 1 > > > > > > > > if (threadID /* ID of the thread within a wavefront/warp */ % 2 > == 0) continue; > > > > > > > > bar = ballot(true); // ballot 2 > > > > } > > > > > > > > versus: > > > > > > > > int i = 0; > > > > while (true) { > > > > do { > &g...

Dynamically determine the CostPerUse value in the register allocator.

2020 May 30

Dynamically determine the CostPerUse value in the register allocator.

...g a dynamic register cost is important to achieve an > optical allocation. > > Precisely, it is important to limit the number of VGPRs allocated for a > kernel/device-function to a smallest value since it will have a direct > impact on the occupancy. The occupancy means the number of wavefronts that > can be launched at runtime for a kernel program. > > > > Some initial thoughts on how to fix it: > > 1. Have a target interface (a switch) to enable/discard the CostPerUse > value. > 2. Get the register cost in the same way we define various calling >...

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 31

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

...2019 at 7:20 AM Jan Sjodin via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > > > > > for (int i = 0; i < 2; i++) { > > > > foo = ballot(true); // ballot 1 > > > > > > > > if (threadID /* ID of the thread within a wavefront/warp */ % 2 > == 0) continue; > > > > > > > > bar = ballot(true); // ballot 2 > > > > } > > > > > > > > versus: > > > > > > > > int i = 0; > > > > while (true) { > > > > do { > &g...

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 30

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On Wed, Jan 30, 2019 at 4:20 PM Jan Sjodin <jan_sjodin at yahoo.com> wrote: > > > > > > > for (int i = 0; i < 2; i++) { > > > > foo = ballot(true); // ballot 1 > > > > > > > > if (threadID /* ID of the thread within a wavefront/warp */ % 2 > == 0) continue; > > > > > > > > bar = ballot(true); // ballot 2 > > > > } > > > > > > > > versus: > > > > > > > > int i = 0; > > > > while (true) { > > > > do { > &g...

[LLVMdev] Instructions that cannot be duplicated

2009 Oct 08

[LLVMdev] Instructions that cannot be duplicated

...lement gets a unique id between 0 and 127 - The condition is true if id > 32 - Predication is used on control flow What happens in the original code: The first half of the first hardware thread predicates computation on the first condition, second half executes bar and all threads in the second wavefront execute bar. Both hardware threads hit the barrier and wait for the other hardware thread to reach that point, then continue execution. What happens in the optimized code: first half of the first hardware thread predicates computation on the first condition, the second half executes bar and waits...

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Feb 01

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

...yahoo.com > <mailto:jan_sjodin at yahoo.com>> wrote: > > > > > > > > for (int i = 0; i < 2; i++) { > > > > foo = ballot(true); // ballot 1 > > > > > > > > if (threadID /* ID of the thread within a wavefront/warp */ % 2 == 0) continue; > > > > > > > > bar = ballot(true); // ballot 2 > > > > } > > > > > > > > versus: > > > > > > > > int i = 0; > > > > while (true) { >...

[LLVMdev] Instructions that cannot be duplicated

2009 Oct 08

[LLVMdev] Instructions that cannot be duplicated

...id between 0 and 127 > - The condition is true if id > 32 > - Predication is used on control flow > What happens in the original code: > The first half of the first hardware thread predicates computation on the first condition, second half executes bar and all threads in the second wavefront execute bar. Both hardware threads hit the barrier and wait for the other hardware thread to reach that point, then continue execution. > > What happens in the optimized code: > first half of the first hardware thread predicates computation on the first condition, the second half executes...

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Feb 09

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

...jan_sjodin at yahoo.com>> wrote: > > > > > > > > > > > > for (int i = 0; i < 2; i++) { > > > > > foo = ballot(true); // ballot 1 > > > > > > > > > > if (threadID /* ID of the thread within a wavefront/warp */ > % 2 == 0) continue; > > > > > > > > > > bar = ballot(true); // ballot 2 > > > > > } > > > > > > > > > > versus: > > > > > > > > > > int i = 0; > > &...

[LLVMdev] Instructions that cannot be duplicated

2009 Oct 08

[LLVMdev] Instructions that cannot be duplicated

...d 127 >> - The condition is true if id > 32 >> - Predication is used on control flow >> What happens in the original code: >> The first half of the first hardware thread predicates computation on the first condition, second half executes bar and all threads in the second wavefront execute bar. Both hardware threads hit the barrier and wait for the other hardware thread to reach that point, then continue execution. >> >> What happens in the optimized code: >> first half of the first hardware thread predicates computation on the first condition, the second ha...

Dealing with illegal operand mappings in RegBankSelect

2019 Feb 27

Dealing with illegal operand mappings in RegBankSelect

> On Feb 26, 2019, at 7:25 PM, Quentin Colombet <qcolombet at apple.com> wrote: > > > >> On Feb 26, 2019, at 4:18 PM, Matt Arsenault <arsenm2 at gmail.com <mailto:arsenm2 at gmail.com>> wrote: >> >> >> >>> On Feb 26, 2019, at 7:01 PM, Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>> wrote:

search for: wavefront