thr3ads.net - similar to: "Is this undefined behavior optimization legal?"

Displaying 20 results from an estimated 10000 matches similar to: "Is this undefined behavior optimization legal?"

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 14

Implementing cross-thread reduction in the AMDGPU backend

On 06/13/2017 07:33 PM, Matt Arsenault wrote: > >> On Jun 12, 2017, at 17:23, Tom Stellard <tstellar at redhat.com <mailto:tstellar at redhat.com>> wrote: >> >> On 06/12/2017 08:03 PM, Connor Abbott wrote: >>> On Mon, Jun 12, 2017 at 4:56 PM, Tom Stellard <tstellar at redhat.com <mailto:tstellar at redhat.com>> wrote: >>>> On

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 15

Implementing cross-thread reduction in the AMDGPU backend

On 06/14/2017 05:05 PM, Connor Abbott wrote: > On Tue, Jun 13, 2017 at 6:13 PM, Tom Stellard <tstellar at redhat.com> wrote: >> On 06/13/2017 07:33 PM, Matt Arsenault wrote: >>> >>>> On Jun 12, 2017, at 17:23, Tom Stellard <tstellar at redhat.com <mailto:tstellar at redhat.com>> wrote: >>>> >>>> On 06/12/2017 08:03 PM, Connor

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 15

Implementing cross-thread reduction in the AMDGPU backend

I'm wondering about the focus on bound_cntl. Any cleared bit in the row_mask or bank_mask will also disable updating the result. Brian -----Original Message----- From: Connor Abbott [mailto:cwabbott0 at gmail.com] Sent: Wednesday, June 14, 2017 6:13 PM To: tstellar at redhat.com Cc: Matt Arsenault; llvm-dev at lists.llvm.org; Kolton, Sam; Sumner, Brian; Pykhtin, Valery Subject: Re:

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 12

Implementing cross-thread reduction in the AMDGPU backend

Hi all, I've been looking into how to implement the more advanced Shader Model 6 reduction operations in radv (and obviously most of the work would be useful for radeonsi too). They're explained in the spec for GL_AMD_shader_ballot at https://www.khronos.org/registry/OpenGL/extensions/AMD/AMD_shader_ballot.txt, but I'll summarize them here. There are two types of operations:

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 13

Implementing cross-thread reduction in the AMDGPU backend

On 06/12/2017 08:03 PM, Connor Abbott wrote: > On Mon, Jun 12, 2017 at 4:56 PM, Tom Stellard <tstellar at redhat.com> wrote: >> On 06/12/2017 07:15 PM, Tom Stellard via llvm-dev wrote: >>> cc some people who have worked on this. >>> >>> On 06/12/2017 05:58 PM, Connor Abbott via llvm-dev wrote: >>>> Hi all, >>>> >>>>

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 14

Implementing cross-thread reduction in the AMDGPU backend

Sorry about the formatting... Anyway, I think there may be a misinterpretation of bound_cntl. My understanding is that: 0 => if the source is invalid or disabled, do not write a result 1 => if the source is invalid or disabled, use a 0 instead So the problematic case is where bound_cntl is 0, not when it is 1. -----Original Message----- From: Tom Stellard [mailto:tstellar at redhat.com]

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 12

Implementing cross-thread reduction in the AMDGPU backend

On 06/12/2017 07:15 PM, Tom Stellard via llvm-dev wrote: > cc some people who have worked on this. > > On 06/12/2017 05:58 PM, Connor Abbott via llvm-dev wrote: >> Hi all, >> >> I've been looking into how to implement the more advanced Shader Model >> 6 reduction operations in radv (and obviously most of the work would >> be useful for radeonsi too).

[arm, aarch64] Alignment checking in interleaved access pass

2016 Sep 19

[arm, aarch64] Alignment checking in interleaved access pass

Hi, As a follow up to Patch D23646 <https://reviews.llvm.org/D23646>, I'm trying to figure out if there should be an alignment check and what the correct approach is. Some background: For stores, the pass turns: %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11> store <12 x i32> %i.vec, <12 x i32>* %ptr

RFC: Strong GC References in LLVM

2016 Jun 24

RFC: Strong GC References in LLVM

This is a proposal to add strong GC reference types to LLVM. We have some local (downstream) patches that are needed to prevent LLVM's optimizer from making transforms that are problematic in the presence of a precise relocating GC. Adding a notion of a strong GC reference to LLVM will let us upstream these patches in a principled manner, and will act as a measure to avoid new problematic

[SCEV][ScalarEvolution] SE limitation impacting LV

2017 Nov 14

[SCEV][ScalarEvolution] SE limitation impacting LV

Hi! I would appreciate some feedback from someone with experience in SCEV/SE. D39346 tries to fix an issue in LV (PR34965) that exposes a limitation in SCEV/SE. The best solution to the LV issue might not be a fix at SCEV/SE level but we may want to report/address SCEV/SE limitation as well. For the snippet below, LV expects SE to return a SCEVAddRecExpr for %21. However, SE returns ((4 * (zext

[AMDGPU] AMDGPUAsmParser fails to parse several instructions

2015 Oct 24

[AMDGPU] AMDGPUAsmParser fails to parse several instructions

Thanks you. I'm new to LLVM backend, so the help is much appreciated. On Sat, Oct 24, 2015 at 2:12 AM, Matt Arsenault <arsenm2 at gmail.com> wrote: > > > On Oct 23, 2015, at 3:36 AM, 李弘宇 via llvm-dev <llvm-dev at lists.llvm.org> > wrote: > > > The first line has the following error message: > > > > sop1-playground.s:1:15: error: invalid immediate:

[LLVMdev] DataLayout missing in isDereferenceablePointer()

2015 Feb 09

[LLVMdev] DataLayout missing in isDereferenceablePointer()

Eric Christopher wrote: > How are you trying to call it? Do you have a DataLayout? In test/Analysis/ValueTracking/memory-dereferenceable.ll, just change byval to dereferenceable(8), and %dparam won't match (see lib/IR/Value.cpp:521 for the logic that is supposed to fire). How do I get it to pass? I tried introducing a target-triple and target-datalayout, but it didn't help.

[LLVMdev] [llvm] r195903 - AArch64: Fix a bug about disassembling post-index load single element to 4 vectors

2013 Nov 28

[LLVMdev] [llvm] r195903 - AArch64: Fix a bug about disassembling post-index load single element to 4 vectors

I"m getting build errors I think from one of your patches O tjoml. You need to have a build area that builds with clang and does warnings as errors to avoid these issues on putback. here is my configure step for example: /home/rkotler/llvm_trunk/configure --enable-werror --prefix=/home/rkotler/ll vm/install CC=/home/rkotler/llvm_3_2/install/bin/clang CXX=/home/rkotler/llvm_3_

[LLVMdev] [llvm] r195903 - AArch64: Fix a bug about disassembling post-index load single element to 4 vectors

2013 Nov 28

[LLVMdev] [llvm] r195903 - AArch64: Fix a bug about disassembling post-index load single element to 4 vectors

I'm still seeing this problem. On 11/28/2013 09:37 AM, NAKAMURA Takumi wrote: > It is r195843 and fixed in r195905, FYI. > > 2013/11/29 Reed Kotler <rkotler at mips.com>: >> I"m getting build errors I think from one of your patches O tjoml. >> >> You need to have a build area that builds with clang and does warnings as >> errors to avoid these

[SCEV][ScalarEvolution] SE limitation impacting LV

2017 Nov 22

[SCEV][ScalarEvolution] SE limitation impacting LV

Thanks for the feedback, Sanjoy. > SCEV is fairly conservative around PHI nodes that aren't recurrences and aren't obviously equivalent to a min-max branch-phi idiom. Is that the limitation you're running into here? Yes, that's exactly the problem. The problematic PHI nodes (%bc.resume.val and %bc.resume.val1) aren't either recurrences or related to min-max idioms. I

Testing for arguments in a function

2011 Sep 26

Testing for arguments in a function

I don't understand how this function can subset by i when i is missing.... ## My function: myfun = function(vec, i){ ret = vec[i] ret } ## My data: i = 10 vec = 1:100 ## Expected input and behavior: myfun(vec, i) ## Missing an argument, but error is not caught! ## How is subsetting even possible here??? myfun(vec) Is there a way to check for missing function arguments, *and*

[LLVMdev] Elsa and LLVM and LLVM submissions

2007 Dec 17

[LLVMdev] Elsa and LLVM and LLVM submissions

Devang Patel wrote: > On Dec 15, 2007, at 12:15 PM, Richard Pennington wrote: > >> I got the current version of LLVM via svn yesterday and modified my >> code to >> use the LLVMFoldingBuilder. Very nice! >> >> My question is this: I noticed that the folding builder doesn't fold >> some >> operations, e.g. casts. Is there some reason why? If

[LLVMdev] How should I update LiveIntervals after removing a use of a register?

2014 Apr 04

[LLVMdev] How should I update LiveIntervals after removing a use of a register?

Hi, I am working on a simple copy propagation pass for the R600 backend that propagates immediates rather than registers. For example, I want to transform: ... %vreg1 = V_MOV_B32 1 %vreg2 = V_ADD_I32 %vreg1, %vreg0 ... into: %vreg1 = V_MOV_B32 1 ; <- Only delete this if it is dead %vreg2 = V_ADD_I32 1, %vreg0 For best results, I am trying to run this pass after the TwoAddressInstruction

[LLVMdev] Weird problems with cos (was Re: [PATCH v3 2/3] R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO)

2014 Oct 03

[LLVMdev] Weird problems with cos (was Re: [PATCH v3 2/3] R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO)

Hi Tom, Matt, I'm running into strange issues with the cos test (piglit generated_tests/cl/builtin/math/builtin-float-cos-1.0.generated.c) I have been seeing random failures (incorrect results) for some time and tried to investigate. the weird part is that the failures are not 100% reproducible, sometimes the tests pass, or partly pass (it's usually float8 and float16 subtests that

[LLVMdev] [llvm] r195903 - AArch64: Fix a bug about disassembling post-index load single element to 4 vectors

2013 Nov 28

[LLVMdev] [llvm] r195903 - AArch64: Fix a bug about disassembling post-index load single element to 4 vectors

It is r195843 and fixed in r195905, FYI. 2013/11/29 Reed Kotler <rkotler at mips.com>: > I"m getting build errors I think from one of your patches O tjoml. > > You need to have a build area that builds with clang and does warnings as > errors to avoid these issues on putback. > > here is my configure step for example: > /home/rkotler/llvm_trunk/configure

similar to: Is this undefined behavior optimization legal?