thr3ads.net - similar to: "(When) Do function calls read/latch/freeze their parameters?"

Displaying 20 results from an estimated 4000 matches similar to: "(When) Do function calls read/latch/freeze their parameters?"

RFC: alive.llvm.org?

2020 Jun 17

RFC: alive.llvm.org?

Hi folks, I've been running a Compiler Explorer instance with Alive2 on a machine in my office, but availability has been poor due to random factors and of course recently it hasn't been easy or convenient to go in and fix things when the machine gets wedged. Nuno and I would like to ask the LLVM community if it's OK to point alive.llvm.org at a cloud machine that I've setup

RFC: alive.llvm.org?

2020 Jun 17

RFC: alive.llvm.org?

No concerns from me. I use Alive2 all the time, and it would be fantastic to have it available online reliably. If we can get Alive1 up there too, that would be even better. I still use that to try to prove things where it's not obvious how to express the relationships in pure LLVM IR: https://rise4fun.com/Alive/NDu On Wed, Jun 17, 2020 at 4:05 PM Chris Lattner via llvm-dev < llvm-dev at

RFC: alive.llvm.org?

2020 Jun 18

RFC: alive.llvm.org?

+1 to alive2.llvm.org On Thu, Jun 18, 2020 at 8:11 AM John Regehr via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > If we can get Alive1 up there too, that would be even better. I still > > use that to try to prove things where it's not obvious how to express > > the relationships in pure LLVM IR: > > https://rise4fun.com/Alive/NDu > > I don't

LLVM-IR store-load propagation

2020 Jun 19

LLVM-IR store-load propagation

Hello everyone, This week I was looking into the following example ( https://godbolt.org/z/uhgQcq) where two constants are written to a local array and an input argument, masked and shifted, is used to select between them. The possible values for the CC variable are 0 and 1, so I'm expecting that at the maximum level of optimizations the two constants are actually propagated, resulting in the

[RFC] Integer Intrinsics for abs, in unsigned/signed min/max

2020 Jun 15

[RFC] Integer Intrinsics for abs, in unsigned/signed min/max

Hello all. This is a proposal to introduce 5 new integer intrinsics: * absolute value * signed min * signed max * unsigned min * unsigned max This is motivated by the fact that we keep working around not having these intrinsics, and that constantly leads us into having more workarounds, and causes infinite combine loops. Here's a (likely incomplete!) list of motivational bugs: infinite

Is it valid to dereference a pointer that have undef bits in its offset?

2020 Sep 22

Is it valid to dereference a pointer that have undef bits in its offset?

Thank you for the infos; it seems making it raise UB is problematic. Would clarifying it in LangRef be good? I can update the patch to contain the information instead. Another concern is then, how can we efficiently encode an assumption that a pointer variable in IR does not have undef bits? Certainly, in the front-end language, (most of) pointers won't have undef bits, and it would be great

Condition code in DAGCombiner::visitFADDForFMACombine?

2018 Aug 22

Condition code in DAGCombiner::visitFADDForFMACombine?

On 22.08.2018 17:52, Ryan Taylor wrote: > This is probably going to effect on other backends and break llvm-lit > for them? Very likely, yes. Can you take a look at how big the fallout is? This might give us a hint about what other frontends might expect, and who needs to be involved in the discussion (if one is needed). Cheers, Nicolai > > On Wed, Aug 22, 2018 at 11:41 AM

Why does FPBinOp(X, undef) -> NaN?

2020 Feb 07

Why does FPBinOp(X, undef) -> NaN?

On Fri, Feb 7, 2020 at 12:29 PM Nuno Lopes <nunoplopes at sapo.pt> wrote: > > It's not correct (output of Alive2): > > define half @fn(half %a) { > %b = fadd half %a, undef > ret half %b > } > => > define half @fn(half %a) { > ret half undef > } > Transformation doesn't verify! > ERROR: Value mismatch > > Example: > half %a

What can the optimizer assume about the memory a global function pointer points to?

2020 Apr 16

What can the optimizer assume about the memory a global function pointer points to?

A function declaration declares a function pointer to the memory where the machine code will be at runtime. Besides providing the ability to call the function, that pointer can also be used, after bitcasting it, to modify the machine code implementing the function. What does the optimizer assume about the memory containing the machine code? The following is an example where Alive2 assumes

[RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions

2015 Jan 09

[RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions

Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 109 +++++++++++++++++++++ 1 file changed, 109 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

funnel shift, select, and poison

2019 Feb 25

funnel shift, select, and poison

There's a question about the behavior of funnel shift [1] + select and poison here that reminds me of previous discussions about select and poison [2]: https://github.com/AliveToolkit/alive2/pull/32#discussion_r257528880 Example: define i8 @fshl_zero_shift_guard(i8 %x, i8 %y, i8 %sh) { %c = icmp eq i8 %sh, 0 %f = fshl i8 %x, i8 %y, i8 %sh %s = select i1 %c, i8 %x, i8 %f ; shift amount is 0

[PATCH v4] nv50/ir: Handle OP_CVT when folding constant expressions

2014 Jul 05

[PATCH v4] nv50/ir: Handle OP_CVT when folding constant expressions

Folding for conversions: F32/64->(U16/32, S16/32) and (U16/32, S16/32)->F32 No piglit regressions observed on nv50 and nvc0! Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- V2: fix usage of wrong variable V3: enable F64 support V4: - disable F64 support again - handle saturate flag: clamp to min/max if needed

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

2015 Jan 10

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- V2: beat me, whip me, split out F64 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 81 ++++++++++++++++++++++ 1 file changed, 81 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[PATCH 1/11] ARM: tegra: add function to control the GPU rail clamp

2014 Dec 24

[PATCH 1/11] ARM: tegra: add function to control the GPU rail clamp

Am Dienstag, den 23.12.2014, 18:39 +0800 schrieb Vince Hsu: > The Tegra124 and later Tegra SoCs have a sepatate rail gating register > to enable/disable the clamp. The original function > tegra_powergate_remove_clamping() is not sufficient for the enable > function. So add a new function which is dedicated to the GPU rail > gating. Also don't refer to the powergate ID since the

Re: [PATCH nbdkit 6/8] data, memory: Implement extents.

2019 Mar 23

Re: [PATCH nbdkit 6/8] data, memory: Implement extents.

On 3/20/19 5:11 PM, Richard W.M. Jones wrote: > These plugins are both based on the same sparse array structure which > supports a simple implementation of extents. > --- > +int > +sparse_array_extents (struct sparse_array *sa, > + uint32_t count, uint64_t offset, > + struct nbdkit_extents *extents) > +{ > + uint32_t n, type;

Branch is not optimized because of right shift

2020 Apr 05

Branch is not optimized because of right shift

Hi everyone, In a twitch chat someone redirected me to an example that is not optimized: https://godbolt.org/z/BL-4jL I included the original source code and this is after -O2. We both thought that the -8 branch could be optimized out. I added a nuw in the subtraction and it actually does it. Any thoughts on why that doesn't happen already? Best, Stefanos Baziotis -------------- next part

Why does FPBinOp(X, undef) -> NaN?

2020 Feb 07

Why does FPBinOp(X, undef) -> NaN?

I came across this comment in SelectionDAG.cpp: case ISD::FADD: case ISD::FSUB: case ISD::FMUL: case ISD::FDIV: case ISD::FREM: // If both operands are undef, the result is undef. If 1 operand is undef, // the result is NaN. This should match the behavior of the IR optimizer. That isn't intuitive to me. I would have expected a binary FP operation with one undef operand to

[PATCH 1/11] ARM: tegra: add function to control the GPU rail clamp

2014 Dec 25

[PATCH 1/11] ARM: tegra: add function to control the GPU rail clamp

Am Donnerstag, den 25.12.2014, 10:28 +0800 schrieb Vince Hsu: > On 12/24/2014 09:16 PM, Lucas Stach wrote: > > Am Dienstag, den 23.12.2014, 18:39 +0800 schrieb Vince Hsu: > >> The Tegra124 and later Tegra SoCs have a sepatate rail gating register > >> to enable/disable the clamp. The original function > >> tegra_powergate_remove_clamping() is not sufficient for

[PATCH 1/11] ARM: tegra: add function to control the GPU rail clamp

2015 Jan 05

[PATCH 1/11] ARM: tegra: add function to control the GPU rail clamp

On Thu, Dec 25, 2014 at 10:28:08AM +0800, Vince Hsu wrote: > On 12/24/2014 09:16 PM, Lucas Stach wrote: > >Am Dienstag, den 23.12.2014, 18:39 +0800 schrieb Vince Hsu: > >>The Tegra124 and later Tegra SoCs have a sepatate rail gating register > >>to enable/disable the clamp. The original function > >>tegra_powergate_remove_clamping() is not sufficient for the

[PATCH -next 2/3] drm/amdgpu: use clamp() in amdgpu_vm_adjust_size()

2024 Aug 30

[PATCH -next 2/3] drm/amdgpu: use clamp() in amdgpu_vm_adjust_size()

Am 30.08.24 um 03:22 schrieb Li Zetao: > When it needs to get a value within a certain interval, using clamp() > makes the code easier to understand than min(max()). > > Signed-off-by: Li Zetao <lizetao1 at huawei.com> This patch and #1 is a nice cleanup and Reviewed-by: Christian K?nig <christian.koenig at amd.com> But as Alex also pointed out patch #3 is for Nouveau

similar to: (When) Do function calls read/latch/freeze their parameters?