thr3ads.net - similar to: "Pointer comparison folding"

Displaying 20 results from an estimated 4000 matches similar to: "Pointer comparison folding"

[LLVMdev] confusion w.r.t. scalar evolution and nuw

2015 Jan 15

[LLVMdev] confusion w.r.t. scalar evolution and nuw

I've been doing some digging in this area (scev, wrapping arithmetic), learning as much as I can, and have reached a point where I'm fairly confused about the semantics of nuw in scalar evolution expressions. Consider the following program: define void @foo(i32 %begin) { entry: br label %loop loop: %idx = phi i32 [ %begin, %entry ], [ %idx.dec, %loop ] %idx.dec = sub nuw i32

[LLVMdev] confusion w.r.t. scalar evolution and nuw

2015 Jan 15

[LLVMdev] confusion w.r.t. scalar evolution and nuw

> We are permitted to turn 'sub nsw i32 %x, 1' into 'add nsw i32 %x, -1' nsw also has the same problem: sub nsw int_min, int_min is 0 but add nsw int_min, (-int_min) is poison -- Sanjoy

[LLVMdev] Missed optimization opportunity in 3-way integer comparison case

2014 May 13

[LLVMdev] Missed optimization opportunity in 3-way integer comparison case

While looking at what llvm writes for this testcase, I noticed that there is one redundant operation in resulting assembly. The second 'cmp' operation there is essentially identical to the first one, with reversed order of arguments. Therefore, it is not needed. This testcase is a simple integer comparison routine, similar to what qsort would take to sort an integer array. I think

[LLVMdev] Constant folding inttoptr i32 0 to null pointer?

2015 Jun 09

[LLVMdev] Constant folding inttoptr i32 0 to null pointer?

Hello, It seems that ConstantFoldCastInstruction in ConstantFold.cpp folds inttoptr instruction with 0 as operand to a null pointer. It makes sense, when talking about a C-style frontend, as the C99 spec (6.3.2.3) states: "An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant." On the other hand, some architectures

[LLVMdev] Constant folding inttoptr i32 0 to null pointer?

2015 Jun 09

[LLVMdev] Constant folding inttoptr i32 0 to null pointer?

Thanks David, It turns out, that the address space I was using was not 0, and yet the pointer was constant folded to null. Here is the sequence: Unoptimized code: define i32 @foo() #0 { entry: %address.addr.i = alloca i32, align 4 %value.i = alloca i32, align 4 store i32 0, i32* %address.addr.i, align 4 %0 = load i32* %address.addr.i, align 4 %1 = inttoptr i32 %0 to i32 addrspace(1)*

[LLVMdev] Constant folding inttoptr i32 0 to null pointer?

2015 Jun 09

[LLVMdev] Constant folding inttoptr i32 0 to null pointer?

On Tue, Jun 9, 2015 at 12:32 PM, David Majnemer <david.majnemer at gmail.com> wrote: > 'load volatile i32 addrspace(1)* null' seems fine to me. However, it > looks like instcombine will turn: > define i32 @foo() { > entry: > %std_ld.i = load volatile i32, i32 addrspace(1)* null > ret i32 %std_ld.i > } > > into: > define i32 @foo() { > entry:

[LLVMdev] Which floating-point comparison?

2010 Mar 28

[LLVMdev] Which floating-point comparison?

I notice llvm provides both ordered and unordered variants of floating-point comparison. Which of these is the right one to use by default? I suppose the two criteria would be, in order of importance: 1. Which is more efficient (more directly maps to typical hardware)? 2. Which is more familiar (more like the way C and Fortran do it)?

[LLVMdev] Which floating-point comparison?

2010 Mar 28

[LLVMdev] Which floating-point comparison?

On Sun, Mar 28, 2010 at 7:45 AM, Russell Wallace <russell.wallace at gmail.com> wrote: > I notice llvm provides both ordered and unordered variants of > floating-point comparison. Which of these is the right one to use by > default? I suppose the two criteria would be, in order of importance: > > 1. Which is more efficient (more directly maps to typical hardware)? You can

R 3.2.2 - make check and install package hang

2015 Oct 19

R 3.2.2 - make check and install package hang

Below is the output. Thanks for the help. > Sys.getenv() BASH_FUNC_module() () { eval `/cm/local/apps/environment-modules/3.2.10/Modules/$MODULE_VERSION/bin/modulecmd bash $*` } COLUMNS 152 CPATH /cm/shared/apps/uge/8.2.1/include CVS_RSH ssh DISPLAY localhost:10.0 EDITOR

R 3.2.2 - make check and install package hang

2015 Oct 17

R 3.2.2 - make check and install package hang

Hello Everyone, After trying several ways to compile R 3.2.2 without luck, I?m reaching out for help. The ?make check? does not hanges for some reason and when trying to install a package it cannot list the download sites (see below). What could be the problem? ./configure --enable-R-shlib --enable-BLAS-shlib hostname = test uname -m = x86_64 uname -r = 2.6.32-573.7.1.el6.x86_64 uname -s =

[LLVMdev] signed/unsigned integers ?

2011 Apr 01

[LLVMdev] signed/unsigned integers ?

> there is no such information. You can still consider every type to have values > in, say, T = [-2^31; 2^31-1]. Probably you are trying to deduce an interval of > possible values for each register. You will need to allow intervals to wrap > around the end of T since (eg) the basic "add" operator in LLVM uses modulo > arithmetic, i.e. if you add 1 to 2^31-1 you get

[LLVMdev] 'loop invariant code motion' and 'Reassociate Expression'

2013 Apr 23

[LLVMdev] 'loop invariant code motion' and 'Reassociate Expression'

Hi, I am investigating a performance degradation between llvm-3.1 and llvm-3.2 (Note: current top-of-tree shows a similar degradation) One issue I see is the following: - 'loop invariant code motion' seems to be depending on the result of the 'reassociate expression' pass: In the samples below I observer the following behavior: Both start with the same expression: %add = add

Intel NUC haswell-ULT

2015 May 06

Intel NUC haswell-ULT

I have one of those new little NUC's and installed Centos 7.1 on it. lspci shows 00:00.0 Host bridge: Intel Corporation Haswell-ULT DRAM Controller (rev 09) 00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 09) 00:03.0 Audio device: Intel Corporation Haswell-ULT HD Audio Controller (rev 09) 00:14.0 USB controller: Intel Corporation 8 Series

[LLVMdev] [LoopVectorizer] Missed vectorization opportunities caused by sext/zext operations

2015 May 06

[LLVMdev] [LoopVectorizer] Missed vectorization opportunities caused by sext/zext operations

For void test0(unsigned short a, unsigned short * in, unsigned short * out) { for (unsigned short w = 1; w < a - 1; w++) //this will never overflow out[w] = in[w+7] * 2; } I think it will be sufficient to add a couple of new cases to ScalarEvolution::HowManyLessThans -- zext(A) ult zext(B) == A ult B sext(A) slt sext(B) == A slt B Currently it bails out if it sees a non-add

[LLVMdev] 'loop invariant code motion' and 'Reassociate Expression'

2013 Apr 25

[LLVMdev] 'loop invariant code motion' and 'Reassociate Expression'

It's an interesting problem. The best stuff I've seen published is by Cooper, Eckhart, & Kennedy, in PACT '08. Cooper gives a nice intro in one of his lectures: http://www.cs.rice.edu/~keith/512/2012/Lectures/26ReassocII-1up.pdf I can't tell, quickly, what's going on in Reassociate; as usual, the documentation resolutely avoids giving any credit for the ideas. Why is that?

always allow canonicalizing to 8- and 16-bit ops?

2018 Jan 17

always allow canonicalizing to 8- and 16-bit ops?

Example: define i8 @narrow_add(i8 %x, i8 %y) { %x32 = zext i8 %x to i32 %y32 = zext i8 %y to i32 %add = add nsw i32 %x32, %y32 %tr = trunc i32 %add to i8 ret i8 %tr } With no data-layout or with an x86 target where 8-bit integer is in the data-layout, we reduce to: $ ./opt -instcombine narrowadd.ll -S define i8 @narrow_add(i8 %x, i8 %y) { %add = add i8 %x, %y ret i8 %add } But on

[LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests

2012 Nov 26

[LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests

I am investigating changing BoundsChecking to use address-based rather than size- & offset-based tests. To explain, here is a short code sample cribbed from one of the tests: %mem = tail call i8* @calloc(i64 1, i64 %elements) %memobj = bitcast i8* %mem to i64* %ptr = getelementptr inbounds i64* %memobj, i64 %index %4 = load i64* %ptr, align 8 Currently, the IR for bounds checking

[LLVMdev] 'loop invariant code motion' and 'Reassociate Expression'

2013 Apr 23

[LLVMdev] 'loop invariant code motion' and 'Reassociate Expression'

As far as I can understand of the code, the Reassociate tries to achieve this result by its "ranking" mechanism. If it dose not, it is not hard to achieve this result, just restructure the expression in a way such that the earlier definition of the sub-expression is permute earlier in the resulting expr. e.g. outer-loop1 x= outer-loop2 y =

[LLVMdev] Floating-point range checks

2015 Jan 13

[LLVMdev] Floating-point range checks

After writing a simple FPRange, I've hit a stumbling block. I don't know what LLVM code should be extended to use it. I was initially thinking of extending LazyValueInfo, but it appears to be used for passes that don’t address the case that I need for Julia. I’m now wondering if I’m better off extending SimplifyFCmpInst to handle the few cases in question instead of trying to be more

always allow canonicalizing to 8- and 16-bit ops?

2018 Jan 22

always allow canonicalizing to 8- and 16-bit ops?

Thanks for the perf testing. I assume that DAG legalization is equipped to handle these cases fairly well, or someone would've complained by now... FWIW (and at least some of this can be blamed on me), instcombine already does the narrowing transforms without checking shouldChangeType() for binops like and/or/xor/udiv. The justification was that narrower ops are always better for

similar to: Pointer comparison folding