Displaying 20 results from an estimated 4000 matches similar to: "Pointer comparison folding"
2015 Jan 15
4
[LLVMdev] confusion w.r.t. scalar evolution and nuw
I've been doing some digging in this area (scev, wrapping arithmetic),
learning as much as I can, and have reached a point where I'm fairly
confused about the semantics of nuw in scalar evolution expressions.
Consider the following program:
define void @foo(i32 %begin) {
entry:
br label %loop
loop:
%idx = phi i32 [ %begin, %entry ], [ %idx.dec, %loop ]
%idx.dec = sub nuw i32
2015 Jan 15
2
[LLVMdev] confusion w.r.t. scalar evolution and nuw
> We are permitted to turn 'sub nsw i32 %x, 1' into 'add nsw i32 %x, -1'
nsw also has the same problem:
sub nsw int_min, int_min is 0 but
add nsw int_min, (-int_min) is poison
-- Sanjoy
2014 May 13
2
[LLVMdev] Missed optimization opportunity in 3-way integer comparison case
While looking at what llvm writes for this testcase, I noticed that
there is one redundant operation in resulting assembly. The second 'cmp'
operation there is essentially identical to the first one, with reversed
order of arguments. Therefore, it is not needed.
This testcase is a simple integer comparison routine, similar to what
qsort would take to sort an integer array.
I think
2015 Jun 09
2
[LLVMdev] Constant folding inttoptr i32 0 to null pointer?
Hello,
It seems that ConstantFoldCastInstruction in ConstantFold.cpp folds inttoptr instruction with 0 as operand to a null pointer. It makes sense, when talking about a C-style frontend, as the C99 spec (6.3.2.3) states:
"An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant."
On the other hand, some architectures
2015 Jun 09
4
[LLVMdev] Constant folding inttoptr i32 0 to null pointer?
Thanks David,
It turns out, that the address space I was using was not 0, and yet the pointer was constant folded to null.
Here is the sequence:
Unoptimized code:
define i32 @foo() #0 {
entry:
%address.addr.i = alloca i32, align 4
%value.i = alloca i32, align 4
store i32 0, i32* %address.addr.i, align 4
%0 = load i32* %address.addr.i, align 4
%1 = inttoptr i32 %0 to i32 addrspace(1)*
2015 Jun 09
2
[LLVMdev] Constant folding inttoptr i32 0 to null pointer?
On Tue, Jun 9, 2015 at 12:32 PM, David Majnemer <david.majnemer at gmail.com>
wrote:
> 'load volatile i32 addrspace(1)* null' seems fine to me. However, it
> looks like instcombine will turn:
> define i32 @foo() {
> entry:
> %std_ld.i = load volatile i32, i32 addrspace(1)* null
> ret i32 %std_ld.i
> }
>
> into:
> define i32 @foo() {
> entry:
2010 Mar 28
2
[LLVMdev] Which floating-point comparison?
I notice llvm provides both ordered and unordered variants of
floating-point comparison. Which of these is the right one to use by
default? I suppose the two criteria would be, in order of importance:
1. Which is more efficient (more directly maps to typical hardware)?
2. Which is more familiar (more like the way C and Fortran do it)?
2010 Mar 28
0
[LLVMdev] Which floating-point comparison?
On Sun, Mar 28, 2010 at 7:45 AM, Russell Wallace
<russell.wallace at gmail.com> wrote:
> I notice llvm provides both ordered and unordered variants of
> floating-point comparison. Which of these is the right one to use by
> default? I suppose the two criteria would be, in order of importance:
>
> 1. Which is more efficient (more directly maps to typical hardware)?
You can
2015 Oct 19
1
R 3.2.2 - make check and install package hang
Below is the output. Thanks for the help.
> Sys.getenv()
BASH_FUNC_module() () { eval
`/cm/local/apps/environment-modules/3.2.10/Modules/$MODULE_VERSION/bin/modulecmd
bash $*` }
COLUMNS 152
CPATH /cm/shared/apps/uge/8.2.1/include
CVS_RSH ssh
DISPLAY localhost:10.0
EDITOR
2015 Oct 17
3
R 3.2.2 - make check and install package hang
Hello Everyone,
After trying several ways to compile R 3.2.2 without luck, I?m reaching out for help.
The ?make check? does not hanges for some reason and when
trying to install a package it cannot list the download sites (see below).
What could be the problem?
./configure --enable-R-shlib --enable-BLAS-shlib
hostname = test
uname -m = x86_64
uname -r = 2.6.32-573.7.1.el6.x86_64
uname -s =
2011 Apr 01
1
[LLVMdev] signed/unsigned integers ?
> there is no such information. You can still consider every type to have values
> in, say, T = [-2^31; 2^31-1]. Probably you are trying to deduce an interval of
> possible values for each register. You will need to allow intervals to wrap
> around the end of T since (eg) the basic "add" operator in LLVM uses modulo
> arithmetic, i.e. if you add 1 to 2^31-1 you get
2013 Apr 23
2
[LLVMdev] 'loop invariant code motion' and 'Reassociate Expression'
Hi,
I am investigating a performance degradation between llvm-3.1 and llvm-3.2
(Note: current top-of-tree shows a similar degradation)
One issue I see is the following:
- 'loop invariant code motion' seems to be depending on the result of the 'reassociate expression' pass:
In the samples below I observer the following behavior:
Both start with the same expression:
%add = add
2015 May 06
1
Intel NUC haswell-ULT
I have one of those new little NUC's and installed Centos 7.1 on it.
lspci shows
00:00.0 Host bridge: Intel Corporation Haswell-ULT DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT Integrated
Graphics Controller (rev 09)
00:03.0 Audio device: Intel Corporation Haswell-ULT HD Audio Controller
(rev 09)
00:14.0 USB controller: Intel Corporation 8 Series
2015 May 06
2
[LLVMdev] [LoopVectorizer] Missed vectorization opportunities caused by sext/zext operations
For
void test0(unsigned short a, unsigned short * in, unsigned short * out) {
for (unsigned short w = 1; w < a - 1; w++) //this will never overflow
out[w] = in[w+7] * 2;
}
I think it will be sufficient to add a couple of new cases to
ScalarEvolution::HowManyLessThans --
zext(A) ult zext(B) == A ult B
sext(A) slt sext(B) == A slt B
Currently it bails out if it sees a non-add
2013 Apr 25
2
[LLVMdev] 'loop invariant code motion' and 'Reassociate Expression'
It's an interesting problem.
The best stuff I've seen published is by Cooper, Eckhart, & Kennedy, in
PACT '08.
Cooper gives a nice intro in one of his lectures:
http://www.cs.rice.edu/~keith/512/2012/Lectures/26ReassocII-1up.pdf
I can't tell, quickly, what's going on in Reassociate;
as usual, the documentation resolutely avoids giving any credit for the
ideas.
Why is that?
2018 Jan 17
3
always allow canonicalizing to 8- and 16-bit ops?
Example:
define i8 @narrow_add(i8 %x, i8 %y) {
%x32 = zext i8 %x to i32
%y32 = zext i8 %y to i32
%add = add nsw i32 %x32, %y32
%tr = trunc i32 %add to i8
ret i8 %tr
}
With no data-layout or with an x86 target where 8-bit integer is in the
data-layout, we reduce to:
$ ./opt -instcombine narrowadd.ll -S
define i8 @narrow_add(i8 %x, i8 %y) {
%add = add i8 %x, %y
ret i8 %add
}
But on
2012 Nov 26
2
[LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests
I am investigating changing BoundsChecking to use address-based rather
than size- & offset-based tests.
To explain, here is a short code sample cribbed from one of the tests:
%mem = tail call i8* @calloc(i64 1, i64 %elements)
%memobj = bitcast i8* %mem to i64*
%ptr = getelementptr inbounds i64* %memobj, i64 %index
%4 = load i64* %ptr, align 8
Currently, the IR for bounds checking
2013 Apr 23
0
[LLVMdev] 'loop invariant code motion' and 'Reassociate Expression'
As far as I can understand of the code, the Reassociate tries to achieve
this result by its "ranking" mechanism.
If it dose not, it is not hard to achieve this result, just restructure
the expression in a way such that
the earlier definition of the sub-expression is permute earlier in the
resulting expr.
e.g.
outer-loop1
x=
outer-loop2
y =
2015 Jan 13
2
[LLVMdev] Floating-point range checks
After writing a simple FPRange, I've hit a stumbling block. I don't know what LLVM code should be extended to use it. I was initially thinking of extending LazyValueInfo, but it appears to be used for passes that don’t address the case that I need for Julia. I’m now wondering if I’m better off extending SimplifyFCmpInst to handle the few cases in question instead of trying to be more
2018 Jan 22
2
always allow canonicalizing to 8- and 16-bit ops?
Thanks for the perf testing. I assume that DAG legalization is equipped to
handle these cases fairly well, or someone would've complained by now...
FWIW (and at least some of this can be blamed on me), instcombine already
does the narrowing transforms without checking shouldChangeType() for
binops like and/or/xor/udiv. The justification was that narrower ops are
always better for