thr3ads.net - llvm dev - [llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM [Jan 2019]

If this information is useful, please help other people find it:
Share via:

Juneyoung Lee via llvm-dev

2019-Jan-14 11:23 UTC

[llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM

Hello all,

This is a proposal for reducing # of ptrtoint/inttoptr casts which are not
written by programmers but rather generated by LLVM passes.
Currently the majority of ptrtoint/inttoptr casts are generated by LLVM;
when compiling SPEC 2017 with LLVM r348082 (Dec 2 2018) with -O3,
the output IR contains 22,771 inttoptr instructions. However, when
compiling it with -O0, there are only 1048 inttoptrs, meaning that 95.4%
of them are generated by LLVM passes.

This trend is similar in ptrtoint instruction as well. When compiling SPEC
2017
with -O0, there are 23,208 ptrtoint instructions, but among them 22,016
(94.8%)
are generated by Clang frontend to represent pointer subtraction.
They aren't effectively optimized out because there are even more ptrtoints
(31,721) after -O3.
This is bad for performance because existence of ptrtoint makes analysis
return conservative
result as a pointer can be escaped through the cast.
Memory accesses to a pointer came from inttoptr is assumed
to possibly access anywhere, therefore it may block
store-to-load forwarding, merging two same loads, etc.

I believe this can be addressed by applying two patches - first one is
representing pointer subtraction with a dedicated intrinsic function,
llvm.psub, and second one is disabling InstCombine transformation

    %q = load i8*, i8** %p1
    store i8* %q, i8** %p2
=>
  %1 = bitcast i8** %p1 to i64*
  %q1 = load i64, i64* %1, align 8
  %2 = bitcast i8** %p2 to i64*
  store i64 %q1, i64* %2, align 8

This transformation can introduce inttoptrs later if loads are followed (
https://godbolt.org/z/wsZ3II ). Both are discussed in
https://bugs.llvm.org/show_bug.cgi?id=39846 as well.
After llvm.psub is used & this transformation is disabled, # of inttoptrs
decreases from 22,771 to 1,565 (6.9%), and # of ptrtoints decreases from
31,721 to 7,772 (24.5%).

I'll introduce llvm.psub patch first.


--- Adding llvm.psub ---

By defining pointer subtraction intrinsic, we can get performance gain
because it gives more undefined behavior than just subtracting two
ptrtoints.

Patch https://reviews.llvm.org/D56598 adds llvm.psub(p1,p2) intrinsic
function, which subtracts two pointers and returns the difference. Its
semantic is as follows.
If p1 and p2 point to different objects, and neither of them is based on a
pointer casted from an integer, `llvm.psub(p1, p2)` returns poison. For
example,

%p = alloca
%q = alloca
%i = llvm.psub(p, q) ; %i is poison

This allows aggressive escape analysis on pointers. Given i = llvm.psub(p1,
p2), if neither of p1 and p2 is based on a pointer casted from an integer,
the llvm.psub call does not make p1 or p2 escape. (
https://reviews.llvm.org/D56601 )

If either p1 or p2 is based on a pointer casted from integer, or p1 and p2
point to a same object, it returns the result of subtraction (in bytes);
for example,

%p = alloca
%q = inttoptr %x
%i = llvm.psub(p, q) ; %i is equivalent to (ptrtoint %p) - %x

`null` is regarded as a pointer casted from an integer because
it is equivalent to `inttoptr 0`.

Adding llvm.psub allows LLVM to utilize significant portion of ptrtoints &
reduce a portion of inttoptrs. After llvm.psub is used, when SPECrate 2017
is compiled with -O3, # of inttoptr decreases to ~13,500 (59%) and # of
ptrtoint decreases to ~14,300 (45%).

To see the performance change, I ran SPECrate 2017 (thread # = 1) with
three versions of LLVM, which are r313797 (Sep 21, 2017), LLVM 6.0
official, and r348082 (Dec 2, 2018).
Running r313797 shows that 505.mcf_r has consistent 2.0% speedup over 3
different machines (which are i3-6100, i5-6600, i7-7700). For LLVM 6.0 and
r348082, there's neither consistent speedup nor slowdown, but the average
speedup is near 0. I believe there's still a room of improvement because
there are passes which are not aware of llvm.psub.

Thank you for reading this, and any comment is welcome.

Best Regards,
Juneyoung Lee
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190114/3f06f847/attachment.html>

Chandler Carruth via llvm-dev

2019-Jan-14 17:36 UTC

head link

[llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM

While I'm very interested in the end result here, I have some questions
that don't seem well answered yet around pointer subtraction...

First and foremost - how do you address correctness issues here? Because
the subtraction `A - B` can escape/capture more things. Specifically, if
one of `A` or `B` is escaped/captured, the subtraction can be used to
escape or capture the other pointer. So *some* of the conservative
treatment is necessary. What is the plan to update all the analyses to
remain correct? What correctness testing have you done?

Second - an intrinsic seems a poor fit here given the significance of this
operation. We have an instruction that covers most pointer arithmetic
(`getelementptr`), and I can imagine growing pointer subtraction, but it
seems like it should be an instruction if we're going to have it. Based on
the above, we will need to use it very often in analysis.


Regarding the instcombine, it should be very easy to keep loads and stores
of pointers as pointer typed in instcombine. Likely just a missing case in
the code I added/touched there.

On Mon, Jan 14, 2019 at 3:23 AM Juneyoung Lee via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hello all,
>
> This is a proposal for reducing # of ptrtoint/inttoptr casts which are not
> written by programmers but rather generated by LLVM passes.
> Currently the majority of ptrtoint/inttoptr casts are generated by LLVM;
> when compiling SPEC 2017 with LLVM r348082 (Dec 2 2018) with -O3,
> the output IR contains 22,771 inttoptr instructions. However, when
> compiling it with -O0, there are only 1048 inttoptrs, meaning that 95.4%
> of them are generated by LLVM passes.
>
> This trend is similar in ptrtoint instruction as well. When compiling SPEC
> 2017
> with -O0, there are 23,208 ptrtoint instructions, but among them 22,016
> (94.8%)
> are generated by Clang frontend to represent pointer subtraction.
> They aren't effectively optimized out because there are even more
> ptrtoints (31,721) after -O3.
> This is bad for performance because existence of ptrtoint makes analysis
> return conservative
> result as a pointer can be escaped through the cast.
> Memory accesses to a pointer came from inttoptr is assumed
> to possibly access anywhere, therefore it may block
> store-to-load forwarding, merging two same loads, etc.
>
> I believe this can be addressed by applying two patches - first one is
> representing pointer subtraction with a dedicated intrinsic function,
> llvm.psub, and second one is disabling InstCombine transformation
>
>     %q = load i8*, i8** %p1
>     store i8* %q, i8** %p2
> =>
>   %1 = bitcast i8** %p1 to i64*
>   %q1 = load i64, i64* %1, align 8
>   %2 = bitcast i8** %p2 to i64*
>   store i64 %q1, i64* %2, align 8
>
> This transformation can introduce inttoptrs later if loads are followed (
> https://godbolt.org/z/wsZ3II ). Both are discussed in
> https://bugs.llvm.org/show_bug.cgi?id=39846 as well.
> After llvm.psub is used & this transformation is disabled, # of
inttoptrs
> decreases from 22,771 to 1,565 (6.9%), and # of ptrtoints decreases from
> 31,721 to 7,772 (24.5%).
>
> I'll introduce llvm.psub patch first.
>
>
> --- Adding llvm.psub ---
>
> By defining pointer subtraction intrinsic, we can get performance gain
> because it gives more undefined behavior than just subtracting two
> ptrtoints.
>
> Patch https://reviews.llvm.org/D56598 adds llvm.psub(p1,p2) intrinsic
> function, which subtracts two pointers and returns the difference. Its
> semantic is as follows.
> If p1 and p2 point to different objects, and neither of them is based on a
> pointer casted from an integer, `llvm.psub(p1, p2)` returns poison. For
> example,
>
> %p = alloca
> %q = alloca
> %i = llvm.psub(p, q) ; %i is poison
>
> This allows aggressive escape analysis on pointers. Given i >
llvm.psub(p1, p2), if neither of p1 and p2 is based on a pointer casted
> from an integer, the llvm.psub call does not make p1 or p2 escape. (
> https://reviews.llvm.org/D56601 )
>
> If either p1 or p2 is based on a pointer casted from integer, or p1 and p2
> point to a same object, it returns the result of subtraction (in bytes);
> for example,
>
> %p = alloca
> %q = inttoptr %x
> %i = llvm.psub(p, q) ; %i is equivalent to (ptrtoint %p) - %x
>
> `null` is regarded as a pointer casted from an integer because
> it is equivalent to `inttoptr 0`.
>
> Adding llvm.psub allows LLVM to utilize significant portion of ptrtoints
&
> reduce a portion of inttoptrs. After llvm.psub is used, when SPECrate 2017
> is compiled with -O3, # of inttoptr decreases to ~13,500 (59%) and # of
> ptrtoint decreases to ~14,300 (45%).
>
> To see the performance change, I ran SPECrate 2017 (thread # = 1) with
> three versions of LLVM, which are r313797 (Sep 21, 2017), LLVM 6.0
> official, and r348082 (Dec 2, 2018).
> Running r313797 shows that 505.mcf_r has consistent 2.0% speedup over 3
> different machines (which are i3-6100, i5-6600, i7-7700). For LLVM 6.0 and
> r348082, there's neither consistent speedup nor slowdown, but the
average
> speedup is near 0. I believe there's still a room of improvement
because
> there are passes which are not aware of llvm.psub.
>
> Thank you for reading this, and any comment is welcome.
>
> Best Regards,
> Juneyoung Lee
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190114/213a91f4/attachment.html>

Juneyoung Lee via llvm-dev

2019-Jan-14 20:55 UTC

head link

[llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM

Hello Chandler,
> First and foremost - how do you address correctness issues here? Becausethe subtraction `A - B` can escape/capture more things. Specifically, if
one of `A` or `B` is escaped/captured, the> subtraction can be used to escape or capture the other pointer. So *some*of the conservative treatment is necessary. What is the plan to update all
the analyses to remain correct? What> correctness testing have you done?
Correctness of psub is guaranteed by the specification of pointer
subtraction of C/C++.
When two pointers are subtracted, both shall point to elements of the same
array object, or one past the last element of the array object (6.5.6.9).
So, if the two pointers p and q point to different objects, we can define
llvm.psub(p,q) as poison.
Other than meeting C specification, correctness of llvm.psub is tested with
SPEC CPU2017 and LLVM Nightly Tests as well.

But it is true that sometimes pointer subtraction is used to get distance
between two objects.
Most common case is doing something like 'p - NULL', and this pattern
exists in SPEC CPU2017, for example spec_qsort.c in mcf_r .
Our suggestion is to define 'p - q' correctly return the distance
between p
and q if either p or q is based on inttoptr(i). This naturally explains 'p
- NULL' because NULL is equivalent to inttoptr(0).

Regarding analysis - what I've observed is that analysis was done after
pointer subtraction was optimized into another form.
For example, if '(p - q) == 0' was given, this is transformed into
'p =q', and then some analysis was done.
Good thing is that these transformations can be simply applied to llvm.psub
as well, which will reenable analysis.
Also we're adding a new operation here, so existing analysis wouldn't be
incorrect, but wouldn't fire.
Fortunately, the performance impact after changing llvm.psub wasn't big.
> Second - an intrinsic seems a poor fit here given the significance ofthis operation. We have an instruction that covers most pointer arithmetic
(`getelementptr`), and I can imagine growing> pointer subtraction, but it seems like it should be an instruction ifwe're going to have it. Based on the above, we will need to use it very
often in analysis.

I also think that defining psub as instruction is fine. :)
> Regarding the instcombine, it should be very easy to keep loads andstores of pointers as pointer typed in instcombine. Likely just a missing
case in the code I added/touched there.

That's really good. :) I found that  combineLoadToOperationType from
InstCombineLoadStoreAlloca was responsible for the transformation.
I can upload a patch for that if ok.

Best Regards,
Juneyoung Lee

On Mon, Jan 14, 2019 at 5:36 PM Chandler Carruth <chandlerc at gmail.com>
wrote:
> While I'm very interested in the end result here, I have some questions
> that don't seem well answered yet around pointer subtraction...
>
> First and foremost - how do you address correctness issues here? Because
> the subtraction `A - B` can escape/capture more things. Specifically, if
> one of `A` or `B` is escaped/captured, the subtraction can be used to
> escape or capture the other pointer. So *some* of the conservative
> treatment is necessary. What is the plan to update all the analyses to
> remain correct? What correctness testing have you done?
>
> Second - an intrinsic seems a poor fit here given the significance of this
> operation. We have an instruction that covers most pointer arithmetic
> (`getelementptr`), and I can imagine growing pointer subtraction, but it
> seems like it should be an instruction if we're going to have it. Based
on
> the above, we will need to use it very often in analysis.
>
>
> Regarding the instcombine, it should be very easy to keep loads and stores
> of pointers as pointer typed in instcombine. Likely just a missing case in
> the code I added/touched there.
>
> On Mon, Jan 14, 2019 at 3:23 AM Juneyoung Lee via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hello all,
>>
>> This is a proposal for reducing # of ptrtoint/inttoptr casts which are
not
>> written by programmers but rather generated by LLVM passes.
>> Currently the majority of ptrtoint/inttoptr casts are generated by
LLVM;
>> when compiling SPEC 2017 with LLVM r348082 (Dec 2 2018) with -O3,
>> the output IR contains 22,771 inttoptr instructions. However, when
>> compiling it with -O0, there are only 1048 inttoptrs, meaning that
95.4%
>> of them are generated by LLVM passes.
>>
>> This trend is similar in ptrtoint instruction as well. When compiling
>> SPEC 2017
>> with -O0, there are 23,208 ptrtoint instructions, but among them 22,016
>> (94.8%)
>> are generated by Clang frontend to represent pointer subtraction.
>> They aren't effectively optimized out because there are even more
>> ptrtoints (31,721) after -O3.
>> This is bad for performance because existence of ptrtoint makes
analysis
>> return conservative
>> result as a pointer can be escaped through the cast.
>> Memory accesses to a pointer came from inttoptr is assumed
>> to possibly access anywhere, therefore it may block
>> store-to-load forwarding, merging two same loads, etc.
>>
>> I believe this can be addressed by applying two patches - first one is
>> representing pointer subtraction with a dedicated intrinsic function,
>> llvm.psub, and second one is disabling InstCombine transformation
>>
>>     %q = load i8*, i8** %p1
>>     store i8* %q, i8** %p2
>> =>
>>   %1 = bitcast i8** %p1 to i64*
>>   %q1 = load i64, i64* %1, align 8
>>   %2 = bitcast i8** %p2 to i64*
>>   store i64 %q1, i64* %2, align 8
>>
>> This transformation can introduce inttoptrs later if loads are followed
(
>> https://godbolt.org/z/wsZ3II ). Both are discussed in
>> https://bugs.llvm.org/show_bug.cgi?id=39846 as well.
>> After llvm.psub is used & this transformation is disabled, # of
inttoptrs
>> decreases from 22,771 to 1,565 (6.9%), and # of ptrtoints decreases
from
>> 31,721 to 7,772 (24.5%).
>>
>> I'll introduce llvm.psub patch first.
>>
>>
>> --- Adding llvm.psub ---
>>
>> By defining pointer subtraction intrinsic, we can get performance gain
>> because it gives more undefined behavior than just subtracting two
>> ptrtoints.
>>
>> Patch https://reviews.llvm.org/D56598 adds llvm.psub(p1,p2) intrinsic
>> function, which subtracts two pointers and returns the difference. Its
>> semantic is as follows.
>> If p1 and p2 point to different objects, and neither of them is based
on
>> a pointer casted from an integer, `llvm.psub(p1, p2)` returns poison.
For
>> example,
>>
>> %p = alloca
>> %q = alloca
>> %i = llvm.psub(p, q) ; %i is poison
>>
>> This allows aggressive escape analysis on pointers. Given i >>
llvm.psub(p1, p2), if neither of p1 and p2 is based on a pointer casted
>> from an integer, the llvm.psub call does not make p1 or p2 escape. (
>> https://reviews.llvm.org/D56601 )
>>
>> If either p1 or p2 is based on a pointer casted from integer, or p1 and
>> p2 point to a same object, it returns the result of subtraction (in
bytes);
>> for example,
>>
>> %p = alloca
>> %q = inttoptr %x
>> %i = llvm.psub(p, q) ; %i is equivalent to (ptrtoint %p) - %x
>>
>> `null` is regarded as a pointer casted from an integer because
>> it is equivalent to `inttoptr 0`.
>>
>> Adding llvm.psub allows LLVM to utilize significant portion of
ptrtoints
>> & reduce a portion of inttoptrs. After llvm.psub is used, when
SPECrate
>> 2017 is compiled with -O3, # of inttoptr decreases to ~13,500 (59%) and
#
>> of ptrtoint decreases to ~14,300 (45%).
>>
>> To see the performance change, I ran SPECrate 2017 (thread # = 1) with
>> three versions of LLVM, which are r313797 (Sep 21, 2017), LLVM 6.0
>> official, and r348082 (Dec 2, 2018).
>> Running r313797 shows that 505.mcf_r has consistent 2.0% speedup over 3
>> different machines (which are i3-6100, i5-6600, i7-7700). For LLVM 6.0
and
>> r348082, there's neither consistent speedup nor slowdown, but the
average
>> speedup is near 0. I believe there's still a room of improvement
because
>> there are passes which are not aware of llvm.psub.
>>
>> Thank you for reading this, and any comment is welcome.
>>
>> Best Regards,
>> Juneyoung Lee
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
-- 

Juneyoung Lee
Software Foundation Lab, Seoul National University
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190114/7f04cb9f/attachment.html>

Kaylor, Andrew via llvm-dev

2019-Jan-14 21:58 UTC

head link

[llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM

My reply below seems to have accidentally dropped the mailing list, so I’m
resending.

From: Kaylor, Andrew
Sent: Monday, January 14, 2019 1:40 PM
To: Juneyoung Lee <juneyoung.lee at sf.snu.ac.kr>; Chandler Carruth
<chandlerc at gmail.com>
Cc: Zhengyang Liu <liuz at cs.utah.edu>; Ralf Jung <jung at
mpi-sws.org>; John Regehr <regehr at cs.utah.edu>; Nuno Lopes
<nlopes at microsoft.com>
Subject: RE: [llvm-dev] Reducing the number of ptrtoint/inttoptrs that are
generated by LLVM

We recently came across a related issue where the InstCombine change that
introduces ptrtoint can block mem2reg promotions. We did indeed track it back to
a change Chandler made that was specifically introducting this as a
canonicalization. The change was r226781. Here’s the commit message:

---

[canonicalize] Teach InstCombine to canonicalize loads which are only
ever stored to always use a legal integer type if one is available.

Regardless of whether this particular type is good or bad, it ensures we
don't get weird differences in generated code (and resulting
performance) from "equivalent" patterns that happen to end up using
a slightly different type.

After some discussion on llvmdev it seems everyone generally likes this
canonicalization. However, there may be some parts of LLVM that handle
it poorly and need to be fixed. I have at least verified that this
doesn't impede GVN and instcombine's store-to-load forwarding powers in
any obvious cases. Subtle cases are exactly what we need te flush out if
they remain.

Also note that this IR pattern should already be hitting LLVM from Clang
at least because it is exactly the IR which would be produced if you
used memcpy to copy a pointer or floating point between memory instead
of a variable.

---

Based on the third paragraph there, we have been exploring the possibility of
teaching mem2reg to handle the ptrtoint cast rather than removing the
canonicalization. Personally, I would be happy to see pointers loaded and stored
directly without the canonicalization if that doesn’t break other optimizations.

I don’t know if the mem2reg fix we’re working on would take care of Juneyoung’s
store-to-load forwarding issues.

-Andy

From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces
at lists.llvm.org>> On Behalf Of Juneyoung Lee via llvm-dev
Sent: Monday, January 14, 2019 12:56 PM
To: Chandler Carruth <chandlerc at gmail.com<mailto:chandlerc at
gmail.com>>
Cc: llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>>; Zhengyang Liu <liuz at cs.utah.edu<mailto:liuz at
cs.utah.edu>>; Ralf Jung <jung at mpi-sws.org<mailto:jung at
mpi-sws.org>>; John Regehr <regehr at cs.utah.edu<mailto:regehr at
cs.utah.edu>>; Nuno Lopes <nlopes at microsoft.com<mailto:nlopes at
microsoft.com>>
Subject: Re: [llvm-dev] Reducing the number of ptrtoint/inttoptrs that are
generated by LLVM

Hello Chandler,
> First and foremost - how do you address correctness issues here? Because
the subtraction `A - B` can escape/capture more things. Specifically, if one of
`A` or `B` is escaped/captured, the
> subtraction can be used to escape or capture the other pointer. So *some*
of the conservative treatment is necessary. What is the plan to update all the
analyses to remain correct? What
> correctness testing have you done?
Correctness of psub is guaranteed by the specification of pointer subtraction of
C/C++.
When two pointers are subtracted, both shall point to elements of the same array
object, or one past the last element of the array object (6.5.6.9).
So, if the two pointers p and q point to different objects, we can define
llvm.psub(p,q) as poison.
Other than meeting C specification, correctness of llvm.psub is tested with SPEC
CPU2017 and LLVM Nightly Tests as well.

But it is true that sometimes pointer subtraction is used to get distance
between two objects.
Most common case is doing something like 'p - NULL', and this pattern
exists in SPEC CPU2017, for example spec_qsort.c in mcf_r .
Our suggestion is to define 'p - q' correctly return the distance
between p and q if either p or q is based on inttoptr(i). This naturally
explains 'p - NULL' because NULL is equivalent to inttoptr(0).

Regarding analysis - what I've observed is that analysis was done after
pointer subtraction was optimized into another form.
For example, if '(p - q) == 0' was given, this is transformed into
'p == q', and then some analysis was done.
Good thing is that these transformations can be simply applied to llvm.psub as
well, which will reenable analysis.
Also we're adding a new operation here, so existing analysis wouldn't be
incorrect, but wouldn't fire.
Fortunately, the performance impact after changing llvm.psub wasn't big.
> Second - an intrinsic seems a poor fit here given the significance of this
operation. We have an instruction that covers most pointer arithmetic
(`getelementptr`), and I can imagine growing
> pointer subtraction, but it seems like it should be an instruction if
we're going to have it. Based on the above, we will need to use it very
often in analysis.
I also think that defining psub as instruction is fine. :)
> Regarding the instcombine, it should be very easy to keep loads and stores
of pointers as pointer typed in instcombine. Likely just a missing case in the
code I added/touched there.
That's really good. :) I found that  combineLoadToOperationType from
InstCombineLoadStoreAlloca was responsible for the transformation.
I can upload a patch for that if ok.

Best Regards,
Juneyoung Lee

On Mon, Jan 14, 2019 at 5:36 PM Chandler Carruth <chandlerc at
gmail.com<mailto:chandlerc at gmail.com>> wrote:
While I'm very interested in the end result here, I have some questions that
don't seem well answered yet around pointer subtraction...

First and foremost - how do you address correctness issues here? Because the
subtraction `A - B` can escape/capture more things. Specifically, if one of `A`
or `B` is escaped/captured, the subtraction can be used to escape or capture the
other pointer. So *some* of the conservative treatment is necessary. What is the
plan to update all the analyses to remain correct? What correctness testing have
you done?

Second - an intrinsic seems a poor fit here given the significance of this
operation. We have an instruction that covers most pointer arithmetic
(`getelementptr`), and I can imagine growing pointer subtraction, but it seems
like it should be an instruction if we're going to have it. Based on the
above, we will need to use it very often in analysis.

Regarding the instcombine, it should be very easy to keep loads and stores of
pointers as pointer typed in instcombine. Likely just a missing case in the code
I added/touched there.

On Mon, Jan 14, 2019 at 3:23 AM Juneyoung Lee via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Hello all,

This is a proposal for reducing # of ptrtoint/inttoptr casts which are not
written by programmers but rather generated by LLVM passes.
Currently the majority of ptrtoint/inttoptr casts are generated by LLVM;
when compiling SPEC 2017 with LLVM r348082 (Dec 2 2018) with -O3,
the output IR contains 22,771 inttoptr instructions. However, when
compiling it with -O0, there are only 1048 inttoptrs, meaning that 95.4%
of them are generated by LLVM passes.

This trend is similar in ptrtoint instruction as well. When compiling SPEC 2017
with -O0, there are 23,208 ptrtoint instructions, but among them 22,016 (94.8%)
are generated by Clang frontend to represent pointer subtraction.
They aren't effectively optimized out because there are even more ptrtoints
(31,721) after -O3.
This is bad for performance because existence of ptrtoint makes analysis return
conservative
result as a pointer can be escaped through the cast.
Memory accesses to a pointer came from inttoptr is assumed
to possibly access anywhere, therefore it may block
store-to-load forwarding, merging two same loads, etc.

I believe this can be addressed by applying two patches - first one is
representing pointer subtraction with a dedicated intrinsic function, llvm.psub,
and second one is disabling InstCombine transformation

    %q = load i8*, i8** %p1
    store i8* %q, i8** %p2
=>
  %1 = bitcast i8** %p1 to i64*
  %q1 = load i64, i64* %1, align 8
  %2 = bitcast i8** %p2 to i64*
  store i64 %q1, i64* %2, align 8

This transformation can introduce inttoptrs later if loads are followed
(https://godbolt.org/z/wsZ3II ). Both are discussed in
https://bugs.llvm.org/show_bug.cgi?id=39846 as well.
After llvm.psub is used & this transformation is disabled, # of inttoptrs
decreases from 22,771 to 1,565 (6.9%), and # of ptrtoints decreases from 31,721
to 7,772 (24.5%).

I'll introduce llvm.psub patch first.

--- Adding llvm.psub ---

By defining pointer subtraction intrinsic, we can get performance gain because
it gives more undefined behavior than just subtracting two ptrtoints.

Patch https://reviews.llvm.org/D56598 adds llvm.psub(p1,p2) intrinsic function,
which subtracts two pointers and returns the difference. Its semantic is as
follows.
If p1 and p2 point to different objects, and neither of them is based on a
pointer casted from an integer, `llvm.psub(p1, p2)` returns poison. For example,

%p = alloca
%q = alloca
%i = llvm.psub(p, q) ; %i is poison

This allows aggressive escape analysis on pointers. Given i = llvm.psub(p1, p2),
if neither of p1 and p2 is based on a pointer casted from an integer, the
llvm.psub call does not make p1 or p2 escape. (https://reviews.llvm.org/D56601 )

If either p1 or p2 is based on a pointer casted from integer, or p1 and p2 point
to a same object, it returns the result of subtraction (in bytes); for example,

%p = alloca
%q = inttoptr %x
%i = llvm.psub(p, q) ; %i is equivalent to (ptrtoint %p) - %x

`null` is regarded as a pointer casted from an integer because
it is equivalent to `inttoptr 0`.

Adding llvm.psub allows LLVM to utilize significant portion of ptrtoints &
reduce a portion of inttoptrs. After llvm.psub is used, when SPECrate 2017 is
compiled with -O3, # of inttoptr decreases to ~13,500 (59%) and # of ptrtoint
decreases to ~14,300 (45%).

To see the performance change, I ran SPECrate 2017 (thread # = 1) with three
versions of LLVM, which are r313797 (Sep 21, 2017), LLVM 6.0 official, and
r348082 (Dec 2, 2018).
Running r313797 shows that 505.mcf_r has consistent 2.0% speedup over 3
different machines (which are i3-6100, i5-6600, i7-7700). For LLVM 6.0 and
r348082, there's neither consistent speedup nor slowdown, but the average
speedup is near 0. I believe there's still a room of improvement because
there are passes which are not aware of llvm.psub.

Thank you for reading this, and any comment is welcome.

Best Regards,
Juneyoung Lee
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--

Juneyoung Lee
Software Foundation Lab, Seoul National University
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190114/4f7ef000/attachment-0001.html>

Chandler Carruth via llvm-dev

2019-Jan-14 22:48 UTC

head link

[llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM

FWIW, my commit message wasn't really intended to necessarily apply to
ptrtoint / inttoptr.

I don't think I did any careful study there. I was looking at bitcast
differences, and mostly between floating point types or
first-class-aggregate types and an integer.

Given that pointers have a completely different approach to casting in the
first place, I could easily imagine (but really haven't checked) that they
should be handled differently here. I don't think either side is trivially
true, and it would need some investigation to understand the tradeoff
between canonicalizing here vs. handling the casts in mem2reg and company.

An example: what happens if we memcpy a pointer type? or a struct
containing a pointer type? Do frontends in the wild end up forming losing
the "pointerness" anyways? If so, maybe the optimization passes need
to
cope regardless of what instcombine does, and so instcombine might as well
keep canonicalizing to integers. This was the case (but with floating point
types) that motivated the original line of work.

On Mon, Jan 14, 2019 at 1:58 PM Kaylor, Andrew <andrew.kaylor at
intel.com>
wrote:
> My reply below seems to have accidentally dropped the mailing list, so I’m
> resending.
>
>
>
> *From:* Kaylor, Andrew
> *Sent:* Monday, January 14, 2019 1:40 PM
> *To:* Juneyoung Lee <juneyoung.lee at sf.snu.ac.kr>; Chandler Carruth
<
> chandlerc at gmail.com>
> *Cc:* Zhengyang Liu <liuz at cs.utah.edu>; Ralf Jung <jung at
mpi-sws.org>;
> John Regehr <regehr at cs.utah.edu>; Nuno Lopes <nlopes at
microsoft.com>
> *Subject:* RE: [llvm-dev] Reducing the number of ptrtoint/inttoptrs that
> are generated by LLVM
>
>
>
> We recently came across a related issue where the InstCombine change that
> introduces ptrtoint can block mem2reg promotions. We did indeed track it
> back to a change Chandler made that was specifically introducting this as a
> canonicalization. The change was r226781. Here’s the commit message:
>
>
>
> ---
>
>
>
> [canonicalize] Teach InstCombine to canonicalize loads which are only
>
> ever stored to always use a legal integer type if one is available.
>
>
>
> Regardless of whether this particular type is good or bad, it ensures we
>
> don't get weird differences in generated code (and resulting
>
> performance) from "equivalent" patterns that happen to end up
using
>
> a slightly different type.
>
>
>
> After some discussion on llvmdev it seems everyone generally likes this
>
> canonicalization. However, there may be some parts of LLVM that handle
>
> it poorly and need to be fixed. I have at least verified that this
>
> doesn't impede GVN and instcombine's store-to-load forwarding
powers in
>
> any obvious cases. Subtle cases are exactly what we need te flush out if
>
> they remain.
>
>
>
> Also note that this IR pattern should already be hitting LLVM from Clang
>
> at least because it is exactly the IR which would be produced if you
>
> used memcpy to copy a pointer or floating point between memory instead
>
> of a variable.
>
>
>
> ---
>
>
>
> Based on the third paragraph there, we have been exploring the possibility
> of teaching mem2reg to handle the ptrtoint cast rather than removing the
> canonicalization. Personally, I would be happy to see pointers loaded and
> stored directly without the canonicalization if that doesn’t break other
> optimizations.
>
>
>
> I don’t know if the mem2reg fix we’re working on would take care of
> Juneyoung’s store-to-load forwarding issues.
>
>
>
> -Andy
>
>
>
> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of
*Juneyoung
> Lee via llvm-dev
> *Sent:* Monday, January 14, 2019 12:56 PM
> *To:* Chandler Carruth <chandlerc at gmail.com>
> *Cc:* llvm-dev <llvm-dev at lists.llvm.org>; Zhengyang Liu <liuz
at cs.utah.edu>;
> Ralf Jung <jung at mpi-sws.org>; John Regehr <regehr at
cs.utah.edu>; Nuno
> Lopes <nlopes at microsoft.com>
> *Subject:* Re: [llvm-dev] Reducing the number of ptrtoint/inttoptrs that
> are generated by LLVM
>
>
>
> Hello Chandler,
>
>
>
> > First and foremost - how do you address correctness issues here?
Because
> the subtraction `A - B` can escape/capture more things. Specifically, if
> one of `A` or `B` is escaped/captured, the
>
> > subtraction can be used to escape or capture the other pointer. So
> *some* of the conservative treatment is necessary. What is the plan to
> update all the analyses to remain correct? What
>
> > correctness testing have you done?
>
>
>
> Correctness of psub is guaranteed by the specification of pointer
> subtraction of C/C++.
>
> When two pointers are subtracted, both shall point to elements of the same
> array object, or one past the last element of the array object (6.5.6.9).
>
> So, if the two pointers p and q point to different objects, we can define
> llvm.psub(p,q) as poison.
>
> Other than meeting C specification, correctness of llvm.psub is tested
> with SPEC CPU2017 and LLVM Nightly Tests as well.
>
>
>
> But it is true that sometimes pointer subtraction is used to get distance
> between two objects.
>
> Most common case is doing something like 'p - NULL', and this
pattern
> exists in SPEC CPU2017, for example spec_qsort.c in mcf_r .
>
> Our suggestion is to define 'p - q' correctly return the distance
between
> p and q if either p or q is based on inttoptr(i). This naturally explains
> 'p - NULL' because NULL is equivalent to inttoptr(0).
>
>
>
> Regarding analysis - what I've observed is that analysis was done after
> pointer subtraction was optimized into another form.
>
> For example, if '(p - q) == 0' was given, this is transformed into
'p => q', and then some analysis was done.
>
> Good thing is that these transformations can be simply applied to
> llvm.psub as well, which will reenable analysis.
>
> Also we're adding a new operation here, so existing analysis
wouldn't be
> incorrect, but wouldn't fire.
>
> Fortunately, the performance impact after changing llvm.psub wasn't
big.
>
>
>
> > Second - an intrinsic seems a poor fit here given the significance of
> this operation. We have an instruction that covers most pointer arithmetic
> (`getelementptr`), and I can imagine growing
>
> > pointer subtraction, but it seems like it should be an instruction if
> we're going to have it. Based on the above, we will need to use it very
> often in analysis.
>
>
>
> I also think that defining psub as instruction is fine. :)
>
>
>
> > Regarding the instcombine, it should be very easy to keep loads and
> stores of pointers as pointer typed in instcombine. Likely just a missing
> case in the code I added/touched there.
>
>
>
> That's really good. :) I found that  combineLoadToOperationType from
> InstCombineLoadStoreAlloca was responsible for the transformation.
>
> I can upload a patch for that if ok.
>
>
>
> Best Regards,
>
> Juneyoung Lee
>
>
>
> On Mon, Jan 14, 2019 at 5:36 PM Chandler Carruth <chandlerc at
gmail.com>
> wrote:
>
> While I'm very interested in the end result here, I have some questions
> that don't seem well answered yet around pointer subtraction...
>
>
>
> First and foremost - how do you address correctness issues here? Because
> the subtraction `A - B` can escape/capture more things. Specifically, if
> one of `A` or `B` is escaped/captured, the subtraction can be used to
> escape or capture the other pointer. So *some* of the conservative
> treatment is necessary. What is the plan to update all the analyses to
> remain correct? What correctness testing have you done?
>
>
>
> Second - an intrinsic seems a poor fit here given the significance of this
> operation. We have an instruction that covers most pointer arithmetic
> (`getelementptr`), and I can imagine growing pointer subtraction, but it
> seems like it should be an instruction if we're going to have it. Based
on
> the above, we will need to use it very often in analysis.
>
>
>
>
>
> Regarding the instcombine, it should be very easy to keep loads and stores
> of pointers as pointer typed in instcombine. Likely just a missing case in
> the code I added/touched there.
>
>
>
> On Mon, Jan 14, 2019 at 3:23 AM Juneyoung Lee via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Hello all,
>
>
>
> This is a proposal for reducing # of ptrtoint/inttoptr casts which are not
>
> written by programmers but rather generated by LLVM passes.
>
> Currently the majority of ptrtoint/inttoptr casts are generated by LLVM;
>
> when compiling SPEC 2017 with LLVM r348082 (Dec 2 2018) with -O3,
>
> the output IR contains 22,771 inttoptr instructions. However, when
>
> compiling it with -O0, there are only 1048 inttoptrs, meaning that 95.4%
>
> of them are generated by LLVM passes.
>
>
>
> This trend is similar in ptrtoint instruction as well. When compiling SPEC
> 2017
>
> with -O0, there are 23,208 ptrtoint instructions, but among them 22,016
> (94.8%)
>
> are generated by Clang frontend to represent pointer subtraction.
>
> They aren't effectively optimized out because there are even more
> ptrtoints (31,721) after -O3.
>
> This is bad for performance because existence of ptrtoint makes analysis
> return conservative
>
> result as a pointer can be escaped through the cast.
>
> Memory accesses to a pointer came from inttoptr is assumed
>
> to possibly access anywhere, therefore it may block
>
> store-to-load forwarding, merging two same loads, etc.
>
>
>
> I believe this can be addressed by applying two patches - first one is
> representing pointer subtraction with a dedicated intrinsic function,
> llvm.psub, and second one is disabling InstCombine transformation
>
>
>
>     %q = load i8*, i8** %p1
>
>     store i8* %q, i8** %p2
>
> =>
>
>   %1 = bitcast i8** %p1 to i64*
>
>   %q1 = load i64, i64* %1, align 8
>
>   %2 = bitcast i8** %p2 to i64*
>
>   store i64 %q1, i64* %2, align 8
>
>
>
> This transformation can introduce inttoptrs later if loads are followed (
> https://godbolt.org/z/wsZ3II ). Both are discussed in
> https://bugs.llvm.org/show_bug.cgi?id=39846 as well.
>
> After llvm.psub is used & this transformation is disabled, # of
inttoptrs
> decreases from 22,771 to 1,565 (6.9%), and # of ptrtoints decreases from
> 31,721 to 7,772 (24.5%).
>
>
>
> I'll introduce llvm.psub patch first.
>
>
>
>
>
> --- Adding llvm.psub ---
>
>
>
> By defining pointer subtraction intrinsic, we can get performance gain
> because it gives more undefined behavior than just subtracting two
> ptrtoints.
>
>
>
> Patch https://reviews.llvm.org/D56598 adds llvm.psub(p1,p2) intrinsic
> function, which subtracts two pointers and returns the difference. Its
> semantic is as follows.
>
> If p1 and p2 point to different objects, and neither of them is based on a
> pointer casted from an integer, `llvm.psub(p1, p2)` returns poison. For
> example,
>
>
>
> %p = alloca
>
> %q = alloca
>
> %i = llvm.psub(p, q) ; %i is poison
>
>
>
> This allows aggressive escape analysis on pointers. Given i >
llvm.psub(p1, p2), if neither of p1 and p2 is based on a pointer casted
> from an integer, the llvm.psub call does not make p1 or p2 escape. (
> https://reviews.llvm.org/D56601 )
>
>
>
> If either p1 or p2 is based on a pointer casted from integer, or p1 and p2
> point to a same object, it returns the result of subtraction (in bytes);
> for example,
>
>
>
> %p = alloca
>
> %q = inttoptr %x
>
> %i = llvm.psub(p, q) ; %i is equivalent to (ptrtoint %p) - %x
>
>
>
> `null` is regarded as a pointer casted from an integer because
>
> it is equivalent to `inttoptr 0`.
>
>
>
> Adding llvm.psub allows LLVM to utilize significant portion of ptrtoints
&
> reduce a portion of inttoptrs. After llvm.psub is used, when SPECrate 2017
> is compiled with -O3, # of inttoptr decreases to ~13,500 (59%) and # of
> ptrtoint decreases to ~14,300 (45%).
>
>
>
> To see the performance change, I ran SPECrate 2017 (thread # = 1) with
> three versions of LLVM, which are r313797 (Sep 21, 2017), LLVM 6.0
> official, and r348082 (Dec 2, 2018).
>
> Running r313797 shows that 505.mcf_r has consistent 2.0% speedup over 3
> different machines (which are i3-6100, i5-6600, i7-7700). For LLVM 6.0 and
> r348082, there's neither consistent speedup nor slowdown, but the
average
> speedup is near 0. I believe there's still a room of improvement
because
> there are passes which are not aware of llvm.psub.
>
>
>
> Thank you for reading this, and any comment is welcome.
>
>
>
> Best Regards,
>
> Juneyoung Lee
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
>
> --
>
>
>
> Juneyoung Lee
>
> Software Foundation Lab, Seoul National University
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190114/7ed7bfa0/attachment.html>

Mehdi AMINI via llvm-dev

2019-Jan-14 23:59 UTC

head link

[llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM

On Mon, Jan 14, 2019 at 9:36 AM Chandler Carruth via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> While I'm very interested in the end result here, I have some questions
> that don't seem well answered yet around pointer subtraction...
>
> First and foremost - how do you address correctness issues here? Because
> the subtraction `A - B` can escape/capture more things. Specifically, if
> one of `A` or `B` is escaped/captured, the subtraction can be used to
> escape or capture the other pointer.
>
Isn't escaping supposed to work at the "address ranges" level and
not at
the pointer value?
I mean that if `A` or `B` is escaped/captured, then any pointer that is
associated to the same memory range should be considered as "escaped",
and
thus the subtraction does not seem to leak anything more to me.

-- 
Mehdi



> So *some* of the conservative treatment is necessary. What is the plan to
> update all the analyses to remain correct? What correctness testing have
> you done?
>
> Second - an intrinsic seems a poor fit here given the significance of this
> operation. We have an instruction that covers most pointer arithmetic
> (`getelementptr`), and I can imagine growing pointer subtraction, but it
> seems like it should be an instruction if we're going to have it. Based
on
> the above, we will need to use it very often in analysis.
>
>
> Regarding the instcombine, it should be very easy to keep loads and stores
> of pointers as pointer typed in instcombine. Likely just a missing case in
> the code I added/touched there.
>
> On Mon, Jan 14, 2019 at 3:23 AM Juneyoung Lee via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hello all,
>>
>> This is a proposal for reducing # of ptrtoint/inttoptr casts which are
not
>> written by programmers but rather generated by LLVM passes.
>> Currently the majority of ptrtoint/inttoptr casts are generated by
LLVM;
>> when compiling SPEC 2017 with LLVM r348082 (Dec 2 2018) with -O3,
>> the output IR contains 22,771 inttoptr instructions. However, when
>> compiling it with -O0, there are only 1048 inttoptrs, meaning that
95.4%
>> of them are generated by LLVM passes.
>>
>> This trend is similar in ptrtoint instruction as well. When compiling
>> SPEC 2017
>> with -O0, there are 23,208 ptrtoint instructions, but among them 22,016
>> (94.8%)
>> are generated by Clang frontend to represent pointer subtraction.
>> They aren't effectively optimized out because there are even more
>> ptrtoints (31,721) after -O3.
>> This is bad for performance because existence of ptrtoint makes
analysis
>> return conservative
>> result as a pointer can be escaped through the cast.
>> Memory accesses to a pointer came from inttoptr is assumed
>> to possibly access anywhere, therefore it may block
>> store-to-load forwarding, merging two same loads, etc.
>>
>> I believe this can be addressed by applying two patches - first one is
>> representing pointer subtraction with a dedicated intrinsic function,
>> llvm.psub, and second one is disabling InstCombine transformation
>>
>>     %q = load i8*, i8** %p1
>>     store i8* %q, i8** %p2
>> =>
>>   %1 = bitcast i8** %p1 to i64*
>>   %q1 = load i64, i64* %1, align 8
>>   %2 = bitcast i8** %p2 to i64*
>>   store i64 %q1, i64* %2, align 8
>>
>> This transformation can introduce inttoptrs later if loads are followed
(
>> https://godbolt.org/z/wsZ3II ). Both are discussed in
>> https://bugs.llvm.org/show_bug.cgi?id=39846 as well.
>> After llvm.psub is used & this transformation is disabled, # of
inttoptrs
>> decreases from 22,771 to 1,565 (6.9%), and # of ptrtoints decreases
from
>> 31,721 to 7,772 (24.5%).
>>
>> I'll introduce llvm.psub patch first.
>>
>>
>> --- Adding llvm.psub ---
>>
>> By defining pointer subtraction intrinsic, we can get performance gain
>> because it gives more undefined behavior than just subtracting two
>> ptrtoints.
>>
>> Patch https://reviews.llvm.org/D56598 adds llvm.psub(p1,p2) intrinsic
>> function, which subtracts two pointers and returns the difference. Its
>> semantic is as follows.
>> If p1 and p2 point to different objects, and neither of them is based
on
>> a pointer casted from an integer, `llvm.psub(p1, p2)` returns poison.
For
>> example,
>>
>> %p = alloca
>> %q = alloca
>> %i = llvm.psub(p, q) ; %i is poison
>>
>> This allows aggressive escape analysis on pointers. Given i >>
llvm.psub(p1, p2), if neither of p1 and p2 is based on a pointer casted
>> from an integer, the llvm.psub call does not make p1 or p2 escape. (
>> https://reviews.llvm.org/D56601 )
>>
>> If either p1 or p2 is based on a pointer casted from integer, or p1 and
>> p2 point to a same object, it returns the result of subtraction (in
bytes);
>> for example,
>>
>> %p = alloca
>> %q = inttoptr %x
>> %i = llvm.psub(p, q) ; %i is equivalent to (ptrtoint %p) - %x
>>
>> `null` is regarded as a pointer casted from an integer because
>> it is equivalent to `inttoptr 0`.
>>
>> Adding llvm.psub allows LLVM to utilize significant portion of
ptrtoints
>> & reduce a portion of inttoptrs. After llvm.psub is used, when
SPECrate
>> 2017 is compiled with -O3, # of inttoptr decreases to ~13,500 (59%) and
#
>> of ptrtoint decreases to ~14,300 (45%).
>>
>> To see the performance change, I ran SPECrate 2017 (thread # = 1) with
>> three versions of LLVM, which are r313797 (Sep 21, 2017), LLVM 6.0
>> official, and r348082 (Dec 2, 2018).
>> Running r313797 shows that 505.mcf_r has consistent 2.0% speedup over 3
>> different machines (which are i3-6100, i5-6600, i7-7700). For LLVM 6.0
and
>> r348082, there's neither consistent speedup nor slowdown, but the
average
>> speedup is near 0. I believe there's still a room of improvement
because
>> there are passes which are not aware of llvm.psub.
>>
>> Thank you for reading this, and any comment is welcome.
>>
>> Best Regards,
>> Juneyoung Lee
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190114/6eb4f82b/attachment.html>

Sanjoy Das via llvm-dev

2019-Jan-18 07:49 UTC

head link

[llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM

On Mon, Jan 14, 2019 at 3:23 AM Juneyoung Lee via llvm-dev
<llvm-dev at lists.llvm.org> wrote:> Patch https://reviews.llvm.org/D56598 adds llvm.psub(p1,p2) intrinsic
function, which subtracts two pointers and returns the difference. Its semantic
is as follows.
>
> If p1 and p2 point to different objects, and neither of them is based on a
pointer casted from an integer, `llvm.psub(p1, p2)` returns poison. For example,
Are you proposing landing this in conjunction with some of the other
stuff discussed in the twin allocation paper?  Otherwise isn't this
problematic with propagateEquality of pointers?

a = malloc()
free(a)
b = malloc() // Assume b == a numerically
if (a == b) {
  print(psub(b, b)) // prints 0
}

=>

a = malloc()
free(a)
b = malloc() // Assume b == a numerically
if (a == b) {
  print(psub(a, b)) // prints poison
}

Though I admit propagateEquality of pointers has many other problems like this.


-- Sanjoy

Juneyoung Lee via llvm-dev

2019-Jan-18 07:57 UTC

head link

[llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM

Hello Sanjoy,

Yep, combining it with propagateEquality of pointers may raise problem. :\

However I believe propagateEquality should be fixed properly, and adding
psub also suggests a solution for that. :)

It is sound to replace a pointer with another if subtraction of them is 0:

a = malloc()
free(a)
b = malloc() // Assume b == a numerically
if ((psub inbounds a b) == 0) { // a and b are pointing to different
objects, so the comparison becomes poison
  use(a)
}

=>

a = malloc()
free(a)
b = malloc() // Assume b == a numerically
if ((psub inbounds a b) == 0) {
  use(b)
}

Juneyoung Lee

On Fri, Jan 18, 2019 at 7:50 AM Sanjoy Das <sanjoy at
playingwithpointers.com>
wrote:
> On Mon, Jan 14, 2019 at 3:23 AM Juneyoung Lee via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> > Patch https://reviews.llvm.org/D56598 adds llvm.psub(p1,p2) intrinsic
> function, which subtracts two pointers and returns the difference. Its
> semantic is as follows.
> >
> > If p1 and p2 point to different objects, and neither of them is based
on
> a pointer casted from an integer, `llvm.psub(p1, p2)` returns poison. For
> example,
>
> Are you proposing landing this in conjunction with some of the other
> stuff discussed in the twin allocation paper?  Otherwise isn't this
> problematic with propagateEquality of pointers?
>
> a = malloc()
> free(a)
> b = malloc() // Assume b == a numerically
> if (a == b) {
>   print(psub(b, b)) // prints 0
> }
>
> =>
>
> a = malloc()
> free(a)
> b = malloc() // Assume b == a numerically
> if (a == b) {
>   print(psub(a, b)) // prints poison
> }
>
> Though I admit propagateEquality of pointers has many other problems like
> this.
>
>
> -- Sanjoy
>

-- 

Juneyoung Lee
Software Foundation Lab, Seoul National University
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190118/8acf5329/attachment.html>

llvm dev - Jan 2019 - Reducing the number of ptrtoint/inttoptrs that are generated by LLVM

[llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM

[llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM

[llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM

[llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM

[llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM

[llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM

[llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM

[llvm-dev] Reducing the number of ptrtoint/inttoptrs that are generated by LLVM