thr3ads.net - llvm dev - [LLVMdev] Contribute a new precise pointer analysis to LLVM [Oct 2013]

If this information is useful, please help other people find it:
Share via:

lian li

2013-Oct-18 04:51 UTC

[LLVMdev] Contribute a new precise pointer analysis to LLVM

Hi Hal,

Thanks for your interest.

We tested with the following existing compiler optimizations in LLVM
with SPECINT2006 benchmarks:
-dse (dead store elimination),
-gvn (global value numbering),
-licm (loop invariant code motion),
-bb-vectorize (basic block vectorization),
-memcpyopt (memcpy optimization),
-sink (code sinking),
-loop-idom (recognize loop idioms),
-argpromotion (argument promotion).

As far as I know, this list includes all optmizations in LLVM that
requires alias information. With our pointer analysis, for the three
optimizations "-dse, -gvn, -licm", we have observed significant
improvements in terms of the number of optimizations being conducted.
Take licm as an example, with our analysis, the number of instructions
being hoisted out of loops more than doubled for some benchmarks. We
have not seen big changes in terms of runtime performance though.

We measure compile time by first linking all modules of the same
benchmark together, then running the analysis against the linked
module. It finishes in less than 1 minute over applications such as
httpd and sendmail, both with more than 100,000 LOC. For some
benchmarks, it is time-consuming: the analysis time for GCC is close
to 45 minutes.  But if you run the analysis over individual files (as
in your normal compilation process), there is no observable overhead
in terms of compile time.

Regards,
Lian

On Fri, Oct 18, 2013 at 12:15 PM, Hal Finkel <hfinkel at anl.gov>
wrote:> Hi Lian,
>
> I am certainly interested in seeing this; do you have performance numbers
(compile time)? Also, can you share more information about the promising
optimization results you mentioned?
>
> Thanks,
> Hal
>
> ----- Original Message -----
>> Hi All,
>>
>> This is Lian Li  from Oracle Labs in Brisbane Australia.
>>
>> We have developed a precise and highly efficient pointer analysis
>> framework on top of LLVM,  The approach is flow, context, and field
>> sensitive, details are described in the two papers below:
>>
>> "Boosting the performance of flow-sensitive points-to analysis
using
>> value flow" (in ESEC-FSE 2011), and
>> "Precise and scalable context-sensitive pointer analysis via value
>> flow graph" (in ISMM 2013).
>>
>> The analysis was initially developed for the purpose of bug checking,
>> and is now extended to support compiler optimizations. We have tested
>> it with existing compiler optimizations in LLVM, and have seen
>> promising results.
>>
>> We are now considering to make this analysis available to the LLVM
>> community, and contribute resources for future maintenance needs,
>> provided that there is enough interest. We think that a new precise
>> pointer analysis in LLVM can enable more new optimization and
>> analysis
>> techniques to be developed in LLVM.
>>
>> Any people interested in seeing a new precise pointer analysis in
>> LLVM?
>>
>> Regards,
>> Lian Li
>>
>>
>>
>>
>>
>>
>>
>> --
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory

-- 
// Copyright @ 2011 Authorized by ** LIAN LI **

Daniel Berlin

2013-Oct-18 06:39 UTC

head link

[LLVMdev] Contribute a new precise pointer analysis to LLVM

I notice you guys formulate your CFL reachability problem as a
balanced parentheses problem.
What algorithm do you use to solve it?

Are you aware of recent work that comes up with linear time and n log
n time algorithms to solve this class of problems:
http://www.cse.cuhk.edu.hk/lyu/_media/paper/pldi2013.pdf

In particular, the time bound from the paper:

"However, if we need the precise pointer information for all
variables, CFL-based
pointer analysis is generally more expensive as it has O(L^3 N^3)
complexity [29], where L is the size of the grammar and N is the
size of the graph."

should now been reduced to  O(n + m log m) time, assuming you are
using a standard balanced parentheses formulation.




On Thu, Oct 17, 2013 at 9:51 PM, lian li <lianli at gmail.com>
wrote:> Hi Hal,
>
> Thanks for your interest.
>
> We tested with the following existing compiler optimizations in LLVM
> with SPECINT2006 benchmarks:
> -dse (dead store elimination),
> -gvn (global value numbering),
> -licm (loop invariant code motion),
> -bb-vectorize (basic block vectorization),
> -memcpyopt (memcpy optimization),
> -sink (code sinking),
> -loop-idom (recognize loop idioms),
> -argpromotion (argument promotion).
>
> As far as I know, this list includes all optmizations in LLVM that
> requires alias information. With our pointer analysis, for the three
> optimizations "-dse, -gvn, -licm", we have observed significant
> improvements in terms of the number of optimizations being conducted.
> Take licm as an example, with our analysis, the number of instructions
> being hoisted out of loops more than doubled for some benchmarks. We
> have not seen big changes in terms of runtime performance though.
>
> We measure compile time by first linking all modules of the same
> benchmark together, then running the analysis against the linked
> module. It finishes in less than 1 minute over applications such as
> httpd and sendmail, both with more than 100,000 LOC. For some
> benchmarks, it is time-consuming: the analysis time for GCC is close
> to 45 minutes.  But if you run the analysis over individual files (as
> in your normal compilation process), there is no observable overhead
> in terms of compile time.
>
> Regards,
> Lian
>
>
>
> On Fri, Oct 18, 2013 at 12:15 PM, Hal Finkel <hfinkel at anl.gov>
wrote:
>> Hi Lian,
>>
>> I am certainly interested in seeing this; do you have performance
numbers (compile time)? Also, can you share more information about the promising
optimization results you mentioned?
>>
>> Thanks,
>> Hal
>>
>> ----- Original Message -----
>>> Hi All,
>>>
>>> This is Lian Li  from Oracle Labs in Brisbane Australia.
>>>
>>> We have developed a precise and highly efficient pointer analysis
>>> framework on top of LLVM,  The approach is flow, context, and field
>>> sensitive, details are described in the two papers below:
>>>
>>> "Boosting the performance of flow-sensitive points-to analysis
using
>>> value flow" (in ESEC-FSE 2011), and
>>> "Precise and scalable context-sensitive pointer analysis via
value
>>> flow graph" (in ISMM 2013).
>>>
>>> The analysis was initially developed for the purpose of bug
checking,
>>> and is now extended to support compiler optimizations. We have
tested
>>> it with existing compiler optimizations in LLVM, and have seen
>>> promising results.
>>>
>>> We are now considering to make this analysis available to the LLVM
>>> community, and contribute resources for future maintenance needs,
>>> provided that there is enough interest. We think that a new precise
>>> pointer analysis in LLVM can enable more new optimization and
>>> analysis
>>> techniques to be developed in LLVM.
>>>
>>> Any people interested in seeing a new precise pointer analysis in
>>> LLVM?
>>>
>>> Regards,
>>> Lian Li
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>
>
>
> --
> // Copyright @ 2011 Authorized by ** LIAN LI **
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

lian li

2013-Oct-18 14:27 UTC

head link

[LLVMdev] Contribute a new precise pointer analysis to LLVM

Hi Daniel,

I want to clarify that our analysis is not based on CFL-reachability.
We apply CFL-reachability to matching context information where the
exist from a function to a call-site must match
the entry from the corresponding call-site. The problem is a simple
balanced parentheses problem in CFL-reachability, and it can be
computed
efficiently.

The paper you mentioned is a very nice paper that proposed an
efficient algorithm to solve Dyck-CFL-reachability problem in
bidirectional graphs. However, precise pointer analysis cannot be
formulated as a Dyck-CFL-reachability problem. The alias analysis
presented in that paper is an over approximation. Hence the time bound
from our paper still applies.

Regards,
Lian





On Fri, Oct 18, 2013 at 4:39 PM, Daniel Berlin <dberlin at dberlin.org>
wrote:> I notice you guys formulate your CFL reachability problem as a
> balanced parentheses problem.
> What algorithm do you use to solve it?
>
> Are you aware of recent work that comes up with linear time and n log
> n time algorithms to solve this class of problems:
> http://www.cse.cuhk.edu.hk/lyu/_media/paper/pldi2013.pdf
>
> In particular, the time bound from the paper:
>
> "However, if we need the precise pointer information for all
> variables, CFL-based
> pointer analysis is generally more expensive as it has O(L^3 N^3)
> complexity [29], where L is the size of the grammar and N is the
> size of the graph."
>
> should now been reduced to  O(n + m log m) time, assuming you are
> using a standard balanced parentheses formulation.
>
>
>
>
> On Thu, Oct 17, 2013 at 9:51 PM, lian li <lianli at gmail.com> wrote:
>> Hi Hal,
>>
>> Thanks for your interest.
>>
>> We tested with the following existing compiler optimizations in LLVM
>> with SPECINT2006 benchmarks:
>> -dse (dead store elimination),
>> -gvn (global value numbering),
>> -licm (loop invariant code motion),
>> -bb-vectorize (basic block vectorization),
>> -memcpyopt (memcpy optimization),
>> -sink (code sinking),
>> -loop-idom (recognize loop idioms),
>> -argpromotion (argument promotion).
>>
>> As far as I know, this list includes all optmizations in LLVM that
>> requires alias information. With our pointer analysis, for the three
>> optimizations "-dse, -gvn, -licm", we have observed
significant
>> improvements in terms of the number of optimizations being conducted.
>> Take licm as an example, with our analysis, the number of instructions
>> being hoisted out of loops more than doubled for some benchmarks. We
>> have not seen big changes in terms of runtime performance though.
>>
>> We measure compile time by first linking all modules of the same
>> benchmark together, then running the analysis against the linked
>> module. It finishes in less than 1 minute over applications such as
>> httpd and sendmail, both with more than 100,000 LOC. For some
>> benchmarks, it is time-consuming: the analysis time for GCC is close
>> to 45 minutes.  But if you run the analysis over individual files (as
>> in your normal compilation process), there is no observable overhead
>> in terms of compile time.
>>
>> Regards,
>> Lian
>>
>>
>>
>> On Fri, Oct 18, 2013 at 12:15 PM, Hal Finkel <hfinkel at anl.gov>
wrote:
>>> Hi Lian,
>>>
>>> I am certainly interested in seeing this; do you have performance
numbers (compile time)? Also, can you share more information about the promising
optimization results you mentioned?
>>>
>>> Thanks,
>>> Hal
>>>
>>> ----- Original Message -----
>>>> Hi All,
>>>>
>>>> This is Lian Li  from Oracle Labs in Brisbane Australia.
>>>>
>>>> We have developed a precise and highly efficient pointer
analysis
>>>> framework on top of LLVM,  The approach is flow, context, and
field
>>>> sensitive, details are described in the two papers below:
>>>>
>>>> "Boosting the performance of flow-sensitive points-to
analysis using
>>>> value flow" (in ESEC-FSE 2011), and
>>>> "Precise and scalable context-sensitive pointer analysis
via value
>>>> flow graph" (in ISMM 2013).
>>>>
>>>> The analysis was initially developed for the purpose of bug
checking,
>>>> and is now extended to support compiler optimizations. We have
tested
>>>> it with existing compiler optimizations in LLVM, and have seen
>>>> promising results.
>>>>
>>>> We are now considering to make this analysis available to the
LLVM
>>>> community, and contribute resources for future maintenance
needs,
>>>> provided that there is enough interest. We think that a new
precise
>>>> pointer analysis in LLVM can enable more new optimization and
>>>> analysis
>>>> techniques to be developed in LLVM.
>>>>
>>>> Any people interested in seeing a new precise pointer analysis
in
>>>> LLVM?
>>>>
>>>> Regards,
>>>> Lian Li
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>
>>> --
>>> Hal Finkel
>>> Assistant Computational Scientist
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>>
>>
>>
>> --
>> // Copyright @ 2011 Authorized by ** LIAN LI **
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


-- 
// Copyright @ 2011 Authorized by ** LIAN LI **

Apparently Analagous Threads

Search for more apparently analagous threads

llvm dev - Oct 2013 - [LLVMdev] Contribute a new precise pointer analysis to LLVM

[LLVMdev] Contribute a new precise pointer analysis to LLVM

[LLVMdev] Contribute a new precise pointer analysis to LLVM

[LLVMdev] Contribute a new precise pointer analysis to LLVM

Apparently Analagous Threads