Thank you very much for all your advices!
I'll revise the proposal according to them.
To George,
As mentioned in the former emails of this thread, I intend to prepare two
proposals for the AA project listed in the idea list and the bloat
detection project proposed by myself respectively and at most one of them
will be accepted by GSoC.
Personally, I do prefer the second project since I'm more familiar with
that field and the technique (static instrumentation) it uses.
According to the timeline of GSoC
<https://www.google-melange.com/gsoc/document/show/gsoc_program/google/gsoc2015/help_page>,
the accepted proposals will only be announced at 27 April. Is it too late
since I do not know which project will be selected or even none of them
will be accepted until then?
Otherwise, will the mentors know the decision earlier (e.g., at 15 April
after the slot is allocated to organizations)?
To John,
The proposal for bloat detection is also available now at
http://james0zan.github.io/resource/GSoC15-Proposal-BloatDetection.pdf (not
completely finished yet).
Some preliminary evaluation results on overhead and detecting ability based
on a simple prototype are given.
(Actually I came up with this idea during my visitation in Columbia U and
the prototype is also implemented at those days, but the project is paused
until recent days due to my internship in Google and some other works.)
P.S. The tex template is downloaded at
http://www.latextemplates.com/template/large-colored-title-article.
Once again, thank you for your time!
On 16 March 2015 at 02:58, George Burgess IV <george.burgess.iv at
gmail.com>
wrote:
> CFLAA already has some basic interprocedural analysis built in (see:
> tryInterproceduralAnalysis; it basically boils down to CFLAA grabbing the
> StratifiedSets for each function call and unifying sets based off of that
> instead of unifying everything ever). The only real changes I had in mind
> for it were:
>
> - Adding context sensitivity (which kind of requires adding context
> sensitivity to CFLAA first)
> - Making it less terrible for {mutually,indirectly,} recursive functions
> (While we're building the sets for some function A(), the sets for A()
> aren't accessible, so any call to A() from a function called by A()
looks
> opaque).
>
> If you want to take a stab at making IPA better/adding context
> sensitivity, you're more than welcome to do so. I'm happy to work
on other
> things in the meantime :)
>
> George
>
> On Sun, Mar 15, 2015 at 8:50 AM, Mingxing Zhang <james0zan at
gmail.com>
> wrote:
>
>> Hello Daniel,
>>
>> Thank you for your comments and sorry for my mistakes, I'll revise
them.
>> And I'll for sure read the paper you mentioned and survey the
recent
>> researches before deciding the implementation technique.
>>
>> To George:
>> May I know the exact plan of your attempt for making cfl-aa
>> interprocedural?
>> I do think that this is the most valuable part of my proposal, but that
>> makes no sense to do it twice.
>>
>> Maybe I can work on the porting of the flow-sensitive method proposed
by
>> Prof. Ben Hardekopf at CGO11
>>
<http://www.cs.ucsb.edu/~benh/research/papers/hardekopf11flow.pdf>.
>> It is declared in his homepage that the published source code "is
written
>> for a pre-release version of LLVM 2.5 and does not work in current
versions
>> of LLVM"
>>
>> Thanks!
>>
>>
>>
>> On 15 March 2015 at 08:31, Daniel Berlin <dberlin at dberlin.org>
wrote:
>>
>>> A few notes:
>>> 1. "But these standard LLVM AA passes either take a large
amount of time
>>> (Anderson Analysis at cubic time and large memory
requirements)"
>>>
>>> Neither of these is correct. Andersen's is not cubic in
practice, or
>>> large memory in practice, when implemented properly. GCC uses it
by
>>> default as the points-to implementation, and it's never even
close to the
>>> top of the profile.
>>>
>>> It takes about 1 second to do a million lines of code.
>>> And one can do better (gcc's impl is no longer state of the
art).
>>>
>>> 2. The approach to field sensitivity you mention is not going to
work
>>> very well, given how much casting occurs (everything is casted). I
would
>>> suggest using the approach in
http://dl.acm.org/citation.cfm?id=996835
>>>
>>> 3. George, cc'd, is planning on implementing both context
sensitive and
>>> context-insensitive interprocedural analysis in cfl-aa the next
month or
>>> two.
>>>
>>> 4. Using a BDD cloning approach for CFL-AA doesn't make much
sense, the
>>> whole point of CFL is not having to generate explicit points-to
sets if you
>>> don't want to. Plus, followup papers and researchers have
*never* been able
>>> to reproduce the scalability of Whaley's work.
>>>
>>> Not to mention it's done on Java. Java is a language where
doing things
>>> like field-sensitivity always increase precision, which is not true
for C.
>>>
>>> If you really want to attempt this, I would suggest using one of
the
>>> demand driven context-sensitive approaches that will be easy to fit
in CFL.
>>>
>>> On Sat, Mar 14, 2015 at 5:57 AM Mingxing Zhang <james0zan at
gmail.com>
>>> wrote:
>>>
>>>> Hello John,
>>>>
>>>> I've finished the first version of my proposal on enhancing
alias
>>>> analysis.
>>>> The proposal can be downloaded at
>>>> http://james0zan.github.io/resource/GSoC15-Proposal-AA.pdf.
>>>> I hope I've successfully justified the necessity and
benefits of this
>>>> project.
>>>> If possible, please find some time to review it and give me
some more
>>>> feedbacks.
>>>>
>>>> Thank you very much!
>>>>
>>>> P.S. I'm working on the other proposal, a couple of days is
needed.
>>>>
>>>>
>>>>
>>>> On 8 March 2015 at 21:42, Mingxing Zhang <james0zan at
gmail.com> wrote:
>>>>
>>>>> Got it.
>>>>> I'll try to find the applications for field-sensitivity
(and
>>>>> interprocedural, etc) AA analysis.
>>>>> And I'll do some preliminary evaluation on the
tracing/slicing part
>>>>> for the bloat detection.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> On 8 March 2015 at 21:34, John Criswell <jtcriswel at
gmail.com> wrote:
>>>>>
>>>>>> On 3/8/15 8:56 AM, Mingxing Zhang wrote:
>>>>>>
>>>>>> Hello John,
>>>>>>
>>>>>> According to the FAQ, I can submit two proposals
although at most one
>>>>>> of them can be accepted.
>>>>>> Thus I will prepare a proposal for each of the two
projects.
>>>>>>
>>>>>>
>>>>>> Correct. Only one proposal will be accepted.
>>>>>>
>>>>>>
>>>>>> And, after reading the code of cfl-aa and several
related papers,
>>>>>> I've listed four milestones for the AA project:
>>>>>>
>>>>>> 1) In order to use the fast algorithm described in
PLDI'13 [1],
>>>>>> cfl-aa makes a simplification on the CFL defined in
POPL'08 [2], which will
>>>>>> lead to a reduction on precision (I've confirmed
this observation with the
>>>>>> author).
>>>>>> Thus a quantitative measurement on how much is the
reduction is
>>>>>> needed.
>>>>>>
>>>>>> 2) In cfl-aa, different fields of a same struct and the
whole array
>>>>>> are represented by a single node.
>>>>>> This is the reason of the problem 2, 4 listed in
>>>>>> http://llvm.org/OpenProjects.html.
>>>>>> We should split these large nodes.
>>>>>>
>>>>>>
>>>>>> I think the real question is whether the loss of
precision matters,
>>>>>> and if so, to which uses of alias analysis. SAFECode,
for example, wants
>>>>>> field information to determine type safety (so that it
can optimize away
>>>>>> type-safe loads and stores), so field sensitivity
matters. Perhaps field
>>>>>> sensitivity doesn't matter for other applications
(e.g., optimization).
>>>>>> There's no point in improving precision if it
doesn't help the analyses
>>>>>> that people care about most.
>>>>>>
>>>>>> As part of your project, I think you should state the
uses of alias
>>>>>> analysis/points-to analysis that you're aiming to
improve and understand
>>>>>> whether your proposed improvements will help that use.
I would also
>>>>>> recommend picking a use that matters to a significant
portion of the LLVM
>>>>>> community.
>>>>>>
>>>>>>
>>>>>> 3) Handling special global variables, such as errno.
>>>>>>
>>>>>> 4) It seems that the current version of cfl-aa is an
intraprocedural
>>>>>> analysis.
>>>>>> If the time is enough, I think we may extend it to an
interprocedural
>>>>>> analysis.
>>>>>> The algorithm described in [3] can be applied to
scaling it.
>>>>>>
>>>>>> As for the bloat-detection project, the final result
should be a tool
>>>>>> that is verified by known bugs and a set of newly
detected bugs.
>>>>>>
>>>>>>
>>>>>> For the bloat detection tool, I would like to be
convinced that
>>>>>> dynamic tracing will be, or can be, sufficiently
efficient to be
>>>>>> practical. I hate to ask, but I think you need to run
an experiment with
>>>>>> Giri to show that dynamic slicing is going to be
practical for the
>>>>>> executions that you expect to analyze. Either that, or
you need to explain
>>>>>> how you can use something more efficient than dynamic
slicing (note that
>>>>>> dynamic slicing and dynamic tracing are not the same,
so be sure you're
>>>>>> correctly stating which one you need).
>>>>>>
>>>>>>
>>>>>> Do you have any suggestions on these objectives?
>>>>>>
>>>>>>
>>>>>> In your proposal, be sure to include a set of
milestones and how long
>>>>>> you think you will need to achieve those milestones. I
may have said that
>>>>>> before, but it's worth repeating.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> John Criswell
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>>
>>>>>> [1] Fast Algorithms for Dyck-CFL-Reachability with
Applications to
>>>>>> Alias Analysis. PLDI'13
>>>>>> [2] Demand-Driven Alias Analysis for C. POPL'08
>>>>>> [3] Demand-Driven Context-Sensitive Alias Analysis for
Java. ISSTA'11
>>>>>>
>>>>>>
>>>>>> On 5 March 2015 at 09:58, Mingxing Zhang <james0zan
at gmail.com> wrote:
>>>>>>
>>>>>>> Wow, that is cool!
>>>>>>> I'll check about it.
>>>>>>>
>>>>>>> Thank you!
>>>>>>>
>>>>>>> On 4 March 2015 at 21:57, John Criswell
<jtcriswel at gmail.com> wrote:
>>>>>>>
>>>>>>>> On 3/4/15 2:18 AM, Mingxing Zhang wrote:
>>>>>>>>
>>>>>>>> Hello John,
>>>>>>>>
>>>>>>>> Thank you for your advices and congratulations~
>>>>>>>>
>>>>>>>> I'll read the code of cfl-aa and Giri first
and make the decision
>>>>>>>> of which project to pursue.
>>>>>>>> The choice will be reported to this thread once
I made the
>>>>>>>> determination (hopefully within this week).
>>>>>>>>
>>>>>>>>
>>>>>>>> You should check for yourself, but I don't
think anything prevents
>>>>>>>> you from submitting two proposals. If you have
time to write two strong
>>>>>>>> proposals, I see no problem with that.
>>>>>>>>
>>>>>>>> Just make sure that any proposal you write is
strong: it provides a
>>>>>>>> concrete explanation of what you want to do,
some justification for why it
>>>>>>>> would benefit the community (short or long
term), and why you're the person
>>>>>>>> qualified to do it. Proposals should also
include a set of milestones and
>>>>>>>> expected dates for completing those milestones.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> John Criswell
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3 March 2015 at 23:12, John Criswell
<jtcriswel at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Dear Mingxing,
>>>>>>>>>
>>>>>>>>> I think both projects are interesting and
useful.
>>>>>>>>>
>>>>>>>>> Points-to analysis is something that is
needed by research users
>>>>>>>>> of LLVM, but to the best of my knowledge,
no solid implementation currently
>>>>>>>>> exists (although the cfl-aa work being done
at Google may provide us with
>>>>>>>>> something; you should check into it before
writing a proposal). My
>>>>>>>>> interest is in a points-to analysis that is
robust and is useful to both
>>>>>>>>> research and industry users of LLVM. A
points-to analysis proposal must
>>>>>>>>> indicate how it will help both of these
subsets of the LLVM community, and
>>>>>>>>> it must argue why current efforts do not
meet the requirements of both
>>>>>>>>> subsets of the community.
>>>>>>>>>
>>>>>>>>> The runtime bloat tool also looks
interesting, and your approach
>>>>>>>>> (at least to me) is interesting. One
question in my mind, though, is
>>>>>>>>> whether dynamic slicing is going to work
well. Swarup Sahoo and I built a
>>>>>>>>> dynamic slicer for LLVM named Giri, and we
found the tracing required for
>>>>>>>>> dynamic slicing to be slow. For our
purposes, the overhead was okay as we
>>>>>>>>> only needed to record execution until a
crash (which happened quickly). In
>>>>>>>>> your bloat tool, the program will probably
run for awhile, creating a long
>>>>>>>>> trace record. You should take a look at
the Giri code, use it to trace
>>>>>>>>> some programs, and see if the overheads are
going to be tolerable. If they
>>>>>>>>> are not, then your first task would be to
optimize Giri for your bloat tool.
>>>>>>>>>
>>>>>>>>> You should also be more specific about
which LLVM instructions
>>>>>>>>> will be traced. For example, I
wouldn't record the outputs of every LLVM
>>>>>>>>> instruction; I might only record the
outputs of loads and stores or the end
>>>>>>>>> of a def-use chain.
>>>>>>>>>
>>>>>>>>> I'd be interested in mentoring either
project.
>>>>>>>>>
>>>>>>>>> BTW, it looks like your FSE paper won an
award. Congrats.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> John Criswell
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 3/3/15 2:30 AM, Mingxing Zhang wrote:
>>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> As a Ph.D. student majored in Software
Reliability, I have used
>>>>>>>>> LLVM in many of my projects, such as the
Anticipating Invariant (
>>>>>>>>> http://james0zan.github.io/AI.html) and
some other undergoing
>>>>>>>>> ones.
>>>>>>>>> Thus, it would be a great pleasure for me
if I could take this
>>>>>>>>> opportunity to contribute to this awesome
project.
>>>>>>>>>
>>>>>>>>> After reading the idea list
(http://llvm.org/OpenProjects.html),
>>>>>>>>> I was most interested in the idea of
improving the "Pointer and Alias
>>>>>>>>> Analysis" passes.
>>>>>>>>> Could you please give me some more tips or
advices on how to get
>>>>>>>>> started on working on the application?
>>>>>>>>>
>>>>>>>>> Simultaneously, I also have another idea
about using LLVM to
>>>>>>>>> detect runtime bloat, just like the
ThreadSanitizer tool for data races.
>>>>>>>>> If there is anyone here who would like to
mentor this project,
>>>>>>>>> could you please find some time to review
the more detailed
>>>>>>>>> proposal on gist
>>>>>>>>>
<https://gist.github.com/james0zan/d03737c60b10d0d11d34> and give
>>>>>>>>> me some feedbacks?
>>>>>>>>>
>>>>>>>>> P.S.
>>>>>>>>> I do prefer the bloat detection tool, but
I'm not sure about
>>>>>>>>> whether it is suitable for GSoC.
>>>>>>>>> Thus I will apply for the Alias Analysis
one if it is not
>>>>>>>>> suitable.
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Mingxing Zhang
>>>>>>>>>
>>>>>>>>> Tel.: +86-10-62797143
>>>>>>>>> Web: http://james0zan.github.io/
>>>>>>>>> Addr: Room 3-122, FIT Building, Tsinghua
University, Beijing
>>>>>>>>> 100084, China
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
_______________________________________________
>>>>>>>>> LLVM Developers mailing listLLVMdev at
cs.uiuc.edu
http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>
>>>>>>>>> /
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> John Criswell
>>>>>>>>> Assistant Professor
>>>>>>>>> Department of Computer Science, University
of Rochesterhttp://www.cs.rochester.edu/u/criswell
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Mingxing Zhang
>>>>>>>>
>>>>>>>> Tel.: +86-10-62797143
>>>>>>>> Web: http://james0zan.github.io/
>>>>>>>> Addr: Room 3-122, FIT Building, Tsinghua
University, Beijing
>>>>>>>> 100084, China
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> John Criswell
>>>>>>>> Assistant Professor
>>>>>>>> Department of Computer Science, University of
Rochesterhttp://www.cs.rochester.edu/u/criswell
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Mingxing Zhang
>>>>>>>
>>>>>>> Tel.: +86-10-62797143
>>>>>>> Web: http://james0zan.github.io/
>>>>>>> Addr: Room 3-122, FIT Building, Tsinghua
University, Beijing
>>>>>>> 100084, China
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Mingxing Zhang
>>>>>>
>>>>>> Tel.: +86-10-62797143
>>>>>> Web: http://james0zan.github.io/
>>>>>> Addr: Room 3-122, FIT Building, Tsinghua University,
Beijing
>>>>>> 100084, China
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> John Criswell
>>>>>> Assistant Professor
>>>>>> Department of Computer Science, University of
Rochesterhttp://www.cs.rochester.edu/u/criswell
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Mingxing Zhang
>>>>>
>>>>> Tel.: +86-10-62797143
>>>>> Web: http://james0zan.github.io/
>>>>> Addr: Room 3-122, FIT Building, Tsinghua University,
Beijing 100084,
>>>>> China
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Mingxing Zhang
>>>>
>>>> Tel.: +86-10-62797143
>>>> Web: http://james0zan.github.io/
>>>> Addr: Room 3-122, FIT Building, Tsinghua University, Beijing
100084,
>>>> China
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>
>>
>>
>> --
>> Mingxing Zhang
>>
>> Tel.: +86-10-62797143
>> Web: http://james0zan.github.io/
>> Addr: Room 3-122, FIT Building, Tsinghua University, Beijing 100084,
China
>>
>
>
--
Mingxing Zhang
Tel.: +86-10-62797143
Web: http://james0zan.github.io/
Addr: Room 3-122, FIT Building, Tsinghua University, Beijing 100084, China
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150316/8e2a3e67/attachment.html>