thr3ads.net - llvm dev - [llvm-dev] [GSOC] "Project: Improve inter-procedural analyses and optimisations" [Mar 2020]

If this information is useful, please help other people find it:
Share via:

Stefanos Baziotis via llvm-dev

2020-Mar-16 01:53 UTC

[llvm-dev] [GSOC] "Project: Improve inter-procedural analyses and optimisations"

Hi Farad,
> I tried to do this for the NoUnwind attribute Hmm, I don't haveexperience with this attribute but it seems like a good starting point
since it doesn't do much. First of all, be sure that you run with: opt
-passes=attributor -attributor-disable=false This uses the new pass manager
which is another discussion. Now, to the point: If you open nounwind.ll, it
has a bunch of test cases and I don't think it's a good idea to run
Attributor in all of them at first. So, break it into individual tests.
First of all, note that the Attributor follows an optimistic path to
attribute deduction. That is, you always start assuming that an attribute
is valid until you have info that it isn't. Seeing TEST 1 should be
relatively obvious that it doesn't have something that breaks the initial
assumption, that's why you see no change in state between calls to
updateImpl(). But now we have to dig deeper: What can break our initial
assumption (which is that a function does not throw) in the case of
NoUnwind? You see in the topmost updateImpl() that they're a bunch Opcodes.
Those are instructions (inside our function) that can potentially break our
initial assumption (e.g. call to a function that either we know it throws
or at least it is not guaranteed that it doesn't). In the updateImpl(),
using checkForAllInstructions, we loop through all those instructions and
call for each one the predicate CheckForNoUnwind. Note a couple of things:
- TEST 1 has no such instructions whatsoever so if you put a print inside
the predicate, you'll see nothing. - You can see that the predicate returns
a boolean value. If it's true, we continue to the next instruction,
otherwise we stop. If we make it through all of them,
checkForAllInstructions() returns true. Otherwise false. The predicate
checks if any instruction breaks our assumption (we'll see shortly how). If
it does, we immediately indicate pessimistic fixpoint. What about TEST 2
and 3 ? Those 2 functions have a call inside but if you put a print inside
the predicate, you'll see again nothing. The reason for that brings us to
an important part of the Attributor and that is that "deadness" (and
the
relative attribute) is very important. checkForAllInstructions() will only
go through instructions that are considered live. The 2 calls are not
considered because another part of the Attributor (which is out of topic
right now) has (somewhat) deduced that they go in an endless recursion. If
you run the Attributor with these 2 functions you'll see another important
point. Specifically that the function bodies have only an `unreachable`
instruction inside. That is, the attributor not only deduces (and provides)
info through attributes but it also transforms code. In this case it
changed the function bodies to unreachable. Finally, I think it's
interesting to see TEST 4: You see that it calls a function that we don't
know it doesn't throw. This should break our assumption. And it does.
Inside the topmost updateImpl() and inside the predicate, if you put a
print, you'll see the call (e.g. dbgs() << I << "\n").
We ask I.mayThrow().
Note that mayThrow() will return false in the case that we're somehow sure
that the instruction does not throw. In this case the instruction is a
call, so it will return true if we're sure that the called function does
not throw. But we're not, so we move forward. What then happens is a little
bit weird but it basically AANoUnwind asks AANoUnwindCallSite for info
because this instruction is a call site. Attributes ask one another and
this is very important in the Attributor because one's information is
useful in another. We do it with getAAFor. Without getting into too much
info, the other attribute says that it's not assumed unwind so we indicate
pessimistic fixpoint (the reality is that the order of calls between the
attributes is reverse, but again, out of topic). I hope that gave you a
better understanding! > I know how in [1] Johaanes explained the use of
MaxObjSize and Dereferenceable in the AliasAnalysis. But I would be happy
if I could come up with some even better example. As you said, that'll
probably take some time but that's ok There are opportunities everywhere.
For example, consider this: https://godbolt.org/z/HFWo_J It's a for loop
that does a load inside from %p. The load seems to be invariant, we could
move it out of the loop. -licm in the cmd arguments means it invokes the
Loop-Invariant Code Motion pass, which does such things. But it doesn't
move the load out. The reason for that is that consider the case where %n
== 0 and %p == null. In the initial code, we would never get into the loop
and we would not have a trap. While, with this transformation, we will have
and thus, we just changed the initial behavior. So, the transformation is
not done. However, if you put dereferenceable(4) attribute in %p, it will
be done. Because now you now you can certainly dereference. So, attribute
info is useful in ways we may not consider :) > I can start with TODO at
line no. 2605  // TODO: Return the number of reachable queries. I'm not
familiar with it. Currently it doesn't seem to do anything but I may miss
something. I suppose what it asks is to track how many queries to this
attribute have been done by outside users which should be easy. > Since the
code is very large right now, I thought to refer to some of the very
initial patches of attributor Maybe, I don't know. But I assume things will
have changed from then and you may get lost. I'd start by doing diagrams
about how different parts of the code interact with each other (e.g. where
does the Attributor start? what does it call then? how are attributes
created? etc.) When starting out, these things might not be important to
tackle. But they helped me. > How should I indicate to the community that I
have started working towards this issue (should I comment on the issue page
on github?)? I can try to work on AAReachability TODO after solving this
issue.
You can write it in the Github comments. I don't think you can / need to do
something else. Kind regards, Stefanos Baziotis

Στις Δευ, 16 Μαρ 2020 στις 12:12 π.μ., ο/η Fahad Nayyar <
fahad17049 at iiitd.ac.in> έγραψε:
> Dear Stefan and Stefanos,
>
> Thanks for your suggestions!
>
> > I'd suggest that you try to run the Attributor and follow a
specific
> attribute's updates and see what it tries to deduce. That is, see its
> updateImpl(). With a couple of prints you can get a good idea of what it
> does and what info it gets from other attributes (and when it stops).
>
> I tried to do this for the NoUnwind attribute. I printed getState(), i
> sAssumedNoUnwind(), isKnownNoUnwind() in updateImpl method of classes
> AANoUnwind and AANoUnwindCallSite. I run the tests in nounwind.ll
(/llvm/test/Transforms/Attributor/nounwind.ll).
> I used this command to run the test: “opt -attributor
> -attributor-disable=false nounwind.ll -S &> nounwind_out.ll”. But
After
> seeing the output I was not able to understand how the attribute is
> changing for the tests. Its status was almost constant every time updateimp
> was called. Please tell me what other things should I try to print to
> better observe how the NoUnwind attribute is changing over the iterations
> of fix point analysis. Also please verify whether I am using the correct
> command to run the tests.
>
> > Also, probably this will be a very interesting panel discussion for
you:
> https://www.youtube.com/watch?v=cC2cspQgSxM
>
> Thanks for suggesting this! I watched the video and now I understand the
> pros and cons of inlining. But I still think that It would take me a while
> before I can come up with a very good example demonstrating the use of of
> of the Attribues in some IPO pass. I know how in [1] Johaanes explained
> the use of MaxObjSize and Dereferenceable in the AliasAnalysis. But I
> would be happy if I could come up with some even better example.
>
> > You are somewhat right. However, H2S is not about
'use-after-free' bug
> detection, but rather its prevention. We already do this, see example.
> <https://godbolt.org/z/HgrC7H>
>
> Thanks for sharing the example. Just for clarification, was this example
> demonstrating the point that we can automatically correct use-after-free
> bugs using attributes? If yes, then I didn’t understand how and which
> attribute helped in this correction? Also is it not wrong to change the IR
> as in this example? Replacing %1 = tail call noalias i8* @malloc(i64 4)
> ;  tail call void @no_sync_func(i8* %1) with  %1 = alloca i8, i64 4 solved
> the use-after-free bug, but doesn’t it also change the semantic of the
> program?
>
> > In the meantime you could look at some TODOs in the Attributor itself
> and try those you see fit.
>
> I looked up some of the TODOs. I found AAReachability a very interesting
> attribute. I can start with TODO at line no. 2605  // TODO: Return the
> number of reachable queries. I can work towards this TODO. But I first
> want your advice on whether it looks doable for me. I can see that
> implementation of AAReachability attribute is not complete yet. I can try
> to learn more about it from D70233
<https://reviews.llvm.org/D70233>and
> D71617 <https://reviews.llvm.org/D71617>.
>
> I am trying to get more familiar with Attributor’s code. Since the code is
> very large right now, I thought to refer to some of the very initial
> patches of attributor (D59918, <https://reviews.llvm.org/D59918>
D60012
> <https://reviews.llvm.org/D60012>, D63379
> <https://reviews.llvm.org/D63379>). I believe that by looking at
these
> three I can get a better idea of the framework as a whole. Please suggest
> if this is a good idea or not. Also please suggest any other way by which I
> can improve my understanding of the code.
>
> I can see that Johanned have put up some issues for GSOC aspirants. I
> think that [2] <https://github.com/llvm/llvm-project/issues/179>
([Attributor]
> Cleanup and upstream `Attribute::MaxObjectSize`) will be a very good
> issue for me, It seems doable and I can get familiar with the whole process
> of writing a patch for an issue. How should I indicate to the community
> that I have started working towards this issue (should I comment on the
> issue page on github?)? I can try to work on AAReachability TODO after
> solving this issue.
>
> Thanks and Regards
>
>
> References
>
> [1] https://youtu.be/HVvvCSSLiTw
> [2] https://github.com/llvm/llvm-project/issues/179
>
>
>
>
> On Sat, Mar 14, 2020 at 4:12 PM Stefan Stipanovic <stefomeister at
gmail.com>
> wrote:
>
>> Hi Fahad,
>>
>>
>>> > Improve dynamic memory related capabilities of Attributor. For
example
>>> Improve HeapToStackConversions. Maybe such deductions can help
safety
>>> (dis)provers. For example, can we improve the use-after-free bug
>>> detection using some attributes?
>>> Stefan should know more about H2S. Regarding the use-after-free, I
don't
>>> think there's currently any plans for it directly, but they can
be I assume.
>>
>>
>> You are somewhat right. However, H2S is not about
'use-after-free' bug
>> detection, but rather its prevention. We already do this, see example
>> <https://godbolt.org/z/HgrC7H>.
>>
>> In the rest of this post I'll try to help you familiarize yourself
with
>>> the Attributor and maybe answer your questions.
>>> Johannes can then give you specific things to do to get started.
>>
>>
>> In the meantime you could look at some TODOs in the Attributor itself
and
>> try those you see fit.
>>
>> If you have any questions, don't hesitate to ask.
>>
>> -stefan
>>
>> On Fri, Mar 13, 2020 at 10:14 PM Stefanos Baziotis <
>> stefanos.baziotis at gmail.com> wrote:
>>
>>> Hi Fahad,
>>>
>>> We're all happy to see you being interested in LLVM! More so in
the
>>> Attributor! I'm a relatively new contributor so I
>>> think I can help. Please note that the Attributor, apart from
Johannes
>>> (who CC'd), has at least another 2 great
>>> contributors, Hideto and Stefan (who I also CC'd). They were
among the
>>> initial creators.
>>>
>>> In the rest of this post I'll try to help you familiarize
yourself with
>>> the Attributor and maybe answer your questions.
>>> Johannes can then give you specific things to do to get started.
>>>
>>> Starting off, understanding the theory of data-flow analysis can
help.
>>> I'd say don't get too hang up on it, you just
>>> have to understand the idea of fix-point analyses.
>>>
>>> I don't how much you know about the Attributor, so I'll
defer a too long
>>> (or too beginner) description because you might already know
>>> a lot of things. You can of course any specific questions you want:
>>> A summary is:
>>> The Attributor tries to deduce attributes in different points of an
LLVM
>>> IR program (you can see that in the video).
>>> The deduction of these attributes is inter-connected, which is the
whole
>>> point of the Attributor. The attributes
>>> "ask" one another for information. For example, one
attribute tries to
>>> see if a load loads from null pointer.
>>> But the pointer operand might be non-constant (like %v in LLVM IR).
>>> Well, another attribute, whose job is to do value simplification
>>> (i.e. constant folding / propagation etc.) might have folded that
(%v)
>>> into the constant null. So, the former can ask him.
>>> These connections give the power and the complexity.
>>>
>>> The attributes have a state, that changes. When the state stops
>>> changing, it has reached a fixpoint, at which point
>>> the deduction of it stops. From the initialization of the attribute
>>> until a fixpoint is reached, the state changes
>>> in updates (called updateImpl() in the source code). This is where
>>> attributes try to deduce new things, ask one another
>>> and eventually try to reach a fixpoint.
>>>
>>> Finally, a fixpoint can be enforced. Because if we for some reason
never
>>> stop changing, it would run forever.
>>> Note however that attributes should be programmed in a way that
fixpoint
>>> should be able to be reached
>>> (This is where theory might help a little).
>>>
>>> I'd suggest that you try to run the Attributor and follow a
specific
>>> attribute's updates and see what it tries to deduce.
>>> That is, see its updateImpl(). With a couple of prints you can get
a
>>> good idea of what it does and what info it
>>> gets from other attributes (and when it stops). You can of course
ask us
>>> if you're interested in a specific one, if
>>> there's something you don't understand etc.
>>>
>>> Now, to (try to) answer your questions and hopefully other people
can
>>> help.
>>> > How Attributor can help for standard inter-procedural and
>>> intra-procedural analysis passes of LLVm. I’ve seen the tutorial
[4]. I
>>> would like to discuss ways of improving other optimization passes
similarly
>>> (or some examples which have already been implemented).
>>>
>>> The Attributor AFAIK is self-contained. It's not in
"production" yet and
>>> so it's not connected with other passes. At this point, LLVM is
focused on
>>> heavy inlining, which while very useful, you'll lose a lot of
the
>>> interprocedural information.
>>> Note that there are other transforms that do Inter-Procedural
>>> Optimization (
>>>
https://github.com/llvm/llvm-project/tree/master/llvm/lib/Transforms/IPO)
>>> but they don't follow the idea of the Attributor.
>>> But they might follow a fix-point analysis.
>>>
>>> > Improve dynamic memory related capabilities of Attributor. For
example
>>> Improve HeapToStackConversions. Maybe such deductions can help
safety
>>> (dis)provers. For example, can we improve the use-after-free bug
>>> detection using some attributes?
>>> Stefan should know more about H2S. Regarding the use-after-free, I
don't
>>> think there's currently any plans for it directly, but they can
be I assume.
>>>
>>> > Improve Liveness related capabilities of Attributor. Again I
want to
>>> consider whether some attribute deduction can help liveness
(dis)provers.
>>> For example NoReturn, WillReturn can be improved. I am sure these 2
>>> attributes do not cover all the cases as it is an undecidable
problem. But
>>> I was wondering whether there is room for improvement in their
deduction
>>> mechanism. Liveness is certainly something that we're currently
trying to
>>> improve and I don't think we'll ever stop. Most of the
attributes interact
>>> with the deadness attribute (AAIsDead) both for asking it info and
>>> providing it info (i.e. the undefined-behavior attribute hopefully
will at
>>> some point be able to tell AAIsDead that a block is dead because it
>>> contains UB). > Is there any attribute that tells whether a
function
>>> has side-effects (does it always gives the same output for the same
>>> input? Or does it affect some global variable directly or
indirectly?)? No
>>> AFAIK, although you might be interested in this:
>>> https://reviews.llvm.org/D74691#1887983
>>>
>>> I hope this was helpful! Don't hesitate to ask any questions.
>>>
>>> Kind regards,
>>> Stefanos Baziotis
>>>
>>> Στις Παρ, 13 Μαρ 2020 στις 10:25 μ.μ., ο/η Fahad Nayyar via
llvm-dev <
>>> llvm-dev at lists.llvm.org> έγραψε:
>>>
>>>> Hi all,
>>>>
>>>> My name is Fahad Nayyar. I am an undergraduate student from
India.
>>>>
>>>> I am interested to participate in GSOC under the project
“Improve
>>>> inter-procedural analyses and optimizations”.
>>>>
>>>> I have been using LLVM for the past 8 months. I have written
various
>>>> intra-procedural analysis in LLVM as FunctionPass for my course
projects
>>>> and research projects. But I’ve not contributed to the LLVM
community yet.
>>>> I am very excited to contribute to LLVM!
>>>>
>>>> I am not too familiar with the inter-procedural analysis
infrastructure
>>>> of LLVM. I have written small toy inter-procedural dataflow
analysis (like
>>>> taint analysis, reaching definitions, etc) for JAVA programs
using SOOT
>>>> tool *[5].* I am familiar with the theory of inter-procedural
analysis
>>>> (I’ve read some chapters of  [1],  [2] and [3] for this).
>>>>
>>>> I am trying to understand the LLVM’s Attributor framework. I am
>>>> interested in these 3 aspects:
>>>>
>>>>    1.
>>>>
>>>>    How Attributor can help for standard inter-procedural and
>>>>    intra-procedural analysis passes of LLVm. I’ve seen the
tutorial [4].
>>>>    I would like to discuss ways of improving other optimization
passes
>>>>    similarly (or some examples which have already been
implemented).
>>>>    2.
>>>>
>>>>    Improve dynamic memory related capabilities of Attributor.
For
>>>>    example Improve HeapToStackConversions. Maybe such
deductions can
>>>>    help safety (dis)provers. For example, can we improve the
use-after-free
>>>>    bug detection using some attributes?
>>>>    3.
>>>>
>>>>    Improve Liveness related capabilities of Attributor. Again I
want
>>>>    to consider whether some attribute deduction can help
liveness
>>>>    (dis)provers. For example NoReturn, WillReturn can be
improved. I
>>>>    am sure these 2 attributes do not cover all the cases as it
is an
>>>>    undecidable problem. But I was wondering whether there is
room for
>>>>    improvement in their deduction mechanism.
>>>>    4.
>>>>
>>>>    Can we optimize the attribute deduction algorithm to reduce
compile
>>>>    time?
>>>>    5.
>>>>
>>>>    Is there any attribute that tells whether a function has
>>>>    side-effects (does it always gives the same output for the
same
>>>>    input? Or does it affect some global variable directly or
indirectly?)?
>>>>
>>>>
>>>> It would be great if Johannes can provide me some TODOs before
>>>> submitting my proposal. Also please tell some specific IPO
improvement
>>>> goals which you have in mind for this project. I would be most
interested
>>>> in memory-related attributes, liveness deductions from
attributes and
>>>> measurable better IPO using attribute deduction.
>>>>
>>>> Thanks and Regards.
>>>>
>>>> References:
>>>>
>>>> [1] Principles of Program Analysis.
>>>> <https://www.springer.com/gp/book/9783540654100>
>>>>
>>>> [2] Data Flow Analysis: Theory and Practice.
>>>> <https://dl.acm.org/doi/book/10.5555/1592955>
>>>>
>>>> [3] Static Program Analysis.
<https://cs.au.dk/~amoeller/spa/spa.pdf>
>>>>
>>>> [4] 2019 LLVM Developers’ Meeting: J. Doerfert “The Attributor:
A
>>>> Versatile Inter-procedural Fixpoint.."
>>>> <https://www.youtube.com/watch?v=HVvvCSSLiTw>
>>>> [5] Soot - A Java optimization framework
>>>> <https://github.com/Sable/soot>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200316/a5fc3f9b/attachment-0001.html>

Fahad Nayyar via llvm-dev

2020-Mar-16 03:51 UTC

head link

[llvm-dev] [GSOC] "Project: Improve inter-procedural analyses and optimisations"

Dear Stefanos,

Thanks for such a detailed explanation!
I'll have to study your mail and try out some things before I can ask
specific questions for further discussion.

But I want to discuss this point right away:

*> I'd start by doing diagrams about how different parts of the code
interact with each other (e.g. where does the Attributor start? what does
it call then? how are attributes created? etc.) When starting out, these
things might not be important to tackle. But they helped me.*

I totally agree with your point of drawing diagrams about different parts
of the code. I tried this but was not able to succeed. It would be very
helpful if you can tell me what is the entry point of Attributor (which
method is called the very first time?)?
Also, is there any way we can debug opt command in gdb like fashion (ie.
setting breakpoints, stepping through instructions one by one?)?
This would help me a lot in the initial code reading period.

Thanks and Regards
Fahad Nayyar




On Mon, Mar 16, 2020 at 7:23 AM Stefanos Baziotis <
stefanos.baziotis at gmail.com> wrote:
> Hi Farad,
>
> > I tried to do this for the NoUnwind attribute Hmm, I don't have
> experience with this attribute but it seems like a good starting point
> since it doesn't do much. First of all, be sure that you run with: opt
> -passes=attributor -attributor-disable=false This uses the new pass
> manager which is another discussion. Now, to the point: If you open
> nounwind.ll, it has a bunch of test cases and I don't think it's a
good
> idea to run Attributor in all of them at first. So, break it into
> individual tests. First of all, note that the Attributor follows an
> optimistic path to attribute deduction. That is, you always start assuming
> that an attribute is valid until you have info that it isn't. Seeing
TEST 1
> should be relatively obvious that it doesn't have something that breaks
the
> initial assumption, that's why you see no change in state between calls
to
> updateImpl(). But now we have to dig deeper: What can break our initial
> assumption (which is that a function does not throw) in the case of
> NoUnwind? You see in the topmost updateImpl() that they're a bunch
Opcodes.
> Those are instructions (inside our function) that can potentially break our
> initial assumption (e.g. call to a function that either we know it throws
> or at least it is not guaranteed that it doesn't). In the updateImpl(),
> using checkForAllInstructions, we loop through all those instructions and
> call for each one the predicate CheckForNoUnwind. Note a couple of things:
> - TEST 1 has no such instructions whatsoever so if you put a print inside
> the predicate, you'll see nothing. - You can see that the predicate
returns
> a boolean value. If it's true, we continue to the next instruction,
> otherwise we stop. If we make it through all of them,
> checkForAllInstructions() returns true. Otherwise false. The predicate
> checks if any instruction breaks our assumption (we'll see shortly
how). If
> it does, we immediately indicate pessimistic fixpoint. What about TEST 2
> and 3 ? Those 2 functions have a call inside but if you put a print inside
> the predicate, you'll see again nothing. The reason for that brings us
to
> an important part of the Attributor and that is that "deadness"
(and the
> relative attribute) is very important. checkForAllInstructions() will only
> go through instructions that are considered live. The 2 calls are not
> considered because another part of the Attributor (which is out of topic
> right now) has (somewhat) deduced that they go in an endless recursion. If
> you run the Attributor with these 2 functions you'll see another
important
> point. Specifically that the function bodies have only an `unreachable`
> instruction inside. That is, the attributor not only deduces (and provides)
> info through attributes but it also transforms code. In this case it
> changed the function bodies to unreachable. Finally, I think it's
> interesting to see TEST 4: You see that it calls a function that we
don't
> know it doesn't throw. This should break our assumption. And it does.
> Inside the topmost updateImpl() and inside the predicate, if you put a
> print, you'll see the call (e.g. dbgs() << I <<
"\n"). We ask I.mayThrow().
> Note that mayThrow() will return false in the case that we're somehow
sure
> that the instruction does not throw. In this case the instruction is a
> call, so it will return true if we're sure that the called function
does
> not throw. But we're not, so we move forward. What then happens is a
little
> bit weird but it basically AANoUnwind asks AANoUnwindCallSite for info
> because this instruction is a call site. Attributes ask one another and
> this is very important in the Attributor because one's information is
> useful in another. We do it with getAAFor. Without getting into too much
> info, the other attribute says that it's not assumed unwind so we
indicate
> pessimistic fixpoint (the reality is that the order of calls between the
> attributes is reverse, but again, out of topic). I hope that gave you a
> better understanding! > I know how in [1] Johaanes explained the use of
> MaxObjSize and Dereferenceable in the AliasAnalysis. But I would be happy
> if I could come up with some even better example. As you said, that'll
> probably take some time but that's ok There are opportunities
everywhere.
> For example, consider this: https://godbolt.org/z/HFWo_J It's a for
loop
> that does a load inside from %p. The load seems to be invariant, we could
> move it out of the loop. -licm in the cmd arguments means it invokes the
> Loop-Invariant Code Motion pass, which does such things. But it doesn't
> move the load out. The reason for that is that consider the case where %n
> == 0 and %p == null. In the initial code, we would never get into the loop
> and we would not have a trap. While, with this transformation, we will have
> and thus, we just changed the initial behavior. So, the transformation is
> not done. However, if you put dereferenceable(4) attribute in %p, it will
> be done. Because now you now you can certainly dereference. So, attribute
> info is useful in ways we may not consider :) > I can start with TODO at
> line no. 2605  // TODO: Return the number of reachable queries. I'm not
> familiar with it. Currently it doesn't seem to do anything but I may
miss
> something. I suppose what it asks is to track how many queries to this
> attribute have been done by outside users which should be easy. > Since
> the code is very large right now, I thought to refer to some of the very
> initial patches of attributor Maybe, I don't know. But I assume things
> will have changed from then and you may get lost. I'd start by doing
> diagrams about how different parts of the code interact with each other
> (e.g. where does the Attributor start? what does it call then? how are
> attributes created? etc.) When starting out, these things might not be
> important to tackle. But they helped me. > How should I indicate to the
> community that I have started working towards this issue (should I comment
> on the issue page on github?)? I can try to work on AAReachability TODO
after
> solving this issue.
> You can write it in the Github comments. I don't think you can / need
to
> do something else. Kind regards, Stefanos Baziotis
>
> Στις Δευ, 16 Μαρ 2020 στις 12:12 π.μ., ο/η Fahad Nayyar <
> fahad17049 at iiitd.ac.in> έγραψε:
>
>> Dear Stefan and Stefanos,
>>
>> Thanks for your suggestions!
>>
>> > I'd suggest that you try to run the Attributor and follow a
specific
>> attribute's updates and see what it tries to deduce. That is, see
its
>> updateImpl(). With a couple of prints you can get a good idea of what
it
>> does and what info it gets from other attributes (and when it stops).
>>
>> I tried to do this for the NoUnwind attribute. I printed getState(), i
>> sAssumedNoUnwind(), isKnownNoUnwind() in updateImpl method of classes
>> AANoUnwind and AANoUnwindCallSite. I run the tests in nounwind.ll
(/llvm/test/Transforms/Attributor/nounwind.ll).
>> I used this command to run the test: “opt -attributor
>> -attributor-disable=false nounwind.ll -S &> nounwind_out.ll”.
But After
>> seeing the output I was not able to understand how the attribute is
>> changing for the tests. Its status was almost constant every time
updateimp
>> was called. Please tell me what other things should I try to print to
>> better observe how the NoUnwind attribute is changing over the
>> iterations of fix point analysis. Also please verify whether I am using
the
>> correct command to run the tests.
>>
>> > Also, probably this will be a very interesting panel discussion
for
>> you: https://www.youtube.com/watch?v=cC2cspQgSxM
>>
>> Thanks for suggesting this! I watched the video and now I understand
the
>> pros and cons of inlining. But I still think that It would take me a
while
>> before I can come up with a very good example demonstrating the use of
of
>> of the Attribues in some IPO pass. I know how in [1] Johaanes explained
>> the use of MaxObjSize and Dereferenceable in the AliasAnalysis. But I
>> would be happy if I could come up with some even better example.
>>
>> > You are somewhat right. However, H2S is not about
'use-after-free' bug
>> detection, but rather its prevention. We already do this, see example.
>> <https://godbolt.org/z/HgrC7H>
>>
>> Thanks for sharing the example. Just for clarification, was this
example
>> demonstrating the point that we can automatically correct
use-after-free
>> bugs using attributes? If yes, then I didn’t understand how and which
>> attribute helped in this correction? Also is it not wrong to change the
IR
>> as in this example? Replacing %1 = tail call noalias i8* @malloc(i64 4)
>> ;  tail call void @no_sync_func(i8* %1) with  %1 = alloca i8, i64 4
solved
>> the use-after-free bug, but doesn’t it also change the semantic of the
>> program?
>>
>> > In the meantime you could look at some TODOs in the Attributor
itself
>> and try those you see fit.
>>
>> I looked up some of the TODOs. I found AAReachability a very
interesting
>> attribute. I can start with TODO at line no. 2605  // TODO: Return the
>> number of reachable queries. I can work towards this TODO. But I first
>> want your advice on whether it looks doable for me. I can see that
>> implementation of AAReachability attribute is not complete yet. I can
>> try to learn more about it from D70233
<https://reviews.llvm.org/D70233>
>> and D71617 <https://reviews.llvm.org/D71617>.
>>
>> I am trying to get more familiar with Attributor’s code. Since the code
>> is very large right now, I thought to refer to some of the very initial
>> patches of attributor (D59918, <https://reviews.llvm.org/D59918>
D60012
>> <https://reviews.llvm.org/D60012>, D63379
>> <https://reviews.llvm.org/D63379>). I believe that by looking at
these
>> three I can get a better idea of the framework as a whole. Please
suggest
>> if this is a good idea or not. Also please suggest any other way by
which I
>> can improve my understanding of the code.
>>
>> I can see that Johanned have put up some issues for GSOC aspirants. I
>> think that [2] <https://github.com/llvm/llvm-project/issues/179>
([Attributor]
>> Cleanup and upstream `Attribute::MaxObjectSize`) will be a very good
>> issue for me, It seems doable and I can get familiar with the whole
process
>> of writing a patch for an issue. How should I indicate to the community
>> that I have started working towards this issue (should I comment on the
>> issue page on github?)? I can try to work on AAReachability TODO after
>> solving this issue.
>>
>> Thanks and Regards
>>
>>
>> References
>>
>> [1] https://youtu.be/HVvvCSSLiTw
>> [2] https://github.com/llvm/llvm-project/issues/179
>>
>>
>>
>>
>> On Sat, Mar 14, 2020 at 4:12 PM Stefan Stipanovic <stefomeister at
gmail.com>
>> wrote:
>>
>>> Hi Fahad,
>>>
>>>
>>>> > Improve dynamic memory related capabilities of Attributor.
For
>>>> example Improve HeapToStackConversions. Maybe such deductions
can help
>>>> safety (dis)provers. For example, can we improve the
use-after-free
>>>> bug detection using some attributes?
>>>> Stefan should know more about H2S. Regarding the
use-after-free, I
>>>> don't think there's currently any plans for it
directly, but they can be I
>>>> assume.
>>>
>>>
>>> You are somewhat right. However, H2S is not about
'use-after-free' bug
>>> detection, but rather its prevention. We already do this, see
example
>>> <https://godbolt.org/z/HgrC7H>.
>>>
>>> In the rest of this post I'll try to help you familiarize
yourself with
>>>> the Attributor and maybe answer your questions.
>>>> Johannes can then give you specific things to do to get
started.
>>>
>>>
>>> In the meantime you could look at some TODOs in the Attributor
itself
>>> and try those you see fit.
>>>
>>> If you have any questions, don't hesitate to ask.
>>>
>>> -stefan
>>>
>>> On Fri, Mar 13, 2020 at 10:14 PM Stefanos Baziotis <
>>> stefanos.baziotis at gmail.com> wrote:
>>>
>>>> Hi Fahad,
>>>>
>>>> We're all happy to see you being interested in LLVM! More
so in the
>>>> Attributor! I'm a relatively new contributor so I
>>>> think I can help. Please note that the Attributor, apart from
Johannes
>>>> (who CC'd), has at least another 2 great
>>>> contributors, Hideto and Stefan (who I also CC'd). They
were among the
>>>> initial creators.
>>>>
>>>> In the rest of this post I'll try to help you familiarize
yourself with
>>>> the Attributor and maybe answer your questions.
>>>> Johannes can then give you specific things to do to get
started.
>>>>
>>>> Starting off, understanding the theory of data-flow analysis
can help.
>>>> I'd say don't get too hang up on it, you just
>>>> have to understand the idea of fix-point analyses.
>>>>
>>>> I don't how much you know about the Attributor, so I'll
defer a too
>>>> long (or too beginner) description because you might already
know
>>>> a lot of things. You can of course any specific questions you
want:
>>>> A summary is:
>>>> The Attributor tries to deduce attributes in different points
of an
>>>> LLVM IR program (you can see that in the video).
>>>> The deduction of these attributes is inter-connected, which is
the
>>>> whole point of the Attributor. The attributes
>>>> "ask" one another for information. For example, one
attribute tries to
>>>> see if a load loads from null pointer.
>>>> But the pointer operand might be non-constant (like %v in LLVM
IR).
>>>> Well, another attribute, whose job is to do value
simplification
>>>> (i.e. constant folding / propagation etc.) might have folded
that (%v)
>>>> into the constant null. So, the former can ask him.
>>>> These connections give the power and the complexity.
>>>>
>>>> The attributes have a state, that changes. When the state stops
>>>> changing, it has reached a fixpoint, at which point
>>>> the deduction of it stops. From the initialization of the
attribute
>>>> until a fixpoint is reached, the state changes
>>>> in updates (called updateImpl() in the source code). This is
where
>>>> attributes try to deduce new things, ask one another
>>>> and eventually try to reach a fixpoint.
>>>>
>>>> Finally, a fixpoint can be enforced. Because if we for some
reason
>>>> never stop changing, it would run forever.
>>>> Note however that attributes should be programmed in a way that
>>>> fixpoint should be able to be reached
>>>> (This is where theory might help a little).
>>>>
>>>> I'd suggest that you try to run the Attributor and follow a
specific
>>>> attribute's updates and see what it tries to deduce.
>>>> That is, see its updateImpl(). With a couple of prints you can
get a
>>>> good idea of what it does and what info it
>>>> gets from other attributes (and when it stops). You can of
course ask
>>>> us if you're interested in a specific one, if
>>>> there's something you don't understand etc.
>>>>
>>>> Now, to (try to) answer your questions and hopefully other
people can
>>>> help.
>>>> > How Attributor can help for standard inter-procedural and
>>>> intra-procedural analysis passes of LLVm. I’ve seen the
tutorial [4].
>>>> I would like to discuss ways of improving other optimization
passes
>>>> similarly (or some examples which have already been
implemented).
>>>>
>>>> The Attributor AFAIK is self-contained. It's not in
"production" yet
>>>> and so it's not connected with other passes. At this point,
LLVM is focused
>>>> on heavy inlining, which while very useful, you'll lose a
lot of the
>>>> interprocedural information.
>>>> Note that there are other transforms that do Inter-Procedural
>>>> Optimization (
>>>>
https://github.com/llvm/llvm-project/tree/master/llvm/lib/Transforms/IPO)
>>>> but they don't follow the idea of the Attributor.
>>>> But they might follow a fix-point analysis.
>>>>
>>>> > Improve dynamic memory related capabilities of Attributor.
For
>>>> example Improve HeapToStackConversions. Maybe such deductions
can help
>>>> safety (dis)provers. For example, can we improve the
use-after-free
>>>> bug detection using some attributes?
>>>> Stefan should know more about H2S. Regarding the
use-after-free, I
>>>> don't think there's currently any plans for it
directly, but they can be I
>>>> assume.
>>>>
>>>> > Improve Liveness related capabilities of Attributor. Again
I want to
>>>> consider whether some attribute deduction can help liveness
(dis)provers.
>>>> For example NoReturn, WillReturn can be improved. I am sure
these 2
>>>> attributes do not cover all the cases as it is an undecidable
problem. But
>>>> I was wondering whether there is room for improvement in their
deduction
>>>> mechanism. Liveness is certainly something that we're
currently trying to
>>>> improve and I don't think we'll ever stop. Most of the
attributes interact
>>>> with the deadness attribute (AAIsDead) both for asking it info
and
>>>> providing it info (i.e. the undefined-behavior attribute
hopefully will at
>>>> some point be able to tell AAIsDead that a block is dead
because it
>>>> contains UB). > Is there any attribute that tells whether a
function
>>>> has side-effects (does it always gives the same output for the
same
>>>> input? Or does it affect some global variable directly or
indirectly?)? No
>>>> AFAIK, although you might be interested in this:
>>>> https://reviews.llvm.org/D74691#1887983
>>>>
>>>> I hope this was helpful! Don't hesitate to ask any
questions.
>>>>
>>>> Kind regards,
>>>> Stefanos Baziotis
>>>>
>>>> Στις Παρ, 13 Μαρ 2020 στις 10:25 μ.μ., ο/η Fahad Nayyar via
llvm-dev <
>>>> llvm-dev at lists.llvm.org> έγραψε:
>>>>
>>>>> Hi all,
>>>>>
>>>>> My name is Fahad Nayyar. I am an undergraduate student from
India.
>>>>>
>>>>> I am interested to participate in GSOC under the project
“Improve
>>>>> inter-procedural analyses and optimizations”.
>>>>>
>>>>> I have been using LLVM for the past 8 months. I have
written various
>>>>> intra-procedural analysis in LLVM as FunctionPass for my
course projects
>>>>> and research projects. But I’ve not contributed to the LLVM
community yet.
>>>>> I am very excited to contribute to LLVM!
>>>>>
>>>>> I am not too familiar with the inter-procedural analysis
>>>>> infrastructure of LLVM. I have written small toy
inter-procedural dataflow
>>>>> analysis (like taint analysis, reaching definitions, etc)
for JAVA programs
>>>>> using SOOT tool *[5].* I am familiar with the theory of
>>>>> inter-procedural analysis (I’ve read some chapters of  [1],
[2] and
>>>>> [3] for this).
>>>>>
>>>>> I am trying to understand the LLVM’s Attributor framework.
I am
>>>>> interested in these 3 aspects:
>>>>>
>>>>>    1.
>>>>>
>>>>>    How Attributor can help for standard inter-procedural
and
>>>>>    intra-procedural analysis passes of LLVm. I’ve seen the
tutorial
>>>>>    [4]. I would like to discuss ways of improving other
optimization
>>>>>    passes similarly (or some examples which have already
been implemented).
>>>>>    2.
>>>>>
>>>>>    Improve dynamic memory related capabilities of
Attributor. For
>>>>>    example Improve HeapToStackConversions. Maybe such
deductions can
>>>>>    help safety (dis)provers. For example, can we improve
the use-after-free
>>>>>    bug detection using some attributes?
>>>>>    3.
>>>>>
>>>>>    Improve Liveness related capabilities of Attributor.
Again I want
>>>>>    to consider whether some attribute deduction can help
liveness
>>>>>    (dis)provers. For example NoReturn, WillReturn can be
improved. I
>>>>>    am sure these 2 attributes do not cover all the cases as
it is an
>>>>>    undecidable problem. But I was wondering whether there
is room for
>>>>>    improvement in their deduction mechanism.
>>>>>    4.
>>>>>
>>>>>    Can we optimize the attribute deduction algorithm to
reduce
>>>>>    compile time?
>>>>>    5.
>>>>>
>>>>>    Is there any attribute that tells whether a function has
>>>>>    side-effects (does it always gives the same output for
the same
>>>>>    input? Or does it affect some global variable directly
or indirectly?)?
>>>>>
>>>>>
>>>>> It would be great if Johannes can provide me some TODOs
before
>>>>> submitting my proposal. Also please tell some specific IPO
improvement
>>>>> goals which you have in mind for this project. I would be
most interested
>>>>> in memory-related attributes, liveness deductions from
attributes and
>>>>> measurable better IPO using attribute deduction.
>>>>>
>>>>> Thanks and Regards.
>>>>>
>>>>> References:
>>>>>
>>>>> [1] Principles of Program Analysis.
>>>>> <https://www.springer.com/gp/book/9783540654100>
>>>>>
>>>>> [2] Data Flow Analysis: Theory and Practice.
>>>>> <https://dl.acm.org/doi/book/10.5555/1592955>
>>>>>
>>>>> [3] Static Program Analysis.
<https://cs.au.dk/~amoeller/spa/spa.pdf>
>>>>>
>>>>> [4] 2019 LLVM Developers’ Meeting: J. Doerfert “The
Attributor: A
>>>>> Versatile Inter-procedural Fixpoint.."
>>>>> <https://www.youtube.com/watch?v=HVvvCSSLiTw>
>>>>> [5] Soot - A Java optimization framework
>>>>> <https://github.com/Sable/soot>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200316/a9e52d38/attachment-0001.html>

Stefanos Baziotis via llvm-dev

2020-Mar-16 14:11 UTC

head link

[llvm-dev] [GSOC] "Project: Improve inter-procedural analyses and optimisations"

> Thanks for such a detailed explanation!
> I'll have to study your mail and try out some things before I can askspecific questions for further discussion.

No problem, follow-up with questions if any.
> I totally agree with your point of drawing diagrams about different partsof the code. I tried this but was not able to succeed. It would be very
helpful if you can tell me what is the entry point of Attributor (which
method is called the very first time?)? It's runAttributorOnFunctions.
I'm
not sure it's good to start seeing the whole picture right now, but in any
case, you can follow that if at any point you want.> Also, is there any way we can debug opt command in gdb like fashion (ie.setting breakpoints, stepping through instructions one by one?)? It depends
on how you built LLVM. If you built it with -DCMAKE_BUILD_TYPE=Release, you
can't. However note that if you built it in Release with asserts on, i.e.
-DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=On, you'll have the
-debug option in opt, which can be useful at times. To be able to have
debug info, you have to build it with RelWithDebInfo or Debug (in
CMAKE_BUILD_TYPE, check: https://llvm.org/docs/CMake.html). Although in the
former, even if you enable assertions, for some reason you don't have
-debug last time I checked. Note that in both of these settings, the binary
size (and the compilation time) grow a lot (i.e. several tens of gigabytes
in Debug IIRC). Best, Stefanos Baziotis


Στις Δευ, 16 Μαρ 2020 στις 5:52 π.μ., ο/η Fahad Nayyar <
fahad17049 at iiitd.ac.in> έγραψε:
> Dear Stefanos,
>
> Thanks for such a detailed explanation!
> I'll have to study your mail and try out some things before I can ask
> specific questions for further discussion.
>
> But I want to discuss this point right away:
>
> *> I'd start by doing diagrams about how different parts of the code
> interact with each other (e.g. where does the Attributor start? what does
> it call then? how are attributes created? etc.) When starting out, these
> things might not be important to tackle. But they helped me.*
>
> I totally agree with your point of drawing diagrams about different parts
> of the code. I tried this but was not able to succeed. It would be very
> helpful if you can tell me what is the entry point of Attributor (which
> method is called the very first time?)?
> Also, is there any way we can debug opt command in gdb like fashion (ie.
> setting breakpoints, stepping through instructions one by one?)?
> This would help me a lot in the initial code reading period.
>
> Thanks and Regards
> Fahad Nayyar
>
>
>
>
> On Mon, Mar 16, 2020 at 7:23 AM Stefanos Baziotis <
> stefanos.baziotis at gmail.com> wrote:
>
>> Hi Farad,
>>
>> > I tried to do this for the NoUnwind attribute Hmm, I don't
have
>> experience with this attribute but it seems like a good starting point
>> since it doesn't do much. First of all, be sure that you run with:
opt
>> -passes=attributor -attributor-disable=false This uses the new pass
>> manager which is another discussion. Now, to the point: If you open
>> nounwind.ll, it has a bunch of test cases and I don't think
it's a good
>> idea to run Attributor in all of them at first. So, break it into
>> individual tests. First of all, note that the Attributor follows an
>> optimistic path to attribute deduction. That is, you always start
assuming
>> that an attribute is valid until you have info that it isn't.
Seeing TEST 1
>> should be relatively obvious that it doesn't have something that
breaks the
>> initial assumption, that's why you see no change in state between
calls to
>> updateImpl(). But now we have to dig deeper: What can break our initial
>> assumption (which is that a function does not throw) in the case of
>> NoUnwind? You see in the topmost updateImpl() that they're a bunch
Opcodes.
>> Those are instructions (inside our function) that can potentially break
our
>> initial assumption (e.g. call to a function that either we know it
throws
>> or at least it is not guaranteed that it doesn't). In the
updateImpl(),
>> using checkForAllInstructions, we loop through all those instructions
and
>> call for each one the predicate CheckForNoUnwind. Note a couple of
things:
>> - TEST 1 has no such instructions whatsoever so if you put a print
inside
>> the predicate, you'll see nothing. - You can see that the predicate
returns
>> a boolean value. If it's true, we continue to the next instruction,
>> otherwise we stop. If we make it through all of them,
>> checkForAllInstructions() returns true. Otherwise false. The predicate
>> checks if any instruction breaks our assumption (we'll see shortly
how). If
>> it does, we immediately indicate pessimistic fixpoint. What about TEST
2
>> and 3 ? Those 2 functions have a call inside but if you put a print
inside
>> the predicate, you'll see again nothing. The reason for that brings
us to
>> an important part of the Attributor and that is that
"deadness" (and the
>> relative attribute) is very important. checkForAllInstructions() will
only
>> go through instructions that are considered live. The 2 calls are not
>> considered because another part of the Attributor (which is out of
topic
>> right now) has (somewhat) deduced that they go in an endless recursion.
If
>> you run the Attributor with these 2 functions you'll see another
important
>> point. Specifically that the function bodies have only an `unreachable`
>> instruction inside. That is, the attributor not only deduces (and
provides)
>> info through attributes but it also transforms code. In this case it
>> changed the function bodies to unreachable. Finally, I think it's
>> interesting to see TEST 4: You see that it calls a function that we
don't
>> know it doesn't throw. This should break our assumption. And it
does.
>> Inside the topmost updateImpl() and inside the predicate, if you put a
>> print, you'll see the call (e.g. dbgs() << I <<
"\n"). We ask I.mayThrow().
>> Note that mayThrow() will return false in the case that we're
somehow sure
>> that the instruction does not throw. In this case the instruction is a
>> call, so it will return true if we're sure that the called function
does
>> not throw. But we're not, so we move forward. What then happens is
a little
>> bit weird but it basically AANoUnwind asks AANoUnwindCallSite for info
>> because this instruction is a call site. Attributes ask one another and
>> this is very important in the Attributor because one's information
is
>> useful in another. We do it with getAAFor. Without getting into too
much
>> info, the other attribute says that it's not assumed unwind so we
indicate
>> pessimistic fixpoint (the reality is that the order of calls between
the
>> attributes is reverse, but again, out of topic). I hope that gave you a
>> better understanding! > I know how in [1] Johaanes explained the use
of
>> MaxObjSize and Dereferenceable in the AliasAnalysis. But I would be
>> happy if I could come up with some even better example. As you said,
>> that'll probably take some time but that's ok There are
opportunities
>> everywhere. For example, consider this: https://godbolt.org/z/HFWo_J
>> It's a for loop that does a load inside from %p. The load seems to
be
>> invariant, we could move it out of the loop. -licm in the cmd arguments
>> means it invokes the Loop-Invariant Code Motion pass, which does such
>> things. But it doesn't move the load out. The reason for that is
that
>> consider the case where %n == 0 and %p == null. In the initial code, we
>> would never get into the loop and we would not have a trap. While, with
>> this transformation, we will have and thus, we just changed the initial
>> behavior. So, the transformation is not done. However, if you put
>> dereferenceable(4) attribute in %p, it will be done. Because now you
now
>> you can certainly dereference. So, attribute info is useful in ways we
may
>> not consider :) > I can start with TODO at line no. 2605  // TODO:
>> Return the number of reachable queries. I'm not familiar with it.
>> Currently it doesn't seem to do anything but I may miss something.
I
>> suppose what it asks is to track how many queries to this attribute
have
>> been done by outside users which should be easy. > Since the code is
>> very large right now, I thought to refer to some of the very initial
>> patches of attributor Maybe, I don't know. But I assume things will
have
>> changed from then and you may get lost. I'd start by doing diagrams
about
>> how different parts of the code interact with each other (e.g. where
does
>> the Attributor start? what does it call then? how are attributes
created?
>> etc.) When starting out, these things might not be important to tackle.
But
>> they helped me. > How should I indicate to the community that I have
>> started working towards this issue (should I comment on the issue page
on
>> github?)? I can try to work on AAReachability TODO after solving this
>> issue.
>> You can write it in the Github comments. I don't think you can /
need to
>> do something else. Kind regards, Stefanos Baziotis
>>
>> Στις Δευ, 16 Μαρ 2020 στις 12:12 π.μ., ο/η Fahad Nayyar <
>> fahad17049 at iiitd.ac.in> έγραψε:
>>
>>> Dear Stefan and Stefanos,
>>>
>>> Thanks for your suggestions!
>>>
>>> > I'd suggest that you try to run the Attributor and follow
a specific
>>> attribute's updates and see what it tries to deduce. That is,
see its
>>> updateImpl(). With a couple of prints you can get a good idea of
what it
>>> does and what info it gets from other attributes (and when it
stops).
>>>
>>> I tried to do this for the NoUnwind attribute. I printed
getState(), i
>>> sAssumedNoUnwind(), isKnownNoUnwind() in updateImpl method of
classes
>>> AANoUnwind and AANoUnwindCallSite. I run the tests in nounwind.ll
(/llvm/test/Transforms/Attributor/nounwind.ll).
>>> I used this command to run the test: “opt -attributor
>>> -attributor-disable=false nounwind.ll -S &>
nounwind_out.ll”. But After
>>> seeing the output I was not able to understand how the attribute is
>>> changing for the tests. Its status was almost constant every time
updateimp
>>> was called. Please tell me what other things should I try to print
to
>>> better observe how the NoUnwind attribute is changing over the
>>> iterations of fix point analysis. Also please verify whether I am
using the
>>> correct command to run the tests.
>>>
>>> > Also, probably this will be a very interesting panel
discussion for
>>> you: https://www.youtube.com/watch?v=cC2cspQgSxM
>>>
>>> Thanks for suggesting this! I watched the video and now I
understand the
>>> pros and cons of inlining. But I still think that It would take me
a while
>>> before I can come up with a very good example demonstrating the use
of of
>>> of the Attribues in some IPO pass. I know how in [1] Johaanes
explained
>>> the use of MaxObjSize and Dereferenceable in the AliasAnalysis. But
I
>>> would be happy if I could come up with some even better example.
>>>
>>> > You are somewhat right. However, H2S is not about
'use-after-free' bug
>>> detection, but rather its prevention. We already do this, see
example.
>>> <https://godbolt.org/z/HgrC7H>
>>>
>>> Thanks for sharing the example. Just for clarification, was this
example
>>> demonstrating the point that we can automatically correct
use-after-free
>>> bugs using attributes? If yes, then I didn’t understand how and
which
>>> attribute helped in this correction? Also is it not wrong to change
the IR
>>> as in this example? Replacing %1 = tail call noalias i8*
@malloc(i64 4)
>>> ;  tail call void @no_sync_func(i8* %1) with  %1 = alloca i8, i64 4
solved
>>> the use-after-free bug, but doesn’t it also change the semantic of
the
>>> program?
>>>
>>> > In the meantime you could look at some TODOs in the Attributor
itself
>>> and try those you see fit.
>>>
>>> I looked up some of the TODOs. I found AAReachability a very
>>> interesting attribute. I can start with TODO at line no. 2605  //
TODO:
>>> Return the number of reachable queries. I can work towards this
TODO.
>>> But I first want your advice on whether it looks doable for me. I
can see
>>> that implementation of AAReachability attribute is not complete
yet. I
>>> can try to learn more about it from D70233
>>> <https://reviews.llvm.org/D70233>and D71617
>>> <https://reviews.llvm.org/D71617>.
>>>
>>> I am trying to get more familiar with Attributor’s code. Since the
code
>>> is very large right now, I thought to refer to some of the very
initial
>>> patches of attributor (D59918,
<https://reviews.llvm.org/D59918> D60012
>>> <https://reviews.llvm.org/D60012>, D63379
>>> <https://reviews.llvm.org/D63379>). I believe that by looking
at these
>>> three I can get a better idea of the framework as a whole. Please
suggest
>>> if this is a good idea or not. Also please suggest any other way by
which I
>>> can improve my understanding of the code.
>>>
>>> I can see that Johanned have put up some issues for GSOC aspirants.
I
>>> think that [2]
<https://github.com/llvm/llvm-project/issues/179> ([Attributor]
>>> Cleanup and upstream `Attribute::MaxObjectSize`) will be a very
good
>>> issue for me, It seems doable and I can get familiar with the whole
process
>>> of writing a patch for an issue. How should I indicate to the
community
>>> that I have started working towards this issue (should I comment on
the
>>> issue page on github?)? I can try to work on AAReachability TODO
after
>>> solving this issue.
>>>
>>> Thanks and Regards
>>>
>>>
>>> References
>>>
>>> [1] https://youtu.be/HVvvCSSLiTw
>>> [2] https://github.com/llvm/llvm-project/issues/179
>>>
>>>
>>>
>>>
>>> On Sat, Mar 14, 2020 at 4:12 PM Stefan Stipanovic <
>>> stefomeister at gmail.com> wrote:
>>>
>>>> Hi Fahad,
>>>>
>>>>
>>>>> > Improve dynamic memory related capabilities of
Attributor. For
>>>>> example Improve HeapToStackConversions. Maybe such
deductions can
>>>>> help safety (dis)provers. For example, can we improve the
use-after-free
>>>>> bug detection using some attributes?
>>>>> Stefan should know more about H2S. Regarding the
use-after-free, I
>>>>> don't think there's currently any plans for it
directly, but they can be I
>>>>> assume.
>>>>
>>>>
>>>> You are somewhat right. However, H2S is not about
'use-after-free' bug
>>>> detection, but rather its prevention. We already do this, see
example
>>>> <https://godbolt.org/z/HgrC7H>.
>>>>
>>>> In the rest of this post I'll try to help you familiarize
yourself with
>>>>> the Attributor and maybe answer your questions.
>>>>> Johannes can then give you specific things to do to get
started.
>>>>
>>>>
>>>> In the meantime you could look at some TODOs in the Attributor
itself
>>>> and try those you see fit.
>>>>
>>>> If you have any questions, don't hesitate to ask.
>>>>
>>>> -stefan
>>>>
>>>> On Fri, Mar 13, 2020 at 10:14 PM Stefanos Baziotis <
>>>> stefanos.baziotis at gmail.com> wrote:
>>>>
>>>>> Hi Fahad,
>>>>>
>>>>> We're all happy to see you being interested in LLVM!
More so in the
>>>>> Attributor! I'm a relatively new contributor so I
>>>>> think I can help. Please note that the Attributor, apart
from Johannes
>>>>> (who CC'd), has at least another 2 great
>>>>> contributors, Hideto and Stefan (who I also CC'd). They
were among the
>>>>> initial creators.
>>>>>
>>>>> In the rest of this post I'll try to help you
familiarize yourself
>>>>> with the Attributor and maybe answer your questions.
>>>>> Johannes can then give you specific things to do to get
started.
>>>>>
>>>>> Starting off, understanding the theory of data-flow
analysis can help.
>>>>> I'd say don't get too hang up on it, you just
>>>>> have to understand the idea of fix-point analyses.
>>>>>
>>>>> I don't how much you know about the Attributor, so
I'll defer a too
>>>>> long (or too beginner) description because you might
already know
>>>>> a lot of things. You can of course any specific questions
you want:
>>>>> A summary is:
>>>>> The Attributor tries to deduce attributes in different
points of an
>>>>> LLVM IR program (you can see that in the video).
>>>>> The deduction of these attributes is inter-connected, which
is the
>>>>> whole point of the Attributor. The attributes
>>>>> "ask" one another for information. For example,
one attribute tries to
>>>>> see if a load loads from null pointer.
>>>>> But the pointer operand might be non-constant (like %v in
LLVM IR).
>>>>> Well, another attribute, whose job is to do value
simplification
>>>>> (i.e. constant folding / propagation etc.) might have
folded that (%v)
>>>>> into the constant null. So, the former can ask him.
>>>>> These connections give the power and the complexity.
>>>>>
>>>>> The attributes have a state, that changes. When the state
stops
>>>>> changing, it has reached a fixpoint, at which point
>>>>> the deduction of it stops. From the initialization of the
attribute
>>>>> until a fixpoint is reached, the state changes
>>>>> in updates (called updateImpl() in the source code). This
is where
>>>>> attributes try to deduce new things, ask one another
>>>>> and eventually try to reach a fixpoint.
>>>>>
>>>>> Finally, a fixpoint can be enforced. Because if we for some
reason
>>>>> never stop changing, it would run forever.
>>>>> Note however that attributes should be programmed in a way
that
>>>>> fixpoint should be able to be reached
>>>>> (This is where theory might help a little).
>>>>>
>>>>> I'd suggest that you try to run the Attributor and
follow a specific
>>>>> attribute's updates and see what it tries to deduce.
>>>>> That is, see its updateImpl(). With a couple of prints you
can get a
>>>>> good idea of what it does and what info it
>>>>> gets from other attributes (and when it stops). You can of
course ask
>>>>> us if you're interested in a specific one, if
>>>>> there's something you don't understand etc.
>>>>>
>>>>> Now, to (try to) answer your questions and hopefully other
people can
>>>>> help.
>>>>> > How Attributor can help for standard inter-procedural
and
>>>>> intra-procedural analysis passes of LLVm. I’ve seen the
tutorial [4].
>>>>> I would like to discuss ways of improving other
optimization passes
>>>>> similarly (or some examples which have already been
implemented).
>>>>>
>>>>> The Attributor AFAIK is self-contained. It's not in
"production" yet
>>>>> and so it's not connected with other passes. At this
point, LLVM is focused
>>>>> on heavy inlining, which while very useful, you'll lose
a lot of the
>>>>> interprocedural information.
>>>>> Note that there are other transforms that do
Inter-Procedural
>>>>> Optimization (
>>>>>
https://github.com/llvm/llvm-project/tree/master/llvm/lib/Transforms/IPO)
>>>>> but they don't follow the idea of the Attributor.
>>>>> But they might follow a fix-point analysis.
>>>>>
>>>>> > Improve dynamic memory related capabilities of
Attributor. For
>>>>> example Improve HeapToStackConversions. Maybe such
deductions can
>>>>> help safety (dis)provers. For example, can we improve the
use-after-free
>>>>> bug detection using some attributes?
>>>>> Stefan should know more about H2S. Regarding the
use-after-free, I
>>>>> don't think there's currently any plans for it
directly, but they can be I
>>>>> assume.
>>>>>
>>>>> > Improve Liveness related capabilities of Attributor.
Again I want
>>>>> to consider whether some attribute deduction can help
liveness
>>>>> (dis)provers. For example NoReturn, WillReturn can be
improved. I am
>>>>> sure these 2 attributes do not cover all the cases as it is
an undecidable
>>>>> problem. But I was wondering whether there is room for
improvement in their
>>>>> deduction mechanism. Liveness is certainly something that
we're currently
>>>>> trying to improve and I don't think we'll ever
stop. Most of the attributes
>>>>> interact with the deadness attribute (AAIsDead) both for
asking it info and
>>>>> providing it info (i.e. the undefined-behavior attribute
hopefully will at
>>>>> some point be able to tell AAIsDead that a block is dead
because it
>>>>> contains UB). > Is there any attribute that tells
whether a function
>>>>> has side-effects (does it always gives the same output for
the same
>>>>> input? Or does it affect some global variable directly or
indirectly?)? No
>>>>> AFAIK, although you might be interested in this:
>>>>> https://reviews.llvm.org/D74691#1887983
>>>>>
>>>>> I hope this was helpful! Don't hesitate to ask any
questions.
>>>>>
>>>>> Kind regards,
>>>>> Stefanos Baziotis
>>>>>
>>>>> Στις Παρ, 13 Μαρ 2020 στις 10:25 μ.μ., ο/η Fahad Nayyar via
llvm-dev <
>>>>> llvm-dev at lists.llvm.org> έγραψε:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> My name is Fahad Nayyar. I am an undergraduate student
from India.
>>>>>>
>>>>>> I am interested to participate in GSOC under the
project “Improve
>>>>>> inter-procedural analyses and optimizations”.
>>>>>>
>>>>>> I have been using LLVM for the past 8 months. I have
written various
>>>>>> intra-procedural analysis in LLVM as FunctionPass for
my course projects
>>>>>> and research projects. But I’ve not contributed to the
LLVM community yet.
>>>>>> I am very excited to contribute to LLVM!
>>>>>>
>>>>>> I am not too familiar with the inter-procedural
analysis
>>>>>> infrastructure of LLVM. I have written small toy
inter-procedural dataflow
>>>>>> analysis (like taint analysis, reaching definitions,
etc) for JAVA programs
>>>>>> using SOOT tool *[5].* I am familiar with the theory of
>>>>>> inter-procedural analysis (I’ve read some chapters of 
[1],  [2] and
>>>>>> [3] for this).
>>>>>>
>>>>>> I am trying to understand the LLVM’s Attributor
framework. I am
>>>>>> interested in these 3 aspects:
>>>>>>
>>>>>>    1.
>>>>>>
>>>>>>    How Attributor can help for standard
inter-procedural and
>>>>>>    intra-procedural analysis passes of LLVm. I’ve seen
the tutorial
>>>>>>    [4]. I would like to discuss ways of improving other
optimization
>>>>>>    passes similarly (or some examples which have
already been implemented).
>>>>>>    2.
>>>>>>
>>>>>>    Improve dynamic memory related capabilities of
Attributor. For
>>>>>>    example Improve HeapToStackConversions. Maybe such
deductions can
>>>>>>    help safety (dis)provers. For example, can we
improve the use-after-free
>>>>>>    bug detection using some attributes?
>>>>>>    3.
>>>>>>
>>>>>>    Improve Liveness related capabilities of Attributor.
Again I want
>>>>>>    to consider whether some attribute deduction can
help liveness
>>>>>>    (dis)provers. For example NoReturn, WillReturn can
be improved. I
>>>>>>    am sure these 2 attributes do not cover all the
cases as it is an
>>>>>>    undecidable problem. But I was wondering whether
there is room for
>>>>>>    improvement in their deduction mechanism.
>>>>>>    4.
>>>>>>
>>>>>>    Can we optimize the attribute deduction algorithm to
reduce
>>>>>>    compile time?
>>>>>>    5.
>>>>>>
>>>>>>    Is there any attribute that tells whether a function
has
>>>>>>    side-effects (does it always gives the same output
for the same
>>>>>>    input? Or does it affect some global variable
directly or indirectly?)?
>>>>>>
>>>>>>
>>>>>> It would be great if Johannes can provide me some TODOs
before
>>>>>> submitting my proposal. Also please tell some specific
IPO improvement
>>>>>> goals which you have in mind for this project. I would
be most interested
>>>>>> in memory-related attributes, liveness deductions from
attributes and
>>>>>> measurable better IPO using attribute deduction.
>>>>>>
>>>>>> Thanks and Regards.
>>>>>>
>>>>>> References:
>>>>>>
>>>>>> [1] Principles of Program Analysis.
>>>>>> <https://www.springer.com/gp/book/9783540654100>
>>>>>>
>>>>>> [2] Data Flow Analysis: Theory and Practice.
>>>>>> <https://dl.acm.org/doi/book/10.5555/1592955>
>>>>>>
>>>>>> [3] Static Program Analysis.
<https://cs.au.dk/~amoeller/spa/spa.pdf>
>>>>>>
>>>>>> [4] 2019 LLVM Developers’ Meeting: J. Doerfert “The
Attributor: A
>>>>>> Versatile Inter-procedural Fixpoint.."
>>>>>> <https://www.youtube.com/watch?v=HVvvCSSLiTw>
>>>>>> [5] Soot - A Java optimization framework
>>>>>> <https://github.com/Sable/soot>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> llvm-dev at lists.llvm.org
>>>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>
>>>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200316/6cd432b1/attachment-0001.html>

Johannes Doerfert via llvm-dev

2020-Mar-18 23:22 UTC

head link

[llvm-dev] [GSOC] "Project: Improve inter-procedural analyses and optimisations"

1) Apologies for being late to the discussion.
2) I will respond to multiple mails in this thread.

On 03/16, Fahad Nayyar wrote:> Dear Stefanos,
> 
> Thanks for such a detailed explanation!
> I'll have to study your mail and try out some things before I can ask
> specific questions for further discussion.
> 
> But I want to discuss this point right away:
> 
> *> I'd start by doing diagrams about how different parts of the code
> interact with each other (e.g. where does the Attributor start? what does
> it call then? how are attributes created? etc.) When starting out, these
> things might not be important to tackle. But they helped me.*
> 
> I totally agree with your point of drawing diagrams about different parts
> of the code. I tried this but was not able to succeed. It would be very
> helpful if you can tell me what is the entry point of Attributor (which
> method is called the very first time?)?
As Stefanos noted, runAttributorOnFunctions is the "entry point". From
there we go to Attributor::run, which will then call
`AbstractAttribute::update` until a fixpoint was found or a timeout
reached.
> Also, is there any way we can debug opt command in gdb like fashion (ie.
> setting breakpoints, stepping through instructions one by one?)?
> This would help me a lot in the initial code reading period.
I recommend to run opt with -debug-only=attributor and look at the
output for a short code example. Not all abstract attributes print much
about their internal deduction process but you see the state before and
after an update at least.

Cheers,
  Johannes
> Thanks and Regards
> Fahad Nayyar
> 
> 
> 
> 
> On Mon, Mar 16, 2020 at 7:23 AM Stefanos Baziotis <
> stefanos.baziotis at gmail.com> wrote:
> 
> > Hi Farad,
> >
> > > I tried to do this for the NoUnwind attribute Hmm, I don't
have
> > experience with this attribute but it seems like a good starting point
> > since it doesn't do much. First of all, be sure that you run with:
opt
> > -passes=attributor -attributor-disable=false This uses the new pass
> > manager which is another discussion. Now, to the point: If you open
> > nounwind.ll, it has a bunch of test cases and I don't think
it's a good
> > idea to run Attributor in all of them at first. So, break it into
> > individual tests. First of all, note that the Attributor follows an
> > optimistic path to attribute deduction. That is, you always start
assuming
> > that an attribute is valid until you have info that it isn't.
Seeing TEST 1
> > should be relatively obvious that it doesn't have something that
breaks the
> > initial assumption, that's why you see no change in state between
calls to
> > updateImpl(). But now we have to dig deeper: What can break our
initial
> > assumption (which is that a function does not throw) in the case of
> > NoUnwind? You see in the topmost updateImpl() that they're a bunch
Opcodes.
> > Those are instructions (inside our function) that can potentially
break our
> > initial assumption (e.g. call to a function that either we know it
throws
> > or at least it is not guaranteed that it doesn't). In the
updateImpl(),
> > using checkForAllInstructions, we loop through all those instructions
and
> > call for each one the predicate CheckForNoUnwind. Note a couple of
things:
> > - TEST 1 has no such instructions whatsoever so if you put a print
inside
> > the predicate, you'll see nothing. - You can see that the
predicate returns
> > a boolean value. If it's true, we continue to the next
instruction,
> > otherwise we stop. If we make it through all of them,
> > checkForAllInstructions() returns true. Otherwise false. The predicate
> > checks if any instruction breaks our assumption (we'll see shortly
how). If
> > it does, we immediately indicate pessimistic fixpoint. What about TEST
2
> > and 3 ? Those 2 functions have a call inside but if you put a print
inside
> > the predicate, you'll see again nothing. The reason for that
brings us to
> > an important part of the Attributor and that is that
"deadness" (and the
> > relative attribute) is very important. checkForAllInstructions() will
only
> > go through instructions that are considered live. The 2 calls are not
> > considered because another part of the Attributor (which is out of
topic
> > right now) has (somewhat) deduced that they go in an endless
recursion. If
> > you run the Attributor with these 2 functions you'll see another
important
> > point. Specifically that the function bodies have only an
`unreachable`
> > instruction inside. That is, the attributor not only deduces (and
provides)
> > info through attributes but it also transforms code. In this case it
> > changed the function bodies to unreachable. Finally, I think it's
> > interesting to see TEST 4: You see that it calls a function that we
don't
> > know it doesn't throw. This should break our assumption. And it
does.
> > Inside the topmost updateImpl() and inside the predicate, if you put a
> > print, you'll see the call (e.g. dbgs() << I <<
"\n"). We ask I.mayThrow().
> > Note that mayThrow() will return false in the case that we're
somehow sure
> > that the instruction does not throw. In this case the instruction is a
> > call, so it will return true if we're sure that the called
function does
> > not throw. But we're not, so we move forward. What then happens is
a little
> > bit weird but it basically AANoUnwind asks AANoUnwindCallSite for info
> > because this instruction is a call site. Attributes ask one another
and
> > this is very important in the Attributor because one's information
is
> > useful in another. We do it with getAAFor. Without getting into too
much
> > info, the other attribute says that it's not assumed unwind so we
indicate
> > pessimistic fixpoint (the reality is that the order of calls between
the
> > attributes is reverse, but again, out of topic). I hope that gave you
a
> > better understanding! > I know how in [1] Johaanes explained the
use of
> > MaxObjSize and Dereferenceable in the AliasAnalysis. But I would be
happy
> > if I could come up with some even better example. As you said,
that'll
> > probably take some time but that's ok There are opportunities
everywhere.
> > For example, consider this: https://godbolt.org/z/HFWo_J It's a
for loop
> > that does a load inside from %p. The load seems to be invariant, we
could
> > move it out of the loop. -licm in the cmd arguments means it invokes
the
> > Loop-Invariant Code Motion pass, which does such things. But it
doesn't
> > move the load out. The reason for that is that consider the case where
%n
> > == 0 and %p == null. In the initial code, we would never get into the
loop
> > and we would not have a trap. While, with this transformation, we will
have
> > and thus, we just changed the initial behavior. So, the transformation
is
> > not done. However, if you put dereferenceable(4) attribute in %p, it
will
> > be done. Because now you now you can certainly dereference. So,
attribute
> > info is useful in ways we may not consider :) > I can start with
TODO at
> > line no. 2605  // TODO: Return the number of reachable queries.
I'm not
> > familiar with it. Currently it doesn't seem to do anything but I
may miss
> > something. I suppose what it asks is to track how many queries to this
> > attribute have been done by outside users which should be easy. >
Since
> > the code is very large right now, I thought to refer to some of the
very
> > initial patches of attributor Maybe, I don't know. But I assume
things
> > will have changed from then and you may get lost. I'd start by
doing
> > diagrams about how different parts of the code interact with each
other
> > (e.g. where does the Attributor start? what does it call then? how are
> > attributes created? etc.) When starting out, these things might not be
> > important to tackle. But they helped me. > How should I indicate to
the
> > community that I have started working towards this issue (should I
comment
> > on the issue page on github?)? I can try to work on AAReachability
TODO after
> > solving this issue.
> > You can write it in the Github comments. I don't think you can /
need to
> > do something else. Kind regards, Stefanos Baziotis
> >
> > Στις Δευ, 16 Μαρ 2020 στις 12:12 π.μ., ο/η Fahad Nayyar <
> > fahad17049 at iiitd.ac.in> έγραψε:
> >
> >> Dear Stefan and Stefanos,
> >>
> >> Thanks for your suggestions!
> >>
> >> > I'd suggest that you try to run the Attributor and follow
a specific
> >> attribute's updates and see what it tries to deduce. That is,
see its
> >> updateImpl(). With a couple of prints you can get a good idea of
what it
> >> does and what info it gets from other attributes (and when it
stops).
> >>
> >> I tried to do this for the NoUnwind attribute. I printed
getState(), i
> >> sAssumedNoUnwind(), isKnownNoUnwind() in updateImpl method of
classes
> >> AANoUnwind and AANoUnwindCallSite. I run the tests in nounwind.ll
(/llvm/test/Transforms/Attributor/nounwind.ll).
> >> I used this command to run the test: “opt -attributor
> >> -attributor-disable=false nounwind.ll -S &>
nounwind_out.ll”. But After
> >> seeing the output I was not able to understand how the attribute
is
> >> changing for the tests. Its status was almost constant every time
updateimp
> >> was called. Please tell me what other things should I try to print
to
> >> better observe how the NoUnwind attribute is changing over the
> >> iterations of fix point analysis. Also please verify whether I am
using the
> >> correct command to run the tests.
> >>
> >> > Also, probably this will be a very interesting panel
discussion for
> >> you: https://www.youtube.com/watch?v=cC2cspQgSxM
> >>
> >> Thanks for suggesting this! I watched the video and now I
understand the
> >> pros and cons of inlining. But I still think that It would take me
a while
> >> before I can come up with a very good example demonstrating the
use of of
> >> of the Attribues in some IPO pass. I know how in [1] Johaanes
explained
> >> the use of MaxObjSize and Dereferenceable in the AliasAnalysis.
But I
> >> would be happy if I could come up with some even better example.
> >>
> >> > You are somewhat right. However, H2S is not about
'use-after-free' bug
> >> detection, but rather its prevention. We already do this, see
example.
> >> <https://godbolt.org/z/HgrC7H>
> >>
> >> Thanks for sharing the example. Just for clarification, was this
example
> >> demonstrating the point that we can automatically correct
use-after-free
> >> bugs using attributes? If yes, then I didn’t understand how and
which
> >> attribute helped in this correction? Also is it not wrong to
change the IR
> >> as in this example? Replacing %1 = tail call noalias i8*
@malloc(i64 4)
> >> ;  tail call void @no_sync_func(i8* %1) with  %1 = alloca i8, i64
4 solved
> >> the use-after-free bug, but doesn’t it also change the semantic of
the
> >> program?
> >>
> >> > In the meantime you could look at some TODOs in the
Attributor itself
> >> and try those you see fit.
> >>
> >> I looked up some of the TODOs. I found AAReachability a very
interesting
> >> attribute. I can start with TODO at line no. 2605  // TODO: Return
the
> >> number of reachable queries. I can work towards this TODO. But I
first
> >> want your advice on whether it looks doable for me. I can see that
> >> implementation of AAReachability attribute is not complete yet. I
can
> >> try to learn more about it from D70233
<https://reviews.llvm.org/D70233>
> >> and D71617 <https://reviews.llvm.org/D71617>.
> >>
> >> I am trying to get more familiar with Attributor’s code. Since the
code
> >> is very large right now, I thought to refer to some of the very
initial
> >> patches of attributor (D59918,
<https://reviews.llvm.org/D59918> D60012
> >> <https://reviews.llvm.org/D60012>, D63379
> >> <https://reviews.llvm.org/D63379>). I believe that by
looking at these
> >> three I can get a better idea of the framework as a whole. Please
suggest
> >> if this is a good idea or not. Also please suggest any other way
by which I
> >> can improve my understanding of the code.
> >>
> >> I can see that Johanned have put up some issues for GSOC
aspirants. I
> >> think that [2]
<https://github.com/llvm/llvm-project/issues/179> ([Attributor]
> >> Cleanup and upstream `Attribute::MaxObjectSize`) will be a very
good
> >> issue for me, It seems doable and I can get familiar with the
whole process
> >> of writing a patch for an issue. How should I indicate to the
community
> >> that I have started working towards this issue (should I comment
on the
> >> issue page on github?)? I can try to work on AAReachability TODO
after
> >> solving this issue.
> >>
> >> Thanks and Regards
> >>
> >>
> >> References
> >>
> >> [1] https://youtu.be/HVvvCSSLiTw
> >> [2] https://github.com/llvm/llvm-project/issues/179
> >>
> >>
> >>
> >>
> >> On Sat, Mar 14, 2020 at 4:12 PM Stefan Stipanovic <stefomeister
at gmail.com>
> >> wrote:
> >>
> >>> Hi Fahad,
> >>>
> >>>
> >>>> > Improve dynamic memory related capabilities of
Attributor. For
> >>>> example Improve HeapToStackConversions. Maybe such
deductions can help
> >>>> safety (dis)provers. For example, can we improve the
use-after-free
> >>>> bug detection using some attributes?
> >>>> Stefan should know more about H2S. Regarding the
use-after-free, I
> >>>> don't think there's currently any plans for it
directly, but they can be I
> >>>> assume.
> >>>
> >>>
> >>> You are somewhat right. However, H2S is not about
'use-after-free' bug
> >>> detection, but rather its prevention. We already do this, see
example
> >>> <https://godbolt.org/z/HgrC7H>.
> >>>
> >>> In the rest of this post I'll try to help you familiarize
yourself with
> >>>> the Attributor and maybe answer your questions.
> >>>> Johannes can then give you specific things to do to get
started.
> >>>
> >>>
> >>> In the meantime you could look at some TODOs in the Attributor
itself
> >>> and try those you see fit.
> >>>
> >>> If you have any questions, don't hesitate to ask.
> >>>
> >>> -stefan
> >>>
> >>> On Fri, Mar 13, 2020 at 10:14 PM Stefanos Baziotis <
> >>> stefanos.baziotis at gmail.com> wrote:
> >>>
> >>>> Hi Fahad,
> >>>>
> >>>> We're all happy to see you being interested in LLVM!
More so in the
> >>>> Attributor! I'm a relatively new contributor so I
> >>>> think I can help. Please note that the Attributor, apart
from Johannes
> >>>> (who CC'd), has at least another 2 great
> >>>> contributors, Hideto and Stefan (who I also CC'd).
They were among the
> >>>> initial creators.
> >>>>
> >>>> In the rest of this post I'll try to help you
familiarize yourself with
> >>>> the Attributor and maybe answer your questions.
> >>>> Johannes can then give you specific things to do to get
started.
> >>>>
> >>>> Starting off, understanding the theory of data-flow
analysis can help.
> >>>> I'd say don't get too hang up on it, you just
> >>>> have to understand the idea of fix-point analyses.
> >>>>
> >>>> I don't how much you know about the Attributor, so
I'll defer a too
> >>>> long (or too beginner) description because you might
already know
> >>>> a lot of things. You can of course any specific questions
you want:
> >>>> A summary is:
> >>>> The Attributor tries to deduce attributes in different
points of an
> >>>> LLVM IR program (you can see that in the video).
> >>>> The deduction of these attributes is inter-connected,
which is the
> >>>> whole point of the Attributor. The attributes
> >>>> "ask" one another for information. For example,
one attribute tries to
> >>>> see if a load loads from null pointer.
> >>>> But the pointer operand might be non-constant (like %v in
LLVM IR).
> >>>> Well, another attribute, whose job is to do value
simplification
> >>>> (i.e. constant folding / propagation etc.) might have
folded that (%v)
> >>>> into the constant null. So, the former can ask him.
> >>>> These connections give the power and the complexity.
> >>>>
> >>>> The attributes have a state, that changes. When the state
stops
> >>>> changing, it has reached a fixpoint, at which point
> >>>> the deduction of it stops. From the initialization of the
attribute
> >>>> until a fixpoint is reached, the state changes
> >>>> in updates (called updateImpl() in the source code). This
is where
> >>>> attributes try to deduce new things, ask one another
> >>>> and eventually try to reach a fixpoint.
> >>>>
> >>>> Finally, a fixpoint can be enforced. Because if we for
some reason
> >>>> never stop changing, it would run forever.
> >>>> Note however that attributes should be programmed in a way
that
> >>>> fixpoint should be able to be reached
> >>>> (This is where theory might help a little).
> >>>>
> >>>> I'd suggest that you try to run the Attributor and
follow a specific
> >>>> attribute's updates and see what it tries to deduce.
> >>>> That is, see its updateImpl(). With a couple of prints you
can get a
> >>>> good idea of what it does and what info it
> >>>> gets from other attributes (and when it stops). You can of
course ask
> >>>> us if you're interested in a specific one, if
> >>>> there's something you don't understand etc.
> >>>>
> >>>> Now, to (try to) answer your questions and hopefully other
people can
> >>>> help.
> >>>> > How Attributor can help for standard inter-procedural
and
> >>>> intra-procedural analysis passes of LLVm. I’ve seen the
tutorial [4].
> >>>> I would like to discuss ways of improving other
optimization passes
> >>>> similarly (or some examples which have already been
implemented).
> >>>>
> >>>> The Attributor AFAIK is self-contained. It's not in
"production" yet
> >>>> and so it's not connected with other passes. At this
point, LLVM is focused
> >>>> on heavy inlining, which while very useful, you'll
lose a lot of the
> >>>> interprocedural information.
> >>>> Note that there are other transforms that do
Inter-Procedural
> >>>> Optimization (
> >>>>
https://github.com/llvm/llvm-project/tree/master/llvm/lib/Transforms/IPO)
> >>>> but they don't follow the idea of the Attributor.
> >>>> But they might follow a fix-point analysis.
> >>>>
> >>>> > Improve dynamic memory related capabilities of
Attributor. For
> >>>> example Improve HeapToStackConversions. Maybe such
deductions can help
> >>>> safety (dis)provers. For example, can we improve the
use-after-free
> >>>> bug detection using some attributes?
> >>>> Stefan should know more about H2S. Regarding the
use-after-free, I
> >>>> don't think there's currently any plans for it
directly, but they can be I
> >>>> assume.
> >>>>
> >>>> > Improve Liveness related capabilities of Attributor.
Again I want to
> >>>> consider whether some attribute deduction can help
liveness (dis)provers.
> >>>> For example NoReturn, WillReturn can be improved. I am
sure these 2
> >>>> attributes do not cover all the cases as it is an
undecidable problem. But
> >>>> I was wondering whether there is room for improvement in
their deduction
> >>>> mechanism. Liveness is certainly something that we're
currently trying to
> >>>> improve and I don't think we'll ever stop. Most of
the attributes interact
> >>>> with the deadness attribute (AAIsDead) both for asking it
info and
> >>>> providing it info (i.e. the undefined-behavior attribute
hopefully will at
> >>>> some point be able to tell AAIsDead that a block is dead
because it
> >>>> contains UB). > Is there any attribute that tells
whether a function
> >>>> has side-effects (does it always gives the same output for
the same
> >>>> input? Or does it affect some global variable directly or
indirectly?)? No
> >>>> AFAIK, although you might be interested in this:
> >>>> https://reviews.llvm.org/D74691#1887983
> >>>>
> >>>> I hope this was helpful! Don't hesitate to ask any
questions.
> >>>>
> >>>> Kind regards,
> >>>> Stefanos Baziotis
> >>>>
> >>>> Στις Παρ, 13 Μαρ 2020 στις 10:25 μ.μ., ο/η Fahad Nayyar
via llvm-dev <
> >>>> llvm-dev at lists.llvm.org> έγραψε:
> >>>>
> >>>>> Hi all,
> >>>>>
> >>>>> My name is Fahad Nayyar. I am an undergraduate student
from India.
> >>>>>
> >>>>> I am interested to participate in GSOC under the
project “Improve
> >>>>> inter-procedural analyses and optimizations”.
> >>>>>
> >>>>> I have been using LLVM for the past 8 months. I have
written various
> >>>>> intra-procedural analysis in LLVM as FunctionPass for
my course projects
> >>>>> and research projects. But I’ve not contributed to the
LLVM community yet.
> >>>>> I am very excited to contribute to LLVM!
> >>>>>
> >>>>> I am not too familiar with the inter-procedural
analysis
> >>>>> infrastructure of LLVM. I have written small toy
inter-procedural dataflow
> >>>>> analysis (like taint analysis, reaching definitions,
etc) for JAVA programs
> >>>>> using SOOT tool *[5].* I am familiar with the theory
of
> >>>>> inter-procedural analysis (I’ve read some chapters of 
[1],  [2] and
> >>>>> [3] for this).
> >>>>>
> >>>>> I am trying to understand the LLVM’s Attributor
framework. I am
> >>>>> interested in these 3 aspects:
> >>>>>
> >>>>>    1.
> >>>>>
> >>>>>    How Attributor can help for standard
inter-procedural and
> >>>>>    intra-procedural analysis passes of LLVm. I’ve seen
the tutorial
> >>>>>    [4]. I would like to discuss ways of improving
other optimization
> >>>>>    passes similarly (or some examples which have
already been implemented).
> >>>>>    2.
> >>>>>
> >>>>>    Improve dynamic memory related capabilities of
Attributor. For
> >>>>>    example Improve HeapToStackConversions. Maybe such
deductions can
> >>>>>    help safety (dis)provers. For example, can we
improve the use-after-free
> >>>>>    bug detection using some attributes?
> >>>>>    3.
> >>>>>
> >>>>>    Improve Liveness related capabilities of
Attributor. Again I want
> >>>>>    to consider whether some attribute deduction can
help liveness
> >>>>>    (dis)provers. For example NoReturn, WillReturn can
be improved. I
> >>>>>    am sure these 2 attributes do not cover all the
cases as it is an
> >>>>>    undecidable problem. But I was wondering whether
there is room for
> >>>>>    improvement in their deduction mechanism.
> >>>>>    4.
> >>>>>
> >>>>>    Can we optimize the attribute deduction algorithm
to reduce
> >>>>>    compile time?
> >>>>>    5.
> >>>>>
> >>>>>    Is there any attribute that tells whether a
function has
> >>>>>    side-effects (does it always gives the same output
for the same
> >>>>>    input? Or does it affect some global variable
directly or indirectly?)?
> >>>>>
> >>>>>
> >>>>> It would be great if Johannes can provide me some
TODOs before
> >>>>> submitting my proposal. Also please tell some specific
IPO improvement
> >>>>> goals which you have in mind for this project. I would
be most interested
> >>>>> in memory-related attributes, liveness deductions from
attributes and
> >>>>> measurable better IPO using attribute deduction.
> >>>>>
> >>>>> Thanks and Regards.
> >>>>>
> >>>>> References:
> >>>>>
> >>>>> [1] Principles of Program Analysis.
> >>>>> <https://www.springer.com/gp/book/9783540654100>
> >>>>>
> >>>>> [2] Data Flow Analysis: Theory and Practice.
> >>>>> <https://dl.acm.org/doi/book/10.5555/1592955>
> >>>>>
> >>>>> [3] Static Program Analysis.
<https://cs.au.dk/~amoeller/spa/spa.pdf>
> >>>>>
> >>>>> [4] 2019 LLVM Developers’ Meeting: J. Doerfert “The
Attributor: A
> >>>>> Versatile Inter-procedural Fixpoint.."
> >>>>> <https://www.youtube.com/watch?v=HVvvCSSLiTw>
> >>>>> [5] Soot - A Java optimization framework
> >>>>> <https://github.com/Sable/soot>
> >>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> LLVM Developers mailing list
> >>>>> llvm-dev at lists.llvm.org
> >>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>>>>
> >>>>
-- 

Johannes Doerfert
Researcher

Argonne National Laboratory
Lemont, IL 60439, USA

jdoerfert at anl.gov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200318/44dff725/attachment-0001.sig>

llvm dev - Mar 2020 - [GSOC] "Project: Improve inter-procedural analyses and optimisations"

[llvm-dev] [GSOC] "Project: Improve inter-procedural analyses and optimisations"

[llvm-dev] [GSOC] "Project: Improve inter-procedural analyses and optimisations"

[llvm-dev] [GSOC] "Project: Improve inter-procedural analyses and optimisations"

[llvm-dev] [GSOC] "Project: Improve inter-procedural analyses and optimisations"