thr3ads.net - llvm dev - [llvm-dev] [GSoC 2017] Project proposal queries [Mar 2017]

If this information is useful, please help other people find it:
Share via:

Matthias Braun via llvm-dev

2017-Mar-15 19:49 UTC

[llvm-dev] [GSoC 2017] Project proposal queries

> On Mar 15, 2017, at 11:42 AM, Mehdi Amini via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
>> 
>> On Mar 15, 2017, at 10:10 AM, Radhika via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>> 
>> Hi,
>> 
>> My name is Radhika Ghosal and I'd like to participate in this
year's GSoC in LLVM. I have been using LLVM for a research project for some
time now, and would love to contribute to it!
>> 
>> Below are the projects I'm interested in and a few questions;
please correct me if I am understanding something incorrectly:
>> 
>> - Code Compaction: [1]
>> Is it necessary to implement code compaction solely using
interprocedural link-time analyses, or this project may include compile-time
optimizations for size as well? (as in the style of the `-Os` flag in gcc)
>> 
>> I ask because solely using LTO for doing so may restrict us to using
the Gold linker or lld, both of which are still at development stage for
embedded targets like ARM (for lld) and AVR (not included in Gold, may have lld
support in the future), while compile-time optimizations for size may yield more
immediate results.
> 
> It is rare to have link-time specific optimizations in LLVM, usually LTO is
just a way to expose a larger body of code to the optimizer and thus getting
better results.
> The link to the article on the open-project page seems dead so I don’t know
exactly what it was about. It seems also that this is quite an old “open
project” so it is not clear if it is still of high interest. For instance these
days Jessica is working on an outliner ( http://llvm.org/devmtg/2016-11/#talk21
<http://llvm.org/devmtg/2016-11/#talk21> ) to improve code-size
aggressively, I don’t know if there are other aspect people are interested to
look into.
> 
> 
> 
>> 
>> - Encode Analysis Results in MachineInstr IR: [2]
>> Does the analysis information have to be encoded within metadata in the
LLVM IR instructions (since the BasicBlock corresponding to the
MachineBasicBlock is always available), or something else must be done? I am
somewhat unsure of what is required, and would like to know more about the
project.
> 
> I don’t know about this one, but hopefully someone else will chime in (CC
Matthias randomly ;)).I don't know either. We gotta find the author of that proposal.

- Matthias

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170315/abb48fc3/attachment.html>

John Criswell via llvm-dev

2017-Mar-15 19:59 UTC

head link

[llvm-dev] [GSoC 2017] Project proposal queries

On 3/15/17 3:49 PM, Matthias Braun via llvm-dev wrote:>
>> On Mar 15, 2017, at 11:42 AM, Mehdi Amini via llvm-dev 
>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>>
>>>
>>> On Mar 15, 2017, at 10:10 AM, Radhika via llvm-dev 
>>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>>>
>>> Hi,
>>>
>>> My name is Radhika Ghosal and I'd like to participate in this
year's
>>> GSoC in LLVM. I have been using LLVM for a research project for
some
>>> time now, and would love to contribute to it!
>>>
>>> Below are the projects I'm interested in and a few questions;
please
>>> correct me if I am understanding something incorrectly:
>>>
>>> - Code Compaction: [1]
>>> Is it necessary to implement code compaction solely using 
>>> interprocedural link-time analyses, or this project may include 
>>> compile-time optimizations for size as well? (as in the style of
the
>>> `-Os` flag in gcc)
>>>
>>> I ask because solely using LTO for doing so may restrict us to
using
>>> the Gold linker or lld, both of which are still at development
stage
>>> for embedded targets like ARM (for lld) and AVR (not included in 
>>> Gold, may have lld support in the future), while compile-time 
>>> optimizations for size may yield more immediate results.
>>
>> It is rare to have link-time specific optimizations in LLVM, usually 
>> LTO is just a way to expose a larger body of code to the optimizer 
>> and thus getting better results.
>> The link to the article on the open-project page seems dead so I 
>> don’t know exactly what it was about. It seems also that this is 
>> quite an old “open project” so it is not clear if it is still of high 
>> interest. For instance these days Jessica is working on an outliner 
>> (http://llvm.org/devmtg/2016-11/#talk21 ) to improve code-size 
>> aggressively, I don’t know if there are other aspect people are 
>> interested to look into.
>>
>>
>>
>>>
>>> - Encode Analysis Results in MachineInstr IR: [2]
>>> Does the analysis information have to be encoded within metadata in
>>> the LLVM IR instructions (since the BasicBlock corresponding to the
>>> MachineBasicBlock is always available), or something else must be 
>>> done? I am somewhat unsure of what is required, and would like to 
>>> know more about the project.
>>
>> I don’t know about this one, but hopefully someone else will chime in 
>> (CC Matthias randomly ;)).
> I don't know either. We gotta find the author of that proposal.
That was me.  :)

For some applications, it is useful to be able to use analysis results 
from LLVM IR passes reliably when analyzing or transforming code at the 
MachineInstr IR level.  For example, in the CRAFTED project, we would 
like to know the targets of an indirect function call.  A points-to/call 
graph analysis (e.g., DSA) can provide the results, but we only have 
ad-hoc techniques for matching the MachineInstr call instruction back to 
the CallInst for which it was generated.  Similarily, we may want to 
have a shape graph for the memory objects allocated by the program.  
Most of the objects are allocated by instructions visible at the LLVM IR 
level, but again, it can be difficult to map a MachineInstr back to the 
LLVM IR instruction for which it was generated.  Therefore, it would be 
nice to attach this information to MachineInstr's as they are generated 
(perhaps as an annotation).

My interest in this is for using (abusing?) LLVM's MachineInstr IR for 
analyzing the susceptibility of machine code to code reuse attacks.  For 
our purpose, using source-level (e.g., LLVM IR-level) information is 
fine, but we need to analyze individual machine instructions as that is 
the granularity at which attacks operate.

I think there may have been another person that had interest in this 
feature as well, though I do not recall who and for what she or he 
wanted it.

Regards,

John Criswell
>
> - Matthias
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
John Criswell
Assistant Professor
Department of Computer Science, University of Rochester
http://www.cs.rochester.edu/u/criswell

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170315/4d304735/attachment.html>

Radhika via llvm-dev

2017-Mar-16 18:06 UTC

head link

[llvm-dev] [GSoC 2017] Project proposal queries

Thank you very much for your replies. I tinker frequently with
embedded systems and was hoping I could contribute to an `-Os` flag
for LLVM or alternately, improving code generation in the experimental
AVR backend if there is a willing mentor.

As for encoding analysis results in MachineInstr IR, it's a feature I
too sorely need in the project I'm working on. Would it be helpful to
implement a metadata-equivalent for MachineInstr so that analysis
information can be passed down from IR to MI, or attaching metadata as
a MachineOperand onto the MachineInstr ad-hoc? Would this be a
suitable project for GSoC? (I suspect this is not very complex a
project and more may be required; I may be wrong)

For the 'Improve code generation testing' project, I'd like to
suggest
another improvement; adding the capability to develop code generation
passes out-of-source, like that for `opt` and IR passes, but for
loading into `llc` (this is unrelated to MIR, but related to testing
code generation passes in general). I suspect this would be helpful
for people developing machine-specific passes/tools (such as for
optimizations in embedded systems). Your thoughts please?

Lastly, I will be working at a university research lab (which also
involves hacking on LLVM) over the summer. Would it still be alright
for me to apply for GSoC? If not, would it still be possible for me to
informally contribute to one of the above projects and ask the
potential mentor questions?

Sincerely,
Radhika

On Thu, Mar 16, 2017 at 1:29 AM, John Criswell <jtcriswel at gmail.com>
wrote:> On 3/15/17 3:49 PM, Matthias Braun via llvm-dev wrote:
>
>
> On Mar 15, 2017, at 11:42 AM, Mehdi Amini via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>
>
> On Mar 15, 2017, at 10:10 AM, Radhika via llvm-dev <llvm-dev at
lists.llvm.org>
> wrote:
>
> Hi,
>
> My name is Radhika Ghosal and I'd like to participate in this
year's GSoC in
> LLVM. I have been using LLVM for a research project for some time now, and
> would love to contribute to it!
>
> Below are the projects I'm interested in and a few questions; please
correct
> me if I am understanding something incorrectly:
>
> - Code Compaction: [1]
> Is it necessary to implement code compaction solely using interprocedural
> link-time analyses, or this project may include compile-time optimizations
> for size as well? (as in the style of the `-Os` flag in gcc)
>
> I ask because solely using LTO for doing so may restrict us to using the
> Gold linker or lld, both of which are still at development stage for
> embedded targets like ARM (for lld) and AVR (not included in Gold, may have
> lld support in the future), while compile-time optimizations for size may
> yield more immediate results.
>
>
> It is rare to have link-time specific optimizations in LLVM, usually LTO is
> just a way to expose a larger body of code to the optimizer and thus
getting
> better results.
> The link to the article on the open-project page seems dead so I don’t know
> exactly what it was about. It seems also that this is quite an old “open
> project” so it is not clear if it is still of high interest. For instance
> these days Jessica is working on an outliner (
> http://llvm.org/devmtg/2016-11/#talk21 ) to improve code-size aggressively,
> I don’t know if there are other aspect people are interested to look into.
>
>
>
>
> - Encode Analysis Results in MachineInstr IR: [2]
> Does the analysis information have to be encoded within metadata in the
LLVM
> IR instructions (since the BasicBlock corresponding to the
MachineBasicBlock
> is always available), or something else must be done? I am somewhat unsure
> of what is required, and would like to know more about the project.
>
>
> I don’t know about this one, but hopefully someone else will chime in (CC
> Matthias randomly ;)).
>
> I don't know either. We gotta find the author of that proposal.
>
>
> That was me.  :)
>
> For some applications, it is useful to be able to use analysis results from
> LLVM IR passes reliably when analyzing or transforming code at the
> MachineInstr IR level.  For example, in the CRAFTED project, we would like
> to know the targets of an indirect function call.  A points-to/call graph
> analysis (e.g., DSA) can provide the results, but we only have ad-hoc
> techniques for matching the MachineInstr call instruction back to the
> CallInst for which it was generated.  Similarily, we may want to have a
> shape graph for the memory objects allocated by the program.  Most of the
> objects are allocated by instructions visible at the LLVM IR level, but
> again, it can be difficult to map a MachineInstr back to the LLVM IR
> instruction for which it was generated.  Therefore, it would be nice to
> attach this information to MachineInstr's as they are generated
(perhaps as
> an annotation).
>
> My interest in this is for using (abusing?) LLVM's MachineInstr IR for
> analyzing the susceptibility of machine code to code reuse attacks.  For
our
> purpose, using source-level (e.g., LLVM IR-level) information is fine, but
> we need to analyze individual machine instructions as that is the
> granularity at which attacks operate.
>
> I think there may have been another person that had interest in this
feature
> as well, though I do not recall who and for what she or he wanted it.
>
> Regards,
>
> John Criswell
>
>
> - Matthias
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> --
> John Criswell
> Assistant Professor
> Department of Computer Science, University of Rochester
> http://www.cs.rochester.edu/u/criswell

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Mar 2017 - [GSoC 2017] Project proposal queries

[llvm-dev] [GSoC 2017] Project proposal queries

[llvm-dev] [GSoC 2017] Project proposal queries

[llvm-dev] [GSoC 2017] Project proposal queries

Apparently Analagous Threads