Radhika via llvm-dev
2017-Mar-16 18:06 UTC
[llvm-dev] [GSoC 2017] Project proposal queries
Thank you very much for your replies. I tinker frequently with embedded systems and was hoping I could contribute to an `-Os` flag for LLVM or alternately, improving code generation in the experimental AVR backend if there is a willing mentor. As for encoding analysis results in MachineInstr IR, it's a feature I too sorely need in the project I'm working on. Would it be helpful to implement a metadata-equivalent for MachineInstr so that analysis information can be passed down from IR to MI, or attaching metadata as a MachineOperand onto the MachineInstr ad-hoc? Would this be a suitable project for GSoC? (I suspect this is not very complex a project and more may be required; I may be wrong) For the 'Improve code generation testing' project, I'd like to suggest another improvement; adding the capability to develop code generation passes out-of-source, like that for `opt` and IR passes, but for loading into `llc` (this is unrelated to MIR, but related to testing code generation passes in general). I suspect this would be helpful for people developing machine-specific passes/tools (such as for optimizations in embedded systems). Your thoughts please? Lastly, I will be working at a university research lab (which also involves hacking on LLVM) over the summer. Would it still be alright for me to apply for GSoC? If not, would it still be possible for me to informally contribute to one of the above projects and ask the potential mentor questions? Sincerely, Radhika On Thu, Mar 16, 2017 at 1:29 AM, John Criswell <jtcriswel at gmail.com> wrote:> On 3/15/17 3:49 PM, Matthias Braun via llvm-dev wrote: > > > On Mar 15, 2017, at 11:42 AM, Mehdi Amini via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > > On Mar 15, 2017, at 10:10 AM, Radhika via llvm-dev <llvm-dev at lists.llvm.org> > wrote: > > Hi, > > My name is Radhika Ghosal and I'd like to participate in this year's GSoC in > LLVM. I have been using LLVM for a research project for some time now, and > would love to contribute to it! > > Below are the projects I'm interested in and a few questions; please correct > me if I am understanding something incorrectly: > > - Code Compaction: [1] > Is it necessary to implement code compaction solely using interprocedural > link-time analyses, or this project may include compile-time optimizations > for size as well? (as in the style of the `-Os` flag in gcc) > > I ask because solely using LTO for doing so may restrict us to using the > Gold linker or lld, both of which are still at development stage for > embedded targets like ARM (for lld) and AVR (not included in Gold, may have > lld support in the future), while compile-time optimizations for size may > yield more immediate results. > > > It is rare to have link-time specific optimizations in LLVM, usually LTO is > just a way to expose a larger body of code to the optimizer and thus getting > better results. > The link to the article on the open-project page seems dead so I don’t know > exactly what it was about. It seems also that this is quite an old “open > project” so it is not clear if it is still of high interest. For instance > these days Jessica is working on an outliner ( > http://llvm.org/devmtg/2016-11/#talk21 ) to improve code-size aggressively, > I don’t know if there are other aspect people are interested to look into. > > > > > - Encode Analysis Results in MachineInstr IR: [2] > Does the analysis information have to be encoded within metadata in the LLVM > IR instructions (since the BasicBlock corresponding to the MachineBasicBlock > is always available), or something else must be done? I am somewhat unsure > of what is required, and would like to know more about the project. > > > I don’t know about this one, but hopefully someone else will chime in (CC > Matthias randomly ;)). > > I don't know either. We gotta find the author of that proposal. > > > That was me. :) > > For some applications, it is useful to be able to use analysis results from > LLVM IR passes reliably when analyzing or transforming code at the > MachineInstr IR level. For example, in the CRAFTED project, we would like > to know the targets of an indirect function call. A points-to/call graph > analysis (e.g., DSA) can provide the results, but we only have ad-hoc > techniques for matching the MachineInstr call instruction back to the > CallInst for which it was generated. Similarily, we may want to have a > shape graph for the memory objects allocated by the program. Most of the > objects are allocated by instructions visible at the LLVM IR level, but > again, it can be difficult to map a MachineInstr back to the LLVM IR > instruction for which it was generated. Therefore, it would be nice to > attach this information to MachineInstr's as they are generated (perhaps as > an annotation). > > My interest in this is for using (abusing?) LLVM's MachineInstr IR for > analyzing the susceptibility of machine code to code reuse attacks. For our > purpose, using source-level (e.g., LLVM IR-level) information is fine, but > we need to analyze individual machine instructions as that is the > granularity at which attacks operate. > > I think there may have been another person that had interest in this feature > as well, though I do not recall who and for what she or he wanted it. > > Regards, > > John Criswell > > > - Matthias > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > -- > John Criswell > Assistant Professor > Department of Computer Science, University of Rochester > http://www.cs.rochester.edu/u/criswell
Mehdi Amini via llvm-dev
2017-Mar-16 18:16 UTC
[llvm-dev] [GSoC 2017] Project proposal queries
> On Mar 16, 2017, at 11:06 AM, Radhika <radhikaghosal at gmail.com> wrote: > > Thank you very much for your replies. I tinker frequently with > embedded systems and was hoping I could contribute to an `-Os` flag > for LLVM or alternately, improving code generation in the experimental > AVR backend if there is a willing mentor.`-Os` already exists in LLVM. The way clang implements it is by adding a function attribute that passes and codegen are supposed to look at and honor. If there is work, it can be about which heuristic / transformations in specific part of LLVM needs improvement to better honor this.> As for encoding analysis results in MachineInstr IR, it's a feature I > too sorely need in the project I'm working on. Would it be helpful to > implement a metadata-equivalent for MachineInstr so that analysis > information can be passed down from IR to MI, or attaching metadata as > a MachineOperand onto the MachineInstr ad-hoc?Metadata in LLVM are supposed to be able to be ignored and dropped. Encoding analysis results in Metadata lead to some questions about invalidation etc.> Would this be a > suitable project for GSoC? (I suspect this is not very complex a > project and more may be required; I may be wrong)I think it is more complex than you believe, to the point that I’d be surprised if it can be completed in a GSOC timeframe (which does not mean a GSOC can’t be purposed to explore it and setup initial infrastructure). Think about serialization, invalidation, etc.> For the 'Improve code generation testing' project, I'd like to suggest > another improvement; adding the capability to develop code generation > passes out-of-source, like that for `opt` and IR passes, but for > loading into `llc` (this is unrelated to MIR, but related to testing > code generation passes in general). I suspect this would be helpful > for people developing machine-specific passes/tools (such as for > optimizations in embedded systems). Your thoughts please?I’m not totally convinced there is a huge benefit to this? Is it a such pain to re-build llc?> > Lastly, I will be working at a university research lab (which also > involves hacking on LLVM) over the summer. Would it still be alright > for me to apply for GSoC?GSoC is supposed to be a full-time commitment.> If not, would it still be possible for me to > informally contribute to one of the above projects and ask the > potential mentor questions?Sure, LLVM is an open-source project and as a community we welcome anyone to be involved and we’re trying to be supportive. GSoC is just a way to get paid to work on interesting stuff, but anyone can work on the same interesting stuff the same way outside of GSoC, and “mentors” should be ready to help. Best, — Mehdi
Radhika via llvm-dev
2017-Mar-17 14:08 UTC
[llvm-dev] [GSoC 2017] Project proposal queries
> On Thu, Mar 16, 2017 at 11:46 PM, Mehdi Amini <mehdi.amini at apple.com> wrote: > > `-Os` already exists in LLVM. The way clang implements it is by adding a function attribute that passes and codegen are supposed to look at and honor. > > If there is work, it can be about which heuristic / transformations in specific part of LLVM needs improvement to better honor this.Got it, so if `-Os` fails to give significant gains (for instance, in the experimental AVR backend), it's because codegen for the arch needs improvement.> Metadata in LLVM are supposed to be able to be ignored and dropped. Encoding analysis results in Metadata lead to some questions about invalidation etc. > > I think it is more complex than you believe, to the point that I’d be surprised if it can be completed in a GSOC timeframe (which does not mean a GSOC can’t be purposed to explore it and setup initial infrastructure). > Think about serialization, invalidation, etc.Did not consider this! What would be the best way to implement this in your opinion? I should have put in more thought... (would it be a good idea to add an annotation intrinsic function visible to the code generator?)> I’m not totally convinced there is a huge benefit to this? Is it a such pain to re-build llc?There isn't any huge benefit, nor is it a pain to re-build llc; only makes distributing passes/tools easier if they can be built against an existing LLVM source installation. Considering the front-end and the middle-end both support this, it may not be a bad idea for the back-end to do so as well (please correct me if I'm wrong!).> GSoC is supposed to be a full-time commitment. > > Sure, LLVM is an open-source project and as a community we welcome anyone to be involved and we’re trying to be supportive. GSoC is just a way to get paid to work on interesting stuff, but anyone can work on the same interesting stuff the same way outside of GSoC, and “mentors” should be ready to help.I was hoping to put in the requisite 30+ hours outside lab work and start early, but that's probably not a great idea; I'll apply next year (I'm a second-year undergrad) and meanwhile contribute outside of GSoC. Thank you very much for your help (and bearing with some possibly stupid questions and ideas)! Sincerely, Radhika