Hi all, I am trying to build a global control flow graph at machine code level. Essentially, I need the handles to the MachineFunction's corresponding to every call site inside a MachineFunction in order to get the handles to MachineBasicBlock's with return statements inside the callee. Currently, the codegen module processes one MachineFunction at a time and hence I can't find a way to gather information about other MachineFunction's and their MachineBasicBlock's. It seems like I may have to modify the way MachineFunction is instantiated in MachineFunctionAnalysis. Instead of doing it per Function, it may have to be done for the entire Module by instantiating MachineFunction objects for every Function inside the Module. This might require major changes to the PassManager framework as well. Is there some work in this direction and code that someone can share? Or an alternative solution? -- Regards, Abhishek Rhisheekesan CIDSE, Arizona State University -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120729/995cb5e7/attachment.html>
Florian Brandner
2012-Jul-30 09:06 UTC
[LLVMdev] global control flow graph at machine code level
Hi Abhishek, On Sunday, July 29, 2012 18:32:11 AbhishekR wrote:> It seems like I may have to modify the way MachineFunction is instantiated in MachineFunctionAnalysis. Instead of doing it per Function, it may have to be done for the entire Module by instantiating MachineFunction objects for every Function inside the Module. This might require major changes to the PassManager framework as well. Is there some work in this direction and code that someone can share? Or an alternative solution?yes, the MachineFunctionAnalysis creates the MachineFunctions. unfortunately a MachineFunction is destroyed along the MachineFunctionAnalysis that created it. this happens for instance when you schedule a module pass (where you could operate on the global control-flow) somewhere during code generation. A workaround could be to modify the MachineFunctionAnalysis such that it stores the MachineFunction in a look-up table inside the MachineModuleInfo instead of destroying it before your module pass is run. once your module pass is finished, a new MachineFunctionAnalysis is scheduled by the pass manager. Now, instead of creating a new function, you could check the look-up table of the MachineModuleInfo to get the original MachineFunction. it does not appear to be a very complicated change, but there might be some dirty details that could make this approach hard to implement, e.g., information stored with the MachineFunctionAnalysis itself. you could move this information to a new class and see the MachineFunctionAnalysis as a wrapper to this class. a nice property of this solution is that code generation still proceeds on a per-function-basis, unless you explicitly insert a module pass. another problem is, that there is little (or no) support to construct the global control-flow from the machine code. for instance, the call graph is based on LLVM-IR. depending on the target architecture, you might have call sites in the machine code that were not visible in the LLVM-IR, e.g., when you implement floating point operations using library calls. so there might be some extra work needed to get this infrastructure too. I have not yet worked on this yet, but I plan to implement something like this at some point sooner or later. if you find another (better) solution let me know. best, Florian -- Florian Brandner Embedded Systems Engineering Group Department of Informatics and Mathematical Modeling DTU Informatics Technical University of Denmark Richard Petersens Plads Building 322, room 206 2800 Lyngby Denmark phone: +45 45255223 web: http://www.imm.dtu.dk/~flbr/ email: flbr at imm.dtu.dk
Abhishek Rhisheekesan
2012-Aug-01 16:25 UTC
[LLVMdev] global control flow graph at machine code level
Florian Brandner <flbr <at> imm.dtu.dk> writes:> > Hi Abhishek, > On Sunday, July 29, 2012 18:32:11 AbhishekR wrote: > > It seems like I may have to modify the way MachineFunction is instantiatedin MachineFunctionAnalysis.> Instead of doing it per Function, it may have to be done for the entire Moduleby instantiating> MachineFunction objects for every Function inside the Module. This mightrequire major changes to the> PassManager framework as well. Is there some work in this direction and codethat someone can share? Or an> alternative solution? > > yes, the MachineFunctionAnalysis creates the MachineFunctions. unfortunately > a MachineFunction is destroyed along the MachineFunctionAnalysis that created > it. this happens for instance when you schedule a module pass (where you could > operate on the global control-flow) somewhere during code generation. > > A workaround could be to modify the MachineFunctionAnalysis such that it > stores the MachineFunction in a look-up table inside the MachineModuleInfo > instead of destroying it before your module pass is run. once your module > pass is finished, a new MachineFunctionAnalysis is scheduled by the pass > manager. Now, instead of creating a new function, you could check the look-up > table of the MachineModuleInfo to get the original MachineFunction. > > it does not appear to be a very complicated change, but there might be some > dirty details that could make this approach hard to implement, e.g.,information> stored with the MachineFunctionAnalysis itself. you could move thisinformation> to a new class and see the MachineFunctionAnalysis as a wrapper to this class. > > a nice property of this solution is that code generation still proceeds on a > per-function-basis, unless you explicitly insert a module pass. > > another problem is, that there is little (or no) support to construct the > global control-flow from the machine code. for instance, the call graph is > based on LLVM-IR. depending on the target architecture, you might have call > sites in the machine code that were not visible in the LLVM-IR, e.g., when you > implement floating point operations using library calls. so there might besome> extra work needed to get this infrastructure too. > > I have not yet worked on this yet, but I plan to implement something like this > at some point sooner or later. if you find another (better) solution let me > know.I implemented a workaround solution for now but I think eventually I might go with your solution. My solution generated the global control flow graph (GCFG) in 2 passes (llc needs to be run twice). In the first pass, I printed the necessary information to a file, like the Machine Function (MF) Name, it's ID number, the IDs of MBBs inside the MF, the Call Sites inside the MF in the order in which they are called from the respective MBBs, the predecessors and successors of MBBs and whether a MBB has a return statement. In the second pass which should be immediately after the 1st pass in the CodeGen flow (otherwise MFs and MBBs can get updated by the CodeGen flow in between), I read back the file. This again is done in 2 passes. The first pass parses the file and populates the skeleton of GCFG data structure with the MF Nodes and MBB nodes inside those MF nodes (note that each Call Site in an MBB splits the MBB into two). Here the nodes are identified by IDs and a DenseMap is maintained to establish the MF name to ID correlation. The second pass parses the file again and attaches the predecessor, and successor links to the MBB nodes.> > best, > Florian >
Florian Brandner
2012-Dec-20 13:14 UTC
[LLVMdev] global control flow graph at machine code level
Hi Abhishek, I finally got some time to try the approach I proposed in July in our compiler (see https://github.com/t-crest/patmos-llvm). I extended the MachineModuleInfo and MachineFunctionAnalysis machinery to support module passes at the codegen-level. So far, it is tested using a simple pass that constructs the call graph at the machine-level. It appears to work, i.e., the generated .s files are identical and all our tests run correctly (i.e., MiBench). I tried to avoid touching the LLVM core-infrastructure as much as possible, so this patch is intended as a starting point for those who would like to do something similar (or for a discussion on how to integrate this properly into mainline). The attached patch contains the LLVM-specific bits only. Best, Florian -- Florian Brandner Embedded Systems Engineering Group Department of Informatics and Mathematical Modeling DTU Informatics Technical University of Denmark Richard Petersens Plads Building 322, room 206 2800 Lyngby Denmark phone: +45 45255223 web: http://www.imm.dtu.dk/~flbr/ email: flbr at imm.dtu.dk On Monday, July 30, 2012 11:06:24 Florian Brandner wrote:> Hi Abhishek, > On Sunday, July 29, 2012 18:32:11 AbhishekR wrote: > > It seems like I may have to modify the way MachineFunction is instantiated in MachineFunctionAnalysis. Instead of doing it per Function, it may have to be done for the entire Module by instantiating MachineFunction objects for every Function inside the Module. This might require major changes to the PassManager framework as well. Is there some work in this direction and code that someone can share? Or an alternative solution? > > yes, the MachineFunctionAnalysis creates the MachineFunctions. unfortunately > a MachineFunction is destroyed along the MachineFunctionAnalysis that created > it. this happens for instance when you schedule a module pass (where you could > operate on the global control-flow) somewhere during code generation. > > A workaround could be to modify the MachineFunctionAnalysis such that it > stores the MachineFunction in a look-up table inside the MachineModuleInfo > instead of destroying it before your module pass is run. once your module > pass is finished, a new MachineFunctionAnalysis is scheduled by the pass > manager. Now, instead of creating a new function, you could check the look-up > table of the MachineModuleInfo to get the original MachineFunction. > > it does not appear to be a very complicated change, but there might be some > dirty details that could make this approach hard to implement, e.g., information > stored with the MachineFunctionAnalysis itself. you could move this information > to a new class and see the MachineFunctionAnalysis as a wrapper to this class. > > a nice property of this solution is that code generation still proceeds on a > per-function-basis, unless you explicitly insert a module pass. > > another problem is, that there is little (or no) support to construct the > global control-flow from the machine code. for instance, the call graph is > based on LLVM-IR. depending on the target architecture, you might have call > sites in the machine code that were not visible in the LLVM-IR, e.g., when you > implement floating point operations using library calls. so there might be some > extra work needed to get this infrastructure too. > > I have not yet worked on this yet, but I plan to implement something like this > at some point sooner or later. if you find another (better) solution let me > know. > > best, > Florian > >-------------- next part -------------- A non-text attachment was scrubbed... Name: codegen-modulepasses.diff Type: text/x-patch Size: 4411 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121220/ee55021a/attachment.bin>
Apparently Analagous Threads
- [LLVMdev] global control flow graph at machine code level
- [LLVMdev] Clang's approach to anonymous struct pointer parameters
- [LLVMdev] Clang's approach to anonymous struct pointer parameters
- [LLVMdev] RFC: Pass Manager Redux
- [LLVMdev] Extracting libmachine from libcodegen (bug 1121)