Hi, I was wondering how exactly instruction selection works in LLVM. As I'm not aware of any document describing it, I'll ask here :) So, the tablegen files that each backend implements describe the mapping between selection DAG nodes and assembly instructions (and optionally? their binary encoding). The selection DAG (one per basic block) is produced "by hand" directly from the LLVM IR. This pass is independent of the target architecture. Therefore, implementing a new backend is "just" about specifying patterns that convert selection DAG nodes to the target's assembly. Is my understanding correct? Also, is there any document describing the selection DAG nodes that one needs to match against? And what's the algorithm behind this selection DAG? I'm also wondering what would be the possible improvement of the approach presented last Friday at POPL (http://portal.acm.org/citation.cfm?doid=1706299.1706346 or http://www.eecs.tufts.edu/~dias/gentileset.pdf) for LLVM. Any insight? The approach seems similar to what gcc does or did at least (IR->RTL->ASM), so I'm not entirely sure there would be something to gain here for LLVM. It would be nice to generate the selection dag from LLVM IR automatically, though. Thanks, Nuno
On Jan 25, 2010, at 2:18 PM, Nuno Lopes wrote:> Hi, > > I was wondering how exactly instruction selection works in LLVM. As I'm not > aware of any document describing it, I'll ask here :) > > So, the tablegen files that each backend implements describe the mapping > between selection DAG nodes and assembly instructions (and optionally? their > binary encoding).Yes. And yes, the binary encoding is optional. Obviously, features like JIT compilation require it though.> The selection DAG (one per basic block) is produced "by > hand" directly from the LLVM IR.Yes.> This pass is independent of the target > architecture.Not entirely. There are several virtual functions which target-specific code overrides for specifying things like how arguments, return values, and calls are lowered.> Therefore, implementing a new backend is "just" about > specifying patterns that convert selection DAG nodes to the target's > assembly.That's just a part of it. There's also the description of the target's register sets, ABI information, assembler information, target-specific lowering, and a variety of other things. Take a look at an actual target.> > Is my understanding correct? Also, is there any document describing the > selection DAG nodes that one needs to match against?The "how to write a backend" documents are relevant. As are existing targets. The comments on the opcodes in include/CodeGen/SelectionDAGNodes.h are fairly descriptive as well.> And what's the > algorithm behind this selection DAG?The underlying algorithm is a fairly simple bottom-up tiling.> > I'm also wondering what would be the possible improvement of the approach > presented last Friday at POPL > (http://portal.acm.org/citation.cfm?doid=1706299.1706346 or > http://www.eecs.tufts.edu/~dias/gentileset.pdf) for LLVM. Any insight? The > approach seems similar to what gcc does or did at least (IR->RTL->ASM), so > I'm not entirely sure there would be something to gain here for LLVM. It > would be nice to generate the selection dag from LLVM IR automatically, > though.What are you looking to do? Dan
Many thanks for your reply, Dan!>> The "how to write a backend" documents are relevant. As are existing >> targets. The comments on the opcodes in >> include/CodeGen/SelectionDAGNodes.h are fairly descriptive as well.Ah, I see. There's actually much more documentation than I though :) Thanks for the pointers.>> I'm also wondering what would be the possible improvement of the approach >> presented last Friday at POPL >> (http://portal.acm.org/citation.cfm?doid=1706299.1706346 or >> http://www.eecs.tufts.edu/~dias/gentileset.pdf) for LLVM. Any insight? >> The >> approach seems similar to what gcc does or did at least (IR->RTL->ASM), >> so >> I'm not entirely sure there would be something to gain here for LLVM. It >> would be nice to generate the selection dag from LLVM IR automatically, >> though. > > What are you looking to do?For now, I'm just trying to understand what's the main contribution of this paper towards simplifying the retargeting of a compiler. Don't get me wrong; I do not want to bash the paper; I just feel that something is escaping me. The approach proposed seems to be fairly similar to what gcc and LLVM do. What makes me wondering is why their algorithm needs heavy reasoning to do semantic equivalence checking, while gcc & LLVM only need simple pattern matching. That's why I've been scratching my head the whole day :) Do you have any insight that can enlighten me, please? Thank you, Nuno