In the spirit of the (long-term) intent to migrate away from the SelectionDAG framework, it is desirable to implement legalization passes as discrete passes. Attached is a patch which implements the beginning of a new type legalization pass, to help motivate discussion. Is LLVM IR the right level for this? The main alternative approach that's been discussed is to do FastISel to a target-independent opcode set on MachineInstrs, and then do legalization and ultimately the last phase off instruction selection proper after that. The most obvious advantage of using LLVM IR for legalization is that it's (currently) more developer-friendly. The most obvious advantage of using MachineInstrs is that they would make it easier to do low-level manipulations. Also, doing legalization on MachineInstrs would mean avoiding having LLVM-IR-level optimization passes which lower the IR, which has historically been a design goal of LLVM. The attached pass operates on LLVM IR, and it's been educational to develop it this way, but I'm ok with rewriting it in MachineInstrs if that's the consensus. Given that the code I wrote operates on LLVM IR, it raises the following interesting issues: The code I wrote in the attached patch operates on LLVM IR, so for example it expands adds into llvm.uadd_with_overflow intrinsics. The intrinsics available in LLVM IR today aren't as expressive as the ISD operator set in SelectionDAG, so the generated code is quite a bit more verbose in some cases. Should we instead add new intrinsics, for add and for a bunch of other things? People I've talked to so far were hesitant to add new intrinsics unless they're really prohibitive to do in other ways. How should we legalize function argument and return types? Because of LLVM IR rules, one can't just change the signature of a function without changing all its callsites, so as a consequence the code I wrote is a ModulePass. This is unfortunate, since it's a goal that most of the codegen passes be FunctionPasses. Modifying the Function types may also be incompatible with the ABI coordination dance between front-ends and backends on some targets. One alternative, which is also implemented, is to leave function signatures alone and simply insert conversions to and from legal types. In this case, instruction selection would need to know how to handle illegal types in these special circumstances, but presumably it would be easy enough to special-case them. However, if this pass is followed by an optimization pass analogous to DAGCombine, it may be tricky to keep the optimization pass from creating patterns which codegen isn't prepared to handle. Another alternative, which is not implemented yet, is have the legalization pass create new Functions, and make the original Functions simply call the legalized functions, and then have a late pass clean everything up. We may already need some amount of special-casing for things like bitfield loads and stores. To implement the C++ memory model, some bitfield loads and stores actually need to load and store a precise number of bits, even if that number of bits doesn't correspond to a legal integer (register) size on the target machine. This isn't implemented yet, but I expect this will be handled by leaving those loads and stores alone, and simply putting the burden on subsequent passes to lower them properly. An alternative to this is to add new truncating-store and sign/zero-extending load intrinsics. Another complication due to using LLVM IR is the interaction with DebugInfo. If AllocaInsts for illegal types are expanded, or if values for llvm.dbg.value intrinsics are expanded, there's currently no way to describe this (DWARF can describe it, but LLVM IR can't currently). I assume this could be fixed by extending LLVM IR's DebugInfo intrinsics, but I haven't investigated it yet. Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130424/51f958c6/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: legalize-integers.patch Type: application/octet-stream Size: 49922 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130424/51f958c6/attachment.obj>
On Apr 24, 2013, at 5:01 PM, Dan Gohman <dan433584 at gmail.com> wrote:> In the spirit of the (long-term) intent to migrate away from the SelectionDAG framework, it is desirable to implement legalization passes as discrete passes. Attached is a patch which implements the beginning of a new type legalization pass, to help motivate discussion.This is a great discussion to have.> Is LLVM IR the right level for this?IMO, no, definitely not.> The main alternative approach that's been discussed is to do FastISel to a target-independent opcode set on MachineInstrs, and then do legalization and ultimately the last phase off instruction selection proper after that. The most obvious advantage of using LLVM IR for legalization is that it's (currently) more developer-friendly. The most obvious advantage of using MachineInstrs is that they would make it easier to do low-level manipulations. Also, doing legalization on MachineInstrs would mean avoiding having LLVM-IR-level optimization passes which lower the IR, which has historically been a design goal of LLVM.I think that you (in the rest of your email) identify a number of specific problems with using LLVM IR for legalization. These are a lot of specific issues caused by the fact that LLVM IR is intentionally not trying to model machine issues. I'm sure you *could* try to make this work by introducing a bunch of new intrinsics into LLVM IR which would model the union of the selection dag ISD nodes along with the target specific X86ISD nodes. However, at this point, you have only modeled the operations and haven't modeled the proper type system. LLVM IR is just not the right level for this. You seem to think it is better than MachineInstrs because of developer friendliness, but it isn't clear to me that LLVM IR with the additions you're talking about would actually be friendly anymore :-) Personally, I think that the right representation for legalization is MachineInstrs supplemented with a type system that allows MVTs as well as register classes. If you are seriously interested in pushing forward on this, we should probably discuss it in person, or over beer at the next social or something. -Chris
> > Is LLVM IR the right level for this? The main alternative approach that's been discussed is to do FastISel to a target-independent opcode set on MachineInstrs, and then do legalization and ultimately the last phase off instruction selection proper after that. The most obvious advantage of using LLVM IR for legalization is that it's (currently) more developer-friendly. The most obvious advantage of using MachineInstrs is that they would make it easier to do low-level manipulations. Also, doing legalization on MachineInstrs would mean avoiding having LLVM-IR-level optimization passes which lower the IR, which has historically been a design goal of LLVM. > > The attached pass operates on LLVM IR, and it's been educational to develop it this way, but I'm ok with rewriting it in MachineInstrs if that's the consensus. > > Given that the code I wrote operates on LLVM IR, it raises the following interesting issues: > > The code I wrote in the attached patch operates on LLVM IR, so for example it expands adds into llvm.uadd_with_overflow intrinsics. The intrinsics available in LLVM IR today aren't as expressive as the ISD operator set in SelectionDAG, so the generated code is quite a bit more verbose in some cases. Should we instead add new intrinsics, for add and for a bunch of other things? People I've talked to so far were hesitant to add new intrinsics unless they're really prohibitive to do in other ways. > > How should we legalize function argument and return types? Because of LLVM IR rules, one can't just change the signature of a function without changing all its callsites, so as a consequence the code I wrote is a ModulePass. This is unfortunate, since it's a goal that most of the codegen passes be FunctionPasses. Modifying the Function types may also be incompatible with the ABI coordination dance between front-ends and backends on some targets. One alternative, which is also implemented, is to leave function signatures alone and simply insert conversions to and from legal types. In this case, instruction selection would need to know how to handle illegal types in these special circumstances, but presumably it would be easy enough to special-case them. However, if this pass is followed by an optimization pass analogous to DAGCombine, it may be tricky to keep the optimization pass from creating patterns which codegen isn't prepared to handle. Another alternative, which is not implemented yet, is have the legalization pass create new Functions, and make the original Functions simply call the legalized functions, and then have a late pass clean everything up. > > We may already need some amount of special-casing for things like bitfield loads and stores. To implement the C++ memory model, some bitfield loads and stores actually need to load and store a precise number of bits, even if that number of bits doesn't correspond to a legal integer (register) size on the target machine. This isn't implemented yet, but I expect this will be handled by leaving those loads and stores alone, and simply putting the burden on subsequent passes to lower them properly. An alternative to this is to add new truncating-store and sign/zero-extending load intrinsics. > > Another complication due to using LLVM IR is the interaction with DebugInfo. If AllocaInsts for illegal types are expanded, or if values for llvm.dbg.value intrinsics are expanded, there's currently no way to describe this (DWARF can describe it, but LLVM IR can't currently). I assume this could be fixed by extending LLVM IR's DebugInfo intrinsics, but I haven't investigated it yet.Hi Dan, Thank you for working on this. You mentioned that the two alternatives for replacing SelectionDAG is to munch on it from the top (by legalizing the IR) or from the bottom (by removing the scheduler, Isel and finally the legalization and lowering). You also mentioned some of the disadvantages of the approach that you are proposing. And I agree with you, this approach has many disadvantages. I think that the end goal should be legalization at the MI level. The llvm IR is nice and compact but it is not verbose enough to allow lowering. You mentioned ext/load and trunc/store, but this problem is much worse for vectors. For example, we lower shuffle vector to the following ISD nodes: broadcast, insert_subvector, extract_subvector, concat_vectors, permute, blend, extract/insert_element and a few others. Representing all of these nodes in IR would be inefficient and inconvenient. Every optimization that handles these intrinsics would need to to setup std::vectors, etc, and I think that the compile time for this will not be great either. But I don't think that this is the biggest problem. How do you plan to handle constants ? Do you lower them to global variables and loads ? How would you implement FastISEL ? Do you plan on having two instruction selectors (like we have today) or do you plan to lower IR to intrinsics and select that ? I think that one of the goals of getting rid of selection dag is to have one instruction selector. Thanks, Nadav -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130424/e1370fcf/attachment.html>
One question : "In the spirit of the (long-term) intent to migrate away from the SelectionDAG framework" .. is this meant in general or just in respect to legalization? On 04/24/2013 05:01 PM, Dan Gohman wrote:> In the spirit of the (long-term) intent to migrate away from the > SelectionDAG framework, it is desirable to implement legalization passes > as discrete passes. Attached is a patch which implements the beginning > of a new type legalization pass, to help motivate discussion. > > Is LLVM IR the right level for this? The main alternative approach > that's been discussed is to do FastISel to a target-independent opcode > set on MachineInstrs, and then do legalization and ultimately the last > phase off instruction selection proper after that. The most obvious > advantage of using LLVM IR for legalization is that it's (currently) > more developer-friendly. The most obvious advantage of using > MachineInstrs is that they would make it easier to do low-level > manipulations. Also, doing legalization on MachineInstrs would mean > avoiding having LLVM-IR-level optimization passes which lower the IR, > which has historically been a design goal of LLVM. > > The attached pass operates on LLVM IR, and it's been educational to > develop it this way, but I'm ok with rewriting it in MachineInstrs if > that's the consensus. > > Given that the code I wrote operates on LLVM IR, it raises the following > interesting issues: > > The code I wrote in the attached patch operates on LLVM IR, so for > example it expands adds into llvm.uadd_with_overflow intrinsics. The > intrinsics available in LLVM IR today aren't as expressive as the ISD > operator set in SelectionDAG, so the generated code is quite a bit more > verbose in some cases. Should we instead add new intrinsics, for add and > for a bunch of other things? People I've talked to so far were hesitant > to add new intrinsics unless they're really prohibitive to do in other ways. > > How should we legalize function argument and return types? Because of > LLVM IR rules, one can't just change the signature of a function without > changing all its callsites, so as a consequence the code I wrote is a > ModulePass. This is unfortunate, since it's a goal that most of the > codegen passes be FunctionPasses. Modifying the Function types may also > be incompatible with the ABI coordination dance between front-ends and > backends on some targets. One alternative, which is also implemented, is > to leave function signatures alone and simply insert conversions to and > from legal types. In this case, instruction selection would need to know > how to handle illegal types in these special circumstances, but > presumably it would be easy enough to special-case them. However, if > this pass is followed by an optimization pass analogous to DAGCombine, > it may be tricky to keep the optimization pass from creating patterns > which codegen isn't prepared to handle. Another alternative, which is > not implemented yet, is have the legalization pass create new Functions, > and make the original Functions simply call the legalized functions, and > then have a late pass clean everything up. > > We may already need some amount of special-casing for things like > bitfield loads and stores. To implement the C++ memory model, some > bitfield loads and stores actually need to load and store a precise > number of bits, even if that number of bits doesn't correspond to a > legal integer (register) size on the target machine. This isn't > implemented yet, but I expect this will be handled by leaving those > loads and stores alone, and simply putting the burden on subsequent > passes to lower them properly. An alternative to this is to add new > truncating-store and sign/zero-extending load intrinsics. > > Another complication due to using LLVM IR is the interaction with > DebugInfo. If AllocaInsts for illegal types are expanded, or if values > for llvm.dbg.value intrinsics are expanded, there's currently no way to > describe this (DWARF can describe it, but LLVM IR can't currently). I > assume this could be fixed by extending LLVM IR's DebugInfo intrinsics, > but I haven't investigated it yet. > > Dan > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
On Apr 24, 2013, at 5:53 PM, Reed Kotler <rkotler at mips.com> wrote:> One question : > > "In the spirit of the (long-term) intent to migrate away from the > SelectionDAG framework" > > .. is this meant in general or just in respect to legalization?Everything. This includes all of the custom lowering code for all of the targets, all of dagcombine, and maybe all of the patterns in the TD files. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130424/f5d6879d/attachment.html>
On 04/24/2013 05:26 PM, Chris Lattner wrote:> On Apr 24, 2013, at 5:01 PM, Dan Gohman <dan433584 at gmail.com> wrote: >> In the spirit of the (long-term) intent to migrate away from the SelectionDAG framework, it is desirable to implement legalization passes as discrete passes. Attached is a patch which implements the beginning of a new type legalization pass, to help motivate discussion. > > This is a great discussion to have. > >> Is LLVM IR the right level for this? > > IMO, no, definitely not. > >> The main alternative approach that's been discussed is to do FastISel to a target-independent opcode set on MachineInstrs, and then do legalization and ultimately the last phase off instruction selection proper after that. The most obvious advantage of using LLVM IR for legalization is that it's (currently) more developer-friendly. The most obvious advantage of using MachineInstrs is that they would make it easier to do low-level manipulations. Also, doing legalization on MachineInstrs would mean avoiding having LLVM-IR-level optimization passes which lower the IR, which has historically been a design goal of LLVM. > > I think that you (in the rest of your email) identify a number of specific problems with using LLVM IR for legalization. These are a lot of specific issues caused by the fact that LLVM IR is intentionally not trying to model machine issues. I'm sure you *could* try to make this work by introducing a bunch of new intrinsics into LLVM IR which would model the union of the selection dag ISD nodes along with the target specific X86ISD nodes. However, at this point, you have only modeled the operations and haven't modeled the proper type system. > > LLVM IR is just not the right level for this. You seem to think it is better than MachineInstrs because of developer friendliness, but it isn't clear to me that LLVM IR with the additions you're talking about would actually be friendly anymore :-) > > Personally, I think that the right representation for legalization is MachineInstrs supplemented with a type system that allows MVTs as well as register classes. If you are seriously interested in pushing forward on this, we should probably discuss it in person, or over beer at the next social or something. > > -Chris >I would really push towards doing this in LLVM IR as the next step. It's possible that what you are proposing is the right "long term" solution but I think it's not a good evolutionary approach; it's more revolutionary. I've already thought of many things that could be very clearly and easily done in IR that are done in very convoluted ways in Selection DAG. This kind of migration could take place right now and as we thin out the selection DAG portion of things to where it is almost non existent, making a jump to just eliminate it and replacing it would be more practical. Something like soft float for example is nearly trivial to do in IR. At the risk of appearing stupid, I can say that I've really struggled to understand selection DAG and all it's facets and interaction with table gen patterns, and this after having done a whole port from scratch already by myself. Part of it is the lack of documentation but also there are too many illogical things (to me) and special cases and hacks surrounding selection DAG and tablegen. On the other hand, I recently started to write some IR level passes and it was nearly trivial for me to understand how to use it and transform it. All the classes are more or less very clean, logical and regular. I was writing transformation passes on the first day with no issues. I think that LLVM IR could be extended to allow for all the things in legalization to take place and many other parts of lowering, i.e. lowering to use some IR which has additional lower level operations. Reed
Hi Dan, Others have weighed in on the merits of IR vs MI legalization, I thought I'd chip in on a different area: + /// Legal roughly means there's a physical register class on the target + /// machine for a type, and there's a reasonable set of instructions + /// which operate on registers of this class and interpret their contents + /// as instances of the type. For convenience, Legal is also used for + /// types which are not legalized by this pass (vectors, floats, etc.) + Legal, I don't think this is the right definition of a legal type. I know that that's how SelectionDAG currently defines it, and I think that definition is behind a lot of the difficulty in retargeting LLVM to something that doesn't look like the intersection of X86 and ARM. I think the correct answer (credit to Chris for this description) is that a legal type is one that (more or less) corresponds to a set of physical registers, and which the target is capable of loading, storing, and copying (possibly also inserting/extracting elements, for vector types). --Owen On Apr 24, 2013, at 5:01 PM, Dan Gohman <dan433584 at gmail.com> wrote:> In the spirit of the (long-term) intent to migrate away from the SelectionDAG framework, it is desirable to implement legalization passes as discrete passes. Attached is a patch which implements the beginning of a new type legalization pass, to help motivate discussion. > > Is LLVM IR the right level for this? The main alternative approach that's been discussed is to do FastISel to a target-independent opcode set on MachineInstrs, and then do legalization and ultimately the last phase off instruction selection proper after that. The most obvious advantage of using LLVM IR for legalization is that it's (currently) more developer-friendly. The most obvious advantage of using MachineInstrs is that they would make it easier to do low-level manipulations. Also, doing legalization on MachineInstrs would mean avoiding having LLVM-IR-level optimization passes which lower the IR, which has historically been a design goal of LLVM. > > The attached pass operates on LLVM IR, and it's been educational to develop it this way, but I'm ok with rewriting it in MachineInstrs if that's the consensus. > > Given that the code I wrote operates on LLVM IR, it raises the following interesting issues: > > The code I wrote in the attached patch operates on LLVM IR, so for example it expands adds into llvm.uadd_with_overflow intrinsics. The intrinsics available in LLVM IR today aren't as expressive as the ISD operator set in SelectionDAG, so the generated code is quite a bit more verbose in some cases. Should we instead add new intrinsics, for add and for a bunch of other things? People I've talked to so far were hesitant to add new intrinsics unless they're really prohibitive to do in other ways. > > How should we legalize function argument and return types? Because of LLVM IR rules, one can't just change the signature of a function without changing all its callsites, so as a consequence the code I wrote is a ModulePass. This is unfortunate, since it's a goal that most of the codegen passes be FunctionPasses. Modifying the Function types may also be incompatible with the ABI coordination dance between front-ends and backends on some targets. One alternative, which is also implemented, is to leave function signatures alone and simply insert conversions to and from legal types. In this case, instruction selection would need to know how to handle illegal types in these special circumstances, but presumably it would be easy enough to special-case them. However, if this pass is followed by an optimization pass analogous to DAGCombine, it may be tricky to keep the optimization pass from creating patterns which codegen isn't prepared to handle. Another alternative, which is not implemented yet, is have the legalization pass create new Functions, and make the original Functions simply call the legalized functions, and then have a late pass clean everything up. > > We may already need some amount of special-casing for things like bitfield loads and stores. To implement the C++ memory model, some bitfield loads and stores actually need to load and store a precise number of bits, even if that number of bits doesn't correspond to a legal integer (register) size on the target machine. This isn't implemented yet, but I expect this will be handled by leaving those loads and stores alone, and simply putting the burden on subsequent passes to lower them properly. An alternative to this is to add new truncating-store and sign/zero-extending load intrinsics. > > Another complication due to using LLVM IR is the interaction with DebugInfo. If AllocaInsts for illegal types are expanded, or if values for llvm.dbg.value intrinsics are expanded, there's currently no way to describe this (DWARF can describe it, but LLVM IR can't currently). I assume this could be fixed by extending LLVM IR's DebugInfo intrinsics, but I haven't investigated it yet. > > Dan > > <legalize-integers.patch>_______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Hi Dan, On 25 Apr 2013, at 01:01, Dan Gohman <dan433584 at gmail.com> wrote:> The main alternative approach that's been discussed is to do FastISel to a target-independent opcode set on MachineInstrs, and then do legalization and ultimately the last phase off instruction selection proper after that. The most obvious advantage of using LLVM IR for legalization is that it's (currently) more developer-friendly. The most obvious advantage of using MachineInstrs is that they would make it easier to do low-level manipulations. Also, doing legalization on MachineInstrs would mean avoiding having LLVM-IR-level optimization passes which lower the IR, which has historically been a design goal of LLVM.The approach taken in WHIRL, which has a lot of advantages, is exactly to lower the IR. It seems strange that in the back end we have Machine* classes that correspond very closely to IR equivalents, but which don't share any code and often have subtly different interfaces. The approach taken in WHIRL is to progressively replace machine-independent bits of the IR with machine-dependent ones, with abstract instructions being replaced with machine instructions, abstract registers with machine registers, and so on. I would be interested to know the rationale behind the design choice to avoid this, as it seems the obvious way of designing a compiler. The down side would be that you couldn't take any random pass that expected target-independent IR and run it, but you never actually want to do this once you've handed off to the codegen infrastructure anyway. David
On Thu, Apr 25, 2013 at 4:50 AM, David Chisnall <David.Chisnall at cl.cam.ac.uk> wrote:> Hi Dan, > > On 25 Apr 2013, at 01:01, Dan Gohman <dan433584 at gmail.com> wrote: > > > The main alternative approach that's been discussed is to do FastISel to > a target-independent opcode set on MachineInstrs, and then do legalization > and ultimately the last phase off instruction selection proper after that. > The most obvious advantage of using LLVM IR for legalization is that it's > (currently) more developer-friendly. The most obvious advantage of using > MachineInstrs is that they would make it easier to do low-level > manipulations. Also, doing legalization on MachineInstrs would mean > avoiding having LLVM-IR-level optimization passes which lower the IR, which > has historically been a design goal of LLVM. > > The approach taken in WHIRL, which has a lot of advantages, is exactly to > lower the IR. It seems strange that in the back end we have Machine* > classes that correspond very closely to IR equivalents, but which don't > share any code and often have subtly different interfaces. The approach > taken in WHIRL is to progressively replace machine-independent bits of the > IR with machine-dependent ones, with abstract instructions being replaced > with machine instructions, abstract registers with machine registers, and > so on. >Couldn't we first lower LLVM IR to a (mostly) target-independent sequence of MachineInstrs, and then progressively lower those? That seems to be very close to what you describe, and makes a great deal of sense to me. The MachineInstr could model operations and types that are not legal for the current target, but passes could lower those until everything is legal and all opcodes are target-specific. Targets can still lower function arguments/returns as they please. And most of the infrastructure for this is already present (no new IR).> > I would be interested to know the rationale behind the design choice to > avoid this, as it seems the obvious way of designing a compiler. The down > side would be that you couldn't take any random pass that expected > target-independent IR and run it, but you never actually want to do this > once you've handed off to the codegen infrastructure anyway. > > David > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130425/ebb50481/attachment.html>
On Wed, Apr 24, 2013 at 11:33 PM, Owen Anderson <resistor at mac.com> wrote:> Hi Dan, >Hi Owen,> Others have weighed in on the merits of IR vs MI legalization, I thought > I'd chip in on a different area: > > + /// Legal roughly means there's a physical register class on the > target > + /// machine for a type, and there's a reasonable set of instructions > + /// which operate on registers of this class and interpret their > contents > + /// as instances of the type. For convenience, Legal is also used for > + /// types which are not legalized by this pass (vectors, floats, etc.) > + Legal, > > I don't think this is the right definition of a legal type. I know that > that's how SelectionDAG currently defines it, and I think that definition > is behind a lot of the difficulty in retargeting LLVM to something that > doesn't look like the intersection of X86 and ARM. >Do you have a particular target in mind that we could discuss? Not all variances from the intersection of x86 and ARM are of the same nature; it's hard to talk in full generality here.> I think the correct answer (credit to Chris for this description) is that > a legal type is one that (more or less) corresponds to a set of physical > registers, and which the target is capable of loading, storing, and copying > (possibly also inserting/extracting elements, for vector types). >If the target doesn't actually have a copy for a register class which will be register-allocated, it will just need to pretend it has one at this level and lower it later somehow, otherwise a lot of other stuff won't work. I don't see why load and store are special at this level though. Or insert/extract element? Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130425/ff22942c/attachment.html>
On Wed, Apr 24, 2013 at 5:26 PM, Chris Lattner <clattner at apple.com> wrote:> On Apr 24, 2013, at 5:01 PM, Dan Gohman <dan433584 at gmail.com> wrote: > > In the spirit of the (long-term) intent to migrate away from the > SelectionDAG framework, it is desirable to implement legalization passes as > discrete passes. Attached is a patch which implements the beginning of a > new type legalization pass, to help motivate discussion. > > This is a great discussion to have. > > > Is LLVM IR the right level for this? > > IMO, no, definitely not. > > > The main alternative approach that's been discussed is to do FastISel to > a target-independent opcode set on MachineInstrs, and then do legalization > and ultimately the last phase off instruction selection proper after that. > The most obvious advantage of using LLVM IR for legalization is that it's > (currently) more developer-friendly. The most obvious advantage of using > MachineInstrs is that they would make it easier to do low-level > manipulations. Also, doing legalization on MachineInstrs would mean > avoiding having LLVM-IR-level optimization passes which lower the IR, which > has historically been a design goal of LLVM. > > I think that you (in the rest of your email) identify a number of specific > problems with using LLVM IR for legalization. These are a lot of specific > issues caused by the fact that LLVM IR is intentionally not trying to model > machine issues. I'm sure you *could* try to make this work by introducing > a bunch of new intrinsics into LLVM IR which would model the union of the > selection dag ISD nodes along with the target specific X86ISD nodes. > However, at this point, you have only modeled the operations and haven't > modeled the proper type system. >I don't wish to argue about this, and am fine following your suggestion. However, I would like to understand your reasons better. I don't think the type system is really the issue. The only thing SelectionDAG's type system has which LLVM IR's lacks which is useful here is "untyped", and that's a special-purpose thing that we can probably handle in other ways. You and others are right that there could be a fair number of new intrinsics, especially considering all the X86ISD ones and all the rest. Is this a significant concern for you? Targets already have large numbers of target-specific intrinsics; would adding a relatively moderate number of new intrinsics really be a problem? There's also the problem of keeping callers and callees consistent, and it's indeed quite a dickens, but it need not be a show-stopper. LLVM IR is just not the right level for this. You seem to think it is> better than MachineInstrs because of developer friendliness, but it isn't > clear to me that LLVM IR with the additions you're talking about would > actually be friendly anymore :-) >As I see it, people working in codegen are going to have to deal with lots of codegeny instructions regardless of whether we call them instructions or intrinsics. Is it really better one way or the other?> Personally, I think that the right representation for legalization is > MachineInstrs supplemented with a type system that allows MVTs as well as > register classes. If you are seriously interested in pushing forward on > this, we should probably discuss it in person, or over beer at the next > social or something. >Ok. Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130425/e5937fa3/attachment.html>
On Apr 25, 2013, at 1:50 AM, David Chisnall <David.Chisnall at cl.cam.ac.uk> wrote:> Hi Dan, > > On 25 Apr 2013, at 01:01, Dan Gohman <dan433584 at gmail.com> wrote: > >> The main alternative approach that's been discussed is to do FastISel to a target-independent opcode set on MachineInstrs, and then do legalization and ultimately the last phase off instruction selection proper after that. The most obvious advantage of using LLVM IR for legalization is that it's (currently) more developer-friendly. The most obvious advantage of using MachineInstrs is that they would make it easier to do low-level manipulations. Also, doing legalization on MachineInstrs would mean avoiding having LLVM-IR-level optimization passes which lower the IR, which has historically been a design goal of LLVM. > > The approach taken in WHIRL, which has a lot of advantages, is exactly to lower the IR. It seems strange that in the back end we have Machine* classes that correspond very closely to IR equivalents, but which don't share any code and often have subtly different interfaces. The approach taken in WHIRL is to progressively replace machine-independent bits of the IR with machine-dependent ones, with abstract instructions being replaced with machine instructions, abstract registers with machine registers, and so on. > > I would be interested to know the rationale behind the design choice to avoid this, as it seems the obvious way of designing a compiler. The down side would be that you couldn't take any random pass that expected target-independent IR and run it, but you never actually want to do this once you've handed off to the codegen infrastructure anyway.There definitely are strong advantages to using one datastructure to represent multiple levels of IR: you have less code in the compiler, more shared concepts, etc. I have seen and work with several compilers that tried to do this. Even GCC does this (in the opposite direction) with "treessa" which repurposes some front-end data structures for their mid-level IR. While there are advantages, it also means that you get fewer invariants, and that the data structures are a worse fit for each level. To give you one simple example: LLVM IR is simplified greatly based on the assumption that it is always in SSA and that each instruction produces one result value, and exceptions to that rule (like some intrinsics) can easily be modeled with extract value operations. This doesn't work for MachineInstrs which have the following additional complexity: - Not everything is in SSA, you have to model physical registers, even very early. - Lots of things return N values, and extract-value doesn't work. I consider it unacceptable to project complexity from MachineInstrs into LLVM IR. There are wins, but there are also unacceptably high costs. Some of those include: - LLVM IR is our stable IR format, MachineInstr is not. The later *needs* to evolve rapidly, where the former has settled down (mostly). - The reasons people like to work with LLVM IR is often directly because of the simplifications we get from having a simple model. Jeopardizing stability in the IR and making LLVM IR works to work with is not acceptable to me. -Chris