Am 19/08/20 um 22:37 schrieb David Greene:> Lorenzo Casalino via llvm-dev <llvm-dev at lists.llvm.org> writes: > >>>> I was imagining a per-instruction data-structure collecting metadata info >>>> related to that specific instruction, instead of having several metadata info >>>> directly embedded in each instruction. >>> Interesting. At the IR level metadata isn't necessarily unique, though >>> it can be made so. If multiple pieces of information were amalgamated >>> into one structure that might reduce the ability to share the in-memory >>> representation, which has a cost. >>> >> Uhm...could I ask you to elaborate a bit more on the "limitation on >> in-memory representation sharing"? It is not clear to me how this >> would cause a problem. > I just mean that at the IR level, if you have a metadata node with, say, > a string "foo bar" and another one with "foo" and put one on an > instruction and the other on another instruction, they won't share an > in-memory representation, whereas if you had separate nodes with "foo" > and "bar" and put both on a single instruction and just "foo" on another > instruction the "foo" metadata would be shared. >But isn't it an implementation aspect? I mean, you can have a metadata nodes which members are pointers; if two nodes have to share the same member instance, they can share the same pointer. After all, even when two instructions refer to a structurally equivalent Constant object (https://llvm.org/doxygen/classllvm_1_1Constant.html#details), they actually share the same pointer to the same Constant object.> Pre-RA it's relatively easy as long as we're still in SSA. The > intrinsic would simply take the instruction it should annotate as an > operand. After SSA it obviously becomes more difficult. I don't have a > lot of good answers for that right now. The live range for the value > defined by the annotated instruction and used the intrinsic would > contain both instructions so maybe that could be used to connect them. > > If the annotated instruction doesn't have an output value (like a store > on machine architectures) you would use the chain output in SelectionDAG > but there's no analogue in the MachineInstr representation.The usage of intrinsics as wrapper for instructions to be annotated is a really nice idea! Although this would require to instruct almost all passes of the codegen pipeline to skip them (which, for instance, is already done for llvm.dbg.* intrinsics). Nonetheless, although I like the idea, without a strategy to track output-less MachineInstructions, it won't go really far :( Furthermore, after register allocation there is a non-negligible effort to properly annotate instructions which share the same output register... Concerning the usage of the live ranges to tie annotated instruction and intrinsic, I have some doubts: 1. After register allocation, since metadata intrinsics are skipped (otherwise, they would be involved in the register allocation process, increasing the register pressure), the instruction stream would present both virtual and physical registers, which I am not sure it is totally ok. 2. Liveness information are still available after register allocation? Assuming a positive answer, live intervals may be split due to register allocation, making connection between intrinsic and annotated instruction really difficult. An enumeration of the MachineInstrucions, which is preserved through the codegen passes, would allow the creation of a 1:1 map between intrinsic and annotated instruction; but, unfortunately, there seems to not be such kind of enumeration in LLVM (maybe, SlotIndexes could might be used in a creative way). Sorry for the long delay! -- Lorenzo> -David
Lorenzo Casalino via llvm-dev <llvm-dev at lists.llvm.org> writes:>> If the annotated instruction doesn't have an output value (like a store >> on machine architectures) you would use the chain output in SelectionDAG >> but there's no analogue in the MachineInstr representation.> The usage of intrinsics as wrapper for instructions to be annotated is > a really nice idea! Although this would require to instruct almost all > passes of the codegen pipeline to skip them (which, for instance, is > already done for llvm.dbg.* intrinsics).It's not free, certainly.> Nonetheless, although I like the idea, without a strategy to track > output-less MachineInstructions, it won't go really far :(Agreed. There are probably ways to hack it in, but true metadata would b e much better.> Furthermore, after register allocation there is a non-negligible effort > to properly annotate instructions which share the same output register... > > Concerning the usage of the live ranges to tie annotated instruction and > intrinsic, I have some doubts: > > 1. After register allocation, since metadata intrinsics are skipped > (otherwise, they would be involved in the register allocation > process, increasing the register pressure), the instruction stream > would present both virtual and physical registers, which I am not > sure it is totally ok.They would have to participate in register allocation. I think the only downside would be an intrinsic that artificially extends the live range of a value by using it past its true dead point, either because the use really is the "last" one or because it fills a "hole" in the live range that otherwise would exist (for example a use in one of the if-then-else branches that would otherwise not exist). If the intrinsics really shadow "real" instructions then it should be possible to place them such that this is not an issue; for example, you could place them immediately before the "real" instruction. It's possible they could introduce extra spills and reloads, in that if a value is spilled it would be reloaded before the intrinsic. If the intrinsic were placed immediately before the "real" instruction then the reload would very likely be re-used for the "real" instruction so this is probably not an issue in practice.> 2. Liveness information are still available after register > allocation? Assuming a positive answer, live intervals may be > split due to register allocation, making connection between > intrinsic and annotated instruction really difficult.Intervals are available post-RA. They still contain information about defs so it is *possible* to track things back though the information tends to degrade.> An enumeration of the MachineInstrucions, which is preserved through > the codegen passes, would allow the creation of a 1:1 map between > intrinsic and annotated instruction; but, unfortunately, there seems > to not be such kind of enumeration in LLVM (maybe, SlotIndexes could > might be used in a creative way).Yeah, SlotIndexes are what is used in the live ranges.> Sorry for the long delay!No problem. It's good to hash these things out and identify areas of weakness that metadata could fill. -David
Am 31/08/20 um 14:10 schrieb David Greene:> Lorenzo Casalino via llvm-dev <llvm-dev at lists.llvm.org> writes: > >> Furthermore, after register allocation there is a non-negligible effort >> to properly annotate instructions which share the same output register... >> >> Concerning the usage of the live ranges to tie annotated instruction and >> intrinsic, I have some doubts: >> >> 1. After register allocation, since metadata intrinsics are skipped >> (otherwise, they would be involved in the register allocation >> process, increasing the register pressure), the instruction stream >> would present both virtual and physical registers, which I am not >> sure it is totally ok. > They would have to participate in register allocation.Should they? I mean: the register allocation "simply" creates a map (VirtReg -> PhysReg), and actual register re-writing takes place in a subsequent machine pass. So, we could avoid their partecipation in register allocation, reducing register pressure and spill/reload work. As a downside, we would have intrinsics with virtual registers as outputs, but it is not a problem, since they do not perform any real computation.> I think the only > downside would be an intrinsic that artificially extends the live range > of a value by using it past its true dead point, either because the use > really is the "last" one or because it fills a "hole" in the live range > that otherwise would exist (for example a use in one of the if-then-else > branches that would otherwise not exist). > > If the intrinsics really shadow "real" instructions then it should be > possible to place them such that this is not an issue; for example, you > could place them immediately before the "real" instruction.I do not think this would be possible: before register allocation, code is SSA form, thus the annotated instruction *must* preceeds the intrinsic annotating it. An alternative is to place the annotating intrinsic before the instruction who ends the specific live-range (not necessarely be an immediate predecessor). Just to point out a problem to cope with: instruction scheduling must be aware of this particular positioning of annotation intrinsics.> It's possible they could introduce extra spills and reloads, in that if > a value is spilled it would be reloaded before the intrinsic. If the > intrinsic were placed immediately before the "real" instruction then the > reload would very likely be re-used for the "real" instruction so this > is probably not an issue in practice.Yes, I agree Kind regards, -- Lorenzo