Gus Smith via llvm-dev
2018-Mar-19 02:39 UTC
[llvm-dev] Generating a custom opcode from an LLVM intrinsic
Craig, thanks for the quick response. That helps a lot. I had no clue they were buried in there, though I guess I should have looked harder -- the hex should have given me a clue, perhaps! For the sake of my own edification (and not taking up too much of your time) I will try to generate it myself. I've found the definition of the "I" class at line 358 of llvm/lib/Target/X86/X86InstrFormats.td, which helps a lot. Let's assume I want to produce opcode 0x16 (which I'm using because it doesn't seem to be implemented in gem5 otherwise, and would simply produce a warning). Then my guess is that I should use something like: def CACHEADD : I<0x16, FORMAT, (outs), (ins), ASM, [(int_cache_add)]>, PD; where FORMAT comes from http://legup.eecg.utoronto.ca/doxygen/namespacellvm_1_1X86II.html and ASM = ??? and i deleted IIC_SSE_PREFETCH (because I'm not sure what this flag indicates, but I assume it's not needed). I'm not sure what that PD is or if it should stay. Looking for input on this! Clearly it's not correct as-is, but I feel like I'm at least understanding parts of it. Thanks! For posterity, this page helped a lot, and probably should have been read first: https://llvm.org/docs/TableGen/index.html In smaller part, this one helped too, but read the above page first: https://llvm.org/docs/TableGen/LangRef.html On Sun, Mar 18, 2018 at 7:43 PM, Craig Topper <craig.topper at gmail.com> wrote:> Here's a couple examples for mapping an intrinsic to an X86 instruction > from X86InstrInfo.td. If you look for int_x86_* in any X86Instr*.td you can > find others. > > let Predicates = [HasCLFLUSHOPT], SchedRW = [WriteLoad] in > def CLFLUSHOPT : I<0xAE, MRM7m, (outs), (ins i8mem:$src), > "clflushopt\t$src", [(int_x86_clflushopt addr:$src)], > IIC_SSE_PREFETCH>, PD; > > let Predicates = [HasCLWB], SchedRW = [WriteLoad] in > def CLWB : I<0xAE, MRM6m, (outs), (ins i8mem:$src), "clwb\t$src", > [(int_x86_clwb addr:$src)], IIC_SSE_PREFETCH>, PD; > > The encoding information for the binary output is buried in these > definitions too. If you tell me what opcode you've chosen I can tell you > what the right things are to get the binary output. > > > ~Craig > > On Sun, Mar 18, 2018 at 3:22 PM, Gus Smith via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hello all. LLVM newbie here. If anything seems glaringly wrong with my >> use of LLVM, that's probably why. >> >> Here's what I'm trying to do. I have modified the gem5 simulator to >> accept a "new" x86 instruction. I've done this by just reserving the opcode >> in gem5's ISA specification, just as all other instructions are specified. >> >> I'm trying to get an LLVM backend to generate this opcode during code >> generation. My current plan is: >> >> 1. During an LLVM pass, I'll detect a series of instructions which >> can be replaced with this new instruction. (The new instruction is a "cache >> compute" instruction -- in my passes, I replace a series of loads, >> operations, and stores with this single instruction.) This step is complete. >> 2. I replace the series of instructions with an intrinsic. I have >> added an intrinsic using the instructions here >> <https://llvm.org/docs/ExtendingLLVM.html#adding-a-new-intrinsic-function>. >> This step is complete. >> 3. During code generation, the intrinsic should be converted to this >> reserved opcode. This is where I'm stuck. >> >> I'm stuck on step 3. I have two main questions that should unblock me: >> >> Question 1: where is the code that maps from intrinsics to instructions? >> The link above states: >> >> "Add support to the .td file for the target(s) of your choice in >> lib/Target/*/*.td. This is usually a matter of adding a pattern to the >> .td file that matches the intrinsic, though it may obviously require adding >> the instructions you want to generate as well. There are lots of examples >> in the PowerPC and X86 backend to follow." >> >> However, looking through these examples isn't illuminating anything for >> me. Any more documentation or high-level explanation on this subject would >> be really helpful. I have read something about "lowering" of intrinsics; >> not sure if that's relevant. >> >> Question 2: will I be able to generate this opcode directly from the >> intrinsic, or will I have to add the opcode as an LLVM IR instruction and >> specify how it gets compiled? I can imagine two options: >> option 1: I can define a "translation" from intrinsic straight to an x86 >> opcode. >> option 2: I can define a "translation" (perhaps in a .td file? I think >> that's what they're used for) which translates my intrinsic into a new >> instruction, and then I can define another translation which will map the >> new instruction to my opcode during code gen. If this is the case, I'm not >> sure there's any point to having an intrinsic; I should just add a new >> instruction instead. >> >> Hoping someone can help! As you can tell, I'm a little lost...the >> documentation for LLVM is great, but it's a little above my level right now >> :) >> >> Gus Smith, PSU >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180318/3f58b518/attachment.html>
Craig Topper via llvm-dev
2018-Mar-19 03:30 UTC
[llvm-dev] Generating a custom opcode from an LLVM intrinsic
ASM is the text output you want printed in a textual listing of the assembly. The curly braces you see in some text strings like "adcx{l}\t{$src, $dst|$dst, $src}" are there to provide different operand orders for at&t syntax vs intel syntax. Anything after $ matches the name in the outs/in part of the instruction. IIC_SSE_PREFETCH is part of the scheduler system to provide latency/throughput information about the instruction. PD indicates the instruction should be on the 0x0f two byte opcode map with a 0x66 prefix. Most common other values in place of PD TB - 0x0f opcode map no prefix(0x66, 0xf2, 0xf3) and use of one of those prefixes should be ignored by the disassembler. PS - 0x0f opcode map no prefix, but if the disassembler sees a prefix it should not decode to this instruction. Should be used when there is another instruction with the same opcode that uses a prefix PD - 0x0f opcode map with 0x66 prefix XS - 0x0f opcode map with 0xf3 prefix XD - 0x0f opcode map with 0xf2 prefix T8 - 0x0f 0x38 opcode map with no prefix T8PS - 0x0f 0x38 opcode map version of PS from above T8PD - 0x0f 0x38 opcode map version of PD from above T8XS - 0x0f 0x38 opcode version of XS from above T8XD - 0x0f 0x38 opcode version of XD from above TA - 0x0f 0x3a opcode map with no prefix TAPS - 0x0f 0x3a opcode map version of PS from above TAPD - 0x0f 0x3a opcode map version of PD from above TAXS - 0x0f 0x3a opcode version of XS from above TAXD - 0x0f 0x3a opcode version of XD from above ~Craig On Sun, Mar 18, 2018 at 7:39 PM, Gus Smith <gushenrysmith at gmail.com> wrote:> Craig, thanks for the quick response. That helps a lot. I had no clue they > were buried in there, though I guess I should have looked harder -- the hex > should have given me a clue, perhaps! > > For the sake of my own edification (and not taking up too much of your > time) I will try to generate it myself. I've found the definition of the > "I" class at line 358 of llvm/lib/Target/X86/X86InstrFormats.td, which > helps a lot. > > Let's assume I want to produce opcode 0x16 (which I'm using because it > doesn't seem to be implemented in gem5 otherwise, and would simply produce > a warning). Then my guess is that I should use something like: > def CACHEADD : I<0x16, FORMAT, (outs), (ins), > ASM, [(int_cache_add)]>, PD; > > where FORMAT comes from http://legup.eecg.utoronto.ca/doxygen/ > namespacellvm_1_1X86II.html > and ASM = ??? > and i deleted IIC_SSE_PREFETCH (because I'm not sure what this flag > indicates, but I assume it's not needed). > I'm not sure what that PD is or if it should stay. > > Looking for input on this! Clearly it's not correct as-is, but I feel like > I'm at least understanding parts of it. Thanks! > > For posterity, this page helped a lot, and probably should have been read > first: https://llvm.org/docs/TableGen/index.html > In smaller part, this one helped too, but read the above page first: > https://llvm.org/docs/TableGen/LangRef.html > > On Sun, Mar 18, 2018 at 7:43 PM, Craig Topper <craig.topper at gmail.com> > wrote: > >> Here's a couple examples for mapping an intrinsic to an X86 instruction >> from X86InstrInfo.td. If you look for int_x86_* in any X86Instr*.td you can >> find others. >> >> let Predicates = [HasCLFLUSHOPT], SchedRW = [WriteLoad] in >> def CLFLUSHOPT : I<0xAE, MRM7m, (outs), (ins i8mem:$src), >> "clflushopt\t$src", [(int_x86_clflushopt addr:$src)], >> IIC_SSE_PREFETCH>, PD; >> >> let Predicates = [HasCLWB], SchedRW = [WriteLoad] in >> def CLWB : I<0xAE, MRM6m, (outs), (ins i8mem:$src), "clwb\t$src", >> [(int_x86_clwb addr:$src)], IIC_SSE_PREFETCH>, PD; >> >> The encoding information for the binary output is buried in these >> definitions too. If you tell me what opcode you've chosen I can tell you >> what the right things are to get the binary output. >> >> >> ~Craig >> >> On Sun, Mar 18, 2018 at 3:22 PM, Gus Smith via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> Hello all. LLVM newbie here. If anything seems glaringly wrong with my >>> use of LLVM, that's probably why. >>> >>> Here's what I'm trying to do. I have modified the gem5 simulator to >>> accept a "new" x86 instruction. I've done this by just reserving the opcode >>> in gem5's ISA specification, just as all other instructions are specified. >>> >>> I'm trying to get an LLVM backend to generate this opcode during code >>> generation. My current plan is: >>> >>> 1. During an LLVM pass, I'll detect a series of instructions which >>> can be replaced with this new instruction. (The new instruction is a "cache >>> compute" instruction -- in my passes, I replace a series of loads, >>> operations, and stores with this single instruction.) This step is complete. >>> 2. I replace the series of instructions with an intrinsic. I have >>> added an intrinsic using the instructions here >>> <https://llvm.org/docs/ExtendingLLVM.html#adding-a-new-intrinsic-function>. >>> This step is complete. >>> 3. During code generation, the intrinsic should be converted to this >>> reserved opcode. This is where I'm stuck. >>> >>> I'm stuck on step 3. I have two main questions that should unblock me: >>> >>> Question 1: where is the code that maps from intrinsics to instructions? >>> The link above states: >>> >>> "Add support to the .td file for the target(s) of your choice in >>> lib/Target/*/*.td. This is usually a matter of adding a pattern to the >>> .td file that matches the intrinsic, though it may obviously require adding >>> the instructions you want to generate as well. There are lots of examples >>> in the PowerPC and X86 backend to follow." >>> >>> However, looking through these examples isn't illuminating anything for >>> me. Any more documentation or high-level explanation on this subject would >>> be really helpful. I have read something about "lowering" of intrinsics; >>> not sure if that's relevant. >>> >>> Question 2: will I be able to generate this opcode directly from the >>> intrinsic, or will I have to add the opcode as an LLVM IR instruction and >>> specify how it gets compiled? I can imagine two options: >>> option 1: I can define a "translation" from intrinsic straight to an x86 >>> opcode. >>> option 2: I can define a "translation" (perhaps in a .td file? I think >>> that's what they're used for) which translates my intrinsic into a new >>> instruction, and then I can define another translation which will map the >>> new instruction to my opcode during code gen. If this is the case, I'm not >>> sure there's any point to having an intrinsic; I should just add a new >>> instruction instead. >>> >>> Hoping someone can help! As you can tell, I'm a little lost...the >>> documentation for LLVM is great, but it's a little above my level right now >>> :) >>> >>> Gus Smith, PSU >>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180318/65341a78/attachment.html>
Gus Smith via llvm-dev
2018-Mar-20 15:27 UTC
[llvm-dev] Generating a custom opcode from an LLVM intrinsic
Great info -- all of this has been incredibly useful. Do you have any links to the documentation from this, or does it just come from your experiential knowledge? FYI, I achieved what I set out to achieve when I wrote this email. I'm moving on to a more complex goal now, but the original question was answered completely, in my opinion. This was the key line: def CACHEOP : I<0x06, RawFrm, (outs), (ins), "cache_op", [(int_cache_op)]>; I added this definition to llvm/lib/Target/X86/X86InstrInfo.td. I also had to comment out an instruction (PUSHES) which overlapped the 0x06 opcode. This was OK in my case (as far as I know) because PUSHES isn't implemented in gem5. Thanks again! Gus On Sun, Mar 18, 2018 at 11:30 PM, Craig Topper <craig.topper at gmail.com> wrote:> ASM is the text output you want printed in a textual listing of the > assembly. The curly braces you see in some text strings like > "adcx{l}\t{$src, $dst|$dst, $src}" are there to provide different operand > orders for at&t syntax vs intel syntax. Anything after $ matches the name > in the outs/in part of the instruction. > > IIC_SSE_PREFETCH is part of the scheduler system to provide > latency/throughput information about the instruction. > > PD indicates the instruction should be on the 0x0f two byte opcode map > with a 0x66 prefix. > > Most common other values in place of PD > TB - 0x0f opcode map no prefix(0x66, 0xf2, 0xf3) and use of one of those > prefixes should be ignored by the disassembler. > PS - 0x0f opcode map no prefix, but if the disassembler sees a prefix it > should not decode to this instruction. Should be used when there is another > instruction with the same opcode that uses a prefix > PD - 0x0f opcode map with 0x66 prefix > XS - 0x0f opcode map with 0xf3 prefix > XD - 0x0f opcode map with 0xf2 prefix > T8 - 0x0f 0x38 opcode map with no prefix > T8PS - 0x0f 0x38 opcode map version of PS from above > T8PD - 0x0f 0x38 opcode map version of PD from above > T8XS - 0x0f 0x38 opcode version of XS from above > T8XD - 0x0f 0x38 opcode version of XD from above > TA - 0x0f 0x3a opcode map with no prefix > TAPS - 0x0f 0x3a opcode map version of PS from above > TAPD - 0x0f 0x3a opcode map version of PD from above > TAXS - 0x0f 0x3a opcode version of XS from above > TAXD - 0x0f 0x3a opcode version of XD from above > > > > > ~Craig > > On Sun, Mar 18, 2018 at 7:39 PM, Gus Smith <gushenrysmith at gmail.com> > wrote: > >> Craig, thanks for the quick response. That helps a lot. I had no clue >> they were buried in there, though I guess I should have looked harder -- >> the hex should have given me a clue, perhaps! >> >> For the sake of my own edification (and not taking up too much of your >> time) I will try to generate it myself. I've found the definition of the >> "I" class at line 358 of llvm/lib/Target/X86/X86InstrFormats.td, which >> helps a lot. >> >> Let's assume I want to produce opcode 0x16 (which I'm using because it >> doesn't seem to be implemented in gem5 otherwise, and would simply produce >> a warning). Then my guess is that I should use something like: >> def CACHEADD : I<0x16, FORMAT, (outs), (ins), >> ASM, [(int_cache_add)]>, PD; >> >> where FORMAT comes from http://legup.eecg.utoront >> o.ca/doxygen/namespacellvm_1_1X86II.html >> and ASM = ??? >> and i deleted IIC_SSE_PREFETCH (because I'm not sure what this flag >> indicates, but I assume it's not needed). >> I'm not sure what that PD is or if it should stay. >> >> Looking for input on this! Clearly it's not correct as-is, but I feel >> like I'm at least understanding parts of it. Thanks! >> >> For posterity, this page helped a lot, and probably should have been read >> first: https://llvm.org/docs/TableGen/index.html >> In smaller part, this one helped too, but read the above page first: >> https://llvm.org/docs/TableGen/LangRef.html >> >> On Sun, Mar 18, 2018 at 7:43 PM, Craig Topper <craig.topper at gmail.com> >> wrote: >> >>> Here's a couple examples for mapping an intrinsic to an X86 instruction >>> from X86InstrInfo.td. If you look for int_x86_* in any X86Instr*.td you can >>> find others. >>> >>> let Predicates = [HasCLFLUSHOPT], SchedRW = [WriteLoad] in >>> def CLFLUSHOPT : I<0xAE, MRM7m, (outs), (ins i8mem:$src), >>> "clflushopt\t$src", [(int_x86_clflushopt addr:$src)], >>> IIC_SSE_PREFETCH>, PD; >>> >>> let Predicates = [HasCLWB], SchedRW = [WriteLoad] in >>> def CLWB : I<0xAE, MRM6m, (outs), (ins i8mem:$src), "clwb\t$src", >>> [(int_x86_clwb addr:$src)], IIC_SSE_PREFETCH>, PD; >>> >>> The encoding information for the binary output is buried in these >>> definitions too. If you tell me what opcode you've chosen I can tell you >>> what the right things are to get the binary output. >>> >>> >>> ~Craig >>> >>> On Sun, Mar 18, 2018 at 3:22 PM, Gus Smith via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> Hello all. LLVM newbie here. If anything seems glaringly wrong with my >>>> use of LLVM, that's probably why. >>>> >>>> Here's what I'm trying to do. I have modified the gem5 simulator to >>>> accept a "new" x86 instruction. I've done this by just reserving the opcode >>>> in gem5's ISA specification, just as all other instructions are specified. >>>> >>>> I'm trying to get an LLVM backend to generate this opcode during code >>>> generation. My current plan is: >>>> >>>> 1. During an LLVM pass, I'll detect a series of instructions which >>>> can be replaced with this new instruction. (The new instruction is a "cache >>>> compute" instruction -- in my passes, I replace a series of loads, >>>> operations, and stores with this single instruction.) This step is complete. >>>> 2. I replace the series of instructions with an intrinsic. I have >>>> added an intrinsic using the instructions here >>>> <https://llvm.org/docs/ExtendingLLVM.html#adding-a-new-intrinsic-function>. >>>> This step is complete. >>>> 3. During code generation, the intrinsic should be converted to >>>> this reserved opcode. This is where I'm stuck. >>>> >>>> I'm stuck on step 3. I have two main questions that should unblock me: >>>> >>>> Question 1: where is the code that maps from intrinsics to >>>> instructions? The link above states: >>>> >>>> "Add support to the .td file for the target(s) of your choice in >>>> lib/Target/*/*.td. This is usually a matter of adding a pattern to the >>>> .td file that matches the intrinsic, though it may obviously require adding >>>> the instructions you want to generate as well. There are lots of examples >>>> in the PowerPC and X86 backend to follow." >>>> >>>> However, looking through these examples isn't illuminating anything for >>>> me. Any more documentation or high-level explanation on this subject would >>>> be really helpful. I have read something about "lowering" of intrinsics; >>>> not sure if that's relevant. >>>> >>>> Question 2: will I be able to generate this opcode directly from the >>>> intrinsic, or will I have to add the opcode as an LLVM IR instruction and >>>> specify how it gets compiled? I can imagine two options: >>>> option 1: I can define a "translation" from intrinsic straight to an >>>> x86 opcode. >>>> option 2: I can define a "translation" (perhaps in a .td file? I think >>>> that's what they're used for) which translates my intrinsic into a new >>>> instruction, and then I can define another translation which will map the >>>> new instruction to my opcode during code gen. If this is the case, I'm not >>>> sure there's any point to having an intrinsic; I should just add a new >>>> instruction instead. >>>> >>>> Hoping someone can help! As you can tell, I'm a little lost...the >>>> documentation for LLVM is great, but it's a little above my level right now >>>> :) >>>> >>>> Gus Smith, PSU >>>> >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> >>> >> >-- *Gus Smith* Penn State University M.S./B.S. Computer Science and Engineering '18 Microsystems Design Lab - Researcher (570)817-9340 | hfs5022 at psu.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180320/dc54f309/attachment-0001.html>
Gus Smith via llvm-dev
2018-Mar-20 15:28 UTC
[llvm-dev] Generating a custom opcode from an LLVM intrinsic
Great info -- all of this has been incredibly useful. Do you have any links to the documentation from this, or does it just come from your experiential knowledge? FYI, I achieved what I set out to achieve when I wrote this email. I'm moving on to a more complex goal now, but the original question was answered completely, in my opinion. This was the key line: def CACHEOP : I<0x06, RawFrm, (outs), (ins), "cache_op", [(int_cache_op)]>; I added this definition to llvm/lib/Target/X86/X86InstrInfo.td. I also had to comment out an instruction (PUSHES) which overlapped the 0x06 opcode. This was OK in my case (as far as I know) because PUSHES isn't implemented in gem5. Thanks again! Gus On Sun, Mar 18, 2018 at 11:30 PM, Craig Topper <craig.topper at gmail.com> wrote:> ASM is the text output you want printed in a textual listing of the > assembly. The curly braces you see in some text strings like > "adcx{l}\t{$src, $dst|$dst, $src}" are there to provide different operand > orders for at&t syntax vs intel syntax. Anything after $ matches the name > in the outs/in part of the instruction. > > IIC_SSE_PREFETCH is part of the scheduler system to provide > latency/throughput information about the instruction. > > PD indicates the instruction should be on the 0x0f two byte opcode map > with a 0x66 prefix. > > Most common other values in place of PD > TB - 0x0f opcode map no prefix(0x66, 0xf2, 0xf3) and use of one of those > prefixes should be ignored by the disassembler. > PS - 0x0f opcode map no prefix, but if the disassembler sees a prefix it > should not decode to this instruction. Should be used when there is another > instruction with the same opcode that uses a prefix > PD - 0x0f opcode map with 0x66 prefix > XS - 0x0f opcode map with 0xf3 prefix > XD - 0x0f opcode map with 0xf2 prefix > T8 - 0x0f 0x38 opcode map with no prefix > T8PS - 0x0f 0x38 opcode map version of PS from above > T8PD - 0x0f 0x38 opcode map version of PD from above > T8XS - 0x0f 0x38 opcode version of XS from above > T8XD - 0x0f 0x38 opcode version of XD from above > TA - 0x0f 0x3a opcode map with no prefix > TAPS - 0x0f 0x3a opcode map version of PS from above > TAPD - 0x0f 0x3a opcode map version of PD from above > TAXS - 0x0f 0x3a opcode version of XS from above > TAXD - 0x0f 0x3a opcode version of XD from above > > > > > ~Craig > > On Sun, Mar 18, 2018 at 7:39 PM, Gus Smith <gushenrysmith at gmail.com> > wrote: > >> Craig, thanks for the quick response. That helps a lot. I had no clue >> they were buried in there, though I guess I should have looked harder -- >> the hex should have given me a clue, perhaps! >> >> For the sake of my own edification (and not taking up too much of your >> time) I will try to generate it myself. I've found the definition of the >> "I" class at line 358 of llvm/lib/Target/X86/X86InstrFormats.td, which >> helps a lot. >> >> Let's assume I want to produce opcode 0x16 (which I'm using because it >> doesn't seem to be implemented in gem5 otherwise, and would simply produce >> a warning). Then my guess is that I should use something like: >> def CACHEADD : I<0x16, FORMAT, (outs), (ins), >> ASM, [(int_cache_add)]>, PD; >> >> where FORMAT comes from http://legup.eecg.utoront >> o.ca/doxygen/namespacellvm_1_1X86II.html >> and ASM = ??? >> and i deleted IIC_SSE_PREFETCH (because I'm not sure what this flag >> indicates, but I assume it's not needed). >> I'm not sure what that PD is or if it should stay. >> >> Looking for input on this! Clearly it's not correct as-is, but I feel >> like I'm at least understanding parts of it. Thanks! >> >> For posterity, this page helped a lot, and probably should have been read >> first: https://llvm.org/docs/TableGen/index.html >> In smaller part, this one helped too, but read the above page first: >> https://llvm.org/docs/TableGen/LangRef.html >> >> On Sun, Mar 18, 2018 at 7:43 PM, Craig Topper <craig.topper at gmail.com> >> wrote: >> >>> Here's a couple examples for mapping an intrinsic to an X86 instruction >>> from X86InstrInfo.td. If you look for int_x86_* in any X86Instr*.td you can >>> find others. >>> >>> let Predicates = [HasCLFLUSHOPT], SchedRW = [WriteLoad] in >>> def CLFLUSHOPT : I<0xAE, MRM7m, (outs), (ins i8mem:$src), >>> "clflushopt\t$src", [(int_x86_clflushopt addr:$src)], >>> IIC_SSE_PREFETCH>, PD; >>> >>> let Predicates = [HasCLWB], SchedRW = [WriteLoad] in >>> def CLWB : I<0xAE, MRM6m, (outs), (ins i8mem:$src), "clwb\t$src", >>> [(int_x86_clwb addr:$src)], IIC_SSE_PREFETCH>, PD; >>> >>> The encoding information for the binary output is buried in these >>> definitions too. If you tell me what opcode you've chosen I can tell you >>> what the right things are to get the binary output. >>> >>> >>> ~Craig >>> >>> On Sun, Mar 18, 2018 at 3:22 PM, Gus Smith via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> Hello all. LLVM newbie here. If anything seems glaringly wrong with my >>>> use of LLVM, that's probably why. >>>> >>>> Here's what I'm trying to do. I have modified the gem5 simulator to >>>> accept a "new" x86 instruction. I've done this by just reserving the opcode >>>> in gem5's ISA specification, just as all other instructions are specified. >>>> >>>> I'm trying to get an LLVM backend to generate this opcode during code >>>> generation. My current plan is: >>>> >>>> 1. During an LLVM pass, I'll detect a series of instructions which >>>> can be replaced with this new instruction. (The new instruction is a "cache >>>> compute" instruction -- in my passes, I replace a series of loads, >>>> operations, and stores with this single instruction.) This step is complete. >>>> 2. I replace the series of instructions with an intrinsic. I have >>>> added an intrinsic using the instructions here >>>> <https://llvm.org/docs/ExtendingLLVM.html#adding-a-new-intrinsic-function>. >>>> This step is complete. >>>> 3. During code generation, the intrinsic should be converted to >>>> this reserved opcode. This is where I'm stuck. >>>> >>>> I'm stuck on step 3. I have two main questions that should unblock me: >>>> >>>> Question 1: where is the code that maps from intrinsics to >>>> instructions? The link above states: >>>> >>>> "Add support to the .td file for the target(s) of your choice in >>>> lib/Target/*/*.td. This is usually a matter of adding a pattern to the >>>> .td file that matches the intrinsic, though it may obviously require adding >>>> the instructions you want to generate as well. There are lots of examples >>>> in the PowerPC and X86 backend to follow." >>>> >>>> However, looking through these examples isn't illuminating anything for >>>> me. Any more documentation or high-level explanation on this subject would >>>> be really helpful. I have read something about "lowering" of intrinsics; >>>> not sure if that's relevant. >>>> >>>> Question 2: will I be able to generate this opcode directly from the >>>> intrinsic, or will I have to add the opcode as an LLVM IR instruction and >>>> specify how it gets compiled? I can imagine two options: >>>> option 1: I can define a "translation" from intrinsic straight to an >>>> x86 opcode. >>>> option 2: I can define a "translation" (perhaps in a .td file? I think >>>> that's what they're used for) which translates my intrinsic into a new >>>> instruction, and then I can define another translation which will map the >>>> new instruction to my opcode during code gen. If this is the case, I'm not >>>> sure there's any point to having an intrinsic; I should just add a new >>>> instruction instead. >>>> >>>> Hoping someone can help! As you can tell, I'm a little lost...the >>>> documentation for LLVM is great, but it's a little above my level right now >>>> :) >>>> >>>> Gus Smith, PSU >>>> >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> >>> >> >-- *Gus Smith* Penn State University M.S./B.S. Computer Science and Engineering '18 Microsystems Design Lab - Researcher (570)817-9340 | hfs5022 at psu.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180320/a0de7661/attachment.html>
Gus Smith via llvm-dev
2018-Mar-20 15:28 UTC
[llvm-dev] Generating a custom opcode from an LLVM intrinsic
Great info -- all of this has been incredibly useful. Do you have any links to the documentation from this, or does it just come from your experiential knowledge? FYI, I achieved what I set out to achieve when I wrote this email. I'm moving on to a more complex goal now, but the original question was answered completely, in my opinion. This was the key line: def CACHEOP : I<0x06, RawFrm, (outs), (ins), "cache_op", [(int_cache_op)]>; I added this definition to llvm/lib/Target/X86/X86InstrInfo.td. I also had to comment out an instruction (PUSHES) which overlapped the 0x06 opcode. This was OK in my case (as far as I know) because PUSHES isn't implemented in gem5. Thanks again! Gus On Sun, Mar 18, 2018 at 11:30 PM, Craig Topper <craig.topper at gmail.com> wrote:> ASM is the text output you want printed in a textual listing of the > assembly. The curly braces you see in some text strings like > "adcx{l}\t{$src, $dst|$dst, $src}" are there to provide different operand > orders for at&t syntax vs intel syntax. Anything after $ matches the name > in the outs/in part of the instruction. > > IIC_SSE_PREFETCH is part of the scheduler system to provide > latency/throughput information about the instruction. > > PD indicates the instruction should be on the 0x0f two byte opcode map > with a 0x66 prefix. > > Most common other values in place of PD > TB - 0x0f opcode map no prefix(0x66, 0xf2, 0xf3) and use of one of those > prefixes should be ignored by the disassembler. > PS - 0x0f opcode map no prefix, but if the disassembler sees a prefix it > should not decode to this instruction. Should be used when there is another > instruction with the same opcode that uses a prefix > PD - 0x0f opcode map with 0x66 prefix > XS - 0x0f opcode map with 0xf3 prefix > XD - 0x0f opcode map with 0xf2 prefix > T8 - 0x0f 0x38 opcode map with no prefix > T8PS - 0x0f 0x38 opcode map version of PS from above > T8PD - 0x0f 0x38 opcode map version of PD from above > T8XS - 0x0f 0x38 opcode version of XS from above > T8XD - 0x0f 0x38 opcode version of XD from above > TA - 0x0f 0x3a opcode map with no prefix > TAPS - 0x0f 0x3a opcode map version of PS from above > TAPD - 0x0f 0x3a opcode map version of PD from above > TAXS - 0x0f 0x3a opcode version of XS from above > TAXD - 0x0f 0x3a opcode version of XD from above > > > > > ~Craig > > On Sun, Mar 18, 2018 at 7:39 PM, Gus Smith <gushenrysmith at gmail.com> > wrote: > >> Craig, thanks for the quick response. That helps a lot. I had no clue >> they were buried in there, though I guess I should have looked harder -- >> the hex should have given me a clue, perhaps! >> >> For the sake of my own edification (and not taking up too much of your >> time) I will try to generate it myself. I've found the definition of the >> "I" class at line 358 of llvm/lib/Target/X86/X86InstrFormats.td, which >> helps a lot. >> >> Let's assume I want to produce opcode 0x16 (which I'm using because it >> doesn't seem to be implemented in gem5 otherwise, and would simply produce >> a warning). Then my guess is that I should use something like: >> def CACHEADD : I<0x16, FORMAT, (outs), (ins), >> ASM, [(int_cache_add)]>, PD; >> >> where FORMAT comes from http://legup.eecg.utoront >> o.ca/doxygen/namespacellvm_1_1X86II.html >> and ASM = ??? >> and i deleted IIC_SSE_PREFETCH (because I'm not sure what this flag >> indicates, but I assume it's not needed). >> I'm not sure what that PD is or if it should stay. >> >> Looking for input on this! Clearly it's not correct as-is, but I feel >> like I'm at least understanding parts of it. Thanks! >> >> For posterity, this page helped a lot, and probably should have been read >> first: https://llvm.org/docs/TableGen/index.html >> In smaller part, this one helped too, but read the above page first: >> https://llvm.org/docs/TableGen/LangRef.html >> >> On Sun, Mar 18, 2018 at 7:43 PM, Craig Topper <craig.topper at gmail.com> >> wrote: >> >>> Here's a couple examples for mapping an intrinsic to an X86 instruction >>> from X86InstrInfo.td. If you look for int_x86_* in any X86Instr*.td you can >>> find others. >>> >>> let Predicates = [HasCLFLUSHOPT], SchedRW = [WriteLoad] in >>> def CLFLUSHOPT : I<0xAE, MRM7m, (outs), (ins i8mem:$src), >>> "clflushopt\t$src", [(int_x86_clflushopt addr:$src)], >>> IIC_SSE_PREFETCH>, PD; >>> >>> let Predicates = [HasCLWB], SchedRW = [WriteLoad] in >>> def CLWB : I<0xAE, MRM6m, (outs), (ins i8mem:$src), "clwb\t$src", >>> [(int_x86_clwb addr:$src)], IIC_SSE_PREFETCH>, PD; >>> >>> The encoding information for the binary output is buried in these >>> definitions too. If you tell me what opcode you've chosen I can tell you >>> what the right things are to get the binary output. >>> >>> >>> ~Craig >>> >>> On Sun, Mar 18, 2018 at 3:22 PM, Gus Smith via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> Hello all. LLVM newbie here. If anything seems glaringly wrong with my >>>> use of LLVM, that's probably why. >>>> >>>> Here's what I'm trying to do. I have modified the gem5 simulator to >>>> accept a "new" x86 instruction. I've done this by just reserving the opcode >>>> in gem5's ISA specification, just as all other instructions are specified. >>>> >>>> I'm trying to get an LLVM backend to generate this opcode during code >>>> generation. My current plan is: >>>> >>>> 1. During an LLVM pass, I'll detect a series of instructions which >>>> can be replaced with this new instruction. (The new instruction is a "cache >>>> compute" instruction -- in my passes, I replace a series of loads, >>>> operations, and stores with this single instruction.) This step is complete. >>>> 2. I replace the series of instructions with an intrinsic. I have >>>> added an intrinsic using the instructions here >>>> <https://llvm.org/docs/ExtendingLLVM.html#adding-a-new-intrinsic-function>. >>>> This step is complete. >>>> 3. During code generation, the intrinsic should be converted to >>>> this reserved opcode. This is where I'm stuck. >>>> >>>> I'm stuck on step 3. I have two main questions that should unblock me: >>>> >>>> Question 1: where is the code that maps from intrinsics to >>>> instructions? The link above states: >>>> >>>> "Add support to the .td file for the target(s) of your choice in >>>> lib/Target/*/*.td. This is usually a matter of adding a pattern to the >>>> .td file that matches the intrinsic, though it may obviously require adding >>>> the instructions you want to generate as well. There are lots of examples >>>> in the PowerPC and X86 backend to follow." >>>> >>>> However, looking through these examples isn't illuminating anything for >>>> me. Any more documentation or high-level explanation on this subject would >>>> be really helpful. I have read something about "lowering" of intrinsics; >>>> not sure if that's relevant. >>>> >>>> Question 2: will I be able to generate this opcode directly from the >>>> intrinsic, or will I have to add the opcode as an LLVM IR instruction and >>>> specify how it gets compiled? I can imagine two options: >>>> option 1: I can define a "translation" from intrinsic straight to an >>>> x86 opcode. >>>> option 2: I can define a "translation" (perhaps in a .td file? I think >>>> that's what they're used for) which translates my intrinsic into a new >>>> instruction, and then I can define another translation which will map the >>>> new instruction to my opcode during code gen. If this is the case, I'm not >>>> sure there's any point to having an intrinsic; I should just add a new >>>> instruction instead. >>>> >>>> Hoping someone can help! As you can tell, I'm a little lost...the >>>> documentation for LLVM is great, but it's a little above my level right now >>>> :) >>>> >>>> Gus Smith, PSU >>>> >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180320/06d9e337/attachment.html>