thr3ads.net - llvm dev - [llvm-dev] Generating a custom opcode from an LLVM intrinsic [Mar 2018]

If this information is useful, please help other people find it:
Share via:

Gus Smith via llvm-dev

2018-Mar-19 02:39 UTC

[llvm-dev] Generating a custom opcode from an LLVM intrinsic

Craig, thanks for the quick response. That helps a lot. I had no clue they
were buried in there, though I guess I should have looked harder -- the hex
should have given me a clue, perhaps!

For the sake of my own edification (and not taking up too much of your
time) I will try to generate it myself. I've found the definition of the
"I" class at line 358 of llvm/lib/Target/X86/X86InstrFormats.td, which
helps a lot.

Let's assume I want to produce opcode 0x16 (which I'm using because it
doesn't seem to be implemented in gem5 otherwise, and would simply produce
a warning). Then my guess is that I should use something like:
def CACHEADD : I<0x16, FORMAT, (outs), (ins),
                   ASM, [(int_cache_add)]>, PD;

where FORMAT comes from
http://legup.eecg.utoronto.ca/doxygen/namespacellvm_1_1X86II.html
and ASM = ???
and i deleted  IIC_SSE_PREFETCH (because I'm not sure what this flag
indicates, but I assume it's not needed).
I'm not sure what that PD is or if it should stay.

Looking for input on this! Clearly it's not correct as-is, but I feel like
I'm at least understanding parts of it. Thanks!

For posterity, this page helped a lot, and probably should have been read
first: https://llvm.org/docs/TableGen/index.html
In smaller part, this one helped too, but read the above page first:
https://llvm.org/docs/TableGen/LangRef.html

On Sun, Mar 18, 2018 at 7:43 PM, Craig Topper <craig.topper at gmail.com>
wrote:
> Here's a couple examples for mapping an intrinsic to an X86 instruction
> from X86InstrInfo.td. If you look for int_x86_* in any X86Instr*.td you can
> find others.
>
> let Predicates = [HasCLFLUSHOPT], SchedRW = [WriteLoad] in
> def CLFLUSHOPT : I<0xAE, MRM7m, (outs), (ins i8mem:$src),
>                    "clflushopt\t$src", [(int_x86_clflushopt
addr:$src)],
>                    IIC_SSE_PREFETCH>, PD;
>
> let Predicates = [HasCLWB], SchedRW = [WriteLoad] in
> def CLWB       : I<0xAE, MRM6m, (outs), (ins i8mem:$src),
"clwb\t$src",
>                    [(int_x86_clwb addr:$src)], IIC_SSE_PREFETCH>, PD;
>
> The encoding information for the binary output is buried in these
> definitions too. If you tell me what opcode you've chosen I can tell
you
> what the right things are to get the binary output.
>
>
> ~Craig
>
> On Sun, Mar 18, 2018 at 3:22 PM, Gus Smith via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hello all. LLVM newbie here. If anything seems glaringly wrong with my
>> use of LLVM, that's probably why.
>>
>> Here's what I'm trying to do. I have modified the gem5
simulator to
>> accept a "new" x86 instruction. I've done this by just
reserving the opcode
>> in gem5's ISA specification, just as all other instructions are
specified.
>>
>> I'm trying to get an LLVM backend to generate this opcode during
code
>> generation. My current plan is:
>>
>>    1. During an LLVM pass, I'll detect a series of instructions
which
>>    can be replaced with this new instruction. (The new instruction is a
"cache
>>    compute" instruction -- in my passes, I replace a series of
loads,
>>    operations, and stores with this single instruction.) This step is
complete.
>>    2. I replace the series of instructions with an intrinsic. I have
>>    added an intrinsic using the instructions here
>>   
<https://llvm.org/docs/ExtendingLLVM.html#adding-a-new-intrinsic-function>.
>>    This step is complete.
>>    3. During code generation, the intrinsic should be converted to this
>>    reserved opcode. This is where I'm stuck.
>>
>> I'm stuck on step 3. I have two main questions that should unblock
me:
>>
>> Question 1: where is the code that maps from intrinsics to
instructions?
>> The link above states:
>>
>> "Add support to the .td file for the target(s) of your choice in
>> lib/Target/*/*.td. This is usually a matter of adding a pattern to the
>> .td file that matches the intrinsic, though it may obviously require
adding
>> the instructions you want to generate as well. There are lots of
examples
>> in the PowerPC and X86 backend to follow."
>>
>> However, looking through these examples isn't illuminating anything
for
>> me. Any more documentation or high-level explanation on this subject
would
>> be really helpful. I have read something about "lowering" of
intrinsics;
>> not sure if that's relevant.
>>
>> Question 2: will I be able to generate this opcode directly from the
>> intrinsic, or will I have to add the opcode as an LLVM IR instruction
and
>> specify how it gets compiled? I can imagine two options:
>> option 1: I can define a "translation" from intrinsic
straight to an x86
>> opcode.
>> option 2: I can define a "translation" (perhaps in a .td
file? I think
>> that's what they're used for) which translates my intrinsic
into a new
>> instruction, and then I can define another translation which will map
the
>> new instruction to my opcode during code gen. If this is the case,
I'm not
>> sure there's any point to having an intrinsic; I should just add a
new
>> instruction instead.
>>
>> Hoping someone can help! As you can tell, I'm a little lost...the
>> documentation for LLVM is great, but it's a little above my level
right now
>> :)
>>
>> Gus Smith, PSU
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180318/3f58b518/attachment.html>

Craig Topper via llvm-dev

2018-Mar-19 03:30 UTC

head link

[llvm-dev] Generating a custom opcode from an LLVM intrinsic

ASM is the text output you want printed in a textual listing of the
assembly. The curly braces you see in some text strings like
"adcx{l}\t{$src, $dst|$dst, $src}" are there to provide different
operand
orders for at&t syntax vs intel syntax. Anything after $ matches the name
in the outs/in part of the instruction.

IIC_SSE_PREFETCH is part of the scheduler system to provide
latency/throughput information about the instruction.

PD indicates the instruction should be on the 0x0f two byte opcode map with
a 0x66 prefix.

Most common other values in place of PD
TB - 0x0f opcode map no prefix(0x66, 0xf2, 0xf3) and use of one of those
prefixes should be ignored by the disassembler.
PS - 0x0f opcode map no prefix, but if the disassembler sees a prefix it
should not decode to this instruction. Should be used when there is another
instruction with the same opcode that uses a prefix
PD - 0x0f opcode map with 0x66 prefix
XS - 0x0f opcode map with 0xf3 prefix
XD - 0x0f opcode map with 0xf2 prefix
T8 - 0x0f 0x38 opcode map with no prefix
T8PS - 0x0f 0x38 opcode map version of PS from above
T8PD - 0x0f 0x38 opcode map version of PD from above
T8XS - 0x0f 0x38 opcode version of XS from above
T8XD - 0x0f 0x38 opcode version of XD from above
TA - 0x0f 0x3a opcode map with no prefix
TAPS - 0x0f 0x3a opcode map version of PS from above
TAPD - 0x0f 0x3a opcode map version of PD from above
TAXS - 0x0f 0x3a opcode version of XS from above
TAXD - 0x0f 0x3a opcode version of XD from above




~Craig

On Sun, Mar 18, 2018 at 7:39 PM, Gus Smith <gushenrysmith at gmail.com>
wrote:
> Craig, thanks for the quick response. That helps a lot. I had no clue they
> were buried in there, though I guess I should have looked harder -- the hex
> should have given me a clue, perhaps!
>
> For the sake of my own edification (and not taking up too much of your
> time) I will try to generate it myself. I've found the definition of
the
> "I" class at line 358 of llvm/lib/Target/X86/X86InstrFormats.td,
which
> helps a lot.
>
> Let's assume I want to produce opcode 0x16 (which I'm using because
it
> doesn't seem to be implemented in gem5 otherwise, and would simply
produce
> a warning). Then my guess is that I should use something like:
> def CACHEADD : I<0x16, FORMAT, (outs), (ins),
>                    ASM, [(int_cache_add)]>, PD;
>
> where FORMAT comes from http://legup.eecg.utoronto.ca/doxygen/
> namespacellvm_1_1X86II.html
> and ASM = ???
> and i deleted  IIC_SSE_PREFETCH (because I'm not sure what this flag
> indicates, but I assume it's not needed).
> I'm not sure what that PD is or if it should stay.
>
> Looking for input on this! Clearly it's not correct as-is, but I feel
like
> I'm at least understanding parts of it. Thanks!
>
> For posterity, this page helped a lot, and probably should have been read
> first: https://llvm.org/docs/TableGen/index.html
> In smaller part, this one helped too, but read the above page first:
> https://llvm.org/docs/TableGen/LangRef.html
>
> On Sun, Mar 18, 2018 at 7:43 PM, Craig Topper <craig.topper at
gmail.com>
> wrote:
>
>> Here's a couple examples for mapping an intrinsic to an X86
instruction
>> from X86InstrInfo.td. If you look for int_x86_* in any X86Instr*.td you
can
>> find others.
>>
>> let Predicates = [HasCLFLUSHOPT], SchedRW = [WriteLoad] in
>> def CLFLUSHOPT : I<0xAE, MRM7m, (outs), (ins i8mem:$src),
>>                    "clflushopt\t$src", [(int_x86_clflushopt
addr:$src)],
>>                    IIC_SSE_PREFETCH>, PD;
>>
>> let Predicates = [HasCLWB], SchedRW = [WriteLoad] in
>> def CLWB       : I<0xAE, MRM6m, (outs), (ins i8mem:$src),
"clwb\t$src",
>>                    [(int_x86_clwb addr:$src)], IIC_SSE_PREFETCH>,
PD;
>>
>> The encoding information for the binary output is buried in these
>> definitions too. If you tell me what opcode you've chosen I can
tell you
>> what the right things are to get the binary output.
>>
>>
>> ~Craig
>>
>> On Sun, Mar 18, 2018 at 3:22 PM, Gus Smith via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Hello all. LLVM newbie here. If anything seems glaringly wrong with
my
>>> use of LLVM, that's probably why.
>>>
>>> Here's what I'm trying to do. I have modified the gem5
simulator to
>>> accept a "new" x86 instruction. I've done this by
just reserving the opcode
>>> in gem5's ISA specification, just as all other instructions are
specified.
>>>
>>> I'm trying to get an LLVM backend to generate this opcode
during code
>>> generation. My current plan is:
>>>
>>>    1. During an LLVM pass, I'll detect a series of instructions
which
>>>    can be replaced with this new instruction. (The new instruction
is a "cache
>>>    compute" instruction -- in my passes, I replace a series of
loads,
>>>    operations, and stores with this single instruction.) This step
is complete.
>>>    2. I replace the series of instructions with an intrinsic. I
have
>>>    added an intrinsic using the instructions here
>>>   
<https://llvm.org/docs/ExtendingLLVM.html#adding-a-new-intrinsic-function>.
>>>    This step is complete.
>>>    3. During code generation, the intrinsic should be converted to
this
>>>    reserved opcode. This is where I'm stuck.
>>>
>>> I'm stuck on step 3. I have two main questions that should
unblock me:
>>>
>>> Question 1: where is the code that maps from intrinsics to
instructions?
>>> The link above states:
>>>
>>> "Add support to the .td file for the target(s) of your choice
in
>>> lib/Target/*/*.td. This is usually a matter of adding a pattern to
the
>>> .td file that matches the intrinsic, though it may obviously
require adding
>>> the instructions you want to generate as well. There are lots of
examples
>>> in the PowerPC and X86 backend to follow."
>>>
>>> However, looking through these examples isn't illuminating
anything for
>>> me. Any more documentation or high-level explanation on this
subject would
>>> be really helpful. I have read something about "lowering"
of intrinsics;
>>> not sure if that's relevant.
>>>
>>> Question 2: will I be able to generate this opcode directly from
the
>>> intrinsic, or will I have to add the opcode as an LLVM IR
instruction and
>>> specify how it gets compiled? I can imagine two options:
>>> option 1: I can define a "translation" from intrinsic
straight to an x86
>>> opcode.
>>> option 2: I can define a "translation" (perhaps in a .td
file? I think
>>> that's what they're used for) which translates my intrinsic
into a new
>>> instruction, and then I can define another translation which will
map the
>>> new instruction to my opcode during code gen. If this is the case,
I'm not
>>> sure there's any point to having an intrinsic; I should just
add a new
>>> instruction instead.
>>>
>>> Hoping someone can help! As you can tell, I'm a little
lost...the
>>> documentation for LLVM is great, but it's a little above my
level right now
>>> :)
>>>
>>> Gus Smith, PSU
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180318/65341a78/attachment.html>

Gus Smith via llvm-dev

2018-Mar-20 15:27 UTC

head link

[llvm-dev] Generating a custom opcode from an LLVM intrinsic

Great info -- all of this has been incredibly useful. Do you have any links
to the documentation from this, or does it just come from your experiential
knowledge?

FYI, I achieved what I set out to achieve when I wrote this email. I'm
moving on to a more complex goal now, but the original question was
answered completely, in my opinion. This was the key line:

def CACHEOP : I<0x06, RawFrm, (outs), (ins), "cache_op",
[(int_cache_op)]>;

I added this definition to llvm/lib/Target/X86/X86InstrInfo.td. I also had
to comment out an instruction (PUSHES) which overlapped the 0x06 opcode.
This was OK in my case (as far as I know) because PUSHES isn't implemented
in gem5.


Thanks again!
Gus


On Sun, Mar 18, 2018 at 11:30 PM, Craig Topper <craig.topper at gmail.com>
wrote:
> ASM is the text output you want printed in a textual listing of the
> assembly. The curly braces you see in some text strings like
> "adcx{l}\t{$src, $dst|$dst, $src}" are there to provide different
operand
> orders for at&t syntax vs intel syntax. Anything after $ matches the
name
> in the outs/in part of the instruction.
>
> IIC_SSE_PREFETCH is part of the scheduler system to provide
> latency/throughput information about the instruction.
>
> PD indicates the instruction should be on the 0x0f two byte opcode map
> with a 0x66 prefix.
>
> Most common other values in place of PD
> TB - 0x0f opcode map no prefix(0x66, 0xf2, 0xf3) and use of one of those
> prefixes should be ignored by the disassembler.
> PS - 0x0f opcode map no prefix, but if the disassembler sees a prefix it
> should not decode to this instruction. Should be used when there is another
> instruction with the same opcode that uses a prefix
> PD - 0x0f opcode map with 0x66 prefix
> XS - 0x0f opcode map with 0xf3 prefix
> XD - 0x0f opcode map with 0xf2 prefix
> T8 - 0x0f 0x38 opcode map with no prefix
> T8PS - 0x0f 0x38 opcode map version of PS from above
> T8PD - 0x0f 0x38 opcode map version of PD from above
> T8XS - 0x0f 0x38 opcode version of XS from above
> T8XD - 0x0f 0x38 opcode version of XD from above
> TA - 0x0f 0x3a opcode map with no prefix
> TAPS - 0x0f 0x3a opcode map version of PS from above
> TAPD - 0x0f 0x3a opcode map version of PD from above
> TAXS - 0x0f 0x3a opcode version of XS from above
> TAXD - 0x0f 0x3a opcode version of XD from above
>
>
>
>
> ~Craig
>
> On Sun, Mar 18, 2018 at 7:39 PM, Gus Smith <gushenrysmith at
gmail.com>
> wrote:
>
>> Craig, thanks for the quick response. That helps a lot. I had no clue
>> they were buried in there, though I guess I should have looked harder
--
>> the hex should have given me a clue, perhaps!
>>
>> For the sake of my own edification (and not taking up too much of your
>> time) I will try to generate it myself. I've found the definition
of the
>> "I" class at line 358 of
llvm/lib/Target/X86/X86InstrFormats.td, which
>> helps a lot.
>>
>> Let's assume I want to produce opcode 0x16 (which I'm using
because it
>> doesn't seem to be implemented in gem5 otherwise, and would simply
produce
>> a warning). Then my guess is that I should use something like:
>> def CACHEADD : I<0x16, FORMAT, (outs), (ins),
>>                    ASM, [(int_cache_add)]>, PD;
>>
>> where FORMAT comes from http://legup.eecg.utoront
>> o.ca/doxygen/namespacellvm_1_1X86II.html
>> and ASM = ???
>> and i deleted  IIC_SSE_PREFETCH (because I'm not sure what this
flag
>> indicates, but I assume it's not needed).
>> I'm not sure what that PD is or if it should stay.
>>
>> Looking for input on this! Clearly it's not correct as-is, but I
feel
>> like I'm at least understanding parts of it. Thanks!
>>
>> For posterity, this page helped a lot, and probably should have been
read
>> first: https://llvm.org/docs/TableGen/index.html
>> In smaller part, this one helped too, but read the above page first:
>> https://llvm.org/docs/TableGen/LangRef.html
>>
>> On Sun, Mar 18, 2018 at 7:43 PM, Craig Topper <craig.topper at
gmail.com>
>> wrote:
>>
>>> Here's a couple examples for mapping an intrinsic to an X86
instruction
>>> from X86InstrInfo.td. If you look for int_x86_* in any X86Instr*.td
you can
>>> find others.
>>>
>>> let Predicates = [HasCLFLUSHOPT], SchedRW = [WriteLoad] in
>>> def CLFLUSHOPT : I<0xAE, MRM7m, (outs), (ins i8mem:$src),
>>>                    "clflushopt\t$src",
[(int_x86_clflushopt addr:$src)],
>>>                    IIC_SSE_PREFETCH>, PD;
>>>
>>> let Predicates = [HasCLWB], SchedRW = [WriteLoad] in
>>> def CLWB       : I<0xAE, MRM6m, (outs), (ins i8mem:$src),
"clwb\t$src",
>>>                    [(int_x86_clwb addr:$src)],
IIC_SSE_PREFETCH>, PD;
>>>
>>> The encoding information for the binary output is buried in these
>>> definitions too. If you tell me what opcode you've chosen I can
tell you
>>> what the right things are to get the binary output.
>>>
>>>
>>> ~Craig
>>>
>>> On Sun, Mar 18, 2018 at 3:22 PM, Gus Smith via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hello all. LLVM newbie here. If anything seems glaringly wrong
with my
>>>> use of LLVM, that's probably why.
>>>>
>>>> Here's what I'm trying to do. I have modified the gem5
simulator to
>>>> accept a "new" x86 instruction. I've done this by
just reserving the opcode
>>>> in gem5's ISA specification, just as all other instructions
are specified.
>>>>
>>>> I'm trying to get an LLVM backend to generate this opcode
during code
>>>> generation. My current plan is:
>>>>
>>>>    1. During an LLVM pass, I'll detect a series of
instructions which
>>>>    can be replaced with this new instruction. (The new
instruction is a "cache
>>>>    compute" instruction -- in my passes, I replace a
series of loads,
>>>>    operations, and stores with this single instruction.) This
step is complete.
>>>>    2. I replace the series of instructions with an intrinsic. I
have
>>>>    added an intrinsic using the instructions here
>>>>   
<https://llvm.org/docs/ExtendingLLVM.html#adding-a-new-intrinsic-function>.
>>>>    This step is complete.
>>>>    3. During code generation, the intrinsic should be converted
to
>>>>    this reserved opcode. This is where I'm stuck.
>>>>
>>>> I'm stuck on step 3. I have two main questions that should
unblock me:
>>>>
>>>> Question 1: where is the code that maps from intrinsics to
>>>> instructions? The link above states:
>>>>
>>>> "Add support to the .td file for the target(s) of your
choice in
>>>> lib/Target/*/*.td. This is usually a matter of adding a pattern
to the
>>>> .td file that matches the intrinsic, though it may obviously
require adding
>>>> the instructions you want to generate as well. There are lots
of examples
>>>> in the PowerPC and X86 backend to follow."
>>>>
>>>> However, looking through these examples isn't illuminating
anything for
>>>> me. Any more documentation or high-level explanation on this
subject would
>>>> be really helpful. I have read something about
"lowering" of intrinsics;
>>>> not sure if that's relevant.
>>>>
>>>> Question 2: will I be able to generate this opcode directly
from the
>>>> intrinsic, or will I have to add the opcode as an LLVM IR
instruction and
>>>> specify how it gets compiled? I can imagine two options:
>>>> option 1: I can define a "translation" from intrinsic
straight to an
>>>> x86 opcode.
>>>> option 2: I can define a "translation" (perhaps in a
.td file? I think
>>>> that's what they're used for) which translates my
intrinsic into a new
>>>> instruction, and then I can define another translation which
will map the
>>>> new instruction to my opcode during code gen. If this is the
case, I'm not
>>>> sure there's any point to having an intrinsic; I should
just add a new
>>>> instruction instead.
>>>>
>>>> Hoping someone can help! As you can tell, I'm a little
lost...the
>>>> documentation for LLVM is great, but it's a little above my
level right now
>>>> :)
>>>>
>>>> Gus Smith, PSU
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>
>>
>

-- 
*Gus Smith*
Penn State University
M.S./B.S. Computer Science and Engineering '18

Microsystems Design Lab - Researcher

(570)817-9340 | hfs5022 at psu.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180320/dc54f309/attachment-0001.html>

Gus Smith via llvm-dev

2018-Mar-20 15:28 UTC

head link

[llvm-dev] Generating a custom opcode from an LLVM intrinsic


Great info -- all of this has been incredibly useful. Do you have any links
to the documentation from this, or does it just come from your experiential
knowledge?

FYI, I achieved what I set out to achieve when I wrote this email. I'm
moving on to a more complex goal now, but the original question was
answered completely, in my opinion. This was the key line:

def CACHEOP : I<0x06, RawFrm, (outs), (ins), "cache_op",
[(int_cache_op)]>;

I added this definition to llvm/lib/Target/X86/X86InstrInfo.td. I also had
to comment out an instruction (PUSHES) which overlapped the 0x06 opcode.
This was OK in my case (as far as I know) because PUSHES isn't implemented
in gem5.


Thanks again!
Gus


On Sun, Mar 18, 2018 at 11:30 PM, Craig Topper <craig.topper at gmail.com>
wrote:
> ASM is the text output you want printed in a textual listing of the
> assembly. The curly braces you see in some text strings like
> "adcx{l}\t{$src, $dst|$dst, $src}" are there to provide different
operand
> orders for at&t syntax vs intel syntax. Anything after $ matches the
name
> in the outs/in part of the instruction.
>
> IIC_SSE_PREFETCH is part of the scheduler system to provide
> latency/throughput information about the instruction.
>
> PD indicates the instruction should be on the 0x0f two byte opcode map
> with a 0x66 prefix.
>
> Most common other values in place of PD
> TB - 0x0f opcode map no prefix(0x66, 0xf2, 0xf3) and use of one of those
> prefixes should be ignored by the disassembler.
> PS - 0x0f opcode map no prefix, but if the disassembler sees a prefix it
> should not decode to this instruction. Should be used when there is another
> instruction with the same opcode that uses a prefix
> PD - 0x0f opcode map with 0x66 prefix
> XS - 0x0f opcode map with 0xf3 prefix
> XD - 0x0f opcode map with 0xf2 prefix
> T8 - 0x0f 0x38 opcode map with no prefix
> T8PS - 0x0f 0x38 opcode map version of PS from above
> T8PD - 0x0f 0x38 opcode map version of PD from above
> T8XS - 0x0f 0x38 opcode version of XS from above
> T8XD - 0x0f 0x38 opcode version of XD from above
> TA - 0x0f 0x3a opcode map with no prefix
> TAPS - 0x0f 0x3a opcode map version of PS from above
> TAPD - 0x0f 0x3a opcode map version of PD from above
> TAXS - 0x0f 0x3a opcode version of XS from above
> TAXD - 0x0f 0x3a opcode version of XD from above
>
>
>
>
> ~Craig
>
> On Sun, Mar 18, 2018 at 7:39 PM, Gus Smith <gushenrysmith at
gmail.com>
> wrote:
>
>> Craig, thanks for the quick response. That helps a lot. I had no clue
>> they were buried in there, though I guess I should have looked harder
--
>> the hex should have given me a clue, perhaps!
>>
>> For the sake of my own edification (and not taking up too much of your
>> time) I will try to generate it myself. I've found the definition
of the
>> "I" class at line 358 of
llvm/lib/Target/X86/X86InstrFormats.td, which
>> helps a lot.
>>
>> Let's assume I want to produce opcode 0x16 (which I'm using
because it
>> doesn't seem to be implemented in gem5 otherwise, and would simply
produce
>> a warning). Then my guess is that I should use something like:
>> def CACHEADD : I<0x16, FORMAT, (outs), (ins),
>>                    ASM, [(int_cache_add)]>, PD;
>>
>> where FORMAT comes from http://legup.eecg.utoront
>> o.ca/doxygen/namespacellvm_1_1X86II.html
>> and ASM = ???
>> and i deleted  IIC_SSE_PREFETCH (because I'm not sure what this
flag
>> indicates, but I assume it's not needed).
>> I'm not sure what that PD is or if it should stay.
>>
>> Looking for input on this! Clearly it's not correct as-is, but I
feel
>> like I'm at least understanding parts of it. Thanks!
>>
>> For posterity, this page helped a lot, and probably should have been
read
>> first: https://llvm.org/docs/TableGen/index.html
>> In smaller part, this one helped too, but read the above page first:
>> https://llvm.org/docs/TableGen/LangRef.html
>>
>> On Sun, Mar 18, 2018 at 7:43 PM, Craig Topper <craig.topper at
gmail.com>
>> wrote:
>>
>>> Here's a couple examples for mapping an intrinsic to an X86
instruction
>>> from X86InstrInfo.td. If you look for int_x86_* in any X86Instr*.td
you can
>>> find others.
>>>
>>> let Predicates = [HasCLFLUSHOPT], SchedRW = [WriteLoad] in
>>> def CLFLUSHOPT : I<0xAE, MRM7m, (outs), (ins i8mem:$src),
>>>                    "clflushopt\t$src",
[(int_x86_clflushopt addr:$src)],
>>>                    IIC_SSE_PREFETCH>, PD;
>>>
>>> let Predicates = [HasCLWB], SchedRW = [WriteLoad] in
>>> def CLWB       : I<0xAE, MRM6m, (outs), (ins i8mem:$src),
"clwb\t$src",
>>>                    [(int_x86_clwb addr:$src)],
IIC_SSE_PREFETCH>, PD;
>>>
>>> The encoding information for the binary output is buried in these
>>> definitions too. If you tell me what opcode you've chosen I can
tell you
>>> what the right things are to get the binary output.
>>>
>>>
>>> ~Craig
>>>
>>> On Sun, Mar 18, 2018 at 3:22 PM, Gus Smith via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hello all. LLVM newbie here. If anything seems glaringly wrong
with my
>>>> use of LLVM, that's probably why.
>>>>
>>>> Here's what I'm trying to do. I have modified the gem5
simulator to
>>>> accept a "new" x86 instruction. I've done this by
just reserving the opcode
>>>> in gem5's ISA specification, just as all other instructions
are specified.
>>>>
>>>> I'm trying to get an LLVM backend to generate this opcode
during code
>>>> generation. My current plan is:
>>>>
>>>>    1. During an LLVM pass, I'll detect a series of
instructions which
>>>>    can be replaced with this new instruction. (The new
instruction is a "cache
>>>>    compute" instruction -- in my passes, I replace a
series of loads,
>>>>    operations, and stores with this single instruction.) This
step is complete.
>>>>    2. I replace the series of instructions with an intrinsic. I
have
>>>>    added an intrinsic using the instructions here
>>>>   
<https://llvm.org/docs/ExtendingLLVM.html#adding-a-new-intrinsic-function>.
>>>>    This step is complete.
>>>>    3. During code generation, the intrinsic should be converted
to
>>>>    this reserved opcode. This is where I'm stuck.
>>>>
>>>> I'm stuck on step 3. I have two main questions that should
unblock me:
>>>>
>>>> Question 1: where is the code that maps from intrinsics to
>>>> instructions? The link above states:
>>>>
>>>> "Add support to the .td file for the target(s) of your
choice in
>>>> lib/Target/*/*.td. This is usually a matter of adding a pattern
to the
>>>> .td file that matches the intrinsic, though it may obviously
require adding
>>>> the instructions you want to generate as well. There are lots
of examples
>>>> in the PowerPC and X86 backend to follow."
>>>>
>>>> However, looking through these examples isn't illuminating
anything for
>>>> me. Any more documentation or high-level explanation on this
subject would
>>>> be really helpful. I have read something about
"lowering" of intrinsics;
>>>> not sure if that's relevant.
>>>>
>>>> Question 2: will I be able to generate this opcode directly
from the
>>>> intrinsic, or will I have to add the opcode as an LLVM IR
instruction and
>>>> specify how it gets compiled? I can imagine two options:
>>>> option 1: I can define a "translation" from intrinsic
straight to an
>>>> x86 opcode.
>>>> option 2: I can define a "translation" (perhaps in a
.td file? I think
>>>> that's what they're used for) which translates my
intrinsic into a new
>>>> instruction, and then I can define another translation which
will map the
>>>> new instruction to my opcode during code gen. If this is the
case, I'm not
>>>> sure there's any point to having an intrinsic; I should
just add a new
>>>> instruction instead.
>>>>
>>>> Hoping someone can help! As you can tell, I'm a little
lost...the
>>>> documentation for LLVM is great, but it's a little above my
level right now
>>>> :)
>>>>
>>>> Gus Smith, PSU
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>
>>
>

-- 
*Gus Smith*
Penn State University
M.S./B.S. Computer Science and Engineering '18

Microsystems Design Lab - Researcher

(570)817-9340 | hfs5022 at psu.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180320/a0de7661/attachment.html>

Gus Smith via llvm-dev

2018-Mar-20 15:28 UTC

head link

[llvm-dev] Generating a custom opcode from an LLVM intrinsic

Great info -- all of this has been incredibly useful. Do you have any
links to the documentation from this, or does it just come from your
experiential knowledge?

FYI, I achieved what I set out to achieve when I wrote this email. I'm
moving on to a more complex goal now, but the original question was
answered completely, in my opinion. This was the key line:

def CACHEOP : I<0x06, RawFrm, (outs), (ins), "cache_op",
[(int_cache_op)]>;

I added this definition to llvm/lib/Target/X86/X86InstrInfo.td. I also had
to comment out an instruction (PUSHES) which overlapped the 0x06 opcode.
This was OK in my case (as far as I know) because PUSHES isn't implemented
in gem5.


Thanks again!
Gus

On Sun, Mar 18, 2018 at 11:30 PM, Craig Topper <craig.topper at gmail.com>
wrote:
> ASM is the text output you want printed in a textual listing of the
> assembly. The curly braces you see in some text strings like
> "adcx{l}\t{$src, $dst|$dst, $src}" are there to provide different
operand
> orders for at&t syntax vs intel syntax. Anything after $ matches the
name
> in the outs/in part of the instruction.
>
> IIC_SSE_PREFETCH is part of the scheduler system to provide
> latency/throughput information about the instruction.
>
> PD indicates the instruction should be on the 0x0f two byte opcode map
> with a 0x66 prefix.
>
> Most common other values in place of PD
> TB - 0x0f opcode map no prefix(0x66, 0xf2, 0xf3) and use of one of those
> prefixes should be ignored by the disassembler.
> PS - 0x0f opcode map no prefix, but if the disassembler sees a prefix it
> should not decode to this instruction. Should be used when there is another
> instruction with the same opcode that uses a prefix
> PD - 0x0f opcode map with 0x66 prefix
> XS - 0x0f opcode map with 0xf3 prefix
> XD - 0x0f opcode map with 0xf2 prefix
> T8 - 0x0f 0x38 opcode map with no prefix
> T8PS - 0x0f 0x38 opcode map version of PS from above
> T8PD - 0x0f 0x38 opcode map version of PD from above
> T8XS - 0x0f 0x38 opcode version of XS from above
> T8XD - 0x0f 0x38 opcode version of XD from above
> TA - 0x0f 0x3a opcode map with no prefix
> TAPS - 0x0f 0x3a opcode map version of PS from above
> TAPD - 0x0f 0x3a opcode map version of PD from above
> TAXS - 0x0f 0x3a opcode version of XS from above
> TAXD - 0x0f 0x3a opcode version of XD from above
>
>
>
>
> ~Craig
>
> On Sun, Mar 18, 2018 at 7:39 PM, Gus Smith <gushenrysmith at
gmail.com>
> wrote:
>
>> Craig, thanks for the quick response. That helps a lot. I had no clue
>> they were buried in there, though I guess I should have looked harder
--
>> the hex should have given me a clue, perhaps!
>>
>> For the sake of my own edification (and not taking up too much of your
>> time) I will try to generate it myself. I've found the definition
of the
>> "I" class at line 358 of
llvm/lib/Target/X86/X86InstrFormats.td, which
>> helps a lot.
>>
>> Let's assume I want to produce opcode 0x16 (which I'm using
because it
>> doesn't seem to be implemented in gem5 otherwise, and would simply
produce
>> a warning). Then my guess is that I should use something like:
>> def CACHEADD : I<0x16, FORMAT, (outs), (ins),
>>                    ASM, [(int_cache_add)]>, PD;
>>
>> where FORMAT comes from http://legup.eecg.utoront
>> o.ca/doxygen/namespacellvm_1_1X86II.html
>> and ASM = ???
>> and i deleted  IIC_SSE_PREFETCH (because I'm not sure what this
flag
>> indicates, but I assume it's not needed).
>> I'm not sure what that PD is or if it should stay.
>>
>> Looking for input on this! Clearly it's not correct as-is, but I
feel
>> like I'm at least understanding parts of it. Thanks!
>>
>> For posterity, this page helped a lot, and probably should have been
read
>> first: https://llvm.org/docs/TableGen/index.html
>> In smaller part, this one helped too, but read the above page first:
>> https://llvm.org/docs/TableGen/LangRef.html
>>
>> On Sun, Mar 18, 2018 at 7:43 PM, Craig Topper <craig.topper at
gmail.com>
>> wrote:
>>
>>> Here's a couple examples for mapping an intrinsic to an X86
instruction
>>> from X86InstrInfo.td. If you look for int_x86_* in any X86Instr*.td
you can
>>> find others.
>>>
>>> let Predicates = [HasCLFLUSHOPT], SchedRW = [WriteLoad] in
>>> def CLFLUSHOPT : I<0xAE, MRM7m, (outs), (ins i8mem:$src),
>>>                    "clflushopt\t$src",
[(int_x86_clflushopt addr:$src)],
>>>                    IIC_SSE_PREFETCH>, PD;
>>>
>>> let Predicates = [HasCLWB], SchedRW = [WriteLoad] in
>>> def CLWB       : I<0xAE, MRM6m, (outs), (ins i8mem:$src),
"clwb\t$src",
>>>                    [(int_x86_clwb addr:$src)],
IIC_SSE_PREFETCH>, PD;
>>>
>>> The encoding information for the binary output is buried in these
>>> definitions too. If you tell me what opcode you've chosen I can
tell you
>>> what the right things are to get the binary output.
>>>
>>>
>>> ~Craig
>>>
>>> On Sun, Mar 18, 2018 at 3:22 PM, Gus Smith via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hello all. LLVM newbie here. If anything seems glaringly wrong
with my
>>>> use of LLVM, that's probably why.
>>>>
>>>> Here's what I'm trying to do. I have modified the gem5
simulator to
>>>> accept a "new" x86 instruction. I've done this by
just reserving the opcode
>>>> in gem5's ISA specification, just as all other instructions
are specified.
>>>>
>>>> I'm trying to get an LLVM backend to generate this opcode
during code
>>>> generation. My current plan is:
>>>>
>>>>    1. During an LLVM pass, I'll detect a series of
instructions which
>>>>    can be replaced with this new instruction. (The new
instruction is a "cache
>>>>    compute" instruction -- in my passes, I replace a
series of loads,
>>>>    operations, and stores with this single instruction.) This
step is complete.
>>>>    2. I replace the series of instructions with an intrinsic. I
have
>>>>    added an intrinsic using the instructions here
>>>>   
<https://llvm.org/docs/ExtendingLLVM.html#adding-a-new-intrinsic-function>.
>>>>    This step is complete.
>>>>    3. During code generation, the intrinsic should be converted
to
>>>>    this reserved opcode. This is where I'm stuck.
>>>>
>>>> I'm stuck on step 3. I have two main questions that should
unblock me:
>>>>
>>>> Question 1: where is the code that maps from intrinsics to
>>>> instructions? The link above states:
>>>>
>>>> "Add support to the .td file for the target(s) of your
choice in
>>>> lib/Target/*/*.td. This is usually a matter of adding a pattern
to the
>>>> .td file that matches the intrinsic, though it may obviously
require adding
>>>> the instructions you want to generate as well. There are lots
of examples
>>>> in the PowerPC and X86 backend to follow."
>>>>
>>>> However, looking through these examples isn't illuminating
anything for
>>>> me. Any more documentation or high-level explanation on this
subject would
>>>> be really helpful. I have read something about
"lowering" of intrinsics;
>>>> not sure if that's relevant.
>>>>
>>>> Question 2: will I be able to generate this opcode directly
from the
>>>> intrinsic, or will I have to add the opcode as an LLVM IR
instruction and
>>>> specify how it gets compiled? I can imagine two options:
>>>> option 1: I can define a "translation" from intrinsic
straight to an
>>>> x86 opcode.
>>>> option 2: I can define a "translation" (perhaps in a
.td file? I think
>>>> that's what they're used for) which translates my
intrinsic into a new
>>>> instruction, and then I can define another translation which
will map the
>>>> new instruction to my opcode during code gen. If this is the
case, I'm not
>>>> sure there's any point to having an intrinsic; I should
just add a new
>>>> instruction instead.
>>>>
>>>> Hoping someone can help! As you can tell, I'm a little
lost...the
>>>> documentation for LLVM is great, but it's a little above my
level right now
>>>> :)
>>>>
>>>> Gus Smith, PSU
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180320/06d9e337/attachment.html>

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - Mar 2018 - Generating a custom opcode from an LLVM intrinsic

[llvm-dev] Generating a custom opcode from an LLVM intrinsic

[llvm-dev] Generating a custom opcode from an LLVM intrinsic

[llvm-dev] Generating a custom opcode from an LLVM intrinsic

[llvm-dev] Generating a custom opcode from an LLVM intrinsic

[llvm-dev] Generating a custom opcode from an LLVM intrinsic

Maybe Matching Threads