search for: gr64_nosp

Displaying 8 results from an estimated 8 matches for "gr64_nosp".

2020 Aug 24
2
Intel AMX programming model discussion.
...n is > generated. The name of pseudo instructions have ‘P’ prefix. Now all > the AMX pseudo instruction take vtile as register class. Let’s assume > %13 is constant 3, %10 is constant 4 and %14 is variable. > > /  %1:vtile = *P*TILELOADDV %13:gr16, %10:gr16, %17:gr64, 1, > %18:gr64_nosp, 0, $noreg/ > > /  %2:vtile = *P*TILELOADDV %10:gr16, %14:gr16, %17:gr64, 1, > %18:gr64_nosp, 0, $noreg/ > > /  %3:vtile = *P*TILELOADDV %13:gr16, %14:gr16, %17:gr64, 1, > %18:gr64_nosp, 0, $noreg/ > > /%21:vtile = *P*TDPBSSDV %13:gr16, %10:gr16, %14:gr16, > %3:vtile(t...
2019 Oct 25
3
register spilling and printing live variables
Hello, I have studied register allocation in theoretical aspects and exploring the same in the implementation level. I need a minimal testcase for register spilling to analyze spilling procedure in llvm. I tried with a testcase taking 20 variables but all the 20 variables are getting stored in the stack using %rbp. Maybe my live variable analysis is wrong. Please help me with a minimal testcase
2020 Sep 04
2
Intel AMX programming model discussion.
...nerated. The name of pseudo instructions have ‘P’ prefix. Now > all the AMX pseudo instruction take vtile as register class. Let’s > assume %13 is constant 3, %10 is constant 4 and %14 is variable. > > /  %1:vtile = *P*TILELOADDV %13:gr16, %10:gr16, %17:gr64, 1, > %18:gr64_nosp, 0, $noreg/ > > /  %2:vtile = *P*TILELOADDV %10:gr16, %14:gr16, %17:gr64, 1, > %18:gr64_nosp, 0, $noreg/ > > /  %3:vtile = *P*TILELOADDV %13:gr16, %14:gr16, %17:gr64, 1, > %18:gr64_nosp, 0, $noreg/ > > /%21:vtile = *P*TDPBSSDV %13:gr16, %10:gr16, %14:gr16...
2020 Sep 04
2
Intel AMX programming model discussion.
...the pseudo AMX instruction is generated. The name of pseudo instructions have 'P' prefix. Now all the AMX pseudo instruction take vtile as register class. Let's assume %13 is constant 3, %10 is constant 4 and %14 is variable. %1:vtile = PTILELOADDV %13:gr16, %10:gr16, %17:gr64, 1, %18:gr64_nosp, 0, $noreg %2:vtile = PTILELOADDV %10:gr16, %14:gr16, %17:gr64, 1, %18:gr64_nosp, 0, $noreg %3:vtile = PTILELOADDV %13:gr16, %14:gr16, %17:gr64, 1, %18:gr64_nosp, 0, $noreg %21:vtile = PTDPBSSDV %13:gr16, %10:gr16, %14:gr16, %3:vtile(tied-def 0), %1:vtile, %2:vtile 2. The configuration-p...
2020 Aug 21
2
Intel AMX programming model discussion.
...the pseudo AMX instruction is generated. The name of pseudo instructions have 'P' prefix. Now all the AMX pseudo instruction take vtile as register class. Let's assume %13 is constant 3, %10 is constant 4 and %14 is variable. %1:vtile = PTILELOADDV %13:gr16, %10:gr16, %17:gr64, 1, %18:gr64_nosp, 0, $noreg %2:vtile = PTILELOADDV %10:gr16, %14:gr16, %17:gr64, 1, %18:gr64_nosp, 0, $noreg %3:vtile = PTILELOADDV %13:gr16, %14:gr16, %17:gr64, 1, %18:gr64_nosp, 0, $noreg %21:vtile = PTDPBSSDV %13:gr16, %10:gr16, %14:gr16, %3:vtile(tied-def 0), %1:vtile, %2:vtile 2. The configuration-p...
2018 Feb 28
0
Missed optimization - spill/load generated instead of reg-to-reg move (and two other questions)
On 02/27/2018 10:21 AM, Alex Wang via llvm-dev wrote: > Hello all! > > I was looking through the results of disassembling a heavily-used > short function > in the program I'm working on, and ended up wondering why LLVM was > generating > that assembly and what changes would be necessary to improve the code. > I asked > on #llvm, but it seems that the people with
2018 Feb 27
2
Missed optimization - spill/load generated instead of reg-to-reg move (and two other questions)
Hello all! I was looking through the results of disassembling a heavily-used short function in the program I'm working on, and ended up wondering why LLVM was generating that assembly and what changes would be necessary to improve the code. I asked on #llvm, but it seems that the people with the necessary expertise weren't around. Here is a condensed version of the code:
2020 Aug 19
3
Intel AMX programming model discussion.
The width and height can be runtime values that we would just copy into 64 byte configuration block we pass to ldtilecfg. So the code doesn't need to be multiversioned. The user code would also use those values to update pointers in the loops they write using the tiles. If we can't determine that two tiles were defined with the same width and height we need to assume the shape is different