Displaying 7 results from an estimated 7 matches for "ymm4".
Did you mean:
xmm4
2013 Aug 28
3
[PATCH] x86: AVX instruction emulation fixes
...+ rc = x86_emulate(&ctxt, &emulops);
+ if ( (rc != X86EMUL_OKAY) || memcmp(res, res + 16, 64) )
+ goto fail;
+ printf("okay\n");
+ }
+ else
+ printf("skipped\n");
+
+ printf("%-40s", "Testing vmovdqu (%edx),%ymm4...");
+ if ( stack_exec && cpu_has_avx )
+ {
+ extern const unsigned char vmovdqu_from_mem[];
+
+#if 0 /* Don''t use AVX2 instructions for now */
+ asm volatile ( "vpcmpgtb %%ymm4, %%ymm4, %%ymm4\n"
+#else
+ asm volatile ( "vpcmpgtb %%x...
2016 May 06
3
Unnecessary spill/fill issue
...the constant vectors immediately to stack,
then each use references the stack pointer directly:
Lots of these at top of function:
movabsq $.LCPI0_212, %rbx
vmovaps (%rbx), %ymm0
vmovaps %ymm0, 2816(%rsp) # 32-byte Spill
Later on, each use references the stack pointer:
vpaddd 2816(%rsp), %ymm4, %ymm1 # 32-byte Folded Reload
It seems the spill to stack is unnecessary. In one particularly bad kernel,
I have 128 8-wide constant vectors, and so there is 4KB of stack use just
for these constants. I think a better approach could be to load the
constant vector pointers as needed:
movabsq $.LC...
2020 Sep 01
2
Vector evolution?
...1eb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
1f0: c5 fc 59 0c 87 vmulps (%rdi,%rax,4),%ymm0,%ymm1
1f5: c5 fc 59 54 87 20 vmulps 0x20(%rdi,%rax,4),%ymm0,%ymm2
1fb: c5 fc 59 5c 87 40 vmulps 0x40(%rdi,%rax,4),%ymm0,%ymm3
201: c5 fc 59 64 87 60 vmulps 0x60(%rdi,%rax,4),%ymm0,%ymm4
207: c5 fc 11 0c 87 vmovups %ymm1,(%rdi,%rax,4)
20c: c5 fc 11 54 87 20 vmovups %ymm2,0x20(%rdi,%rax,4)
212: c5 fc 11 5c 87 40 vmovups %ymm3,0x40(%rdi,%rax,4)
218: c5 fc 11 64 87 60 vmovups %ymm4,0x60(%rdi,%rax,4)
21e: c5 fc 59 8c 87 80 00 vmulps 0x80(%rdi,%rax,4),%ymm0,%ymm1
2...
2012 Jan 09
3
[LLVMdev] Calling conventions for YMM registers on AVX
On Jan 9, 2012, at 10:00 AM, Jakob Stoklund Olesen wrote:
>
> On Jan 8, 2012, at 11:18 PM, Demikhovsky, Elena wrote:
>
>> I'll explain what we see in the code.
>> 1. The caller saves XMM registers across the call if needed (according to DEFS definition).
>> YMMs are not in the set, so caller does not take care.
>
> This is not how the register allocator
2012 Jan 10
0
[LLVMdev] Calling conventions for YMM registers on AVX
...ithSubRegs<"ymm1", [XMM1, XMM1b]>, DwarfRegNum<[18, 22, 22]>;
def YMM2: RegisterWithSubRegs<"ymm2", [XMM2, XMM2b]>, DwarfRegNum<[19, 23, 23]>;
def YMM3: RegisterWithSubRegs<"ymm3", [XMM3, XMM3b]>, DwarfRegNum<[20, 24, 24]>;
def YMM4: RegisterWithSubRegs<"ymm4", [XMM4, XMM4b]>, DwarfRegNum<[21, 25, 25]>;
def YMM5: RegisterWithSubRegs<"ymm5", [XMM5, XMM5b]>, DwarfRegNum<[22, 26, 26]>;
def YMM6: RegisterWithSubRegs<"ymm6", [XMM6, XMM6b]>, DwarfRegNum<[23, 27, 27]&...
2014 Feb 21
2
[LLVMdev] [lldb-dev] How is variable info retrieved in debugging for executables generated by llvm backend?
...size:256;offset:339;encoding:vector;format:vector-uint8;set:Floating
> Point Registers;gcc:19;dwarf:19;#00
> $qRegisterInfo5e#dc
> $name:ymm3;bitsize:256;offset:371;encoding:vector;format:vector-uint8;set:Floating
> Point Registers;gcc:20;dwarf:20;#00
> $qRegisterInfo5f#dd
> $name:ymm4;bitsize:256;offset:403;encoding:vector;format:vector-uint8;set:Floating
> Point Registers;gcc:21;dwarf:21;#00
> $qRegisterInfo60#a8
> $name:ymm5;bitsize:256;offset:435;encoding:vector;format:vector-uint8;set:Floating
> Point Registers;gcc:22;dwarf:22;#00
> $qRegisterInfo61#a9
> $n...
2014 Feb 20
2
[LLVMdev] [lldb-dev] How is variable info retrieved in debugging for executables generated by llvm backend?
Thank you, Clayton. This is very helpful.
We use the LLDB specific GDB remote extensions, and our debugger server
supports "qRegisterInfo" package. "reg 0x3c" is the frame pointer.
In the example mentioned above, we have SP = FP - 40 for current call frame.
And variable "a" is stored at address (FP + -24) from asm instruction [FP +
-24] = R3;;
Thus we can conclude