search for: ymm14

Displaying 20 results from an estimated 20 matches for "ymm14".

Did you mean: ymm1
2012 Jan 09
3
[LLVMdev] Calling conventions for YMM registers on AVX
On Jan 9, 2012, at 10:00 AM, Jakob Stoklund Olesen wrote: > > On Jan 8, 2012, at 11:18 PM, Demikhovsky, Elena wrote: > >> I'll explain what we see in the code. >> 1. The caller saves XMM registers across the call if needed (according to DEFS definition). >> YMMs are not in the set, so caller does not take care. > > This is not how the register allocator
2017 Jul 01
2
KNL Assembly Code for Matrix Multiplication
...gt;>>>> 16. here it should be zmm14**=[3200,3600,40000,.......54000]. but by >>>>> the above computation these indexes are changes??* >>>>> * kxnorw k2, k0, k0 **; **dont understand the need for this step* >>>>> >>>> * vpgatherqd ymm14 {k2}, zmmword ptr [zmm15] **; **here again issues >>>>> with index zmm15. it should be **[0,4000,8000,.......28000] but its >>>>> different due to above computation.* >>>>> * vinserti64x4 zmm0, zmm14, ymm0, 1* >>>>> * kmovw k2, k1* >&...
2012 Mar 01
3
[LLVMdev] Stack alignment on X86 AVX seems incorrect
...; <llvmdev at cs.uiuc.edu> Message-ID: <A0DC88CEB3010344830D52D66533DA8E0C2E7A at HASMSX103.ger.corp.intel.com> Content-Type: text/plain; charset="windows-1252" ./llc -mattr=+avx -stack-alignment=16 < basic.ll | grep movaps | grep ymm | grep rbp vmovaps -176(%rbp), %ymm14 vmovaps -144(%rbp), %ymm11 vmovaps -240(%rbp), %ymm13 vmovaps -208(%rbp), %ymm9 vmovaps -272(%rbp), %ymm7 vmovaps -304(%rbp), %ymm0 vmovaps -112(%rbp), %ymm0 vmovaps -80(%rbp), %ymm1 vmovaps -112(%rbp), %ymm0 vmovaps -80(%rbp), %ymm0...
2012 Jan 10
0
[LLVMdev] Calling conventions for YMM registers on AVX
...lt;"ymm11", [XMM11, XMM11b]>, DwarfRegNum<[28, -2, -2]>; def YMM12: RegisterWithSubRegs<"ymm12", [XMM12, XMM12b]>, DwarfRegNum<[29, -2, -2]>; def YMM13: RegisterWithSubRegs<"ymm13", [XMM13, XMM13b]>, DwarfRegNum<[30, -2, -2]>; def YMM14: RegisterWithSubRegs<"ymm14", [XMM14, XMM14b]>, DwarfRegNum<[31, -2, -2]>; def YMM15: RegisterWithSubRegs<"ymm15", [XMM15, XMM15b]>, DwarfRegNum<[32, -2, -2]>; } - Elena -----Original Message----- From: Jakob Stoklund Olesen [mailto:jolesen at apple...
2012 Mar 01
0
[LLVMdev] Stack alignment on X86 AVX seems incorrect
./llc -mattr=+avx -stack-alignment=16 < basic.ll | grep movaps | grep ymm | grep rbp vmovaps -176(%rbp), %ymm14 vmovaps -144(%rbp), %ymm11 vmovaps -240(%rbp), %ymm13 vmovaps -208(%rbp), %ymm9 vmovaps -272(%rbp), %ymm7 vmovaps -304(%rbp), %ymm0 vmovaps -112(%rbp), %ymm0 vmovaps -80(%rbp), %ymm1 vmovaps -112(%rbp), %ymm0 vmovaps -80(%rbp),...
2012 Mar 01
2
[LLVMdev] Stack alignment in kernel
I'm running in AVX mode, but the stack before call to kernel is aligned to 16 bit. Could you, please, tell me where it should be specified? Thank you. - Elena --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or
2012 Mar 01
0
[LLVMdev] Stack alignment on X86 AVX seems incorrect
...:A0DC88CEB3010344830D52D66533DA8E0C2E7A at HASMSX103.ger.corp.intel.com>> >> Content-Type: text/plain; charset="windows-1252" >> >> ./llc -mattr=+avx -stack-alignment=16 < basic.ll | grep movaps | grep >> ymm | grep rbp >> vmovaps -176(%rbp), %ymm14 >> vmovaps -144(%rbp), %ymm11 >> vmovaps -240(%rbp), %ymm13 >> vmovaps -208(%rbp), %ymm9 >> vmovaps -272(%rbp), %ymm7 >> vmovaps -304(%rbp), %ymm0 >> vmovaps -112(%rbp), %ymm0 >> vmovaps -80(%rbp), %ymm1 >...
2012 Mar 01
0
[LLVMdev] Stack alignment on X86 AVX seems incorrect
On Thu, Mar 1, 2012 at 5:30 PM, Cameron McInally <cameron.mcinally at nyu.edu>wrote: > Aligning the stack to 32 bytes when there are auto AVX vector variables >> present shouldn't necessarily break the x86-64 ABI, as long as smaller auto >> variables remain properly aligned. A similar approach was taken for i386 >> in GCC in order to support SSE vectors. >>
2012 Mar 01
3
[LLVMdev] Stack alignment on X86 AVX seems incorrect
Even if you explicitly specify –stack-alignment=16 the aligned movs are still generated. It is not an issue related to ABI. See my original mail: ./llc -mattr=+avx -stack-alignment=16 < basic.ll | grep movaps | grep ymm | grep rbp vmovaps -176(%rbp), %ymm14 vmovaps -144(%rbp), %ymm11 vmovaps -240(%rbp), %ymm13 - Elena From: Cameron McInally [mailto:cameron.mcinally at nyu.edu] Sent: Friday, March 02, 2012 01:04 To: Evandro Menezes Cc: Demikhovsky, Elena; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Stack alignment on X86 AVX seems...
2017 Oct 11
1
[PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support
...ey, x0; \ - vpshufb .Lpack_bswap, x0, x0; \ + vpshufb .Lpack_bswap(%rip), x0, x0; \ \ vpxor x0, y7, y7; \ vpxor x0, y6, y6; \ @@ -1112,7 +1112,7 @@ ENTRY(camellia_ctr_32way) vmovdqu (%rcx), %xmm0; vmovdqa %xmm0, %xmm1; inc_le128(%xmm0, %xmm15, %xmm14); - vbroadcasti128 .Lbswap128_mask, %ymm14; + vbroadcasti128 .Lbswap128_mask(%rip), %ymm14; vinserti128 $1, %xmm0, %ymm1, %ymm0; vpshufb %ymm14, %ymm0, %ymm13; vmovdqu %ymm13, 15 * 32(%rax); @@ -1158,7 +1158,7 @@ ENTRY(camellia_ctr_32way) /* inpack32_pre: */ vpbroadcastq (key_table)(CTX), %ymm15; - vpshufb .Lpack_bswap, %ymm15,...
2012 Mar 01
4
[LLVMdev] Stack alignment on X86 AVX seems incorrect
On Thu, Mar 1, 2012 at 4:29 PM, Evandro Menezes <emenezes at codeaurora.org>wrote: ... > Aligning the stack to 32 bytes when there are auto AVX vector variables > present shouldn't necessarily break the x86-64 ABI, as long as smaller auto > variables remain properly aligned. A similar approach was taken for i386 > in GCC in order to support SSE vectors. > > Perhaps
2014 Feb 21
2
[LLVMdev] [lldb-dev] How is variable info retrieved in debugging for executables generated by llvm backend?
...ize:256;offset:659;encoding:vector;format:vector-uint8;set:Floating > Point Registers;gcc:29;dwarf:29;#00 > $qRegisterInfo68#b0 > $name:ymm13;bitsize:256;offset:691;encoding:vector;format:vector-uint8;set:Floating > Point Registers;gcc:30;dwarf:30;#00 > $qRegisterInfo69#b1 > $name:ymm14;bitsize:256;offset:723;encoding:vector;format:vector-uint8;set:Floating > Point Registers;gcc:31;dwarf:31;#00 > $qRegisterInfo6a#d9 > $name:ymm15;bitsize:256;offset:755;encoding:vector;format:vector-uint8;set:Floating > Point Registers;gcc:32;dwarf:32;#00 > $qRegisterInfo6b#da > $...
2014 Feb 20
2
[LLVMdev] [lldb-dev] How is variable info retrieved in debugging for executables generated by llvm backend?
Thank you, Clayton. This is very helpful. We use the LLDB specific GDB remote extensions, and our debugger server supports "qRegisterInfo" package. "reg 0x3c" is the frame pointer. In the example mentioned above, we have SP = FP - 40 for current call frame. And variable "a" is stored at address (FP + -24) from asm instruction [FP + -24] = R3;; Thus we can conclude
2018 Mar 13
32
[PATCH v2 00/27] x86: PIE support and option to extend KASLR randomization
Changes: - patch v2: - Adapt patch to work post KPTI and compiler changes - Redo all performance testing with latest configs and compilers - Simplify mov macro on PIE (MOVABS now) - Reduce GOT footprint - patch v1: - Simplify ftrace implementation. - Use gcc mstack-protector-guard-reg=%gs with PIE when possible. - rfc v3: - Use --emit-relocs instead of -pie to reduce
2018 Mar 13
32
[PATCH v2 00/27] x86: PIE support and option to extend KASLR randomization
Changes: - patch v2: - Adapt patch to work post KPTI and compiler changes - Redo all performance testing with latest configs and compilers - Simplify mov macro on PIE (MOVABS now) - Reduce GOT footprint - patch v1: - Simplify ftrace implementation. - Use gcc mstack-protector-guard-reg=%gs with PIE when possible. - rfc v3: - Use --emit-relocs instead of -pie to reduce
2017 Oct 04
28
x86: PIE support and option to extend KASLR randomization
These patches make the changes necessary to build the kernel as Position Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below the top 2G of the virtual address space. It allows to optionally extend the KASLR randomization range from 1G to 3G. Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler changes, PIE support and KASLR in general. Thanks to
2017 Oct 04
28
x86: PIE support and option to extend KASLR randomization
These patches make the changes necessary to build the kernel as Position Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below the top 2G of the virtual address space. It allows to optionally extend the KASLR randomization range from 1G to 3G. Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler changes, PIE support and KASLR in general. Thanks to
2018 May 23
33
[PATCH v3 00/27] x86: PIE support and option to extend KASLR randomization
Changes: - patch v3: - Update on message to describe longer term PIE goal. - Minor change on ftrace if condition. - Changed code using xchgq. - patch v2: - Adapt patch to work post KPTI and compiler changes - Redo all performance testing with latest configs and compilers - Simplify mov macro on PIE (MOVABS now) - Reduce GOT footprint - patch v1: - Simplify ftrace
2017 Oct 11
32
[PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
Changes: - patch v1: - Simplify ftrace implementation. - Use gcc mstack-protector-guard-reg=%gs with PIE when possible. - rfc v3: - Use --emit-relocs instead of -pie to reduce dynamic relocation space on mapped memory. It also simplifies the relocation process. - Move the start the module section next to the kernel. Remove the need for -mcmodel=large on modules. Extends
2017 Oct 11
32
[PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
Changes: - patch v1: - Simplify ftrace implementation. - Use gcc mstack-protector-guard-reg=%gs with PIE when possible. - rfc v3: - Use --emit-relocs instead of -pie to reduce dynamic relocation space on mapped memory. It also simplifies the relocation process. - Move the start the module section next to the kernel. Remove the need for -mcmodel=large on modules. Extends