Displaying 12 results from an estimated 12 matches for "vpbroadcastq".
2017 Jul 01
2
KNL Assembly Code for Matrix Multiplication
...o the value
>> actually present at these locations so zmm22 will contain values not
>> indexes. suppose [8]={1}, [9]={5}, [10]={4}...... so zmm22 will become
>> zmm22={1, 5, 4, 3, 8, 7, 6, 2}......these are those 64 bit values loaded
>> from memory indexes.
>>
>> vpbroadcastq zmm2, qword ptr [rip + .LCPI0_2]; here .LCPI0_2=4000 means
>> broadcast value at this index for eg this location contains 2 so
>> zmm2={2,2,2,2.....2}.
>>
>> vpmuludq zmm14, zmm10, zmm2 ; this step is value multiplication not
>> index, there seems no point in multiplyi...
2017 Oct 11
1
[PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support
...28 .Lshufb_16x16b(%rip), a0; \
vmovdqu st1, a1; \
vpshufb a0, a2, a2; \
vpshufb a0, a3, a3; \
@@ -521,7 +521,7 @@ ENDPROC(roundsm32_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
#define inpack32_pre(x0, x1, x2, x3, x4, x5, x6, x7, y0, y1, y2, y3, y4, y5, \
y6, y7, rio, key) \
vpbroadcastq key, x0; \
- vpshufb .Lpack_bswap, x0, x0; \
+ vpshufb .Lpack_bswap(%rip), x0, x0; \
\
vpxor 0 * 32(rio), x0, y7; \
vpxor 1 * 32(rio), x0, y6; \
@@ -572,7 +572,7 @@ ENDPROC(roundsm32_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
vmovdqu x0, stack_tmp0; \
\
vpbroadcastq key, x0; \...
2017 Aug 06
2
VBROADCAST Implementation Issues
...05 at gmail.com>
>>>> wrote:
>>>>
>>>>> Sorry to disturb,
>>>>> Now i want to implement instruction to broadcast scalar register
>>>>> content to vector.
>>>>>
>>>>> like this;
>>>>> vpbroadcastq zmm0, rsi
>>>>>
>>>>>
>>>>> I tried implementing it as follows;
>>>>>
>>>>> def BROADCASTR_256B : I<0x21, MRMSrcReg, (outs VR_2048:$dst), (ins
>>>>> GR64:$src),
>>>>> &quo...
2017 Aug 07
2
VBROADCAST Implementation Issues
...>>>
>>>>>>> Sorry to disturb,
>>>>>>> Now i want to implement instruction to broadcast scalar register
>>>>>>> content to vector.
>>>>>>>
>>>>>>> like this;
>>>>>>> vpbroadcastq zmm0, rsi
>>>>>>>
>>>>>>>
>>>>>>> I tried implementing it as follows;
>>>>>>>
>>>>>>> def BROADCASTR_256B : I<0x21, MRMSrcReg, (outs VR_2048:$dst), (ins
>>>>>>> GR64:$src...
2017 Aug 07
3
VBROADCAST Implementation Issues
...>>>>>> Now i want to implement instruction to broadcast scalar register
>>>>>>>>>>> content to vector.
>>>>>>>>>>>
>>>>>>>>>>> like this;
>>>>>>>>>>> vpbroadcastq zmm0, rsi
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I tried implementing it as follows;
>>>>>>>>>>>
>>>>>>>>>>> def BROADCASTR_256B : I<0...
2018 Mar 13
32
[PATCH v2 00/27] x86: PIE support and option to extend KASLR randomization
Changes:
- patch v2:
- Adapt patch to work post KPTI and compiler changes
- Redo all performance testing with latest configs and compilers
- Simplify mov macro on PIE (MOVABS now)
- Reduce GOT footprint
- patch v1:
- Simplify ftrace implementation.
- Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
- rfc v3:
- Use --emit-relocs instead of -pie to reduce
2018 Mar 13
32
[PATCH v2 00/27] x86: PIE support and option to extend KASLR randomization
Changes:
- patch v2:
- Adapt patch to work post KPTI and compiler changes
- Redo all performance testing with latest configs and compilers
- Simplify mov macro on PIE (MOVABS now)
- Reduce GOT footprint
- patch v1:
- Simplify ftrace implementation.
- Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
- rfc v3:
- Use --emit-relocs instead of -pie to reduce
2017 Oct 04
28
x86: PIE support and option to extend KASLR randomization
These patches make the changes necessary to build the kernel as Position
Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
the top 2G of the virtual address space. It allows to optionally extend the
KASLR randomization range from 1G to 3G.
Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
changes, PIE support and KASLR in general. Thanks to
2017 Oct 04
28
x86: PIE support and option to extend KASLR randomization
These patches make the changes necessary to build the kernel as Position
Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
the top 2G of the virtual address space. It allows to optionally extend the
KASLR randomization range from 1G to 3G.
Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
changes, PIE support and KASLR in general. Thanks to
2018 May 23
33
[PATCH v3 00/27] x86: PIE support and option to extend KASLR randomization
Changes:
- patch v3:
- Update on message to describe longer term PIE goal.
- Minor change on ftrace if condition.
- Changed code using xchgq.
- patch v2:
- Adapt patch to work post KPTI and compiler changes
- Redo all performance testing with latest configs and compilers
- Simplify mov macro on PIE (MOVABS now)
- Reduce GOT footprint
- patch v1:
- Simplify ftrace
2017 Oct 11
32
[PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
Changes:
- patch v1:
- Simplify ftrace implementation.
- Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
- rfc v3:
- Use --emit-relocs instead of -pie to reduce dynamic relocation space on
mapped memory. It also simplifies the relocation process.
- Move the start the module section next to the kernel. Remove the need for
-mcmodel=large on modules. Extends
2017 Oct 11
32
[PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
Changes:
- patch v1:
- Simplify ftrace implementation.
- Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
- rfc v3:
- Use --emit-relocs instead of -pie to reduce dynamic relocation space on
mapped memory. It also simplifies the relocation process.
- Move the start the module section next to the kernel. Remove the need for
-mcmodel=large on modules. Extends