Displaying 4 results from an estimated 4 matches for "401344".
2017 Jun 24
4
AVX Scheduling and Parallelism
...dqu32 zmm0, zmmword ptr [rip + c+64]
vpaddd zmm0, zmm0, zmmword ptr [rip + b+64]
and
eg. 2
mov rax, -393216
.p2align 4, 0x90
.LBB0_1: # %vector.body
# =>This Inner Loop Header: Depth=1
vmovdqu32 zmm1, zmmword ptr [rax + c+401344] ; load c[401344]
in zmm1
vmovdqu32 zmm0, zmmword ptr [rax + c+401280] ;load b[401280]
in zmm0
vpaddd zmm1, zmm1, zmmword ptr [rax + b+401344] ;
zmm1<-zmm1+b[401344]
vmovdqu32 zmmword ptr [rax + a+401344], zmm1 ; store zmm1 in
c[401344]
vmovdqu32 zm...
2017 Jun 25
2
AVX Scheduling and Parallelism
...zmm0, zmm0, zmmword ptr [rip + b+64]
and
eg. 2
mov rax, -393216
.p2align 4, 0x90
.LBB0_1: # %vector.body
# =>This Inner Loop Header: Depth=1
vmovdqu32 zmm1, zmmword ptr [rax + c+401344] ; load c[401344] in zmm1
vmovdqu32 zmm0, zmmword ptr [rax + c+401280] ;load b[401280] in zmm0
vpaddd zmm1, zmm1, zmmword ptr [rax + b+401344] ; zmm1<-zmm1+b[401344]
vmovdqu32 zmmword ptr [rax + a+401344], z...
2017 Jun 25
0
AVX Scheduling and Parallelism
...ntext of targeting the KNL, however, I'm a bit
concerned about the addressing, and specifically, the size of the
resulting encoding:
> vmovdqu32 zmm0, zmmword ptr [rax + c+401280] ;load b[401280] in
> zmm0
>
> vpaddd zmm1, zmm1, zmmword ptr [rax + b+401344]
> ; zmm1<-zmm1+b[401344]
The KNL can only deliver 16 bytes per cycle from the icache to the
decoder. Essentially all of the instructions in the loop, as we seem to
generate it, have 10-byte encodings:
10: 62 f1 7e 48 6f 80 00 vmovdqu32 0x0(%rax),%zmm0
17: 00 00...
2010 Mar 25
0
[Bug 1039] Incomplete application of HostKeyAlias in ssh
https://bugzilla.mindrot.org/show_bug.cgi?id=1039
Darren Tucker <dtucker at zip.com.au> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |CLOSED
--- Comment #12 from Darren Tucker <dtucker at zip.com.au> 2010-03-26 10:51:05 EST ---
With the