Displaying 3 results from an estimated 3 matches for "86bbc835".
2013 Jul 10
0
[LLVMdev] unaligned AVX store gets split into two instructions
Thanks for all the the info! I'm still in the process of narrowing down the
performance difference in my code. I'm no longer convinced its related to
only the unaligned loads/stores alone since extracting this part of the
kernel makes the performance difference disappear. I will try to narrow
down what is going on and if it seems related LLVM, I will post an example.
Thanks again,
Zach
2013 Jul 10
3
[LLVMdev] unaligned AVX store gets split into two instructions
Hi,
Yes. On Sandybridge 256-bit loads/stores are double pumped. This means that they go in one after the other in two cycles. On Haswell the memory ports are wide enough to allow a 256bit memory operation in one cycle. So, on Sandybridge we split unaligned memory operations into two 128bit parts to allow them to execute in two separate ports. This is also what GCC and ICC do.
It is very
2013 Jul 10
2
[LLVMdev] unaligned AVX store gets split into two instructions
...faster. See r172868/r172894.
>>
>> Adding Nadav in case he has anything more to say.
>>
>> -Eli
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130710/86bbc835/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: harness.cpp
Type: text/x-c++src
Size: 346 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130710/86bbc835/attachment.cpp>
-------------- next part ---...