thr3ads.net - search: "86bbc835"

Displaying 3 results from an estimated 3 matches for "86bbc835".

[LLVMdev] unaligned AVX store gets split into two instructions

2013 Jul 10

[LLVMdev] unaligned AVX store gets split into two instructions

Thanks for all the the info! I'm still in the process of narrowing down the performance difference in my code. I'm no longer convinced its related to only the unaligned loads/stores alone since extracting this part of the kernel makes the performance difference disappear. I will try to narrow down what is going on and if it seems related LLVM, I will post an example. Thanks again, Zach

[LLVMdev] unaligned AVX store gets split into two instructions

2013 Jul 10

[LLVMdev] unaligned AVX store gets split into two instructions

Hi, Yes. On Sandybridge 256-bit loads/stores are double pumped. This means that they go in one after the other in two cycles. On Haswell the memory ports are wide enough to allow a 256bit memory operation in one cycle. So, on Sandybridge we split unaligned memory operations into two 128bit parts to allow them to execute in two separate ports. This is also what GCC and ICC do. It is very

[LLVMdev] unaligned AVX store gets split into two instructions

2013 Jul 10

[LLVMdev] unaligned AVX store gets split into two instructions

...faster. See r172868/r172894. >> >> Adding Nadav in case he has anything more to say. >> >> -Eli >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130710/86bbc835/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: harness.cpp Type: text/x-c++src Size: 346 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130710/86bbc835/attachment.cpp> -------------- next part ---...

search for: 86bbc835