Cameron McInally
2012-Nov-07 04:38 UTC
[LLVMdev] AVX broadcast Vs. vector constant pool load
Hey guys, I'm currently investigating broadcasts from the constant pool on Sandy Bridge. I see this comment in llvm/lib/Target/X86/X86ISelLowering.cpp: // Handle the broadcasting a single constant scalar from the constant pool // into a vector. On Sandybridge it is still better to load a constant vector // from the constant pool and not to broadcast it from a scalar. Would anyone be able to explain why it is better to load a vector from the constant pool rather than broadcast a scalar? I checked out Agner Fog's tables, but it wasn't so obvious to me... vmovaps y, m256: Uops: 1 Lat: 4 Throughput: 1 vbroadcastsd y, m64: Uops: 2 Lat: [Not or cannot be measured] Throughput: 1 Thanks in advance, Cameron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121106/cd47780e/attachment.html>
I don't remember exactly why I did this. I vaguely remember looking at this with one of the Sandybridge architects and following his suggestion. When I look at it now, it looks like broadcasting the scalar would be faster because the 256 bit load on sandy bridge is double pumped. I am CC-ing Elena, who should be able to tell. On Nov 6, 2012, at 8:38 PM, Cameron McInally <cameron.mcinally at nyu.edu> wrote:> Hey guys, > > I'm currently investigating broadcasts from the constant pool on Sandy Bridge. I see this comment in llvm/lib/Target/X86/X86ISelLowering.cpp: > > // Handle the broadcasting a single constant scalar from the constant pool > // into a vector. On Sandybridge it is still better to load a constant vector > // from the constant pool and not to broadcast it from a scalar. > > Would anyone be able to explain why it is better to load a vector from the constant pool rather than broadcast a scalar? > > I checked out Agner Fog's tables, but it wasn't so obvious to me... > > vmovaps y, m256: > Uops: 1 > Lat: 4 > Throughput: 1 > > vbroadcastsd y, m64: > Uops: 2 > Lat: [Not or cannot be measured] > Throughput: 1 > > Thanks in advance, > Cameron > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121106/d22100a2/attachment.html>