thr3ads.net - search: "perfoptim"

Displaying 2 results from an estimated 2 matches for "perfoptim".

[LLVMdev] Is PIC code defeating the branch predictor?

2011 Jan 04

[LLVMdev] Is PIC code defeating the branch predictor?

...r returns > in the function because calls and returns no longer are matched. According to benchmarks by Apple, it's nevertheless faster on modern x86 processors than the trampoline-based alternative (except maybe on Atom, as mentioned in another reply): http://lists.apple.com/archives/perfoptimization-dev/2007/Nov/msg00005.html At the time of that post, Apple's version of GCC still generated trampolines (hence the remark). They switched that to the above pattern afterwards. Jonas

[LLVMdev] Is PIC code defeating the branch predictor?

2011 Jan 04

[LLVMdev] Is PIC code defeating the branch predictor?

I noticed that we generate code like this for i386 PIC: calll L0$pb L0$pb: popl %eax movl %eax, -24(%ebp) ## 4-byte Spill I worry that this defeats the return address prediction for returns in the function because calls and returns no longer are matched. From Intel's Optimization Reference Manual: "The return address stack mechanism augments the static and dynamic

search for: perfoptim