search for: palign

Displaying 4 results from an estimated 4 matches for "palign".

Did you mean: align
2016 Jun 22
2
x86: How to Force 2-byte `jmp` instruction in lowering
I have a bit of a riddle: In http://reviews.llvm.org/D19904 I'm trying to spell the following assembly: .palign 2, 0x90 jmp +0x9 nopw 512(%rax,%rax,1) // rest of the code I try the following snippet to accomplish this: OutStreamer->EmitLabel(CurSled); OutStreamer->EmitCodeAlignment(4); auto Target = OutContext.createLinkerPrivateTempSymbol(); // Use a two-byte `jmp`. This version of JM...
2014 Sep 10
13
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
On Tue, Sep 9, 2014 at 11:39 PM, Chandler Carruth <chandlerc at google.com> wrote: > Awesome, thanks for all the information! > > See below: > > On Tue, Sep 9, 2014 at 6:13 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> > wrote: >> >> You have already mentioned how the new shuffle lowering is missing >> some features; for example, you explicitly
2010 May 11
0
[LLVMdev] How does SSEDomainFix work?
...to the int domain because the add forced them. > Please tell me if something would be wrong for me. You should measure if LLVM's code is actually slower that the code you want. If it is, I would like to hear. Our weakness is the shufflevector instruction. It is selected into shufps/pshufd/palign/... only by looking at patterns. The instruction selector does not consider execution domains. This can be a problem because these instructions cannot be freely interchanged by the SSE execution domain pass. > foo.ll: > define <4 x i32> @foo(<4 x i32> %x, <4 x i32> %y, &lt...
2010 May 11
2
[LLVMdev] How does SSEDomainFix work?
Hello. This is my 1st post. I have tried SSE execution domain fixup pass. But I am not able to see any improvements. I expect for the example below to use MOVDQA, PAND &c. (On nehalem, ANDPS is extremely slower than PAND) Please tell me if something would be wrong for me. Thank you. Takumi Host: i386-mingw32 Build: trunk at 103373 foo.ll: define <4 x i32> @foo(<4 x i32> %x,