Displaying 4 results from an estimated 4 matches for "palign".
Did you mean:
align
2016 Jun 22
2
x86: How to Force 2-byte `jmp` instruction in lowering
I have a bit of a riddle:
In http://reviews.llvm.org/D19904 I'm trying to spell the following
assembly:
.palign 2, 0x90
jmp +0x9
nopw 512(%rax,%rax,1)
// rest of the code
I try the following snippet to accomplish this:
OutStreamer->EmitLabel(CurSled);
OutStreamer->EmitCodeAlignment(4);
auto Target = OutContext.createLinkerPrivateTempSymbol();
// Use a two-byte `jmp`. This version of JM...
2014 Sep 10
13
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
On Tue, Sep 9, 2014 at 11:39 PM, Chandler Carruth <chandlerc at google.com> wrote:
> Awesome, thanks for all the information!
>
> See below:
>
> On Tue, Sep 9, 2014 at 6:13 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com>
> wrote:
>>
>> You have already mentioned how the new shuffle lowering is missing
>> some features; for example, you explicitly
2010 May 11
0
[LLVMdev] How does SSEDomainFix work?
...to the int domain because the add forced them.
> Please tell me if something would be wrong for me.
You should measure if LLVM's code is actually slower that the code you want. If it is, I would like to hear.
Our weakness is the shufflevector instruction. It is selected into shufps/pshufd/palign/... only by looking at patterns. The instruction selector does not consider execution domains. This can be a problem because these instructions cannot be freely interchanged by the SSE execution domain pass.
> foo.ll:
> define <4 x i32> @foo(<4 x i32> %x, <4 x i32> %y, <...
2010 May 11
2
[LLVMdev] How does SSEDomainFix work?
Hello. This is my 1st post.
I have tried SSE execution domain fixup pass.
But I am not able to see any improvements.
I expect for the example below to use MOVDQA, PAND &c.
(On nehalem, ANDPS is extremely slower than PAND)
Please tell me if something would be wrong for me.
Thank you.
Takumi
Host: i386-mingw32
Build: trunk at 103373
foo.ll:
define <4 x i32> @foo(<4 x i32> %x,