Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] Keeping values in memory"
2010 Sep 08
8
[LLVMdev] LLVM 2.8 and MMX
On Wed, Sep 8, 2010 at 12:35 AM, Nicolas Capens
<nicolas.capens at gmail.com> wrote:
> Hi Chris,
>
> It's not broken, but the performance is crippled.
>
> I noticed that the code still contains some MMX instructions, but several
> operations get expanded (apparently swizzling and such get expanded to a
> large number of byte moves).
I think some changes related to
2010 Sep 22
1
[LLVMdev] LLVM 2.8 and MMX
Assign the bug to me and I'll fix it in TOT next week! Thanks for
narrowing it down!
On Wednesday, September 22, 2010, Nicolas Capens
<nicolas.capens at gmail.com> wrote:
> Hi all,
>
> I think I figured it out:
> 112804 causes 64-bit UNPCKLBW to no longer be selected for certain cases.
> 112805 is benign.
> 112806 causes 64-bit UNPCKHBW to no longer be selected for
2010 Sep 21
1
[LLVMdev] LLVM 2.8 and MMX
This thread confuses me. I thought Chris said that LLVM 2.8 will not
lower generic vectors to MMX because it breaks x87 code, and I didn't
see an answer to your question about a switch to tell the code
generator otherwise. However, you're complaining that MMX performance
is subpar, even though LLVM 2.8 isn't supposed to generate MMX
instructions.
Can someone clarify the situation
2010 Sep 21
0
[LLVMdev] LLVM 2.8 and MMX
On Sep 21, 2010, at 10:23 AM, Nicolas Capens wrote:
> Hi all,
>
> Sorry for the late reply. I got sidetracked by other fun projects. ;-)
>
> I found that the performance regression is caused by revisions 112804,
> 112805 and 112806. Those changes were made 2 days prior to the 2.8
> branching, so it may have not been the intention to include them there?
> Either way they
2010 Sep 22
0
[LLVMdev] LLVM 2.8 and MMX
LLVM isn't going to stop generating MMX instructions all together. We can't do that. :-) If the user specifically wants MMX (by, say, using the builtins), we have to support that still. The plan to cease generating MMX for generic vectors is a work-in-progress right now. It's not in 2.8.
-bw
On Sep 21, 2010, at 4:24 PM, Reid Kleckner wrote:
> This thread confuses me. I thought
2010 Sep 22
1
[LLVMdev] LLVM 2.8 and MMX
On Sep 21, 2010, at 5:30 PMPDT, Bill Wendling wrote:
> LLVM isn't going to stop generating MMX instructions all together. We can't do that. :-) If the user specifically wants MMX (by, say, using the builtins), we have to support that still. The plan to cease generating MMX for generic vectors is a work-in-progress right now. It's not in 2.8.
>
> -bw
Right, early on there
2010 Sep 08
0
[LLVMdev] LLVM 2.8 and MMX
On Sep 8, 2010, at 7:24 AM, Eli Friedman wrote:
> On Wed, Sep 8, 2010 at 12:35 AM, Nicolas Capens
> <nicolas.capens at gmail.com> wrote:
>> Hi Chris,
>>
>> It's not broken, but the performance is crippled.
>>
>> I noticed that the code still contains some MMX instructions, but several
>> operations get expanded (apparently swizzling and such
2008 Jul 07
0
[LLVMdev] Eager JIT
Sure, you can turn off lazy compilation. Take a look at
NoLazyCompilation in lli.cpp.
Evan
On Jul 7, 2008, at 6:08 AM, Nicolas Capens wrote:
> Hi all,
>
> Is there any way to generate the binary code for a whole module at
> once? Currently I always get lazy compilation one function at a time.
>
> The reason I would like to generate the whole module at once is
>
2010 Sep 21
1
[LLVMdev] LLVM 2.8 and MMX
On Sep 21, 2010, at 10:23 AMPDT, Nicolas Capens wrote:
> Hi all,
>
> Sorry for the late reply. I got sidetracked by other fun projects. ;-)
>
> I found that the performance regression is caused by revisions 112804,
> 112805 and 112806. Those changes were made 2 days prior to the 2.8
> branching, so it may have not been the intention to include them there?
> Either way
2008 May 08
7
[LLVMdev] Vector code
Hi all,
I'm trying to use LLVM to generate SIMD code at runtime (in particular Intel
SSE). But I'm having a bit of trouble understanding how to create even the
simplest function; adding two vectors of four single-precision
floating-point elements. I can get it to add the elements one at a time but
not using one vector instruction.
All help much appreciated!
Nicolas Capens
2008 Jul 07
2
[LLVMdev] Eager JIT
Hi all,
Is there any way to generate the binary code for a whole module at once?
Currently I always get lazy compilation one function at a time.
The reason I would like to generate the whole module at once is because I
create some functions at run-time and then minimize the memory footprint by
deallocating all LLVM objects. I've written my own JITMemoryManager to
ensure that the binary
2008 Jul 31
2
[LLVMdev] Generating movq2dq using IRBuilder
Hi all,
How do I generate the movq2dq SSE2 instruction using the IRBuilder? There is
no zext from 64-bit to 128-bit (corresponding to MMX to XMM register
transfer) as far as I can tell. So I've tried inserting an i64 into a v2i64,
which generates valid code but rather a number of stores and loads on the
stack instead of a single movq2dq.
Looking though the code, I found a pattern for
2010 Sep 21
0
[LLVMdev] LLVM 2.8 and MMX
Hi Dale,
I suspect that these patches were intended to improve 128-bit vector
performance but caused certain 64-bit vector operations to no longer lower
to MMX instructions. Anyway, now that I've narrowed it down to these patches
I think I can narrow it down further to a specific case so I can file a
bug...
Will Bruno be back soon or is he no longer working on the project for good?
Cheers,
2010 Sep 07
1
[LLVMdev] LLVM 2.8 and MMX
On Sep 7, 2010, at 7:45 AM, Nicolas Capens wrote:
> Hi all,
>
> I've tested a recent revision and noticed that using 64-bit vectors became very slow. It looks like they are expanded to non-MMX instructions to avoid breaking code which does not clear the MMX state using emms?
>
> For my project I'm already manually inserting emms instructions in the right places, so
2008 Jul 31
5
[LLVMdev] Generating movq2dq using IRBuilder
On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote:
> In the same breath I’d also like to kindly ask if someone could have
> a look at the reverse operations, namely trunk from 128 to 64 bit
> using movdq2q, and 128 to 32 and 64 to 32 using movd. This also
> seems related to Bug 2585. Thanks again.
The operations you're describing can be represented as insertelement
and
2008 May 08
0
[LLVMdev] Vector code
On Thu, 8 May 2008, Nicolas Capens wrote:
> Thanks for the advise, but I'm actually not trying to compile code from
> text. For now I'm just trying to construct the function directly. Think of
> it as the vector equivalent of the HowToUseJIT.cpp example.
There is a one to one mapping between text and IR. If you understand what
to generate it is much easier to generate it.
2008 May 08
2
[LLVMdev] Vector code
Hi Chris,
Thanks for the advise, but I'm actually not trying to compile code from
text. For now I'm just trying to construct the function directly. Think of
it as the vector equivalent of the HowToUseJIT.cpp example.
Cheers,
-Nicolas
-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Chris Lattner
Sent: Thursday, 08
2008 Jul 31
0
[LLVMdev] Generating movq2dq using IRBuilder
In the same breath I'd also like to kindly ask if someone could have a look
at the reverse operations, namely trunk from 128 to 64 bit using movdq2q,
and 128 to 32 and 64 to 32 using movd. This also seems related to Bug 2585.
Thanks again.
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Nicolas Capens
Sent: Thursday, 31 July, 2008 16:03
To:
2008 Jul 31
0
[LLVMdev] Generating movq2dq using IRBuilder
On 31-Jul-08, at 2:38 PM, Dan Gohman wrote:
> On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote:
>> In the same breath I’d also like to kindly ask if someone could have
>> a look at the reverse operations, namely trunk from 128 to 64 bit
>> using movdq2q, and 128 to 32 and 64 to 32 using movd. This also
>> seems related to Bug 2585. Thanks again.
>
> The operations
2008 May 20
2
[LLVMdev] Making use of SSE intrinsics
Hi all,
I'd like to make use of some specific x86 Streaming SIMD Extension
instructions, but I don't know where to start. For instance the 'rcpps'
instructions computes a low precision but fast reciprocal. I've noticed that
LLVM supports intrinsics, but I couldn't find any information on how to use
them. I've tried digging through the LLVM-GCC code but it's just