Displaying 20 results from an estimated 3000 matches similar to: "[LLVMdev] extractelement causes memory access violation - what to do?"
2015 Jun 26
2
[LLVMdev] extractelement causes memory access violation - what to do?
On 06/26/2015 08:42 AM, David Majnemer wrote:
>
>
> On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica <chfast at gmail.com
> <mailto:chfast at gmail.com>> wrote:
>
> Hi,
>
> Let's have a simple program:
> define i32 @main(i32 %n, i64 %idx) {
> %idxSafe = trunc i64 %idx to i5
> %r = extractelement <4 x i32> <i32 -1, i32
2015 Jun 30
2
[LLVMdev] extractelement causes memory access violation - what to do?
On Fri, Jun 26, 2015 at 5:42 PM David Majnemer <david.majnemer at gmail.com>
wrote:
> On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica <chfast at gmail.com> wrote:
>
>> Hi,
>>
>> Let's have a simple program:
>> define i32 @main(i32 %n, i64 %idx) {
>> %idxSafe = trunc i64 %idx to i5
>> %r = extractelement <4 x i32> <i32 -1, i32
2015 Jun 30
2
[LLVMdev] extractelement causes memory access violation - what to do?
On Tue, Jun 30, 2015 at 11:03 PM Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
> > From: "Paweł Bylica" <chfast at gmail.com>
> > To: "David Majnemer" <david.majnemer at gmail.com>
> > Cc: "LLVMdev" <llvmdev at cs.uiuc.edu>
> > Sent: Tuesday, June 30, 2015 5:42:24 AM
> > Subject: Re:
2015 Jul 01
2
[LLVMdev] extractelement causes memory access violation - what to do?
----- Original Message -----
> From: "Pete Cooper" <peter_cooper at apple.com>
> To: "Paweł Bylica" <chfast at gmail.com>
> Cc: "Hal Finkel" <hfinkel at anl.gov>, "LLVMdev" <llvmdev at cs.uiuc.edu>
> Sent: Wednesday, July 1, 2015 12:08:37 PM
> Subject: Re: [LLVMdev] extractelement causes memory access violation - what
2015 Jul 01
3
[LLVMdev] extractelement causes memory access violation - what to do?
----- Original Message -----
> From: "Pete Cooper" <peter_cooper at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "LLVMdev" <llvmdev at cs.uiuc.edu>, "Paweł Bylica" <chfast at gmail.com>
> Sent: Wednesday, July 1, 2015 6:42:41 PM
> Subject: Re: [LLVMdev] extractelement causes memory access violation - what to
2015 Jul 02
2
[LLVMdev] extractelement causes memory access violation - what to do?
----- Original Message -----
> From: "David Majnemer" <david.majnemer at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Pete Cooper" <peter_cooper at apple.com>, "LLVMdev" <llvmdev at cs.uiuc.edu>
> Sent: Wednesday, July 1, 2015 7:17:19 PM
> Subject: Re: [LLVMdev] extractelement causes memory access violation
2015 Jul 27
3
[LLVMdev] i1* function argument on x86-64
I am running into a problem with 'i1*' as a function's argument which
seems to have appeared since I switched to LLVM 3.6 (but can have other
source, of course). If I look at the assembler that the MCJIT generates
for an x86-64 target I see that the array 'i1*' is taken as a sequence
of 1 bit wide elements. (I guess that's correct). However, I used to
call the function
2010 May 11
2
[LLVMdev] How does SSEDomainFix work?
Hello. This is my 1st post.
I have tried SSE execution domain fixup pass.
But I am not able to see any improvements.
I expect for the example below to use MOVDQA, PAND &c.
(On nehalem, ANDPS is extremely slower than PAND)
Please tell me if something would be wrong for me.
Thank you.
Takumi
Host: i386-mingw32
Build: trunk at 103373
foo.ll:
define <4 x i32> @foo(<4 x i32> %x,
2014 Oct 13
2
[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets
Hello,
Depending on how I extract integer lanes from an x86_64 xmm register, the
backend may spill that register in order to load scalars. The effect was
observed on two targets: corei7-avx and btver1 (I haven't checked other
targets).
Here's a test case with spilling/no-spilling code put on conditional
compile:
#if __SSE4_1__ != 0
#include <smmintrin.h>
#else
#include
2010 May 11
0
[LLVMdev] How does SSEDomainFix work?
On May 10, 2010, at 9:07 PM, NAKAMURA Takumi wrote:
> Hello. This is my 1st post.
ようこそ!
> I have tried SSE execution domain fixup pass.
> But I am not able to see any improvements.
Did you actually measure runtime, or did you look at assembly?
> I expect for the example below to use MOVDQA, PAND &c.
> (On nehalem, ANDPS is extremely slower than PAND)
Are you sure? The
2014 Jul 23
4
[LLVMdev] the clang 3.5 loop optimizer seems to jump in unintentional for simple loops
the clang 3.5 loop optimizer seems to jump in unintentional for simple loops
the very simple example
----
const int SIZE = 3;
int the_func(int* p_array)
{
int dummy = 0;
#if defined(ITER)
for(int* p = &p_array[0]; p < &p_array[SIZE]; ++p) dummy += *p;
#else
for(int i = 0; i < SIZE; ++i) dummy += p_array[i];
#endif
return dummy;
}
int main(int argc, char** argv)
{
2015 Jul 24
2
[LLVMdev] SIMD for sdiv <2 x i64>
On 07/24/2015 03:42 AM, Benjamin Kramer wrote:
>> On 24.07.2015, at 08:06, zhi chen <zchenhn at gmail.com> wrote:
>>
>> It seems that that it's hard to vectorize int64 in LLVM. For example, LLVM 3.4 generates very complicated code for the following IR. I am running on a Haswell processor. Is it because there is no alternative AVX/2 instructions for int64? The same thing
2011 Feb 21
4
[LLVMdev] How to force stack alignment for particular target triple in JIT?
I get SEGV in gcc-compiled procedure in Solaris10-i386. This procedure
is called from llvm JIT code.
Exact instruction that crashes is this: movdqa %xmm0, 0x10(%esp)
%esp is 8-aligned, and by definition of movdqa it expects 16-aligned stack.
This leads me to believe that llvm uses wrong ABI when calling external
procedures and doesn't align stack properly.
llvm module executing in JIT has
2016 May 26
2
enabling interleaved access loop vectorization
Interleaved access is not enabled on X86 yet.
We looked at this feature and got into conclusion that interleaving (as loads + shuffles) is not always profitable on X86. We should provide the right cost which depends on number of shuffles. Number of shuffles depends on permutations (shuffle mask). And even if we estimate the number of shuffles, the shuffles are not generated in-place. Vectorizer
2016 Aug 05
3
enabling interleaved access loop vectorization
Hi Michael,
Sometime back I did some experiments with interleave vectorizer and did not found any degrade,
probably my tests/benchmarks are not extensive enough to cover much.
Elina is the right person to comment on it as she already experienced cases where it hinders performance.
For interleave vectorizer on X86 we do not have any specific costing, it goes to BasicTTI where the costing is not
2015 Jul 24
0
[LLVMdev] SIMD for sdiv <2 x i64>
------------------------------------ IR
------------------------------------------------------------------
if.then.i.i.i.i.i.i: ; preds = %if.then4
%S25_D = zext <2 x i32> %splatLDS17_D.splat to <2 x i64>
%umul_with_overflow.i.iS26_D = shl <2 x i64> %S25_D, <i64 3, i64 3>
%extumul_with_overflow.i.iS26_D = extractelement <2 x i64>
2015 Jul 24
1
[LLVMdev] SIMD for sdiv <2 x i64>
This snippet of IR is interesting:
%sub.ptr.div.iS37_D = sdiv <2 x i64> %sub.ptr.sub.iS36_D, <i64 24,
i64 24>
%cmp10S38_D = icmp ugt <2 x i64> %sub.ptr.div.iS37_D,
%splatInsMapS1_D.splat
%zextS39_D = sext <2 x i1> %cmp10S38_D to <2 x i64>
%BCS39_D = bitcast <2 x i64> %zextS39_D to i128
%mskS39_D = icmp ne i128 %BCS39_D, 0
br i1 %mskS39_D,
2016 May 26
0
enabling interleaved access loop vectorization
On 26 May 2016 at 19:12, Sanjay Patel via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> Is there a compile-time and/or potential runtime cost that makes
> enableInterleavedAccessVectorization() default to 'false'?
>
> I notice that this is set to true for ARM, AArch64, and PPC.
>
> In particular, I'm wondering if there's a reason it's not enabled for
2012 Feb 17
0
[LLVMdev] Folding an insertelt chain
On Feb 17, 2012, at 12:50 AM, Ivan Llopard wrote:
> Hello,
>
> I've added a little combining operation in DAGCombiner to fold a chain of insertelt nodes if that chain is proved to fully overwrite the very first source vector. In which case, I supposed a build_vector is better. It seems to be safe but I don't know if it is correctly implemented or if it is already done somewhere
2014 Sep 05
3
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
On Fri, Sep 5, 2014 at 9:32 AM, Robert Lougher <rob.lougher at gmail.com>
wrote:
> Unfortunately, another team, while doing internal testing has seen the
> new path generating illegal insertps masks. A sample here:
>
> vinsertps $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3]
> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3]
>