similar to: [LLVMdev] Using intrinsics with memory operands

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] Using intrinsics with memory operands"

2008 Aug 01
0
[LLVMdev] Using intrinsics with memory operands
On Fri, Aug 1, 2008 at 12:10 AM, Nicolas Capens <nicolas at capens.net> wrote: > I was wondering how to use variations of intrinsic functions that take a > memory operand. Often, for intrinsics where it matters, there's a variant of the intrinsic that takes a pointer operand that you can use, although it looks like there isn't one here. > Take for example the SSE4.1
2016 Apr 12
2
X86 TRUNCATE cost for AVX & AVX2 mode
<Copied Cong> Thanks Elena. Mostly I was interested in why such a high cost 30 kept for TRUNCATE v16i32 to v16i8 in SSE41. Looking at the code it appears like TRUNCATE v16i32 to v16i8 in SSE41 is very expensive vs SSE2. I feel this number should be same/close to the cost mentioned for same operation in SSE2ConversionTbl. Below patch from Cong Hou reduce cost for same operation in SSE2
2016 Apr 11
2
X86 TRUNCATE cost for AVX & AVX2 mode
Hi, I was going through the X86TTIImpl::getCastInstrCost, and got a doubt on cost calculation for TRUNCATE instruction in AVX mode. In AVX2ConversionTbl & AVXConversionTbl table there is no cost defined for TRUNCATE v16i32 to v16i8, as a fallback it goes to SSE41ConversionTbl table and there it finds cost as 30 for this operation. 30 cost for this operation looks very high. Wondering why
2016 Jul 29
2
Help with ISEL matching for an SDAG
I have the following selection DAG: SelectionDAG has 9 nodes: t0: ch = EntryToken t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 t16: i32,ch = load<LD1[%ptr](tbaa=<0x10023c9f448>), anyext from i8> t0, t2, undef:i64 t15: v16i8 = BUILD_VECTOR t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16 t11: ch,glue = CopyToReg t0, Register:v16i8 %V2, t15
2008 Nov 20
4
[LLVMdev] changing -mattr behavior with mmx and sse
Hi, When setting -mattr option on X86, I would like to treat MMX separately from SSE levels. This would allow a client who sets the attributes directly to set the SSE level independent of MMX, e.g., llc -march=x86 -mattr=sse41, one would get sse4.1 with mmx disabled while llc -march=x86 -mattr=mmx -mattr=sse42 will get mmx and sse42. If anyone objects to this change, please let me
2017 May 08
2
LLVM and Xeon Skylake v5
getProcessTriple just determines operation system, and architecture. It doesn't deal with specific instruction set features. The CPU should be controlled by MCPU on the EngineBuilder i think. The CPU autodetection code lives in getHostCPUName in lib/Support/Host.cpp, but I don't think the JIT calls into. I think its expected the user would call it or pass a specific CPU string to the MCPU
2012 Sep 04
2
[LLVMdev] branch on vector compare?
Roland Scheidegger <sroland <at> vmware.com> writes: > This looks quite similar to something I filed a bug on (12312). Michael > Liao submitted fixes for this, so I think > if you change it to > %16 = fcmp ogt <4 x float> %15, %cr > %17 = sext <4 x i1> %16 to <4 x i32> > %18 = bitcast <4 x i32> %17 to i128 > %19 = icmp ne i128 %18, 0
2012 Sep 05
0
[LLVMdev] branch on vector compare?
Am 05.09.2012 00:24, schrieb Stephen: > Roland Scheidegger <sroland <at> vmware.com> writes: >> This looks quite similar to something I filed a bug on (12312). Michael >> Liao submitted fixes for this, so I think >> if you change it to >> %16 = fcmp ogt <4 x float> %15, %cr >> %17 = sext <4 x i1> %16 to <4 x i32> >> %18 =
2017 May 08
2
LLVM and Xeon Skylake v5
Thank you. I'm letting it auto detect by setting the target using getProcessTarget. I disabled avx512 support by passing -avx512f (and the other variants) to setMAttrs on EngineBuilder. I can see refs to avx512 in X86.td. It's the exact same executable running on Kabylake. What does the Cannot select: specifically mean? Is there some table that doesn't have a definition for a key in
2008 Nov 20
0
[LLVMdev] changing -mattr behavior with mmx and sse
Might you instead consider just adding a -disable-mmx option? Preston On Thu, 2008-20-11 at 02:57 -0500, Mon Ping Wang wrote: > Hi, > > When setting -mattr option on X86, I would like to treat MMX > separately from SSE levels. This would allow a client who sets the > attributes directly to set the SSE level independent of MMX, e.g., llc > -march=x86 -mattr=sse41, one would get
2008 Nov 20
0
[LLVMdev] changing -mattr behavior with mmx and sse
On Nov 19, 2008, at 11:57 PMPST, Mon Ping Wang wrote: > Hi, > > When setting -mattr option on X86, I would like to treat MMX > separately from SSE levels. This would allow a client who sets the > attributes directly to set the SSE level independent of MMX, e.g., llc > -march=x86 -mattr=sse41, one would get sse4.1 with mmx disabled while > llc -march=x86 -mattr=mmx
2008 Nov 20
1
[LLVMdev] changing -mattr behavior with mmx and sse
Hi Dale, I will not change the default. I would dislike to see any regressions due to this type of change. -- Mon Ping On Nov 20, 2008, at 10:12 AM, Dale Johannesen wrote: > > On Nov 19, 2008, at 11:57 PMPST, Mon Ping Wang wrote: > >> Hi, >> >> When setting -mattr option on X86, I would like to treat MMX >> separately from SSE levels. This would allow a
2008 Nov 20
1
[LLVMdev] changing -mattr behavior with mmx and sse
On Nov 20, 2008, at 8:31 AM, Preston Gurd wrote: > Might you instead consider just adding a -disable-mmx option? I agree, this is a better approach. This distinguishes between capabilities of the chip and the desire to codegen specific vectors one way or another. -Chris > > Preston > > On Thu, 2008-20-11 at 02:57 -0500, Mon Ping Wang wrote: >> Hi, >> >>
2014 Mar 11
2
x86_64 SSE2/SSE41 optim not used
Hi Guys, In stream_decoder.c when assigning lpc restore function, only IA32 processor benefits from SS2 and SSE4.1 optimization. Shouldn't it be the case for x86_64 processor as well ? Thanks, -- Olivier TRISTAN uvi.net -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/flac-dev/attachments/20140311/1d49b5c2/attachment.htm
2014 Sep 09
5
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
Hi Chandler, Thanks for fixing the problem with the insertps mask. Generally the new shuffle lowering looks promising, however there are some cases where the codegen is now worse causing runtime performance regressions in some of our internal codebase. You have already mentioned how the new shuffle lowering is missing some features; for example, you explicitly said that we currently lack of
2017 Jan 02
1
FLAC 1.3.2 has been released
Janne Hyvärinen wrote: > That shouldn't matter. I realise that, but I wanted a better idea about how many people this is like to affect. If you were on Windows XP on some old processor this would probably not affect many people, but since you are on Windows 10 with a Core i7 thats a different matter. Erik -- ---------------------------------------------------------------------- Erik de
2014 Sep 08
2
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
> On Sep 7, 2014, at 8:49 PM, Quentin Colombet <qcolombet at apple.com> wrote: > > Sure, > > Here is the command line: > clang -cc1 -triple x86_64-apple-macosx -S -disable-free -disable-llvm-verifier -main-file-name tmp.i -mrelocation-model pic -pic-level 2 -mdisable-fp-elim -masm-verbose -munwind-tables -target-cpu core-avx-i -O3 -ferror-limit 19 -fmessage-length 114
2008 Jul 31
5
[LLVMdev] Generating movq2dq using IRBuilder
On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote: > In the same breath I’d also like to kindly ask if someone could have > a look at the reverse operations, namely trunk from 128 to 64 bit > using movdq2q, and 128 to 32 and 64 to 32 using movd. This also > seems related to Bug 2585. Thanks again. The operations you're describing can be represented as insertelement and
2008 Jun 21
3
[LLVMdev] Vector cast
Hi all, I seem to be unable to cast a vector of integers to a vector of floats (uitofp [4 x i8] to [4 x float], to be exact). It hits an assert in LegalizeDAG.cpp line 5433: "Unknown int value type". The Assembly Language Reference Manual's definition of uitofp doesn't indicate that this is unsupported, so it looks like a bug to me. I'm on an x86 system by the way. My
2012 Sep 03
3
[LLVMdev] branch on vector compare?
> > which goes through memory. Is there some idiom I'm missing so that it would use > > for instance movmsk for SSE or vcmpgt & cr6 for altivec? > > I don't think you are missing anything: LLVM IR has no support for horizontal > operations like or'ing the elements of a vector of boolean together. The code > generators do try to recognize a few idioms and