similar to: [LLVMdev] AVX2 Cost Table in X86TargetTransformInfo

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] AVX2 Cost Table in X86TargetTransformInfo"

2015 May 04
3
[LLVMdev] AVX2 Cost Table in X86TargetTransformInfo
Thanks Nadav for the info. It clears my query :) Yes its an integer ADD, and since AVX2 supports 256 bits integer arithmetic, so its cost is less than AVX1. One query though - shouldn't then the cost of integer ADD/SUB/MUL (which would be 1) be explicitly specified in AVX2 cost table? Because right now this entry is missing and cost of these operations are taken from BaseTTI (which is
2014 Dec 11
2
[LLVMdev] Phabricator update
Hi Manuel, Thanks for the help. Still persists for me too. Instead of waiting indefinitely, now I get this error: Unhandled Exception ("AphrontDeadlockQueryException") #1205: Lock wait timeout exceeded; try restarting transaction On Thu, Dec 11, 2014 at 11:26 AM, suyog sarda <sardask01 at gmail.com> wrote: > The problem still persist :( > > On 12/11/14, Manuel Klimek
2014 Dec 11
2
[LLVMdev] Phabricator update
Another php type problem; can you please try again. Thanks! On Thu Dec 11 2014 at 1:37:32 PM Bruno Cardoso Lopes < bruno.cardoso at gmail.com> wrote: > I'm facing the same problem. > > On Thu, Dec 11, 2014 at 10:16 AM, suyog sarda <sardask01 at gmail.com> wrote: > > Hi, > > I am facing problem while submitting patch on phab. All things go smooth > - >
2014 Dec 11
2
[LLVMdev] Phabricator update
Hi, I am facing problem while submitting patch on phab. All things go smooth - create diff, create revision, specify title and comments. However, when I try to submit the diff by clicking "save" button, it takes a lot of time and eventually times out, failing to submit the patch. Any help on this? On Thursday, December 11, 2014, Manuel Klimek <klimek at google.com> wrote: >
2014 Aug 07
4
[LLVMdev] Efficient Pattern matching in Instruction Combine
Hi, All, Duncan, Rafael, David, Nick. This is regarding pattern matching in InstructionCombine pass. We use 'match' functions many times, but it doesn't do the pattern matching effectively. e.x. Lets take pattern : (A ^ B) | ((B ^ C) ^ A) -> (A ^ B) | C (B ^ A) | ((B ^ C) ^ A) -> (A ^ B) | C Both the patterns above are same, since ^ is commutative in Op0. But,
2014 Dec 11
3
[LLVMdev] [cfe-dev] Phabricator update
On Wed, Dec 10, 2014 at 2:38 PM, Jonathan Roelofs <jonathan at codesourcery.com > wrote: > I think the send-email part of phab has yet to come back up. > Yes, restarting it would be very helpful. > > > Cheers, > > Jon > > > On 12/10/14 1:59 PM, Manuel Klimek wrote: > >> Phab is back up - it's still a little slow (the mysql database we use is
2014 Dec 10
2
[LLVMdev] Phabricator update
Phab is back up - it's still a little slow (the mysql database we use is doing some cleanups). On Wed Dec 10 2014 at 5:07:07 PM suyog sarda <sardask01 at gmail.com> wrote: > And i was thinking something wrong with my proxy configuration :P > > On Wed, Dec 10, 2014 at 6:47 PM, Manuel Klimek <klimek at google.com> wrote: > >> Heya, >> >> if you wonder
2015 Apr 10
2
[LLVMdev] MMX/SSE subtarget feature in IR
Your clang invocation below works for me, and generates target triple in the llvm IR of i386. And then in the specific options for the functions it generates the following: ; Function Attrs: nounwind define float @foo() #0 { entry: ret float 1.000000e+00 } attributes #0 = { nounwind "less-precise-fpmad"="false" "no-frame-pointer-elim"= "true"
2015 Jun 26
2
[LLVMdev] Can LLVM vectorize <2 x i32> type
For example, I have the following IR code, for.cond.preheader: ; preds = %if.end18 %mul = mul i32 %12, %3 %cmp21128 = icmp sgt i32 %mul, 0 br i1 %cmp21128, label %for.body.preheader, label %return for.body.preheader: ; preds = %for.cond.preheader %19 = mul i32 %12, %3 %20 = add i32 %19, -1 %21 = zext i32 %20 to i64 %22 =
2014 Dec 11
3
[LLVMdev] [cfe-dev] Phabricator update
On Thu, Dec 11, 2014 at 1:29 AM, Manuel Klimek <klimek at google.com> wrote: > On Thu Dec 11 2014 at 2:16:00 AM Alexey Samsonov <vonosmas at gmail.com> > wrote: > >> On Wed, Dec 10, 2014 at 2:38 PM, Jonathan Roelofs < >> jonathan at codesourcery.com> wrote: >> >>> I think the send-email part of phab has yet to come back up. >>>
2014 Aug 08
4
[LLVMdev] Efficient Pattern matching in Instruction Combine
Hi Duncan, David, Sean. Thanks for your reply. > It'd be interesting if you could find a design that also treated these > the same: > > (B ^ A) | ((A ^ B) ^ C) -> (A ^ B) | C > (B ^ A) | ((B ^ C) ^ A) -> (A ^ B) | C > (B ^ A) | ((C ^ A) ^ B) -> (A ^ B) | C > > I.e., `^` is also associative. Agree with Duncan on including associative operation too.
2014 Aug 13
2
[LLVMdev] Efficient Pattern matching in Instruction Combine
Thanks Sean for the reference. I will go through it and see if i can implement it for generic boolean expression minimization. Regards, Suyog On Wed, Aug 13, 2014 at 2:30 AM, Sean Silva <chisophugis at gmail.com> wrote: > Re-adding the mailing list (remember to hit "reply all") > > > On Tue, Aug 12, 2014 at 9:36 AM, suyog sarda <sardask01 at gmail.com> wrote:
2014 Dec 10
2
[LLVMdev] Phabricator update
Heya, if you wonder why phabricator is down - it's an upgrade that is running a database update that takes a while (probably 3-5 more hours). I'll update this thread once it's finished and phab is up again. Cheers, /Manuel -------------- next part -------------- An HTML attachment was scrubbed... URL:
2014 Nov 10
2
[LLVMdev] [Vectorization] Mis match in code generated
Hi Suyog, Thanks for looking at this. This has recently got itself onto my TODO list too. > I am not sure how much all this will improve the code quality for horizontal reduction > (donno how frequently such pattern of horizontal reduction from same array occurs in real world/SPECS). Actually the main loop of 470.lbm can be SLP vectorized like this. We have three parts to it: A fully
2014 Dec 11
2
[LLVMdev] [cfe-dev] Phabricator update
Heya, I'll look into it first thing tomorrow - probably a problem with the encoding settings. On Thu Dec 11 2014 at 9:17:40 PM Robinson, Paul < Paul_Robinson at playstation.sony.com> wrote: > What I'm seeing is that Phabricator emails double-space *everything* > (not just the diffs). > > --paulr > > > > *From:* cfe-dev-bounces at cs.uiuc.edu
2015 Apr 09
2
[LLVMdev] MMX/SSE subtarget feature in IR
Thanks Kevin for the reply. I got the point now :) On 10 Apr 2015 00:18, "Smith, Kevin B" <kevin.b.smith at intel.com> wrote: > For x86_64 ABI, a minimum feature set of SSE2 is required. > > > > Kevin > > > > *From:* llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] *On > Behalf Of *suyog sarda > *Sent:* Thursday, April 09,
2014 Sep 19
3
[LLVMdev] [Vectorization] Mis match in code generated
Hi Arnold, Thanks for your reply. I tried test case as suggested by you. *void foo(int *a, int *sum) {*sum = a[0]+a[1]+a[2]+a[3]+a[4]+a[5]+a[6]+a[7]+a[8]+a[9]+a[10]+a[11]+a[12]+a[13]+a[14]+a[15];}* so that it has a 'store' in its IR. *IR before vectorization :*target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128" target triple =
2013 Dec 19
4
[LLVMdev] LLVM ARM VMLA instruction
Hi Tim, > > cortex-a15 vfpv4 : vmla instruction emitted (which is a NEON instruction) > > I get a VFP vmla here rather than a NEON one (clang -target > armv7-linux-gnueabihf -mcpu=cortex-a15): "vmla.f32 s0, s1, s2". Are > you seeing something different? > As per Renato comment above, vmla instruction is NEON instruction while vmfa is VFP instruction. Correct
2015 Jun 24
2
[LLVMdev] Can LLVM vectorize <2 x i32> type
Hi, Is LLVM be able to generate code for the following code? %mul = mul <2 x i32> %1, %2, where %1 and %2 are <2 x i32> type. I am running it on a Haswell processor with LLVM-3.4.2. It seems that it will generates really complicated code with vpaddq, vpmuludq, vpsllq, vpsrlq. Thanks, Zhi -------------- next part -------------- An HTML attachment was scrubbed... URL:
2016 Feb 25
2
how to force llvm generate gather intrinsic
Yes, masked load/store/gather/scatter are completed. - Elena From: zhi chen [mailto:zchenhn at gmail.com] Sent: Thursday, February 25, 2016 01:20 To: Demikhovsky, Elena <elena.demikhovsky at intel.com> Cc: Sanjay Patel <spatel at rotateright.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] how to