Displaying 20 results from an estimated 5000 matches similar to: "[LLVMdev] Efficient Pattern matching in Instruction Combine"
2014 Aug 08
4
[LLVMdev] Efficient Pattern matching in Instruction Combine
Hi Duncan, David, Sean.
Thanks for your reply.
> It'd be interesting if you could find a design that also treated these
> the same:
>
> (B ^ A) | ((A ^ B) ^ C) -> (A ^ B) | C
> (B ^ A) | ((B ^ C) ^ A) -> (A ^ B) | C
> (B ^ A) | ((C ^ A) ^ B) -> (A ^ B) | C
>
> I.e., `^` is also associative.
Agree with Duncan on including associative operation too.
2014 Aug 13
2
[LLVMdev] Efficient Pattern matching in Instruction Combine
Thanks Sean for the reference.
I will go through it and see if i can implement it for generic boolean
expression minimization.
Regards,
Suyog
On Wed, Aug 13, 2014 at 2:30 AM, Sean Silva <chisophugis at gmail.com> wrote:
> Re-adding the mailing list (remember to hit "reply all")
>
>
> On Tue, Aug 12, 2014 at 9:36 AM, suyog sarda <sardask01 at gmail.com> wrote:
2014 Aug 13
2
[LLVMdev] Efficient Pattern matching in Instruction Combine
Even if you can't implement such an algorithm sanely, ISTM that
auto-generating this code from a table (or whatever), and choosing
canonical results (to avoid a fixpoint issue), rather than what seems
to be hand-additions of every possible set of minimizations on three
variables, is still a better solution, no?
At least then you wouldn't have human errors, and a growing file that
makes
2015 Jun 24
2
[LLVMdev] Can LLVM vectorize <2 x i32> type
Hi,
Is LLVM be able to generate code for the following code?
%mul = mul <2 x i32> %1, %2, where %1 and %2 are <2 x i32> type.
I am running it on a Haswell processor with LLVM-3.4.2. It seems that it
will generates really complicated code with vpaddq, vpmuludq, vpsllq,
vpsrlq.
Thanks,
Zhi
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2015 Jun 26
2
[LLVMdev] Can LLVM vectorize <2 x i32> type
For example, I have the following IR code,
for.cond.preheader: ; preds = %if.end18
%mul = mul i32 %12, %3
%cmp21128 = icmp sgt i32 %mul, 0
br i1 %cmp21128, label %for.body.preheader, label %return
for.body.preheader: ; preds =
%for.cond.preheader
%19 = mul i32 %12, %3
%20 = add i32 %19, -1
%21 = zext i32 %20 to i64
%22 =
2014 Dec 11
2
[LLVMdev] Phabricator update
Another php type problem; can you please try again. Thanks!
On Thu Dec 11 2014 at 1:37:32 PM Bruno Cardoso Lopes <
bruno.cardoso at gmail.com> wrote:
> I'm facing the same problem.
>
> On Thu, Dec 11, 2014 at 10:16 AM, suyog sarda <sardask01 at gmail.com> wrote:
> > Hi,
> > I am facing problem while submitting patch on phab. All things go smooth
> -
>
2014 Dec 11
2
[LLVMdev] Phabricator update
Hi Manuel,
Thanks for the help. Still persists for me too. Instead of waiting
indefinitely, now I get this error:
Unhandled Exception ("AphrontDeadlockQueryException")
#1205: Lock wait timeout exceeded; try restarting transaction
On Thu, Dec 11, 2014 at 11:26 AM, suyog sarda <sardask01 at gmail.com> wrote:
> The problem still persist :(
>
> On 12/11/14, Manuel Klimek
2015 May 04
3
[LLVMdev] AVX2 Cost Table in X86TargetTransformInfo
Thanks Nadav for the info. It clears my query :)
Yes its an integer ADD, and since AVX2 supports 256 bits integer
arithmetic, so its cost is less than AVX1.
One query though - shouldn't then the cost of integer ADD/SUB/MUL (which
would be 1) be explicitly specified in AVX2 cost table? Because right now
this entry is missing and cost of these operations are taken from BaseTTI
(which is
2014 Dec 11
2
[LLVMdev] Phabricator update
Hi,
I am facing problem while submitting patch on phab. All things go smooth -
create diff, create revision, specify title and comments. However, when I
try to submit the diff by clicking "save" button, it takes a lot of time
and eventually times out, failing to submit the patch.
Any help on this?
On Thursday, December 11, 2014, Manuel Klimek <klimek at google.com> wrote:
>
2014 Dec 11
3
[LLVMdev] [cfe-dev] Phabricator update
On Wed, Dec 10, 2014 at 2:38 PM, Jonathan Roelofs <jonathan at codesourcery.com
> wrote:
> I think the send-email part of phab has yet to come back up.
>
Yes, restarting it would be very helpful.
>
>
> Cheers,
>
> Jon
>
>
> On 12/10/14 1:59 PM, Manuel Klimek wrote:
>
>> Phab is back up - it's still a little slow (the mysql database we use is
2015 May 04
2
[LLVMdev] AVX2 Cost Table in X86TargetTransformInfo
Hi all,
I have a query regarding Cost Table for AVX2 in TargetTransformInfo.
The table consist of entries for shift and div operations only. There are
no entries for ADD, SUB and MUL for AVX2 cost table. Those entries are
present in Cost Table for AVX.
The reason for query is - when my sub target feature is AVX2, in SLP
Vectorization, while calculating scalar cost of ADD, it doesn't see
2015 Apr 10
2
[LLVMdev] MMX/SSE subtarget feature in IR
Your clang invocation below works for me, and generates target triple in the llvm IR of
i386.
And then in the specific options for the functions it generates the following:
; Function Attrs: nounwind
define float @foo() #0 {
entry:
ret float 1.000000e+00
}
attributes #0 = { nounwind "less-precise-fpmad"="false" "no-frame-pointer-elim"=
"true"
2014 Dec 10
2
[LLVMdev] Phabricator update
Phab is back up - it's still a little slow (the mysql database we use is
doing some cleanups).
On Wed Dec 10 2014 at 5:07:07 PM suyog sarda <sardask01 at gmail.com> wrote:
> And i was thinking something wrong with my proxy configuration :P
>
> On Wed, Dec 10, 2014 at 6:47 PM, Manuel Klimek <klimek at google.com> wrote:
>
>> Heya,
>>
>> if you wonder
2014 Nov 10
2
[LLVMdev] [Vectorization] Mis match in code generated
Hi Suyog,
Thanks for looking at this. This has recently got itself onto my TODO list
too.
> I am not sure how much all this will improve the code quality for
horizontal reduction
> (donno how frequently such pattern of horizontal reduction from same
array occurs in real world/SPECS).
Actually the main loop of 470.lbm can be SLP vectorized like this. We have
three parts to it: A fully
2014 Dec 11
3
[LLVMdev] [cfe-dev] Phabricator update
On Thu, Dec 11, 2014 at 1:29 AM, Manuel Klimek <klimek at google.com> wrote:
> On Thu Dec 11 2014 at 2:16:00 AM Alexey Samsonov <vonosmas at gmail.com>
> wrote:
>
>> On Wed, Dec 10, 2014 at 2:38 PM, Jonathan Roelofs <
>> jonathan at codesourcery.com> wrote:
>>
>>> I think the send-email part of phab has yet to come back up.
>>>
2014 Sep 19
3
[LLVMdev] [Vectorization] Mis match in code generated
Hi Arnold,
Thanks for your reply.
I tried test case as suggested by you.
*void foo(int *a, int *sum) {*sum =
a[0]+a[1]+a[2]+a[3]+a[4]+a[5]+a[6]+a[7]+a[8]+a[9]+a[10]+a[11]+a[12]+a[13]+a[14]+a[15];}*
so that it has a 'store' in its IR.
*IR before vectorization :*target datalayout =
"e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
target triple =
2015 Apr 09
2
[LLVMdev] MMX/SSE subtarget feature in IR
Hi all,
I have a sample test case :
$ cat 1.c
int foo(int x, int y){
int z = x + y;
return z/2;
}
I tried to get its IR form with clang providing subtarget feature as mmx
for target x86_64
$ clang -O3 -mmmx 1.c -S -emit-llvm
in the IR generated i can see the subtarget-features as function attribute :
"target-features"="+mmx"
In the SelectionDAG phase in file
2015 May 04
2
[LLVMdev] Modifying LoopUnrollingPass
Optimization passes running before LoopVectorizer should be able to combine
the two statements (this should be happening in O1. Pls check)
arr[i] = a + i
sum += arr[i]
to
sum += a + i
Not sure, why are you using the array there.
- Suyog
On 4 May 2015 23:11, "Michael Zolotukhin" <mzolotukhin at apple.com> wrote:
> Hi Yaduveer,
>
> Vectorizer probably fails because it
2014 Dec 10
2
[LLVMdev] Phabricator update
Heya,
if you wonder why phabricator is down - it's an upgrade that is running a
database update that takes a while (probably 3-5 more hours). I'll update
this thread once it's finished and phab is up again.
Cheers,
/Manuel
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2015 Apr 09
2
[LLVMdev] MMX/SSE subtarget feature in IR
Thanks Kevin for the reply. I got the point now :)
On 10 Apr 2015 00:18, "Smith, Kevin B" <kevin.b.smith at intel.com> wrote:
> For x86_64 ABI, a minimum feature set of SSE2 is required.
>
>
>
> Kevin
>
>
>
> *From:* llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] *On
> Behalf Of *suyog sarda
> *Sent:* Thursday, April 09,