similar to: [LLVMdev] removing unnecessary ZEXT

Displaying 20 results from an estimated 1100 matches similar to: "[LLVMdev] removing unnecessary ZEXT"

2013 Sep 10
0
[LLVMdev] removing unnecessary ZEXT
Hi, A bit more information. I believe my problem lies with the fact that the load is left as 'anyext from i8'. On the XCore target we know this will become an 8bit zext load - as there is no 8bit sign extended load! If BB#1 were to force the load to a "zext from i8" would this information be available in BB#2? BB#1: 0x268c1b0: i32 = Register %vreg1 [ID=3] 0x2689d80:
2013 Sep 11
2
[LLVMdev] removing unnecessary ZEXT
On Sep 10, 2013, at 8:59 AM, Robert Lytton <robert at xmos.com> wrote: > Hi, > > A bit more information. > I believe my problem lies with the fact that the load is left as 'anyext from i8'. > On the XCore target we know this will become an 8bit zext load - as there is no 8bit sign extended load! > If BB#1 were to force the load to a "zext from i8" would
2013 Sep 11
0
[LLVMdev] removing unnecessary ZEXT
Hi Andrew, Thank you for the suggestion. I've looked at CodeGenPrepare.cpp and MoveExtToFormExtLoad() is never run. I also notice that the ARM target produces the same additional register usage (copy) and zero extending (of the copy). (See the usage of r3 &r5 and also r12 & r4 in attached file arm-strcspn.s, my understanding is that 'ldrb' is zero extending.) Here is a
2013 Jan 25
2
[LLVMdev] TargetLowering vs. TargetTransform
Hi all, I'm looking for a place where to put the costs of vector (and scalar) cast operations for ARM, but I noticed the TargetTransform methods call the TargetLowering ones when unsure. Now, I'm not sure... Many casts on ARM are free, and I could build a list of cases where it is true, but should I put this on the lowering or the transform? My main motivation is to get the costs right
2013 Jan 25
0
[LLVMdev] TargetLowering vs. TargetTransform
Hi Renato, I think that we need to improve ::isTruncateFree, ::isZextFree, etc to include all of the free conversions. Vector and Scalar. Non-free conversions are marked with setOperationAction so the generic parts of TTI should be able to give a reasonable cost estimation. The cost tables should contain cases that are not handled by TTI. So, if we have a clever DAGCombine optimization (that
2014 Jan 13
2
[LLVMdev] test suite 'owner'
... and so (I infer from that) it should not be patched let alone need any changes. Assuming my inference is correct, any patching should only affect the XCore target and only if there is a good reason why the XCore requires the change. So, is #ifdef around all/most changes the correct way to submit a patch? Robert ________________________________ From: Eric Christopher [echristo at gmail.com]
2015 Sep 30
2
InstCombine wrongful (?) optimization on BinOp with SameOperands
Hi all, I have been looking at the way LLVM optimizes code before forwarding it to the backend I develop for my company and while building define i32 @test_extract_subreg_func(i32 %x, i32 %y) #0 { entry: %conv = zext i32 %x to i64 %conv1 = zext i32 %y to i64 %mul = mul nuw i64 %conv1, %conv %shr = lshr i64 %mul, 32 %xor = xor i64 %shr, %mul %conv2 = trunc i64 %xor to i32
2014 Jan 13
4
[LLVMdev] test suite 'owner'
Hi Eric, Could you explain the intent and policy regarding the test-suite body of code. Should the test be left as much as possible as-is (even if technically incorrect)? Should changes only affect the XCore target (#ifdef) or should all targets get the changes? Taking "int32_t main" as an example. The correct return type & argc for main is 'int'. In the XCore tool chain,
2014 Feb 26
3
[LLVMdev] test-suite wrongly using big-endian results
On 26 February 2014 14:44, Robert Lytton <robert at xmos.com> wrote: > This is related to a patch I submitted a little while ago (still pending): > http://llvm-reviews.chandlerc.com/D2760 > > If accepted, would it make this patch (and a others) unnecessary? Hi Robert, It is, but hijacking your patch a little, why not use __ORDER_LITTLE_ENDIAN__? Why do we need to create
2019 May 14
2
weakforced and GeoIP lookups
Hi Tobi, it should just work, but depends on the OS version. ./configure ?help tells you all the configure options, including: --with-maxminddb-includedir path to maxminddb include directory [default=auto] --with-maxminddb-libdir path to maxminddb library directory [default=auto] Neil > On 14 May 2019, at 17:44, Tobi via dovecot <dovecot at dovecot.org>
2019 May 14
2
weakforced and GeoIP lookups
Hi Tobi, This looks like you haven?t included the libmaxmind libraries before running configure. GeoIP support is only compiled in if it finds the right libs. This would be libmaxminddb-dev on Ubuntu for example. Neil >> Hi list >> >> hope it's okay to ask weakforced questions here as well, but I could not >> find a dedicated mailinglist for wforce. >>
2014 Jan 15
2
[LLVMdev] test suite 'owner'
thank you. I'll submit the patch without #ifdef in this case. Robert ________________________________ From: dblaikie at gmail.com [dblaikie at gmail.com] Sent: 14 January 2014 17:03 To: Robert Lytton; echristo at gmail.com; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] test suite 'owner' On Mon Jan 13 2014 at 12:25:14 PM, Robert Lytton <robert at xmos.com<mailto:robert at
2012 Mar 28
2
[LLVMdev] Remove subreg copies
Hi, I'm facing a problem in my BE while trying to remove certain copies. Here is a code snippet which I would like to optimize %vreg1<def> = READF32r; vRRegs:%vreg1 %vreg2<def> = COPY %vreg1:rsub_h; iRSubRegs:%vreg2 vRRegs:%vreg1 %vreg3<def> = COPY %vreg1:rsub_l; iRSubRegs:%vreg3 vRRegs:%vreg1 This code produces subreg-to-subreg copies but I would like to have direct
2012 Oct 25
3
[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.
Hi Vincent, On 25/10/2012 18:14, Vincent Lejeune wrote: > When examining the debug output of regalloc, it seems that joining 32bits reg also joins 128 parent reg. > > If I look at the : > %vreg34<def> = COPY %vreg6:sel_y; R600_Reg32:%vreg34 R600_Reg128:%vreg6 > > instructions ; it gets joined to : > 928B%vreg34<def> = COPY %vreg48:sel_y; > > when vreg6 and
2017 Jun 28
3
Ok with mismatch between dead-markings in BUNDLE and bundled instructions?
Not sure if I could follow everything in this discussion regarding subregisters. But I think the problem posted by Mikael just happened to involve subregisters, and the discussions about subregisters is confusing when it comes to Mikaels original question/problem. I think that the bundle could look something like this just as well: BUNDLE %vreg1<def,dead> * %vreg1<def> =
2012 Jun 13
2
[LLVMdev] Assert in live update from MI scheduler.
On Jun 13, 2012, at 1:15 PM, Sergei Larin <slarin at codeaurora.org> wrote: > Andy, > > You are probably right here – look at this – before phi elimination this code looks much more sane: > > # *** IR Dump After Live Variable Analysis ***: > # Machine code for function push: SSA > Function Live Outs: %R0 > > BB#0: derived from LLVM BB %entry >
2013 Oct 08
2
[LLVMdev] Subregister liveness tracking
Currently it will always spill / restore the whole vreg but only spilling the parts that are actually live would be a nice addition in the future. Looking at r192119': if "mtlo" writes to $LO and sets $HI to an unpredictable value, then it should just have an additional (dead) def operand for $hi, shouldn't it? Greetings Matthias Am 10/8/13, 11:03 AM, schrieb Akira
2012 Oct 25
0
[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.
Thank for your help. You're right, merging vreg32 and vreg48 is perfectly fine, sorry I missed that. I "brute force" debuged by adding MachineFunction dump after each join, I think I found the issue : it's when vreg32 and vreg10 are merged. vreg10 only appears in BB#3, and the join only occurs in BB#3 apparently even if vreg32 lives in the 4 machine blocks After joining, there
2013 Oct 07
1
[LLVMdev] Subregister liveness tracking
I've been working on patches to improve subregister liveness tracking on llvm and I wanted to inform the llvm community about the overal design/motivation for them. I will send the patches to llvm-commits later today. Greetings Matthias Braun Subregisters in llvm ==================== Some targets can access registers in different ways resulting in wider or narrower accesses. For
2013 Jul 05
4
[LLVMdev] making a copy of a byval aggregate on the callee's frame
Hi Tim, Thought about it last night and was coming to the same conclusion. 1. it cant be done at the end during lowering (target backend). 2. it should be part of llvm as the byVal needs to be handled. As a twist, I have been told that llvm-gcc can lower byVal into memcpy in the callee. I may take a look at this. I wonder if it ever emits 'byVal'... I still feel I don't understand