thr3ads.net - similar to: "Redundant copies"

Displaying 20 results from an estimated 1000 matches similar to: "Redundant copies"

2020 Mar 16

Redundant copies

Hi Sjoerd, I'm already using RDA in the pass I mentioned and it works great. Thanks Sam! Regarding the root cause, I didn't see anything obviously suboptimal not in the copy coalescing or the register allocation, at least in my previous example. Alternatively we might want to improve what we pass onto RA: i.e. remove the redundant copy earlier. At this point however it doesn't

Redundant copies

2020 Mar 16

Redundant copies

Yep, exactly that. We see quite a lot of them, most of them get cleaned up, but not always... Cheers. ________________________________ From: Roger Ferrer Ibáñez <rofirrim at gmail.com> Sent: 16 March 2020 08:53 To: Sjoerd Meijer <Sjoerd.Meijer at arm.com> Cc: LLVM-Dev <llvm-dev at lists.llvm.org>; Sam Parker <Sam.Parker at arm.com> Subject: Re: [llvm-dev] Redundant copies

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

2020 Nov 06

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

On 11/6/20 12:39 PM, Sjoerd Meijer wrote: Hello Simon, Thanks for your replies, very useful. And yes, thanks for the example and making the target differences clear: ; Some examples: ; RISC-V V & VE(*): ; %mask = (splat i1 1) ; %evl = min(256, %n - %i) ; MVE/SVE : ; %mask = get.active.lane.mask(%i, %n) ; %evl = call @llvm.vscale() ; AVX: ; %mask = icmp (%i + (seq

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

2020 Nov 06

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

On 11/6/20 8:49 AM, Roger Ferrer Ibáñez wrote: Hi Sjoerd, Trying to remember how everything fits together here, but could get.active.lane.mask not create the %mask of the VP intrinsics? Or in other words, in the vectoriser, who's producing the %mask and %evl that is consumed by the VP intrinsics? I'm not sure what would be the best way here. I think about the Loop Vectorizer. I imagine

LV: predication

2020 May 19

LV: predication

Hi Simon, Thanks for reposting the example, and looking at it more carefully, I think it is very similar to my first proposal. This was met with some resistance here because it dumps loop information in the vector preheader. Doing it this early, we want to emit this in the vectoriser, puts a restriction on (future) optimisations that transform vector loops to honour/update/support this intrinsic

LV: predication

2020 May 18

LV: predication

> You have similar problems with https://reviews.llvm.org/D79100 The new revision D79100<https://reviews.llvm.org/D79100> solves your comment 1), and I don't think your comments2) and 3) apply as there are no vendor specific intrinsics involved at all here. Just to quickly discuss the optimisation pipeline, D79100<https://reviews.llvm.org/D79100> is a small extension for the

LV: predication

2020 May 19

LV: predication

Invitation accepted, I am happy to help out with reviews, like I did with the previous VP patches. And of course agreed that things should be well defined, and that we shouldn't paint ourselves in a corner, but I don't think that this is the case. And it's not that I am in a rush, but I don't think this change needs to be predicated on a big change landing first like the LV

LV: predication

2020 May 18

LV: predication

Hi, I abandoned that approach and followed Eli's suggestion, see somewhere earlier in this thread, and emit an intrinsic that represents/calculates the active mask. I've just uploaded a new revision for D79100 that implements this. Cheers. ________________________________ From: Simon Moll <Simon.Moll at EMEA.NEC.COM> Sent: 18 May 2020 13:32 To: Sjoerd Meijer <Sjoerd.Meijer at

LV: predication

2020 May 04

LV: predication

Hi Roger, That's a good example, that shows most of the moving parts involved here. In a nutshell, the difference is, and what we would like to make explicit, is the vector trip versus the scalar loop trip count. In your IR example, the loads/stores are predicated on a mask that is calculated from a splat induction variable, which is compared with the vector trip count. Illustrated with your

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

2020 Nov 05

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

For RISC-V V and VE being explicit about %evl is important for performance & correctness and that is what VP does. The get.active.lane.mask intrinsic is used as a hint for the MVE, SVE backends to use hardware tail-predication (the backends reverse engineer that hint by pattern matching for get.active.lane.mask in the mask parameter of "some" masked intrinsics). IMHO, it's more

(no subject)

2018 Aug 28

(no subject)

Dear Alex, all, I was looking for fcvt.d.{w,l}{,u} in RISCVInstrInfoD and I'm not sure to understand the current definitions: 138 def FCVT_D_W : FPUnaryOp_r<0b1101001, 0b000, FPR64, GPR, "fcvt.d.w"> { 139 let rs2 = 0b00000; 140 } 141 142 def FCVT_D_WU : FPUnaryOp_r<0b1101001, 0b000, FPR64, GPR, "fcvt.d.wu"> { 143 let rs2 =

[RISCV][PIC] Lowering pseudo instructions in MCCodeEmitter vs AsmPrinter

2018 Jul 10

[RISCV][PIC] Lowering pseudo instructions in MCCodeEmitter vs AsmPrinter

H all, I'm looking at generating PIC code for RISC-V in the context of Linux. Not sure if anyone is working on this already, any inputs are very welcome. I'm now looking at function calls which in the RISCV backend are represented via two pseudoinstructions RISCV::TAIL and RISCV::CALL. Currently those pseudos are lowered in MCCodeEmitter. They are expanded into AUIPC and JALR

"Earlyclobber" but for a subset of the inputs

2020 May 05

"Earlyclobber" but for a subset of the inputs

Hi Quentin, > It sounds like you only need the earlyclobber description for the N, N > variant. > In other words, as long as you use different opcodes for widen-op NN and > widen-op WN, you model exactly what you want. > > What am I missing? > we are using different opcodes for widen-op NN and widen-op WN. My understanding is that not setting earlyclobber to the W, N

"Earlyclobber" but for a subset of the inputs

2020 May 04

"Earlyclobber" but for a subset of the inputs

Hi all, I'm working on a target whose registers have equal-sized subregisters and all of those subregisters can be named (or the other way round: registers can be grouped into super registers). So for instance we've got 16 registers W (as in wide) W0..W15 and 32 registers N (as in narrow) N0..N31. This way, W0 is made by grouping N0 and N1, W1 is N2 and N3, W2 is N4 and N5, ..., W15 is

LV: predication

2020 May 04

LV: predication

> The harm comes if the intrinsic ends up with the wrong value, or attached to the wrong loop. The intrinsic is marked as IntrNoDuplicate, so I wasn't worried about it ending up somewhere else. Also, it is a property of a specific loop, a tail-folded vector loop, that holds even after it is transformed I think. I.e. unrolling a vector loop is probably not what you want, but even if you do

Question about the status of the IR extensions for OpenMP

2018 Jun 11

Question about the status of the IR extensions for OpenMP

[Apologies if you received this email twice, the first time I sent it from the wrong email account] Hi all, some time ago Intel proposed a set of minimal IR extensions to improve the support of OpenMP in LLVM [1][2]. I wonder if there has been any progress on this and if it is going to be upstreamed. Also the previous proposal[2] and communications to the llvm-dev[3] mention the following

Questions about vscale

2020 Apr 07

Questions about vscale

Hi, Looking at the language reference, vscale is an integer. This might pose a problem for fractional vscale. Furthermore, I believe that vscale is constant throughout the life of the program; so if RISC-V vscale can vary from instruction to instruction that may also be problematic unless you can just commit to one specific value of vscale. Also, I had a question about your table. Based

data.frame: adding a column that is based on ranges of values in another column

2010 Jul 05

data.frame: adding a column that is based on ranges of values in another column

Dear List, I've been looking tirelessly for a solution to this dilemma but without success. Perhaps someone has an idea that will guide me in the right direction. Suppose I have the following data.frame: DF = data.frame(X = c(114.5508, 114.6468, 114.6596, 114.6957, 114.6828, 114.8903, 114.9519, 114.8842, 114.8579, 114.8489), Y = c(47.14094, 46.98874, 46.91235, 46.88265, 46.80584, 46.67022,

Behavior or as.environment in function arguments/call (and force() behaviors...)

2013 Jan 01

Behavior or as.environment in function arguments/call (and force() behaviors...)

Happy 2013! Can someone with more knowledge of edge case scoping/eval rules explain what is happening below? Happens in all the versions of R I have on hand. Behavior itself is confusing, but ?as.environment also provides no clue. The term used in that doc is 'search list', which is ambiguous, but the see also section mentions search(), so I would *think* that is what is intended.

weights in lm, glm (PR#9023)

2006 Jun 22

weights in lm, glm (PR#9023)

Full_Name: James Signorovitch Version: 2.2.1 OS: WinXP Submission from: (NULL) (134.174.182.203) In the code below, fn1() and fn2() fail with the messages given in the comments. Strangely, fn2() fails for all data sets I've tried except for those with 100 rows. The same errors occur if glm() is used in place of lm(), or if R 2.1.1 is used on a unix system. Thanks for looking into this.

similar to: Redundant copies