thr3ads.net - similar to: "Optimizing memory allocation for custom allocators and non C code"

Displaying 20 results from an estimated 2000 matches similar to: "Optimizing memory allocation for custom allocators and non C code"

Can someone give me some pointer on alias analysis ?

2016 Jan 04

Can someone give me some pointer on alias analysis ?

> On Jan 4, 2016, at 9:55 AM, Amaury SECHET via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > > 2016-01-04 18:21 GMT+01:00 Philip Reames <listmail at philipreames.com <mailto:listmail at philipreames.com>>: > On 01/04/2016 07:32 AM, Amaury SECHET wrote: >> After a bit more investigation, it turns out that because %0 is stored into %1 (after

Can someone give me some pointer on alias analysis ?

2016 Jan 04

Can someone give me some pointer on alias analysis ?

2015-12-26 18:32 GMT+01:00 Philip Reames <listmail at philipreames.com>: > On 12/26/2015 02:17 AM, Amaury SECHET via llvm-dev wrote: > > I'm trying to fix that bug: https://llvm.org/bugs/show_bug.cgi?id=20049 > > It turns out this is the kind of optimization that I really need, as when > it isn't done, all kind of other optimizations opportunities down the road

Can someone give me some pointer on alias analysis ?

2016 Jan 04

Can someone give me some pointer on alias analysis ?

On 01/04/2016 07:32 AM, Amaury SECHET wrote: > After a bit more investigation, it turns out that because %0 is stored > into %1 (after bitcast) and so %3 may have access to it and clobber it. Can you give a bit more context? I'm not sure which of the examples you're talking about. > > After a bit of thought, it is correct in the general case, but > definitively something

Optimizing diamond pattern in DAGCombine

2017 May 22

Optimizing diamond pattern in DAGCombine

Explicitly re-adding a node to be processed doesn't work, because the processing order is canonical. 2017-05-22 11:39 GMT-07:00 Nirav Davé <niravd at google.com>: > You can always explicitly add D to the worklist when you make the > transformation with AddToWorklist. Presuambly this was the cause for your > infinite loop. > > -Nirav > > > On Mon, May 22, 2017 at

Optimizing diamond pattern in DAGCombine

2017 May 22

Optimizing diamond pattern in DAGCombine

The root problem is that, when A gets modified, D doesn't get added back to the worklist. I could match the pattern on A, but the problem remains: when D gets modified, A do not get added back tot he worklist. I also considered ding several round of DAGCombine, but it is very easy to run into infinite loops, even with a fair amount of sanity checks. 2017-05-22 7:30 GMT-07:00 Nirav Davé

Can someone give me some pointer on alias analysis ?

2015 Dec 26

Can someone give me some pointer on alias analysis ?

I'm trying to fix that bug: https://llvm.org/bugs/show_bug.cgi?id=20049 It turns out this is the kind of optimization that I really need, as when it isn't done, all kind of other optimizations opportunities down the road are not realized as they are not exposed. I have no idea where to start digging for this. I assume there is some kind of interaction between memory dependency and alias

Problem ScheduleDAG on PowerPC, X86 works fine.

2017 Feb 08

Problem ScheduleDAG on PowerPC, X86 works fine.

I don't think that'd work, because it leaves all other backends broken. AFAICT, your transform is simply not a legal transform, with the way the ADDC/ADDE opcodes are currently defined, and to do it you really need to fix the opcode definitions to not involve glue, first. I also note that your transform doesn't actually trigger at all on this particular test case on x86, because the

Problem ScheduleDAG on PowerPC, X86 works fine.

2017 Feb 09

Problem ScheduleDAG on PowerPC, X86 works fine.

I'd think i1 would be the proper and correct choice for a carry flag for the generic instruction. I expect that would also make UADDO/USUBO redundant with ADDC/SUBC (which would seem a good outcome). You'd need to make sure the right thing happened when converting from ADDC's 1-bit carry in/out to X86ISD::AD[DC]'s EFLAGS i/o. Right now the conversion can get away with assuming

Problem ScheduleDAG on PowerPC, X86 works fine.

2017 Feb 07

Problem ScheduleDAG on PowerPC, X86 works fine.

Would it not make sense to refactor the code so those don't use glue rather than emitting them with glue and then getting rid of it. There are times when we would like to emit these in separate blocks but can't (presumably because of the glue). On Tue, Feb 7, 2017 at 9:15 PM, James Y Knight via llvm-dev < llvm-dev at lists.llvm.org> wrote: > That's seems really odd that

Problem ScheduleDAG on PowerPC, X86 works fine.

2017 Feb 07

Problem ScheduleDAG on PowerPC, X86 works fine.

Long story short: https://llvm.org/bugs/show_bug.cgi?id=31890 The backend fails to schedule a given DAG, the reason being that there is an instruction and it glue that needs to be broken apart as they can't be scheduled consecutively. See attached file for a picture of the DAG. Not sure what's the best course of action is, and not sure why this isn't a problem for the X86 backend

rL296252 Made large integer operation codegen significantly worse.

2017 Feb 25

rL296252 Made large integer operation codegen significantly worse.

Hi, I'm working with workload where the bottleneck is cryptographic signature checks. Or, in compiler terms, most large integer operations. Looking at rL296252 , the state of affair in that area degraded quite significantly, see test/CodeGen/X86/i256-add.ll for instance. Is there some kind of work in progress here and it is expected to get better ? Because if not, that's a big problem.

Optimizing diamond pattern in DAGCombine

2017 May 22

Optimizing diamond pattern in DAGCombine

I'm trying to optimize a pattern that goes roughly as: A / \ B C \ / D Problem is, when A gets modified, B and C get added back to the worklist, but D doesn't. Readding D to the worklist just create an infinite loop where one process D again and again. Is there a proper way to make this work ? -------------- next part -------------- An HTML attachment was scrubbed...

Status of the official LLVM APT repositories

2016 Apr 13

Status of the official LLVM APT repositories

On Wed, 13 Apr 2016 at 09:38 Amaury SECHET <deadalnix at gmail.com> wrote: > I'd be happy to do it, but this is a bit much high level for me to be > actionable. Can you explain me what I should do to reintroduce them int he > debian packaging ? > On the CMake side, I'm not sure. I think it's just a matter of using the "install()" functions to install them

Status of the official LLVM APT repositories

2016 May 02

Status of the official LLVM APT repositories

On Sun, 1 May 2016 at 16:12 Amaury SECHET <deadalnix at gmail.com> wrote: > Some update on this. > > 2016-04-12 18:48 GMT-07:00 Andrew Wilkins <axwalk at gmail.com>: > >> On Wed, 13 Apr 2016 at 09:38 Amaury SECHET <deadalnix at gmail.com> wrote: >> >>> I'd be happy to do it, but this is a bit much high level for me to be >>>

RFC: Remove inaccessiblememonly from 3.8 branch

2016 Feb 10

RFC: Remove inaccessiblememonly from 3.8 branch

A while back, we introduced new attributes to model functions which read and wrote from locations not otherwise visible in the IR. The motivation was to enable much more aggressive aliasing around such calls. Since that time, we've reverted the original patch and AFAIK we have no transformation or analysis passes in tree which use this information. Given this, I don't think it's

RFC: Add guard intrinsics to LLVM

2016 Feb 23

RFC: Add guard intrinsics to LLVM

On Mon, Feb 22, 2016 at 9:40 PM, Andrew Trick <atrick at apple.com> wrote: > I actually see fences as a proxy for potential inter-process > communication and I/O. It's important that any opaque library call > could contain a fence. This makes perfect sense to me now, especially if you want to use @trap_on for safety checks. Without re-ordering restrictions, a failed @trap_on

[LLVMdev] LLVM commit 410f38e01597120b41e406ec1cea69127463f9e5

2014 Jul 05

[LLVMdev] LLVM commit 410f38e01597120b41e406ec1cea69127463f9e5

Hi, I'm working on a target which have a variable size for CC (the same size as the arguments). As a result getSetCCResultType, return a variable size. In this commit, at the line DAG.getSExtOrTrunc(SetCC, DL, SelectVT), on my target, you end up generating the Node you are replacing, and so creating a loop in the DAG, which give a whole new meaning to the A in the acronym. Subsequent code

Reloc::Default should trigger PIC on plateformq where PIE is the defaul

2017 Jan 09

Reloc::Default should trigger PIC on plateformq where PIE is the defaul

Pretty all is in the title. Is there a reason why this isn't done ? With debian and ubuntu switching to PIE by default, this is become more and more of a hassle to get this working properly, and I'd rather see that fixed in LLVM rather than in each driver that do something with it. Has someone already looked into this ? If so, what are the conclusions ? -------------- next part

Early legalization pass ? Doing early legalization in an existing pass ?

2017 Jan 24

Early legalization pass ? Doing early legalization in an existing pass ?

I may be wrong here, but legalizing early seems like something that is more likely to prevent optimizations than it is to encourage them. But I guess I don't follow why things like TTI, TII and TLI queries don't suffice for this. CodeGenPrepare will break this sequence up. I would imagine that if the target returns false for isCheapToSpeculateCtlz() and false for canInsertSelect(), the

Status of the official LLVM APT repositories

2016 Apr 13

Status of the official LLVM APT repositories

On Wed, 13 Apr 2016 at 08:10 Amaury SECHET via llvm-dev < llvm-dev at lists.llvm.org> wrote: > I'd like to shime in here. These apt repository used to contain packages > named llvm-3.8-tools containing, amongst other things, the lit python > library used to test llvm. It seems that it went away recently and I have > travis build failing because of this. > > What is

similar to: Optimizing memory allocation for custom allocators and non C code