thr3ads.net - similar to: "[LLVMdev] Unaligned load/store for callee-saved 128-bit registers"

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] Unaligned load/store for callee-saved 128-bit registers"

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

2013 Nov 21

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

----- Original Message ----- > From: "Hal Finkel" <hfinkel at anl.gov> > To: "Francois Pichet" <pichet2000 at gmail.com> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Monday, November 18, 2013 2:45:53 PM > Subject: Re: [LLVMdev] Unaligned load/store for callee-saved 128-bit registers > > ----- Original

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

2013 Nov 21

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

----- Original Message ----- > From: "Francois Pichet" <pichet2000 at gmail.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Chad Rosier" <mcrosier at codeaurora.org>, "Jakob Stoklund Olesen" <jolesen at apple.com>, "LLVM Developers Mailing > List" <llvmdev at cs.uiuc.edu> > Sent: Thursday, November

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

2013 Nov 18

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

----- Original Message ----- > From: "Francois Pichet" <pichet2000 at gmail.com> > To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Monday, November 18, 2013 2:26:30 PM > Subject: [LLVMdev] Unaligned load/store for callee-saved 128-bit registers > > > > On my (out-of-tree) target I have 16 128-bit registers. >

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

2013 Nov 21

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

BTW I managed to get around this problem by flagging all the 128-bit registers as caller saved only. On my system, vector registers are more likely to be used on leaf functions anyway. On Thu, Nov 21, 2013 at 3:24 PM, Hal Finkel <hfinkel at anl.gov> wrote: > ----- Original Message ----- > > From: "Hal Finkel" <hfinkel at anl.gov> > > To: "Francois

[LLVMdev] Question about callee saved registers in x86

2014 May 27

[LLVMdev] Question about callee saved registers in x86

Hi llvmdev, I'm trying to figure how llvm remembers stack slots allotted to callee saved registers on x86. In particular, llvm pushes registers in decreasing order of FrameIdxs [1], so the offsets they get (as returned by MFI->getObjectOffset) don't directly correspond to their actual stack locations. In X86FrameLowering's emitCalleeSavedFrameMoves, when emitting DWARF

[LLVMdev] Question about callee saved registers in x86

2014 May 30

[LLVMdev] Question about callee saved registers in x86

On 31.5.2014 2:04, Pasi Parviainen wrote: > On 28.5.2014 2:57, Sanjoy Das wrote: >> Hi llvmdev, >> >> I'm trying to figure how llvm remembers stack slots allotted to callee >> saved registers on x86. In particular, llvm pushes registers in >> decreasing order of FrameIdxs [1], so the offsets they get (as >> returned by MFI->getObjectOffset) don't

[LLVMdev] Disable vectorization for unaligned data

2013 Jul 19

[LLVMdev] Disable vectorization for unaligned data

What is the proper solution to disable auto-vectorization for unaligned data? I have an out of tree target and I added this: bool OpusTargetLowering::allowsUnalignedMemoryAccesses(EVT VT, bool *Fast) const { if (VT.isVector()) return false; .... } After that, I could see that vectorization is still done on unaligned data except that llvm will copy the data back and forth from the source

[LLVMdev] Disable vectorization for unaligned data

2013 Jul 21

[LLVMdev] Disable vectorization for unaligned data

Ok any quick workaround to limit vectorization to 16-byte aligned 128-bit data then? All the memory copying done by ExpandUnalignedStore/ExpandUnalignedLoad is just too expensive. On Sat, Jul 20, 2013 at 12:52 PM, Arnold Schwaighofer < aschwaighofer at apple.com> wrote: > > On Jul 19, 2013, at 3:14 PM, Francois Pichet <pichet2000 at gmail.com> wrote: > > > > >

[LLVMdev] Improving the quality of debug locations / DbgValueHistoryCalculator

2016 May 11

[LLVMdev] Improving the quality of debug locations / DbgValueHistoryCalculator

The most obvious place where it is lacking at the moment is that it only supports DBG_VALUEs in registers. Adding support for constant values, memory locations, and fp constants would be a big win! thanks, Adrian > On May 11, 2016, at 2:52 PM, Francois Pichet <pichet2000 at gmail.com> wrote: > > In retrospect I totally agree with you. I am looking at LiveDebugValue again to see

[LLVMdev] Disable vectorization for unaligned data

2013 Jul 20

[LLVMdev] Disable vectorization for unaligned data

On Jul 19, 2013, at 3:14 PM, Francois Pichet <pichet2000 at gmail.com> wrote: > > What is the proper solution to disable auto-vectorization for unaligned data? > > I have an out of tree target and I added this: > > bool OpusTargetLowering::allowsUnalignedMemoryAccesses(EVT VT, bool *Fast) const { > if (VT.isVector()) > return false; > .... > } >

[LLVMdev] Disable vectorization for unaligned data

2013 Jul 19

[LLVMdev] Disable vectorization for unaligned data

On Fri, Jul 19, 2013 at 1:14 PM, Francois Pichet <pichet2000 at gmail.com> wrote: > > What is the proper solution to disable auto-vectorization for unaligned > data? Why are you trying to do this? If auto-vectorization is making a given loop slower on your target, that means the cost metrics are off, and we should fix them. If code size is an issue, you should tell the optimizer

[LLVMdev] Improving the quality of debug locations / DbgValueHistoryCalculator

2016 May 12

[LLVMdev] Improving the quality of debug locations / DbgValueHistoryCalculator

> On May 12, 2016, at 11:00 AM, Francois Pichet <pichet2000 at gmail.com> wrote: > > Here is a specific case that make the debugging experiences degraded on my target: > This is a loop simplified CFG: > > BB#0: > %R5<def> = OR_rr %R0, %R49 // this is %R5 only def. > DBG_VALUE %R5, %noreg, !"argc", <!18>; line no:4 > Successors

[LLVMdev] Improving the quality of debug locations / DbgValueHistoryCalculator

2016 May 11

[LLVMdev] Improving the quality of debug locations / DbgValueHistoryCalculator

> On May 11, 2016, at 2:09 PM, Francois Pichet <pichet2000 at gmail.com> wrote: > > Good point. > > Currently yes a DEBUG_VALUE "x", vreg0 will be added in BB2. Now I realize this might be wrong in some (corner?) cases where vreg0 no longer refer to "x" > > My fix would be to propagate the DEBUG_VALUE only if "x" is associated with only a

[LLVMdev] 64-bit add using 2 32-bit operations, guarantee of stuck together?

2013 Apr 15

[LLVMdev] 64-bit add using 2 32-bit operations, guarantee of stuck together?

I really have to force them to stuck together otherwise the carry will just not work. How about wrapping the 2 instructions in a bundle? Would that be a way? http://llvm.org/docs/CodeGenerator.html#machineinstr-bundles On Mon, Apr 15, 2013 at 5:24 PM, Quentin Colombet <qcolombet at apple.com>wrote: > Hi Francois, > > If you model the effect of your carry on the instructions, the

[LLVMdev] 64-bit add using 2 32-bit operations, guarantee of stuck together?

2013 Apr 15

[LLVMdev] 64-bit add using 2 32-bit operations, guarantee of stuck together?

Using bundles here looks like a fragile way to handle that, IMHO. Really, using a pseudo instruction seems the best approach for you. For instance, you can match your add64 during isel with your pseudo instruction and expand it just before emitting the assembly file (add a pass using the hook: addPreEmitPass on your target). -Quentin On Apr 15, 2013, at 2:37 PM, Francois Pichet <pichet2000

[LLVMdev] Improving the quality of debug locations / DbgValueHistoryCalculator

2016 May 11

[LLVMdev] Improving the quality of debug locations / DbgValueHistoryCalculator

> On May 11, 2016, at 1:12 PM, Francois Pichet <pichet2000 at gmail.com> wrote: > > Hello, > > Regarding the problem of debug range for optimized code. > Currently a DEBUG_VALUE will be inserted after the <def>vregX > DEBUG_VALUE are only valid until the end of the current MachineBasicBlock. That's the main problem. > Why not simply iterate over all uses

[LLVMdev] 64-bit add using 2 32-bit operations, guarantee of stuck together?

2013 Apr 15

[LLVMdev] 64-bit add using 2 32-bit operations, guarantee of stuck together?

Hi, Let's say we have a 32-bit architecture where 64-bit additions are done using 2 operations. Instructions are defined as follow in TableGen: defm ADD64 : ALU32<"add", 1, 1, addc>; defm ADD64C : ALU32<"addrc", 1, 2, adde>; Let's assume that the carry bit is implicit and that the 2 operations must *always* be stuck together for the 64-bit add to

[LLVMdev] Question about shouldMergeGEPs in InstructionCombining

2015 Feb 24

[LLVMdev] Question about shouldMergeGEPs in InstructionCombining

On Mon, Feb 23, 2015 at 2:17 PM, Hal Finkel <hfinkel at anl.gov> wrote: > ----- Original Message ----- > > From: "Francois Pichet" <pichet2000 at gmail.com> > > To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > > Sent: Sunday, February 22, 2015 5:34:11 PM > > Subject: [LLVMdev] Question about shouldMergeGEPs in

[LLVMdev] Value of structure passed byval to a recurse function not initialized when accessed through GDB

2012 Dec 06

[LLVMdev] Value of structure passed byval to a recurse function not initialized when accessed through GDB

Hi David, I think it might not be exactly PR13303 which might be causing the corruption of struct when accessed through GDB. This seems to be an ABI problem in clang. The problem seems to be that when we have pass by value of struct (having indirect arguments) stack is not aligned properly. I tried realigning the stack for indirect arguments in(TargetInfo.cpp) - ABIArgInfo

[LLVMdev] Question about shouldMergeGEPs in InstructionCombining

2015 Mar 12

[LLVMdev] Question about shouldMergeGEPs in InstructionCombining

I think it would make sense for (1) and (2). I am not sure if (3) is feasible in instcombine. (I am not too familiar with LoopInfo) For the Octasic's Opus platform, I modified shouldMergeGEPs in our fork to: if (GEP.hasAllZeroIndices() && !Src.hasAllZeroIndices() && !Src.hasOneUse()) return false; return Src.hasAllConstantIndices(); // was return false;

similar to: [LLVMdev] Unaligned load/store for callee-saved 128-bit registers