similar to: [LLVMdev] Making Sense of ISel DAG Output

Displaying 20 results from an estimated 200 matches similar to: "[LLVMdev] Making Sense of ISel DAG Output"

2008 Oct 02
0
[LLVMdev] Making Sense of ISel DAG Output
On Thursday 02 October 2008 11:37, David Greene wrote: > I'll try ot write a small example and send it in a bit. Ok, here's what I'm trying to do: let AddedComplexity = 40 in { def : Pat<(v2f64 (vector_shuffle (v2f64 (scalar_to_vector (loadf64 addr: $src1))), (v2f64 (scalar_to_vector (loadf64 addr: $src2))),
2008 Oct 02
6
[LLVMdev] Making Sense of ISel DAG Output
On Thursday 02 October 2008 12:42, David Greene wrote: > But let's say you _could_ write such a pattern (because I can). The input > DAG looks like this: > > 0x391a220: <multiple use> > 0x391c970: v2f64 = scalar_to_vector 0x391a220 srcLineNum= 10 > 0x391ac10: <multiple use> > 0x391c8b0: v2f64 = scalar_to_vector
2008 Oct 02
0
[LLVMdev] Making Sense of ISel DAG Output
On Oct 2, 2008, at 9:37 AM, David Greene wrote: > I'm debugging some X86 patterns and I want to understand the debug > dumps from > isel better. > > Here's some example output: > > 0x391bc40: i64,ch = load 0x3922c50, 0x391b8d0, 0x38dc530 > <0x39053e0:0> <sext > i32> alignment=4 srcLineNum= 10 > 0x3922c50: <multiple use> >
2008 Oct 03
0
[LLVMdev] Making Sense of ISel DAG Output
On Oct 2, 2008, at 2:19 PM, David Greene wrote: > On Thursday 02 October 2008 12:42, David Greene wrote: > >> But let's say you _could_ write such a pattern (because I can). >> The input >> DAG looks like this: >> >> 0x391a220: <multiple use> >> 0x391c970: v2f64 = scalar_to_vector 0x391a220 srcLineNum= 10 >>
2009 Jan 30
2
[LLVMdev] undefs in phis
On Jan 30, 2009, at 1:52 PM, David Greene wrote: > On Friday 30 January 2009 15:10, David Greene wrote: > >> This still looks correct. The coalescer then says: >> >> 4360 %reg1177<def> = FsMOVAPSrr %reg1176<kill> ; srcLine 0 >> Inspecting %reg1176,0 = [2702,4362:0) 0 at 2702-(4362) and >> %reg1177,0 = >>
2009 Jan 30
0
[LLVMdev] undefs in phis
On Jan 29, 2009, at 5:29 PM, David Greene wrote: > On Thursday 29 January 2009 18:04, Eli Friedman wrote: >> On Thu, Jan 29, 2009 at 2:47 PM, David Greene <dag at cray.com> wrote: >>> After phi elimination we have: >>> >>> bb134: >>> %reg1645 = 1.0 >>> >>> bb74: >>> %reg1176 = MOVAPS %reg1645 >>> %reg1177 =
2009 Jan 30
2
[LLVMdev] undefs in phis
On Thursday 29 January 2009 18:04, Eli Friedman wrote: > On Thu, Jan 29, 2009 at 2:47 PM, David Greene <dag at cray.com> wrote: > > After phi elimination we have: > > > > bb134: > > %reg1645 = 1.0 > > > > bb74: > > %reg1176 = MOVAPS %reg1645 > > %reg1177 = MOVAPS %reg1646 > > [...] > > > > bb108: > > %reg1645 =
2009 Feb 02
0
[LLVMdev] undefs in phis
On Friday 30 January 2009 16:54, Evan Cheng wrote: > I don't have the whole context to understand why you think this is a > bug. An implicit_def doesn't actually define any value. So we don't > care if a live interval overlaps live ranges defined by an implicit_def. It's a bug because the coalerscer does illegal coaescing. Our last episode left us here: bb134: 2696
2008 Apr 04
2
[LLVMdev] InstCombine Question
On Friday 04 April 2008 13:07, Chris Lattner wrote: > > So how does the undef store to null appear in the IR when it isn't > > attached anywhere and how can I get rid of it? > > Don't do undefined behavior? :) I don't think it's undefined behavior. Right before instcombine, we have this: %r60 = load <2 x i64>* %"$LCS_1", align 16 ; <<2
2008 Apr 04
0
[LLVMdev] InstCombine Question
On Fri, 4 Apr 2008, David Greene wrote: > I am confused by this bit of code in instcombine: > > 09789 if (GetElementPtrInst *GEPI = dyn_cast<GetElementPtrInst>(Op)) { > 09790 const Value *GEPI0 = GEPI->getOperand(0); > 09791 // TODO: Consider a target hook for valid address spaces for this > xform. > 09792 if (isa<ConstantPointerNull>(GEPI0)
2008 Apr 04
2
[LLVMdev] InstCombine Question
I am confused by this bit of code in instcombine: 09789 if (GetElementPtrInst *GEPI = dyn_cast<GetElementPtrInst>(Op)) { 09790 const Value *GEPI0 = GEPI->getOperand(0); 09791 // TODO: Consider a target hook for valid address spaces for this xform. 09792 if (isa<ConstantPointerNull>(GEPI0) && 09793
2009 Jan 30
2
[LLVMdev] undefs in phis
On Friday 30 January 2009 01:41, Evan Cheng wrote: > >> I find it a little strange that the IMPLICIT_DEF disappears. Besides > >> that, it looks okay up to here. > > > > I just verified that it does disappear. > > It's intentional. We don't want a live interval defined by an > implicit_def. It unnecessarily increases register pressure. Ah, I see.
2008 Oct 07
2
[LLVMdev] Making Sense of ISel DAG Output
On Friday 03 October 2008 12:06, Dan Gohman wrote: > On Fri, October 3, 2008 9:10 am, David Greene wrote: > > On Thursday 02 October 2008 19:32, Dan Gohman wrote: > >> Looking at your dump() output above, it looks like the pre-selection > >> loads have multiple uses, so even though you've managed to match a > >> larger pattern that incorporates them, they
2008 Oct 03
0
[LLVMdev] Making Sense of ISel DAG Output
On Fri, October 3, 2008 9:10 am, David Greene wrote: > On Thursday 02 October 2008 19:32, Dan Gohman wrote: > >> Looking at your dump() output above, it looks like the pre-selection >> loads have multiple uses, so even though you've managed to match a >> larger pattern that incorporates them, they still need to exist to >> satisfy some other users. > > Yes,
2008 Oct 03
3
[LLVMdev] Making Sense of ISel DAG Output
On Thursday 02 October 2008 19:32, Dan Gohman wrote: > Looking at your dump() output above, it looks like the pre-selection > loads have multiple uses, so even though you've managed to match a > larger pattern that incorporates them, they still need to exist to > satisfy some other users. Yes, I looked at that too. It looks like these other uses end up being chains to
2009 Nov 10
4
[LLVMdev] speed up memcpy intrinsic using ARM Neon registers
I tried to speed up Dhrystone on ARM Cortex-A8 by optimizing the memcpy intrinsic. I used the Neon load multiple instruction to move up to 48 bytes at a time . Over 15 scalar instructions collapsed down into these 2 Neon instructions. fldmiad r3, {d0, d1, d2, d3, d4, d5} @ SrcLine dhrystone.c 359 fstmiad r1, {d0, d1, d2, d3, d4, d5} It seems like this should be faster. But I did
2011 Jun 20
1
Quote the path of graphics output in Sweave when it contains spaces
Hi, I'm aware of the definition of a "valid filename" in .SweaveValidFilenameRegexp, but I think it might be better to quote the filename when it contains spaces instead of just giving a warning. This should bring us safer LaTeX code (although I never use spaces in paths). Here is the simple patch: Index: src/library/utils/R/SweaveDrivers.R
2009 Jan 30
0
[LLVMdev] undefs in phis
On Friday 30 January 2009 15:10, David Greene wrote: > This still looks correct. The coalescer then says: > > 4360 %reg1177<def> = FsMOVAPSrr %reg1176<kill> ; srcLine 0 > Inspecting %reg1176,0 = [2702,4362:0) 0 at 2702-(4362) and %reg1177,0 = > [2700,3712:0)[3768,3878:0)[4362,4372:0) 0 at 4362-(3878): > Joined. Result = %reg1177,0 = [2700,4372:0) 0 at
2009 Nov 10
0
[LLVMdev] speed up memcpy intrinsic using ARM Neon registers
On Nov 9, 2009, at 7:34 PM, Neel Nagar wrote: > I tried to speed up Dhrystone on ARM Cortex-A8 by optimizing the > memcpy intrinsic. I used the Neon load multiple instruction to move up > to 48 bytes at a time . Over 15 scalar instructions collapsed down > into these 2 Neon instructions. > > fldmiad r3, {d0, d1, d2, d3, d4, d5} @ SrcLine dhrystone.c 359 > fstmiad
2009 Nov 10
3
[LLVMdev] speed up memcpy intrinsic using ARM Neon registers
On Nov 9, 2009, at 5:59 PM, David Conrad wrote: > On Nov 9, 2009, at 7:34 PM, Neel Nagar wrote: > >> I tried to speed up Dhrystone on ARM Cortex-A8 by optimizing the >> memcpy intrinsic. I used the Neon load multiple instruction to move >> up >> to 48 bytes at a time . Over 15 scalar instructions collapsed down >> into these 2 Neon instructions. Nice. Thanks