thr3ads.net - similar to: "[LLVMdev] Making Sense of ISel DAG Output"

Displaying 20 results from an estimated 200 matches similar to: "[LLVMdev] Making Sense of ISel DAG Output"

[LLVMdev] Making Sense of ISel DAG Output

2008 Oct 02

[LLVMdev] Making Sense of ISel DAG Output

On Thursday 02 October 2008 11:37, David Greene wrote: > I'll try ot write a small example and send it in a bit. Ok, here's what I'm trying to do: let AddedComplexity = 40 in { def : Pat<(v2f64 (vector_shuffle (v2f64 (scalar_to_vector (loadf64 addr: $src1))), (v2f64 (scalar_to_vector (loadf64 addr: $src2))),

[LLVMdev] Making Sense of ISel DAG Output

2008 Oct 02

[LLVMdev] Making Sense of ISel DAG Output

On Thursday 02 October 2008 12:42, David Greene wrote: > But let's say you _could_ write such a pattern (because I can). The input > DAG looks like this: > > 0x391a220: <multiple use> > 0x391c970: v2f64 = scalar_to_vector 0x391a220 srcLineNum= 10 > 0x391ac10: <multiple use> > 0x391c8b0: v2f64 = scalar_to_vector

[LLVMdev] Making Sense of ISel DAG Output

2008 Oct 02

[LLVMdev] Making Sense of ISel DAG Output

On Oct 2, 2008, at 9:37 AM, David Greene wrote: > I'm debugging some X86 patterns and I want to understand the debug > dumps from > isel better. > > Here's some example output: > > 0x391bc40: i64,ch = load 0x3922c50, 0x391b8d0, 0x38dc530 > <0x39053e0:0> <sext > i32> alignment=4 srcLineNum= 10 > 0x3922c50: <multiple use> >

[LLVMdev] Making Sense of ISel DAG Output

2008 Oct 03

[LLVMdev] Making Sense of ISel DAG Output

On Oct 2, 2008, at 2:19 PM, David Greene wrote: > On Thursday 02 October 2008 12:42, David Greene wrote: > >> But let's say you _could_ write such a pattern (because I can). >> The input >> DAG looks like this: >> >> 0x391a220: <multiple use> >> 0x391c970: v2f64 = scalar_to_vector 0x391a220 srcLineNum= 10 >>

[LLVMdev] undefs in phis

2009 Jan 30

[LLVMdev] undefs in phis

On Jan 30, 2009, at 1:52 PM, David Greene wrote: > On Friday 30 January 2009 15:10, David Greene wrote: > >> This still looks correct. The coalescer then says: >> >> 4360 %reg1177<def> = FsMOVAPSrr %reg1176<kill> ; srcLine 0 >> Inspecting %reg1176,0 = [2702,4362:0) 0 at 2702-(4362) and >> %reg1177,0 = >>

[LLVMdev] undefs in phis

2009 Jan 30

[LLVMdev] undefs in phis

On Jan 29, 2009, at 5:29 PM, David Greene wrote: > On Thursday 29 January 2009 18:04, Eli Friedman wrote: >> On Thu, Jan 29, 2009 at 2:47 PM, David Greene <dag at cray.com> wrote: >>> After phi elimination we have: >>> >>> bb134: >>> %reg1645 = 1.0 >>> >>> bb74: >>> %reg1176 = MOVAPS %reg1645 >>> %reg1177 =

[LLVMdev] undefs in phis

2009 Jan 30

[LLVMdev] undefs in phis

On Thursday 29 January 2009 18:04, Eli Friedman wrote: > On Thu, Jan 29, 2009 at 2:47 PM, David Greene <dag at cray.com> wrote: > > After phi elimination we have: > > > > bb134: > > %reg1645 = 1.0 > > > > bb74: > > %reg1176 = MOVAPS %reg1645 > > %reg1177 = MOVAPS %reg1646 > > [...] > > > > bb108: > > %reg1645 =

[LLVMdev] undefs in phis

2009 Feb 02

[LLVMdev] undefs in phis

On Friday 30 January 2009 16:54, Evan Cheng wrote: > I don't have the whole context to understand why you think this is a > bug. An implicit_def doesn't actually define any value. So we don't > care if a live interval overlaps live ranges defined by an implicit_def. It's a bug because the coalerscer does illegal coaescing. Our last episode left us here: bb134: 2696

[LLVMdev] InstCombine Question

2008 Apr 04

[LLVMdev] InstCombine Question

On Friday 04 April 2008 13:07, Chris Lattner wrote: > > So how does the undef store to null appear in the IR when it isn't > > attached anywhere and how can I get rid of it? > > Don't do undefined behavior? :) I don't think it's undefined behavior. Right before instcombine, we have this: %r60 = load <2 x i64>* %"$LCS_1", align 16 ; <<2

[LLVMdev] InstCombine Question

2008 Apr 04

[LLVMdev] InstCombine Question

On Fri, 4 Apr 2008, David Greene wrote: > I am confused by this bit of code in instcombine: > > 09789 if (GetElementPtrInst *GEPI = dyn_cast<GetElementPtrInst>(Op)) { > 09790 const Value *GEPI0 = GEPI->getOperand(0); > 09791 // TODO: Consider a target hook for valid address spaces for this > xform. > 09792 if (isa<ConstantPointerNull>(GEPI0)

[LLVMdev] InstCombine Question

2008 Apr 04

[LLVMdev] InstCombine Question

I am confused by this bit of code in instcombine: 09789 if (GetElementPtrInst *GEPI = dyn_cast<GetElementPtrInst>(Op)) { 09790 const Value *GEPI0 = GEPI->getOperand(0); 09791 // TODO: Consider a target hook for valid address spaces for this xform. 09792 if (isa<ConstantPointerNull>(GEPI0) && 09793

[LLVMdev] undefs in phis

2009 Jan 30

[LLVMdev] undefs in phis

On Friday 30 January 2009 01:41, Evan Cheng wrote: > >> I find it a little strange that the IMPLICIT_DEF disappears. Besides > >> that, it looks okay up to here. > > > > I just verified that it does disappear. > > It's intentional. We don't want a live interval defined by an > implicit_def. It unnecessarily increases register pressure. Ah, I see.

[LLVMdev] Making Sense of ISel DAG Output

2008 Oct 07

[LLVMdev] Making Sense of ISel DAG Output

On Friday 03 October 2008 12:06, Dan Gohman wrote: > On Fri, October 3, 2008 9:10 am, David Greene wrote: > > On Thursday 02 October 2008 19:32, Dan Gohman wrote: > >> Looking at your dump() output above, it looks like the pre-selection > >> loads have multiple uses, so even though you've managed to match a > >> larger pattern that incorporates them, they

[LLVMdev] Making Sense of ISel DAG Output

2008 Oct 03

[LLVMdev] Making Sense of ISel DAG Output

On Fri, October 3, 2008 9:10 am, David Greene wrote: > On Thursday 02 October 2008 19:32, Dan Gohman wrote: > >> Looking at your dump() output above, it looks like the pre-selection >> loads have multiple uses, so even though you've managed to match a >> larger pattern that incorporates them, they still need to exist to >> satisfy some other users. > > Yes,

[LLVMdev] Making Sense of ISel DAG Output

2008 Oct 03

[LLVMdev] Making Sense of ISel DAG Output

On Thursday 02 October 2008 19:32, Dan Gohman wrote: > Looking at your dump() output above, it looks like the pre-selection > loads have multiple uses, so even though you've managed to match a > larger pattern that incorporates them, they still need to exist to > satisfy some other users. Yes, I looked at that too. It looks like these other uses end up being chains to

[LLVMdev] speed up memcpy intrinsic using ARM Neon registers

2009 Nov 10

[LLVMdev] speed up memcpy intrinsic using ARM Neon registers

I tried to speed up Dhrystone on ARM Cortex-A8 by optimizing the memcpy intrinsic. I used the Neon load multiple instruction to move up to 48 bytes at a time . Over 15 scalar instructions collapsed down into these 2 Neon instructions. fldmiad r3, {d0, d1, d2, d3, d4, d5} @ SrcLine dhrystone.c 359 fstmiad r1, {d0, d1, d2, d3, d4, d5} It seems like this should be faster. But I did

Quote the path of graphics output in Sweave when it contains spaces

2011 Jun 20

Quote the path of graphics output in Sweave when it contains spaces

Hi, I'm aware of the definition of a "valid filename" in .SweaveValidFilenameRegexp, but I think it might be better to quote the filename when it contains spaces instead of just giving a warning. This should bring us safer LaTeX code (although I never use spaces in paths). Here is the simple patch: Index: src/library/utils/R/SweaveDrivers.R

[LLVMdev] undefs in phis

2009 Jan 30

[LLVMdev] undefs in phis

On Friday 30 January 2009 15:10, David Greene wrote: > This still looks correct. The coalescer then says: > > 4360 %reg1177<def> = FsMOVAPSrr %reg1176<kill> ; srcLine 0 > Inspecting %reg1176,0 = [2702,4362:0) 0 at 2702-(4362) and %reg1177,0 = > [2700,3712:0)[3768,3878:0)[4362,4372:0) 0 at 4362-(3878): > Joined. Result = %reg1177,0 = [2700,4372:0) 0 at

[LLVMdev] speed up memcpy intrinsic using ARM Neon registers

2009 Nov 10

[LLVMdev] speed up memcpy intrinsic using ARM Neon registers

On Nov 9, 2009, at 7:34 PM, Neel Nagar wrote: > I tried to speed up Dhrystone on ARM Cortex-A8 by optimizing the > memcpy intrinsic. I used the Neon load multiple instruction to move up > to 48 bytes at a time . Over 15 scalar instructions collapsed down > into these 2 Neon instructions. > > fldmiad r3, {d0, d1, d2, d3, d4, d5} @ SrcLine dhrystone.c 359 > fstmiad

[LLVMdev] speed up memcpy intrinsic using ARM Neon registers

2009 Nov 10

[LLVMdev] speed up memcpy intrinsic using ARM Neon registers

On Nov 9, 2009, at 5:59 PM, David Conrad wrote: > On Nov 9, 2009, at 7:34 PM, Neel Nagar wrote: > >> I tried to speed up Dhrystone on ARM Cortex-A8 by optimizing the >> memcpy intrinsic. I used the Neon load multiple instruction to move >> up >> to 48 bytes at a time . Over 15 scalar instructions collapsed down >> into these 2 Neon instructions. Nice. Thanks

similar to: [LLVMdev] Making Sense of ISel DAG Output