Displaying 20 results from an estimated 200 matches similar to: "[LLVMdev] Making Sense of ISel DAG Output"
2008 Oct 02
0
[LLVMdev] Making Sense of ISel DAG Output
On Thursday 02 October 2008 11:37, David Greene wrote:
> I'll try ot write a small example and send it in a bit.
Ok, here's what I'm trying to do:
let AddedComplexity = 40 in {
def : Pat<(v2f64 (vector_shuffle (v2f64 (scalar_to_vector (loadf64 addr:
$src1))),
(v2f64 (scalar_to_vector (loadf64 addr:
$src2))),
2008 Oct 02
6
[LLVMdev] Making Sense of ISel DAG Output
On Thursday 02 October 2008 12:42, David Greene wrote:
> But let's say you _could_ write such a pattern (because I can). The input
> DAG looks like this:
>
> 0x391a220: <multiple use>
> 0x391c970: v2f64 = scalar_to_vector 0x391a220 srcLineNum= 10
> 0x391ac10: <multiple use>
> 0x391c8b0: v2f64 = scalar_to_vector
2008 Oct 02
0
[LLVMdev] Making Sense of ISel DAG Output
On Oct 2, 2008, at 9:37 AM, David Greene wrote:
> I'm debugging some X86 patterns and I want to understand the debug
> dumps from
> isel better.
>
> Here's some example output:
>
> 0x391bc40: i64,ch = load 0x3922c50, 0x391b8d0, 0x38dc530
> <0x39053e0:0> <sext
> i32> alignment=4 srcLineNum= 10
> 0x3922c50: <multiple use>
>
2008 Oct 03
0
[LLVMdev] Making Sense of ISel DAG Output
On Oct 2, 2008, at 2:19 PM, David Greene wrote:
> On Thursday 02 October 2008 12:42, David Greene wrote:
>
>> But let's say you _could_ write such a pattern (because I can).
>> The input
>> DAG looks like this:
>>
>> 0x391a220: <multiple use>
>> 0x391c970: v2f64 = scalar_to_vector 0x391a220 srcLineNum= 10
>>
2009 Jan 30
2
[LLVMdev] undefs in phis
On Jan 30, 2009, at 1:52 PM, David Greene wrote:
> On Friday 30 January 2009 15:10, David Greene wrote:
>
>> This still looks correct. The coalescer then says:
>>
>> 4360 %reg1177<def> = FsMOVAPSrr %reg1176<kill> ; srcLine 0
>> Inspecting %reg1176,0 = [2702,4362:0) 0 at 2702-(4362) and
>> %reg1177,0 =
>>
2009 Jan 30
0
[LLVMdev] undefs in phis
On Jan 29, 2009, at 5:29 PM, David Greene wrote:
> On Thursday 29 January 2009 18:04, Eli Friedman wrote:
>> On Thu, Jan 29, 2009 at 2:47 PM, David Greene <dag at cray.com> wrote:
>>> After phi elimination we have:
>>>
>>> bb134:
>>> %reg1645 = 1.0
>>>
>>> bb74:
>>> %reg1176 = MOVAPS %reg1645
>>> %reg1177 =
2009 Jan 30
2
[LLVMdev] undefs in phis
On Thursday 29 January 2009 18:04, Eli Friedman wrote:
> On Thu, Jan 29, 2009 at 2:47 PM, David Greene <dag at cray.com> wrote:
> > After phi elimination we have:
> >
> > bb134:
> > %reg1645 = 1.0
> >
> > bb74:
> > %reg1176 = MOVAPS %reg1645
> > %reg1177 = MOVAPS %reg1646
> > [...]
> >
> > bb108:
> > %reg1645 =
2009 Feb 02
0
[LLVMdev] undefs in phis
On Friday 30 January 2009 16:54, Evan Cheng wrote:
> I don't have the whole context to understand why you think this is a
> bug. An implicit_def doesn't actually define any value. So we don't
> care if a live interval overlaps live ranges defined by an implicit_def.
It's a bug because the coalerscer does illegal coaescing.
Our last episode left us here:
bb134:
2696
2008 Apr 04
2
[LLVMdev] InstCombine Question
On Friday 04 April 2008 13:07, Chris Lattner wrote:
> > So how does the undef store to null appear in the IR when it isn't
> > attached anywhere and how can I get rid of it?
>
> Don't do undefined behavior? :)
I don't think it's undefined behavior. Right before instcombine, we have
this:
%r60 = load <2 x i64>* %"$LCS_1", align 16 ; <<2
2008 Apr 04
0
[LLVMdev] InstCombine Question
On Fri, 4 Apr 2008, David Greene wrote:
> I am confused by this bit of code in instcombine:
>
> 09789 if (GetElementPtrInst *GEPI = dyn_cast<GetElementPtrInst>(Op)) {
> 09790 const Value *GEPI0 = GEPI->getOperand(0);
> 09791 // TODO: Consider a target hook for valid address spaces for this
> xform.
> 09792 if (isa<ConstantPointerNull>(GEPI0)
2008 Apr 04
2
[LLVMdev] InstCombine Question
I am confused by this bit of code in instcombine:
09789 if (GetElementPtrInst *GEPI = dyn_cast<GetElementPtrInst>(Op)) {
09790 const Value *GEPI0 = GEPI->getOperand(0);
09791 // TODO: Consider a target hook for valid address spaces for this
xform.
09792 if (isa<ConstantPointerNull>(GEPI0) &&
09793
2009 Jan 30
2
[LLVMdev] undefs in phis
On Friday 30 January 2009 01:41, Evan Cheng wrote:
> >> I find it a little strange that the IMPLICIT_DEF disappears. Besides
> >> that, it looks okay up to here.
> >
> > I just verified that it does disappear.
>
> It's intentional. We don't want a live interval defined by an
> implicit_def. It unnecessarily increases register pressure.
Ah, I see.
2008 Oct 07
2
[LLVMdev] Making Sense of ISel DAG Output
On Friday 03 October 2008 12:06, Dan Gohman wrote:
> On Fri, October 3, 2008 9:10 am, David Greene wrote:
> > On Thursday 02 October 2008 19:32, Dan Gohman wrote:
> >> Looking at your dump() output above, it looks like the pre-selection
> >> loads have multiple uses, so even though you've managed to match a
> >> larger pattern that incorporates them, they
2008 Oct 03
0
[LLVMdev] Making Sense of ISel DAG Output
On Fri, October 3, 2008 9:10 am, David Greene wrote:
> On Thursday 02 October 2008 19:32, Dan Gohman wrote:
>
>> Looking at your dump() output above, it looks like the pre-selection
>> loads have multiple uses, so even though you've managed to match a
>> larger pattern that incorporates them, they still need to exist to
>> satisfy some other users.
>
> Yes,
2008 Oct 03
3
[LLVMdev] Making Sense of ISel DAG Output
On Thursday 02 October 2008 19:32, Dan Gohman wrote:
> Looking at your dump() output above, it looks like the pre-selection
> loads have multiple uses, so even though you've managed to match a
> larger pattern that incorporates them, they still need to exist to
> satisfy some other users.
Yes, I looked at that too. It looks like these other uses end up being
chains to
2009 Nov 10
4
[LLVMdev] speed up memcpy intrinsic using ARM Neon registers
I tried to speed up Dhrystone on ARM Cortex-A8 by optimizing the
memcpy intrinsic. I used the Neon load multiple instruction to move up
to 48 bytes at a time . Over 15 scalar instructions collapsed down
into these 2 Neon instructions.
fldmiad r3, {d0, d1, d2, d3, d4, d5} @ SrcLine dhrystone.c 359
fstmiad r1, {d0, d1, d2, d3, d4, d5}
It seems like this should be faster. But I did
2011 Jun 20
1
Quote the path of graphics output in Sweave when it contains spaces
Hi,
I'm aware of the definition of a "valid filename" in
.SweaveValidFilenameRegexp, but I think it might be better to quote
the filename when it contains spaces instead of just giving a warning.
This should bring us safer LaTeX code (although I never use spaces in
paths).
Here is the simple patch:
Index: src/library/utils/R/SweaveDrivers.R
2009 Jan 30
0
[LLVMdev] undefs in phis
On Friday 30 January 2009 15:10, David Greene wrote:
> This still looks correct. The coalescer then says:
>
> 4360 %reg1177<def> = FsMOVAPSrr %reg1176<kill> ; srcLine 0
> Inspecting %reg1176,0 = [2702,4362:0) 0 at 2702-(4362) and %reg1177,0 =
> [2700,3712:0)[3768,3878:0)[4362,4372:0) 0 at 4362-(3878):
> Joined. Result = %reg1177,0 = [2700,4372:0) 0 at
2009 Nov 10
0
[LLVMdev] speed up memcpy intrinsic using ARM Neon registers
On Nov 9, 2009, at 7:34 PM, Neel Nagar wrote:
> I tried to speed up Dhrystone on ARM Cortex-A8 by optimizing the
> memcpy intrinsic. I used the Neon load multiple instruction to move up
> to 48 bytes at a time . Over 15 scalar instructions collapsed down
> into these 2 Neon instructions.
>
> fldmiad r3, {d0, d1, d2, d3, d4, d5} @ SrcLine dhrystone.c 359
> fstmiad
2009 Nov 10
3
[LLVMdev] speed up memcpy intrinsic using ARM Neon registers
On Nov 9, 2009, at 5:59 PM, David Conrad wrote:
> On Nov 9, 2009, at 7:34 PM, Neel Nagar wrote:
>
>> I tried to speed up Dhrystone on ARM Cortex-A8 by optimizing the
>> memcpy intrinsic. I used the Neon load multiple instruction to move
>> up
>> to 48 bytes at a time . Over 15 scalar instructions collapsed down
>> into these 2 Neon instructions.
Nice. Thanks