thr3ads.net - llvm dev - [LLVMdev] sinking address computing in CodeGenPrepare [Nov 2013]

If this information is useful, please help other people find it:
Share via:

Hal Finkel

2013-Nov-21 06:38 UTC

[LLVMdev] sinking address computing in CodeGenPrepare

----- Original Message -----> From: "Evan Cheng" <evan.cheng at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "LLVM" <llvmdev at cs.uiuc.edu>, "Junbum Lim"
<junbums at gmail.com>
> Sent: Wednesday, November 20, 2013 7:48:13 PM
> Subject: Re: [LLVMdev] sinking  address computing in CodeGenPrepare
> 
> 
> On Nov 20, 2013, at 5:38 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> > ----- Original Message -----
> >> From: "Evan Cheng" <evan.cheng at apple.com>
> >> To: "Junbum Lim" <junbums at gmail.com>
> >> Cc: llvmdev at cs.uiuc.edu
> >> Sent: Wednesday, November 20, 2013 7:01:49 PM
> >> Subject: Re: [LLVMdev] sinking  address computing in
> >> CodeGenPrepare
> >> 
> >> 
> >> On Nov 20, 2013, at 3:10 PM, Junbum Lim <junbums at
gmail.com> wrote:
> >> 
> >>> 
> >>> 
> >>> When multiple GEPs or other operations are used for the
address
> >>> calculation, OptimizeMemoryInst() performs address matching
and
> >>> determines a final addressing expression as a simple form
(e.g.,
> >>> ptrtoint/add/inttoptr) and sinks it into user's block so
that
> >>> ISel
> >>> could have better chance to fold address computation into LDRs
> >>> and
> >>> STRs. However, OptimizeMemoryInst() seems to do this
> >>> transformation even when the address calculation derived from
a
> >>> single GEP, resulting in poor alias analysis because GEP is no
> >>> longer used.
> >> 
> >> I don't follow your last statement. How does this impact AA?
> >> CodeGenPrep is run late, after AA is done.
> > 
> > I don't know if this is relevant for Lim or not, but some targets
> > use AA during CodeGen (instruction scheduling mostly, but SDAG
> > too).
> 
> MachineSched uses AA to determine if something is loop invariant,
> which basically boils down to looking at machine operand and see
> it's pointing to constant memory. I don't see how that's impact
by
> GEP vs. ADDS + MUL.
MachineSched can use AA for a lot more than that. I use AA during scheduling
because, in addition to picking up loads from constant memory, it lets me do a
kind of modulo scheduling for unrolled loops. AA can tell that loads and stores
to different arrays don't alias, and loads and stores to different offsets
of the same array don't alias.
> Also, the analysis should have already been done
> and cached.
BasicAA has a cache internally, but as far as I can tell, only to guard against
recursion (and it is emptied after each query). Am I missing something?

 -Hal
> 
> Evan
> 
> > 
> > -Hal
> > 
> >> 
> >> Evan
> >> 
> >>> 
> >>> So, do you think it is a possible workaround to sink a GEP
> >>> without
> >>> converting it into a set of integer operations
> >>> (ptrtoint/add/inttoptr) if the address mode is derived only
from
> >>> a
> >>> single GEP.
> >>> 
> >>> Thanks,
> >>> Jun
> >>> 
> >>> 
> >>> On Nov 12, 2013, at 7:14 PM, Evan Cheng <evan.cheng at
apple.com>
> >>> wrote:
> >>> 
> >>>> 
> >>>> On Nov 12, 2013, at 11:24 AM, Junbum Lim <junbums at
gmail.com>
> >>>> wrote:
> >>>> 
> >>>>> 
> >>>>> I wonder why CodeGenPrepare breaks GEP into integer
> >>>>> calculations
> >>>>> (ptrtoin/add/inttopt) instead of directly sinking the
address
> >>>>> calculation using GEP into user's block.
> >>>> 
> >>>> I believe it's primary for address mode matching where
only part
> >>>> of the GEP can be folded (depending on the instruction
set).
> >>>> 
> >>>> Evan
> >>>> 
> >>>>> 
> >>>>> Thanks,
> >>>>> Jun
> >>>>> 
> >>>>> 
> >>>>> On Nov 12, 2013, at 12:07 PM, Evan Cheng
<evan.cheng at apple.com>
> >>>>> wrote:
> >>>>> 
> >>>>>> The reason for this is to allow folding of address
computation
> >>>>>> into loads and stores. A lot of modern arch, e.g.
X86 and arm,
> >>>>>> have complex addressing mode.
> >>>>>> 
> >>>>>> Evan
> >>>>>> 
> >>>>>> Sent from my iPad
> >>>>>> 
> >>>>>>> On Nov 12, 2013, at 8:39 AM, Junbum Lim
<junbums at gmail.com>
> >>>>>>> wrote:
> >>>>>>> 
> >>>>>>> Hi All,
> >>>>>>> 
> >>>>>>> In CodeGenPrepare pass,  OptimizeMemoryInst()
try to sink
> >>>>>>> address computing into users' block by
converting GET to
> >>>>>>> integers? It appear that it have impacts on
ISel's result,
> >>>>>>> but
> >>>>>>> I'm not clear about the main purpose of
the transformation.
> >>>>>>> 
> >>>>>>> FROM :
> >>>>>>> for.body.lr.ph:
> >>>>>>>           %zzz = getelementptr inbounds
%struct.SS* %a2, i32
> >>>>>>>           0, i32 35
> >>>>>>> 
> >>>>>>> for.body:
> >>>>>>>           %4 = load double* %zzz, align 8,
!tbaa !0
> >>>>>>> 
> >>>>>>> TO :
> >>>>>>> for.body:
> >>>>>>> %sunkaddr27 = ptrtoint %struct.SS* %a2 to i32 
<-----
> >>>>>>> sink
> >>>>>>> address computing into user's block
> >>>>>>> %sunkaddr28 = add i32 %sunkaddr27, 272
> >>>>>>> %sunkaddr29 = inttoptr i32 %sunkaddr28 to
double*
> >>>>>>> %4 = load double* %sunkaddr29, align 8, !tbaa
!8
> >>>>>>> 
> >>>>>>> 
> >>>>>>> From what I observed, this transformation can
cause poor
> >>>>>>> alias
> >>>>>>> analysis results without using GEP.  So, I
want to see there
> >>>>>>> is any way to avoid this conversion.
> >>>>>>> 
> >>>>>>> My question is :
> >>>>>>> 1. Why do we need to sink address computing
into users'
> >>>>>>> block?
> >>>>>>> What is the benefit of this conversion ?
> >>>>>>> 2. Can we directly use GEP instead of breaking
it into
> >>>>>>> integer
> >>>>>>> calculations ?
> >>>>>>> 
> >>>>>>> Thanks,
> >>>>>>> Jun
> >>>>>>>
_______________________________________________
> >>>>>>> LLVM Developers mailing list
> >>>>>>> LLVMdev at cs.uiuc.edu        
http://llvm.cs.uiuc.edu
> >>>>>>>
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>>>> 
> >>>> 
> >>> 
> >> 
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >> 
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> 
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Evan Cheng

2013-Nov-22 00:47 UTC

head link

[LLVMdev] sinking address computing in CodeGenPrepare

On Nov 20, 2013, at 10:38 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
>> From: "Evan Cheng" <evan.cheng at apple.com>
>> To: "Hal Finkel" <hfinkel at anl.gov>
>> Cc: "LLVM" <llvmdev at cs.uiuc.edu>, "Junbum
Lim" <junbums at gmail.com>
>> Sent: Wednesday, November 20, 2013 7:48:13 PM
>> Subject: Re: [LLVMdev] sinking  address computing in CodeGenPrepare
>> 
>> 
>> On Nov 20, 2013, at 5:38 PM, Hal Finkel <hfinkel at anl.gov>
wrote:
>> 
>>> ----- Original Message -----
>>>> From: "Evan Cheng" <evan.cheng at apple.com>
>>>> To: "Junbum Lim" <junbums at gmail.com>
>>>> Cc: llvmdev at cs.uiuc.edu
>>>> Sent: Wednesday, November 20, 2013 7:01:49 PM
>>>> Subject: Re: [LLVMdev] sinking  address computing in
>>>> CodeGenPrepare
>>>> 
>>>> 
>>>> On Nov 20, 2013, at 3:10 PM, Junbum Lim <junbums at
gmail.com> wrote:
>>>> 
>>>>> 
>>>>> 
>>>>> When multiple GEPs or other operations are used for the
address
>>>>> calculation, OptimizeMemoryInst() performs address matching
and
>>>>> determines a final addressing expression as a simple form
(e.g.,
>>>>> ptrtoint/add/inttoptr) and sinks it into user's block
so that
>>>>> ISel
>>>>> could have better chance to fold address computation into
LDRs
>>>>> and
>>>>> STRs. However, OptimizeMemoryInst() seems to do this
>>>>> transformation even when the address calculation derived
from a
>>>>> single GEP, resulting in poor alias analysis because GEP is
no
>>>>> longer used.
>>>> 
>>>> I don't follow your last statement. How does this impact
AA?
>>>> CodeGenPrep is run late, after AA is done.
>>> 
>>> I don't know if this is relevant for Lim or not, but some
targets
>>> use AA during CodeGen (instruction scheduling mostly, but SDAG
>>> too).
>> 
>> MachineSched uses AA to determine if something is loop invariant,
>> which basically boils down to looking at machine operand and see
>> it's pointing to constant memory. I don't see how that's
impact by
>> GEP vs. ADDS + MUL.
> 
> MachineSched can use AA for a lot more than that. I use AA during
scheduling because, in addition to picking up loads from constant memory, it
lets me do a kind of modulo scheduling for unrolled loops. AA can tell that
loads and stores to different arrays don't alias, and loads and stores to
different offsets of the same array don't alias.
I still don't understand what this has to do with whether GEP is lowered in
codegenprep though.
> 
>> Also, the analysis should have already been done
>> and cached.
> 
> BasicAA has a cache internally, but as far as I can tell, only to guard
against recursion (and it is emptied after each query). Am I missing something?
It's not clear to me how AA is used in codegen. I understand some
information are transferred to memoperands during LLVM IR to SDISel conversion.
Is AA actually being recomputed using LLVM IR during codegen?

Evan
> 
> -Hal
> 
>> 
>> Evan
>> 
>>> 
>>> -Hal
>>> 
>>>> 
>>>> Evan
>>>> 
>>>>> 
>>>>> So, do you think it is a possible workaround to sink a GEP
>>>>> without
>>>>> converting it into a set of integer operations
>>>>> (ptrtoint/add/inttoptr) if the address mode is derived only
from
>>>>> a
>>>>> single GEP.
>>>>> 
>>>>> Thanks,
>>>>> Jun
>>>>> 
>>>>> 
>>>>> On Nov 12, 2013, at 7:14 PM, Evan Cheng <evan.cheng at
apple.com>
>>>>> wrote:
>>>>> 
>>>>>> 
>>>>>> On Nov 12, 2013, at 11:24 AM, Junbum Lim <junbums at
gmail.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> I wonder why CodeGenPrepare breaks GEP into integer
>>>>>>> calculations
>>>>>>> (ptrtoin/add/inttopt) instead of directly sinking
the address
>>>>>>> calculation using GEP into user's block.
>>>>>> 
>>>>>> I believe it's primary for address mode matching
where only part
>>>>>> of the GEP can be folded (depending on the instruction
set).
>>>>>> 
>>>>>> Evan
>>>>>> 
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Jun
>>>>>>> 
>>>>>>> 
>>>>>>> On Nov 12, 2013, at 12:07 PM, Evan Cheng
<evan.cheng at apple.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> The reason for this is to allow folding of
address computation
>>>>>>>> into loads and stores. A lot of modern arch,
e.g. X86 and arm,
>>>>>>>> have complex addressing mode.
>>>>>>>> 
>>>>>>>> Evan
>>>>>>>> 
>>>>>>>> Sent from my iPad
>>>>>>>> 
>>>>>>>>> On Nov 12, 2013, at 8:39 AM, Junbum Lim
<junbums at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> Hi All,
>>>>>>>>> 
>>>>>>>>> In CodeGenPrepare pass, 
OptimizeMemoryInst() try to sink
>>>>>>>>> address computing into users' block by
converting GET to
>>>>>>>>> integers? It appear that it have impacts on
ISel's result,
>>>>>>>>> but
>>>>>>>>> I'm not clear about the main purpose of
the transformation.
>>>>>>>>> 
>>>>>>>>> FROM :
>>>>>>>>> for.body.lr.ph:
>>>>>>>>>          %zzz = getelementptr inbounds
%struct.SS* %a2, i32
>>>>>>>>>          0, i32 35
>>>>>>>>> 
>>>>>>>>> for.body:
>>>>>>>>>          %4 = load double* %zzz, align 8,
!tbaa !0
>>>>>>>>> 
>>>>>>>>> TO :
>>>>>>>>> for.body:
>>>>>>>>> %sunkaddr27 = ptrtoint %struct.SS* %a2 to
i32       <-----
>>>>>>>>> sink
>>>>>>>>> address computing into user's block
>>>>>>>>> %sunkaddr28 = add i32 %sunkaddr27, 272
>>>>>>>>> %sunkaddr29 = inttoptr i32 %sunkaddr28 to
double*
>>>>>>>>> %4 = load double* %sunkaddr29, align 8,
!tbaa !8
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> From what I observed, this transformation
can cause poor
>>>>>>>>> alias
>>>>>>>>> analysis results without using GEP.  So, I
want to see there
>>>>>>>>> is any way to avoid this conversion.
>>>>>>>>> 
>>>>>>>>> My question is :
>>>>>>>>> 1. Why do we need to sink address computing
into users'
>>>>>>>>> block?
>>>>>>>>> What is the benefit of this conversion ?
>>>>>>>>> 2. Can we directly use GEP instead of
breaking it into
>>>>>>>>> integer
>>>>>>>>> calculations ?
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Jun
>>>>>>>>>
_______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>> LLVMdev at cs.uiuc.edu        
http://llvm.cs.uiuc.edu
>>>>>>>>>
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>> 
>>> 
>>> --
>>> Hal Finkel
>>> Assistant Computational Scientist
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>> 
>> 
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory

Hal Finkel

2013-Nov-22 03:25 UTC

head link

[LLVMdev] sinking address computing in CodeGenPrepare

----- Original Message -----> From: "Evan Cheng" <evan.cheng at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "LLVM" <llvmdev at cs.uiuc.edu>, "Junbum Lim"
<junbums at gmail.com>, "Andrew Trick" <atrick at
apple.com>
> Sent: Thursday, November 21, 2013 6:47:40 PM
> Subject: Re: [LLVMdev] sinking  address computing in CodeGenPrepare
> 
> 
> On Nov 20, 2013, at 10:38 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> > ----- Original Message -----
> >> From: "Evan Cheng" <evan.cheng at apple.com>
> >> To: "Hal Finkel" <hfinkel at anl.gov>
> >> Cc: "LLVM" <llvmdev at cs.uiuc.edu>, "Junbum
Lim" <junbums at gmail.com>
> >> Sent: Wednesday, November 20, 2013 7:48:13 PM
> >> Subject: Re: [LLVMdev] sinking  address computing in
> >> CodeGenPrepare
> >> 
> >> 
> >> On Nov 20, 2013, at 5:38 PM, Hal Finkel <hfinkel at anl.gov>
wrote:
> >> 
> >>> ----- Original Message -----
> >>>> From: "Evan Cheng" <evan.cheng at
apple.com>
> >>>> To: "Junbum Lim" <junbums at gmail.com>
> >>>> Cc: llvmdev at cs.uiuc.edu
> >>>> Sent: Wednesday, November 20, 2013 7:01:49 PM
> >>>> Subject: Re: [LLVMdev] sinking  address computing in
> >>>> CodeGenPrepare
> >>>> 
> >>>> 
> >>>> On Nov 20, 2013, at 3:10 PM, Junbum Lim <junbums at
gmail.com>
> >>>> wrote:
> >>>> 
> >>>>> 
> >>>>> 
> >>>>> When multiple GEPs or other operations are used for
the address
> >>>>> calculation, OptimizeMemoryInst() performs address
matching and
> >>>>> determines a final addressing expression as a simple
form
> >>>>> (e.g.,
> >>>>> ptrtoint/add/inttoptr) and sinks it into user's
block so that
> >>>>> ISel
> >>>>> could have better chance to fold address computation
into LDRs
> >>>>> and
> >>>>> STRs. However, OptimizeMemoryInst() seems to do this
> >>>>> transformation even when the address calculation
derived from a
> >>>>> single GEP, resulting in poor alias analysis because
GEP is no
> >>>>> longer used.
> >>>> 
> >>>> I don't follow your last statement. How does this
impact AA?
> >>>> CodeGenPrep is run late, after AA is done.
> >>> 
> >>> I don't know if this is relevant for Lim or not, but some
targets
> >>> use AA during CodeGen (instruction scheduling mostly, but SDAG
> >>> too).
> >> 
> >> MachineSched uses AA to determine if something is loop invariant,
> >> which basically boils down to looking at machine operand and see
> >> it's pointing to constant memory. I don't see how
that's impact by
> >> GEP vs. ADDS + MUL.
> > 
> > MachineSched can use AA for a lot more than that. I use AA during
> > scheduling because, in addition to picking up loads from constant
> > memory, it lets me do a kind of modulo scheduling for unrolled
> > loops. AA can tell that loads and stores to different arrays don't
> > alias, and loads and stores to different offsets of the same array
> > don't alias.
> 
> I still don't understand what this has to do with whether GEP is
> lowered in codegenprep though.
As I recall, BasicAA does not look through int <-> ptr conversions.
> 
> > 
> >> Also, the analysis should have already been done
> >> and cached.
> > 
> > BasicAA has a cache internally, but as far as I can tell, only to
> > guard against recursion (and it is emptied after each query). Am I
> > missing something?
> 
> It's not clear to me how AA is used in codegen. I understand some
> information are transferred to memoperands during LLVM IR to SDISel
> conversion. Is AA actually being recomputed using LLVM IR during
> codegen?
It depends on what the (sub)target requests. By default, no. But if the target
overrides TargetSubtargetInfo::useAA to return true, then yes.

 -Hal
> 
> Evan
> 
> > 
> > -Hal
> > 
> >> 
> >> Evan
> >> 
> >>> 
> >>> -Hal
> >>> 
> >>>> 
> >>>> Evan
> >>>> 
> >>>>> 
> >>>>> So, do you think it is a possible workaround to sink a
GEP
> >>>>> without
> >>>>> converting it into a set of integer operations
> >>>>> (ptrtoint/add/inttoptr) if the address mode is derived
only
> >>>>> from
> >>>>> a
> >>>>> single GEP.
> >>>>> 
> >>>>> Thanks,
> >>>>> Jun
> >>>>> 
> >>>>> 
> >>>>> On Nov 12, 2013, at 7:14 PM, Evan Cheng <evan.cheng
at apple.com>
> >>>>> wrote:
> >>>>> 
> >>>>>> 
> >>>>>> On Nov 12, 2013, at 11:24 AM, Junbum Lim
<junbums at gmail.com>
> >>>>>> wrote:
> >>>>>> 
> >>>>>>> 
> >>>>>>> I wonder why CodeGenPrepare breaks GEP into
integer
> >>>>>>> calculations
> >>>>>>> (ptrtoin/add/inttopt) instead of directly
sinking the address
> >>>>>>> calculation using GEP into user's block.
> >>>>>> 
> >>>>>> I believe it's primary for address mode
matching where only
> >>>>>> part
> >>>>>> of the GEP can be folded (depending on the
instruction set).
> >>>>>> 
> >>>>>> Evan
> >>>>>> 
> >>>>>>> 
> >>>>>>> Thanks,
> >>>>>>> Jun
> >>>>>>> 
> >>>>>>> 
> >>>>>>> On Nov 12, 2013, at 12:07 PM, Evan Cheng
> >>>>>>> <evan.cheng at apple.com>
> >>>>>>> wrote:
> >>>>>>> 
> >>>>>>>> The reason for this is to allow folding of
address
> >>>>>>>> computation
> >>>>>>>> into loads and stores. A lot of modern
arch, e.g. X86 and
> >>>>>>>> arm,
> >>>>>>>> have complex addressing mode.
> >>>>>>>> 
> >>>>>>>> Evan
> >>>>>>>> 
> >>>>>>>> Sent from my iPad
> >>>>>>>> 
> >>>>>>>>> On Nov 12, 2013, at 8:39 AM, Junbum
Lim <junbums at gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>> 
> >>>>>>>>> Hi All,
> >>>>>>>>> 
> >>>>>>>>> In CodeGenPrepare pass, 
OptimizeMemoryInst() try to sink
> >>>>>>>>> address computing into users'
block by converting GET to
> >>>>>>>>> integers? It appear that it have
impacts on ISel's result,
> >>>>>>>>> but
> >>>>>>>>> I'm not clear about the main
purpose of the transformation.
> >>>>>>>>> 
> >>>>>>>>> FROM :
> >>>>>>>>> for.body.lr.ph:
> >>>>>>>>>          %zzz = getelementptr inbounds
%struct.SS* %a2, i32
> >>>>>>>>>          0, i32 35
> >>>>>>>>> 
> >>>>>>>>> for.body:
> >>>>>>>>>          %4 = load double* %zzz, align
8, !tbaa !0
> >>>>>>>>> 
> >>>>>>>>> TO :
> >>>>>>>>> for.body:
> >>>>>>>>> %sunkaddr27 = ptrtoint %struct.SS* %a2
to i32       <-----
> >>>>>>>>> sink
> >>>>>>>>> address computing into user's
block
> >>>>>>>>> %sunkaddr28 = add i32 %sunkaddr27, 272
> >>>>>>>>> %sunkaddr29 = inttoptr i32 %sunkaddr28
to double*
> >>>>>>>>> %4 = load double* %sunkaddr29, align
8, !tbaa !8
> >>>>>>>>> 
> >>>>>>>>> 
> >>>>>>>>> From what I observed, this
transformation can cause poor
> >>>>>>>>> alias
> >>>>>>>>> analysis results without using GEP. 
So, I want to see
> >>>>>>>>> there
> >>>>>>>>> is any way to avoid this conversion.
> >>>>>>>>> 
> >>>>>>>>> My question is :
> >>>>>>>>> 1. Why do we need to sink address
computing into users'
> >>>>>>>>> block?
> >>>>>>>>> What is the benefit of this conversion
?
> >>>>>>>>> 2. Can we directly use GEP instead of
breaking it into
> >>>>>>>>> integer
> >>>>>>>>> calculations ?
> >>>>>>>>> 
> >>>>>>>>> Thanks,
> >>>>>>>>> Jun
> >>>>>>>>>
_______________________________________________
> >>>>>>>>> LLVM Developers mailing list
> >>>>>>>>> LLVMdev at cs.uiuc.edu        
http://llvm.cs.uiuc.edu
> >>>>>>>>>
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>>>>>> 
> >>>>>> 
> >>>>> 
> >>>> 
> >>>> _______________________________________________
> >>>> LLVM Developers mailing list
> >>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>>> 
> >>> 
> >>> --
> >>> Hal Finkel
> >>> Assistant Computational Scientist
> >>> Leadership Computing Facility
> >>> Argonne National Laboratory
> >> 
> >> 
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> 
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Andrew Trick

2013-Nov-22 03:37 UTC

head link

[LLVMdev] sinking address computing in CodeGenPrepare

On Nov 21, 2013, at 4:47 PM, Evan Cheng <evan.cheng at apple.com> wrote:
> 
> On Nov 20, 2013, at 10:38 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
>> ----- Original Message -----
>>> From: "Evan Cheng" <evan.cheng at apple.com>
>>> To: "Hal Finkel" <hfinkel at anl.gov>
>>> Cc: "LLVM" <llvmdev at cs.uiuc.edu>, "Junbum
Lim" <junbums at gmail.com>
>>> Sent: Wednesday, November 20, 2013 7:48:13 PM
>>> Subject: Re: [LLVMdev] sinking  address computing in CodeGenPrepare
>>> 
>>> 
>>> On Nov 20, 2013, at 5:38 PM, Hal Finkel <hfinkel at anl.gov>
wrote:
>>> 
>>>> ----- Original Message -----
>>>>> From: "Evan Cheng" <evan.cheng at
apple.com>
>>>>> To: "Junbum Lim" <junbums at gmail.com>
>>>>> Cc: llvmdev at cs.uiuc.edu
>>>>> Sent: Wednesday, November 20, 2013 7:01:49 PM
>>>>> Subject: Re: [LLVMdev] sinking  address computing in
>>>>> CodeGenPrepare
>>>>> 
>>>>> 
>>>>> On Nov 20, 2013, at 3:10 PM, Junbum Lim <junbums at
gmail.com> wrote:
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> When multiple GEPs or other operations are used for the
address
>>>>>> calculation, OptimizeMemoryInst() performs address
matching and
>>>>>> determines a final addressing expression as a simple
form (e.g.,
>>>>>> ptrtoint/add/inttoptr) and sinks it into user's
block so that
>>>>>> ISel
>>>>>> could have better chance to fold address computation
into LDRs
>>>>>> and
>>>>>> STRs. However, OptimizeMemoryInst() seems to do this
>>>>>> transformation even when the address calculation
derived from a
>>>>>> single GEP, resulting in poor alias analysis because
GEP is no
>>>>>> longer used.
>>>>> 
>>>>> I don't follow your last statement. How does this
impact AA?
>>>>> CodeGenPrep is run late, after AA is done.
>>>> 
>>>> I don't know if this is relevant for Lim or not, but some
targets
>>>> use AA during CodeGen (instruction scheduling mostly, but SDAG
>>>> too).
>>> 
>>> MachineSched uses AA to determine if something is loop invariant,
>>> which basically boils down to looking at machine operand and see
>>> it's pointing to constant memory. I don't see how
that's impact by
>>> GEP vs. ADDS + MUL.
>> 
>> MachineSched can use AA for a lot more than that. I use AA during
scheduling because, in addition to picking up loads from constant memory, it
lets me do a kind of modulo scheduling for unrolled loops. AA can tell that
loads and stores to different arrays don't alias, and loads and stores to
different offsets of the same array don't alias.
> 
> I still don't understand what this has to do with whether GEP is
lowered in codegenprep though.
> 
>> 
>>> Also, the analysis should have already been done
>>> and cached.
>> 
>> BasicAA has a cache internally, but as far as I can tell, only to guard
against recursion (and it is emptied after each query). Am I missing something?
> 
> It's not clear to me how AA is used in codegen. I understand some
information are transferred to memoperands during LLVM IR to SDISel conversion.
Is AA actually being recomputed using LLVM IR during codegen?
In general, when AA is used during codegen, it grabs the IR value from the
machine memoperands, then runs normal IR-level alias analysis. The IR needs to
stay around and be immutable. That’s why anything that changes aliasing of
IR-level memory ops should be run before CodeGen. For example, stack coloring
needs to conservatively mutilate the machine memoperands to work around this
problem.

We need to sink address computation to expose addressing modes to ISEL, but I’m
not sure why we need to lower to ptrtoint. That doesn’t seem good for AA at all.

-Andy
>> 
>> -Hal
>> 
>>> 
>>> Evan
>>> 
>>>> 
>>>> -Hal
>>>> 
>>>>> 
>>>>> Evan
>>>>> 
>>>>>> 
>>>>>> So, do you think it is a possible workaround to sink a
GEP
>>>>>> without
>>>>>> converting it into a set of integer operations
>>>>>> (ptrtoint/add/inttoptr) if the address mode is derived
only from
>>>>>> a
>>>>>> single GEP.
>>>>>> 
>>>>>> Thanks,
>>>>>> Jun
>>>>>> 
>>>>>> 
>>>>>> On Nov 12, 2013, at 7:14 PM, Evan Cheng <evan.cheng
at apple.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> On Nov 12, 2013, at 11:24 AM, Junbum Lim
<junbums at gmail.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> 
>>>>>>>> I wonder why CodeGenPrepare breaks GEP into
integer
>>>>>>>> calculations
>>>>>>>> (ptrtoin/add/inttopt) instead of directly
sinking the address
>>>>>>>> calculation using GEP into user's block.
>>>>>>> 
>>>>>>> I believe it's primary for address mode
matching where only part
>>>>>>> of the GEP can be folded (depending on the
instruction set).
>>>>>>> 
>>>>>>> Evan
>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Jun
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Nov 12, 2013, at 12:07 PM, Evan Cheng
<evan.cheng at apple.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> The reason for this is to allow folding of
address computation
>>>>>>>>> into loads and stores. A lot of modern
arch, e.g. X86 and arm,
>>>>>>>>> have complex addressing mode.
>>>>>>>>> 
>>>>>>>>> Evan
>>>>>>>>> 
>>>>>>>>> Sent from my iPad
>>>>>>>>> 
>>>>>>>>>> On Nov 12, 2013, at 8:39 AM, Junbum Lim
<junbums at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi All,
>>>>>>>>>> 
>>>>>>>>>> In CodeGenPrepare pass, 
OptimizeMemoryInst() try to sink
>>>>>>>>>> address computing into users' block
by converting GET to
>>>>>>>>>> integers? It appear that it have
impacts on ISel's result,
>>>>>>>>>> but
>>>>>>>>>> I'm not clear about the main
purpose of the transformation.
>>>>>>>>>> 
>>>>>>>>>> FROM :
>>>>>>>>>> for.body.lr.ph:
>>>>>>>>>>         %zzz = getelementptr inbounds
%struct.SS* %a2, i32
>>>>>>>>>>         0, i32 35
>>>>>>>>>> 
>>>>>>>>>> for.body:
>>>>>>>>>>         %4 = load double* %zzz, align
8, !tbaa !0
>>>>>>>>>> 
>>>>>>>>>> TO :
>>>>>>>>>> for.body:
>>>>>>>>>> %sunkaddr27 = ptrtoint %struct.SS* %a2
to i32       <-----
>>>>>>>>>> sink
>>>>>>>>>> address computing into user's block
>>>>>>>>>> %sunkaddr28 = add i32 %sunkaddr27, 272
>>>>>>>>>> %sunkaddr29 = inttoptr i32 %sunkaddr28
to double*
>>>>>>>>>> %4 = load double* %sunkaddr29, align 8,
!tbaa !8
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> From what I observed, this
transformation can cause poor
>>>>>>>>>> alias
>>>>>>>>>> analysis results without using GEP. 
So, I want to see there
>>>>>>>>>> is any way to avoid this conversion.
>>>>>>>>>> 
>>>>>>>>>> My question is :
>>>>>>>>>> 1. Why do we need to sink address
computing into users'
>>>>>>>>>> block?
>>>>>>>>>> What is the benefit of this conversion
?
>>>>>>>>>> 2. Can we directly use GEP instead of
breaking it into
>>>>>>>>>> integer
>>>>>>>>>> calculations ?
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Jun
>>>>>>>>>>
_______________________________________________
>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>> LLVMdev at cs.uiuc.edu        
http://llvm.cs.uiuc.edu
>>>>>>>>>>
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>> 
>>>> 
>>>> --
>>>> Hal Finkel
>>>> Assistant Computational Scientist
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>> 
>>> 
>> 
>> -- 
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131121/1a3ce253/attachment.html>

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Nov 2013 - [LLVMdev] sinking address computing in CodeGenPrepare

[LLVMdev] sinking address computing in CodeGenPrepare

[LLVMdev] sinking address computing in CodeGenPrepare

[LLVMdev] sinking address computing in CodeGenPrepare

[LLVMdev] sinking address computing in CodeGenPrepare

Possibly Parallel Threads