On Oct 21, 2011, at 12:15 AM, James Molloy wrote:> Hi Andy,
>
> Could you describe how this would be done? In the current ARM itineraries
> (say C-A9 for example), the superscalar issue stage is modelled as taking 1
> cycle. If it were to take 2 cycles instead, as far as I can tell the hazard
> analyser would stall because both FU's would be acquired.
>
> I would like to model both issue width and pipeline depth. To save myself
> explaining a possibly incorrect assumption again, could you please briefly
> say how you expect that to be modelled and I can respond to that? Say for
> example a simple M-wide, N-deep pipeline.
>
> Cheers,
>
> James
>
Hi James,
I'll try to describe how the itinerary works a bit. It's nonintuitive.
The itinerary has two lists, a list of pipeline stages and a list of
operand latencies. The latency of an instruction is captured by the
latency of its "definition" operands, so latency does not need to be
modeled in the pipeline stages at all.
A 2 wide, 1 deep pipeline (2x1) would be:
[InstrStage<1, [Pipe0, Pipe1]>]
A 2 wide, 4 deep pipeline (2x4) would be:
[InstrStage<1, [Pipe0, Pipe1]>]
Surprise. There is no difference in the pipeline description, because
the units are fully pipelined and we don't need to express latency
here. (I'm only showing the pipeline stages here, not the operand latency
list).
Let's say you want to treat each stage of a pipeline as a separate
type of unit:
stage0: Decode
stage1: Exec
stage2: Write
[InstrStage<1, [Decode0, Decode1], 0>,
InstrStage<1, [Exec0, Exec1], 0>,
InstrStage<1, [Write0, Write1, 0]>]
Now when the first instruction is scheduled, it fills in the current
row of the reservation table with Decode0, Exec0, Write0. This is
counterintuitive because the instruction does not execute on all units
in the same cycle, but it results in a more compact reservation table
and still sufficiently models hazards.
Things only get more complicated if you have functional units that are
not fully pipelined, or you have instructions that use the same functional
units at different pipeline stages.
If I have an instruction that consumes a functional unit for 2 cycles,
during which no other instruction may be issued to that unit, then I
need to do this:
[InstrStage<2, [NonPipelinedUnit]>
If I have an instruction that splits into two dependent microops, that
use the same type of functional unit, but at different times, then I need to
do this:
[InstrStage<1, [ALU0, ALU1], 1>
InstrStage<1, [ALU0, ALU1]>
-Andy
>From TargetScheduled.td:
//===----------------------------------------------------------------------===//
// Instruction stage - These values represent a non-pipelined step in
// the execution of an instruction. Cycles represents the number of
// discrete time slots needed to complete the stage. Units represent
// the choice of functional units that can be used to complete the
// stage. Eg. IntUnit1, IntUnit2. NextCycles indicates how many
// cycles should elapse from the start of this stage to the start of
// the next stage in the itinerary. For example:
//
// A stage is specified in one of two ways:
//
// InstrStage<1, [FU_x, FU_y]> - TimeInc defaults to Cycles
// InstrStage<1, [FU_x, FU_y], 0> - TimeInc explicit
//
class InstrStage<int cycles, list<FuncUnit> units,
int timeinc = -1,
ReservationKind kind = Required> {
int Cycles = cycles; // length of stage in machine cycles
list<FuncUnit> Units = units; // choice of functional units
int TimeInc = timeinc; // cycles till start of next stage
int Kind = kind.Value; // kind of FU reservation
}
> -----Original Message-----
> From: Andrew Trick [mailto:atrick at apple.com]
> Sent: 21 October 2011 02:36
> To: James Molloy
> Cc: Hal Finkel; llvm-commits LLVM; Evan Cheng
> Subject: Re: [llvm-commits] [llvm] r142171 - in /llvm/trunk:
> lib/Target/PowerPC/PPCSchedule440.td
test/CodeGen/PowerPC/ppc440-fp-basic.ll
> test/CodeGen/PowerPC/ppc440-msync.ll
>
> On Oct 20, 2011, at 3:24 PM, Evan Cheng wrote:
>
>>
>> On Oct 20, 2011, at 12:04 PM, James Molloy wrote:
>>
>>> Evan,
>>>
>>> Regarding this, I wanted to ask - there's currently a hard
limit of 32
> FunctionalUnits. Functional units cannot be pipelined, so for example to
> describe a pipeline for a superscalar machine of issue width N taking M
> cycles, one requires N*M functional units.
>>
>> I don't think that's how it works. You can describe a resource
being
> acquired or reserved for M cycles. Perhaps I am not understanding your
> question.
>>
>> Evan
>>
>
> An N-wide machine can be described with N units, regardless of how deep the
> pipeline is.
>
> Furthermore if you only need to model issue width, then you don't even
need
> to describe the pipeline at all. You only need to set the
> InstrItineraryData::IssueWidth field. ARMSubtarget::computeIssueWidth does
> this by assuming something about the convention of ARM itineraries. But you
> could simply embed the issue width constants for your subtargets within the
> target initialization code (in place of computeIssueWidth). I never
bothered
> to add tablegen support for an IssueWidth field in the itinerary because we
> didn't need it for x86 and it is redundant with the existing ARM
> itineraries.
>
> -Andy
>
>>>
>>> This can quickly take you over the 32 unit limit. Is there any plan
(or
> can I implement) pipelined functional units that can accept a new
> instruction every cycle but hold instructions for N cycles?
>>>
>>> Cheers,
>>>
>>> James
>>> ________________________________________
>>> From: llvm-commits-bounces at cs.uiuc.edu [llvm-commits-bounces at
cs.uiuc.edu]
> On Behalf Of Evan Cheng [evan.cheng at apple.com]
>>> Sent: 20 October 2011 18:21
>>> To: Hal Finkel
>>> Cc: llvm-commits at cs.uiuc.edu
>>> Subject: Re: [llvm-commits] [llvm] r142171 - in /llvm/trunk:
> lib/Target/PowerPC/PPCSchedule440.td
test/CodeGen/PowerPC/ppc440-fp-basic.ll
> test/CodeGen/PowerPC/ppc440-msync.ll
>>>
>>> On Oct 19, 2011, at 7:29 PM, Hal Finkel <hfinkel at anl.gov>
wrote:
>>>
>>>> Evan,
>>>>
>>>> Thanks for the heads up! Is there a current target that
implements the
>>>> scheduling as it will be? And does the bottom-up scheduling
also account
>>>
>>> ARM is a good model.
>>>
>>>> for pipeline-conflict hazards?
>>>
>>> Yes, definitely. And it should be doing a much better job of it.
>>>
>>> Evan
>>>
>>>>
>>>> -Hal
>>>>
>>>> On Wed, 2011-10-19 at 16:45 -0700, Evan Cheng wrote:
>>>>> Hi Hal,
>>>>>
>>>>> Heads up. We'll soon abolish top-down pre-register
allocation scheduler
> and force every target to bottom up scheduling. The problem is tt list
> scheduler does not handle physical register dependency at all but it is
> something that's required for some upcoming legalizer change.
>>>>>
>>>>> If you are interested in PPC, you might want to look into
switching its
> scheduler now. The bottom up register pressure aware scheduler should work
> quite well for PPC.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Evan
>>>>>
>>>>> On Oct 16, 2011, at 9:03 PM, Hal Finkel wrote:
>>>>>
>>>>>> Author: hfinkel
>>>>>> Date: Sun Oct 16 23:03:55 2011
>>>>>> New Revision: 142171
>>>>>>
>>>>>> URL:
http://llvm.org/viewvc/llvm-project?rev=142171&view=rev
>>>>>> Log:
>>>>>> Add PPC 440 scheduler and some associated tests (new
files)
>>>>>>
>>>>>> Added:
>>>>>> llvm/trunk/lib/Target/PowerPC/PPCSchedule440.td
>>>>>> llvm/trunk/test/CodeGen/PowerPC/ppc440-fp-basic.ll
>>>>>> llvm/trunk/test/CodeGen/PowerPC/ppc440-msync.ll
>>>>>>
>>>>>> Added: llvm/trunk/lib/Target/PowerPC/PPCSchedule440.td
>>>>>> URL:
>
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCSchedul
> e440.td?rev=142171&view=auto
>>>>>>
>
===========================================================================>
=>>>>>> --- llvm/trunk/lib/Target/PowerPC/PPCSchedule440.td
(added)
>>>>>> +++ llvm/trunk/lib/Target/PowerPC/PPCSchedule440.td Sun
Oct 16
> 23:03:55 2011
>>>>>> @@ -0,0 +1,568 @@
>>>>>> +//===- PPCSchedule440.td - PPC 440 Scheduling
Definitions ----*-
> tablegen -*-===//
>>>>>> +//
>>>>>> +// The LLVM Compiler
Infrastructure
>>>>>> +//
>>>>>> +// This file is distributed under the University of
Illinois Open
> Source
>>>>>> +// License. See LICENSE.TXT for details.
>>>>>> +//
>>>>>>
>
+//===----------------------------------------------------------------------
> ===//
>>>>>> +
>>>>>> +// Primary reference:
>>>>>> +// PowerPC 440x6 Embedded Processor Core UserâEUR(tm)s
Manual.
>>>>>> +// IBM (as updated in) 2010.
>>>>>> +
>>>>>> +// The basic PPC 440 does not include a floating-point
unit; the
> pipeline
>>>>>> +// timings here are constructed to match the FP2 unit
shipped with
> the
>>>>>> +// PPC-440- and PPC-450-based Blue Gene (L and P)
supercomputers.
>>>>>> +// References:
>>>>>> +// S. Chatterjee, et al. Design and exploitation of a
> high-performance
>>>>>> +// SIMD floating-point unit for Blue Gene/L.
>>>>>> +// IBM J. Res. & Dev. 49 (2/3) March/May 2005.
>>>>>> +// also:
>>>>>> +// Carlos Sosa and Brant Knudson. IBM System Blue Gene
Solution:
>>>>>> +// Blue Gene/P Application Development.
>>>>>> +// IBM (as updated in) 2009.
>>>>>> +
>>>>>>
>
+//===----------------------------------------------------------------------
> ===//
>>>>>> +// Functional units on the PowerPC 440/450 chip sets
>>>>>> +//
>>>>>> +def IFTH1 : FuncUnit; // Fetch unit 1
>>>>>> +def IFTH2 : FuncUnit; // Fetch unit 2
>>>>>> +def PDCD1 : FuncUnit; // Decode unit 1
>>>>>> +def PDCD2 : FuncUnit; // Decode unit 2
>>>>>> +def DISS1 : FuncUnit; // Issue unit 1
>>>>>> +def DISS2 : FuncUnit; // Issue unit 2
>>>>>> +def LRACC : FuncUnit; // Register access and dispatch
for
>>>>>> + // the simple integer (J-pipe)
and
>>>>>> + // load/store (L-pipe)
pipelines
>>>>>> +def IRACC : FuncUnit; // Register access and dispatch
for
>>>>>> + // the complex integer (I-pipe)
pipeline
>>>>>> +def FRACC : FuncUnit; // Register access and dispatch
for
>>>>>> + // the floating-point execution
(F-pipe)
> pipeline
>>>>>> +def IEXE1 : FuncUnit; // Execution stage 1 for the I
pipeline
>>>>>> +def IEXE2 : FuncUnit; // Execution stage 2 for the I
pipeline
>>>>>> +def IWB : FuncUnit; // Write-back unit for the I
pipeline
>>>>>> +def JEXE1 : FuncUnit; // Execution stage 1 for the J
pipeline
>>>>>> +def JEXE2 : FuncUnit; // Execution stage 2 for the J
pipeline
>>>>>> +def JWB : FuncUnit; // Write-back unit for the J
pipeline
>>>>>> +def AGEN : FuncUnit; // Address generation for the L
pipeline
>>>>>> +def CRD : FuncUnit; // D-cache access for the L
pipeline
>>>>>> +def LWB : FuncUnit; // Write-back unit for the L
pipeline
>>>>>> +def FEXE1 : FuncUnit; // Execution stage 1 for the F
pipeline
>>>>>> +def FEXE2 : FuncUnit; // Execution stage 2 for the F
pipeline
>>>>>> +def FEXE3 : FuncUnit; // Execution stage 3 for the F
pipeline
>>>>>> +def FEXE4 : FuncUnit; // Execution stage 4 for the F
pipeline
>>>>>> +def FEXE5 : FuncUnit; // Execution stage 5 for the F
pipeline
>>>>>> +def FEXE6 : FuncUnit; // Execution stage 6 for the F
pipeline
>>>>>> +def FWB : FuncUnit; // Write-back unit for the F
pipeline
>>>>>> +
>>>>>> +def LWARX_Hold : FuncUnit; // This is a pseudo-unit
which is used
>>>>>> + // to make sure that no
lwarx/stwcx.
>>>>>> + // instructions are issued
while another
>>>>>> + // lwarx/stwcx. is in the L
pipe.
>>>>>> +
>>>>>> +def GPR_Bypass : Bypass; // The bypass for
general-purpose regs.
>>>>>> +def FPR_Bypass : Bypass; // The bypass for
floating-point regs.
>>>>>> +
>>>>>> +// Notes:
>>>>>> +// Instructions are held in the FRACC, LRACC and IRACC
pipeline
>>>>>> +// stages until their source operands become ready.
Exceptions:
>>>>>> +// - Store instructions will hold in the AGEN stage
>>>>>> +// - The integer multiply-accumulate instruction will
hold in
>>>>>> +// the IEXE1 stage
>>>>>> +//
>>>>>> +// For most I-pipe operations, the result is available
at the end of
>>>>>> +// the IEXE1 stage. Operations such as multiply and
divide must
>>>>>> +// continue to execute in IEXE2 and IWB. Divide
resides in IWB for
>>>>>> +// 33 cycles (multiply also calculates its result in
IWB). For all
>>>>>> +// J-pipe instructions, the result is available
>>>>>> +// at the end of the JEXE1 stage. Loads have a 3-cycle
latency
>>>>>> +// (data is not available until after the LWB stage).
>>>>>> +//
>>>>>> +// The L1 cache hit latency is four cycles for
floating point loads
>>>>>> +// and three cycles for integer loads.
>>>>>> +//
>>>>>> +// The stwcx. instruction requires both the LRACC and
the IRACC
>>>>>> +// dispatch stages. It must be issued from DISS0.
>>>>>> +//
>>>>>> +// All lwarx/stwcx. instructions hold in LRACC if
another
>>>>>> +// uncommitted lwarx/stwcx. is in AGEN, CRD, or LWB.
>>>>>> +//
>>>>>> +// msync (a.k.a. sync) and mbar will hold in LWB until
all load/store
>>>>>> +// resources are empty. AGEN and CRD are held empty
until the
> msync/mbar
>>>>>> +// commits.
>>>>>> +//
>>>>>> +// Most floating-point instructions, computational and
move,
>>>>>> +// have a 5-cycle latency. Divide takes longer (30
cycles).
> Instructions that
>>>>>> +// update the CR take 2 cycles. Stores take 3 cycles
and, as
> mentioned above,
>>>>>> +// loads take 4 cycles (for L1 hit).
>>>>>> +
>>>>>> +//
>>>>>> +// This file defines the itinerary class data for the
PPC 440
> processor.
>>>>>> +//
>>>>>>
>
+//===----------------------------------------------------------------------
> ===//
>>>>>> +
>>>>>> +
>>>>>> +def PPC440Itineraries : ProcessorItineraries<
>>>>>> + [IFTH1, IFTH2, PDCD1, PDCD2, DISS1, DISS2, FRACC,
>>>>>> + IRACC, IEXE1, IEXE2, IWB, LRACC, JEXE1, JEXE2, JWB,
AGEN, CRD,
> LWB,
>>>>>> + FEXE1, FEXE2, FEXE3, FEXE4, FEXE5, FEXE6, FWB,
LWARX_Hold],
>>>>>> + [GPR_Bypass, FPR_Bypass], [
>>>>>> + InstrItinData<IntGeneral , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC, LRACC]>,
>>>>>> + InstrStage<1,
[IEXE1, JEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2, JEXE2]>,
>>>>>> + InstrStage<1, [IWB,
JWB]>],
>>>>>> + [6, 4, 4],
>>>>>> + [GPR_Bypass, GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<IntCompare , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC, LRACC]>,
>>>>>> + InstrStage<1,
[IEXE1, JEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2, JEXE2]>,
>>>>>> + InstrStage<1, [IWB,
JWB]>],
>>>>>> + [6, 4, 4],
>>>>>> + [NoBypass, GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<IntDivW , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<33,
[IWB]>],
>>>>>> + [40, 4, 4],
>>>>>> + [NoBypass, GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<IntMFFS , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>],
>>>>>> + [7, 4, 4],
>>>>>> + [GPR_Bypass, GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<IntMTFSB0 , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>],
>>>>>> + [7, 4, 4],
>>>>>> + [GPR_Bypass, GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<IntMulHW , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>],
>>>>>> + [8, 4, 4],
>>>>>> + [NoBypass, GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<IntMulHWU , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>],
>>>>>> + [8, 4, 4],
>>>>>> + [NoBypass, GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<IntMulLI , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>],
>>>>>> + [8, 4, 4],
>>>>>> + [NoBypass, GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<IntRotate , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC, LRACC]>,
>>>>>> + InstrStage<1,
[IEXE1, JEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2, JEXE2]>,
>>>>>> + InstrStage<1, [IWB,
JWB]>],
>>>>>> + [6, 4, 4],
>>>>>> + [GPR_Bypass, GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<IntShift , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC, LRACC]>,
>>>>>> + InstrStage<1,
[IEXE1, JEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2, JEXE2]>,
>>>>>> + InstrStage<1, [IWB,
JWB]>],
>>>>>> + [6, 4, 4],
>>>>>> + [GPR_Bypass, GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<IntTrapW , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>],
>>>>>> + [6, 4],
>>>>>> + [GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<BrB , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>],
>>>>>> + [8, 4],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<BrCR , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>],
>>>>>> + [8, 4, 4],
>>>>>> + [NoBypass, GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<BrMCR , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>],
>>>>>> + [8, 4, 4],
>>>>>> + [NoBypass, GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<BrMCRX , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>],
>>>>>> + [8, 4, 4],
>>>>>> + [NoBypass, GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<LdStDCBA , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[LRACC]>,
>>>>>> + InstrStage<1,
[AGEN]>,
>>>>>> + InstrStage<1,
[CRD]>,
>>>>>> + InstrStage<1,
[LWB]>],
>>>>>> + [8, 5],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<LdStDCBF , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[LRACC]>,
>>>>>> + InstrStage<1,
[AGEN]>,
>>>>>> + InstrStage<1,
[CRD]>,
>>>>>> + InstrStage<1,
[LWB]>],
>>>>>> + [8, 5],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<LdStDCBI , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[LRACC]>,
>>>>>> + InstrStage<1,
[AGEN]>,
>>>>>> + InstrStage<1,
[CRD]>,
>>>>>> + InstrStage<1,
[LWB]>],
>>>>>> + [8, 5],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<LdStGeneral , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[LRACC]>,
>>>>>> + InstrStage<1,
[AGEN]>,
>>>>>> + InstrStage<1,
[CRD]>,
>>>>>> + InstrStage<2,
[LWB]>],
>>>>>> + [9, 5], // FIXME: should
be [9, 5] for
> loads and
>>>>>> + // [8, 5] for
stores.
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<LdStICBI , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[LRACC]>,
>>>>>> + InstrStage<1,
[AGEN]>,
>>>>>> + InstrStage<1,
[CRD]>,
>>>>>> + InstrStage<1,
[LWB]>],
>>>>>> + [8, 5],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<LdStUX , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[LRACC]>,
>>>>>> + InstrStage<1,
[AGEN]>,
>>>>>> + InstrStage<1,
[CRD]>,
>>>>>> + InstrStage<1,
[LWB]>],
>>>>>> + [8, 5, 5],
>>>>>> + [NoBypass, GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<LdStLFD , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[LRACC]>,
>>>>>> + InstrStage<1,
[AGEN]>,
>>>>>> + InstrStage<1,
[CRD]>,
>>>>>> + InstrStage<2,
[LWB]>],
>>>>>> + [9, 5, 5],
>>>>>> + [NoBypass, GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<LdStLFDU , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[LRACC]>,
>>>>>> + InstrStage<1,
[AGEN]>,
>>>>>> + InstrStage<1,
[CRD]>,
>>>>>> + InstrStage<1,
[LWB]>],
>>>>>> + [9, 5, 5],
>>>>>> + [NoBypass, GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<LdStLHA , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[LRACC]>,
>>>>>> + InstrStage<1,
[AGEN]>,
>>>>>> + InstrStage<1,
[CRD]>,
>>>>>> + InstrStage<1,
[LWB]>],
>>>>>> + [8, 5],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<LdStLMW , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[LRACC]>,
>>>>>> + InstrStage<1,
[AGEN]>,
>>>>>> + InstrStage<1,
[CRD]>,
>>>>>> + InstrStage<1,
[LWB]>],
>>>>>> + [8, 5],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<LdStLWARX , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1]>,
>>>>>> + InstrStage<1,
[IRACC], 0>,
>>>>>> + InstrStage<4,
[LWARX_Hold], 0>,
>>>>>> + InstrStage<1,
[LRACC]>,
>>>>>> + InstrStage<1,
[AGEN]>,
>>>>>> + InstrStage<1,
[CRD]>,
>>>>>> + InstrStage<1,
[LWB]>],
>>>>>> + [8, 5],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<LdStSTWCX , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1]>,
>>>>>> + InstrStage<1,
[IRACC], 0>,
>>>>>> + InstrStage<4,
[LWARX_Hold], 0>,
>>>>>> + InstrStage<1,
[LRACC]>,
>>>>>> + InstrStage<1,
[AGEN]>,
>>>>>> + InstrStage<1,
[CRD]>,
>>>>>> + InstrStage<1,
[LWB]>],
>>>>>> + [8, 5],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<LdStSync , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[LRACC]>,
>>>>>> + InstrStage<3,
[AGEN], 1>,
>>>>>> + InstrStage<2, [CRD],
1>,
>>>>>> + InstrStage<1,
[LWB]>]>,
>>>>>> + InstrItinData<SprISYNC , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[FRACC], 0>,
>>>>>> + InstrStage<1,
[LRACC], 0>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[FEXE1], 0>,
>>>>>> + InstrStage<1,
[AGEN], 0>,
>>>>>> + InstrStage<1,
[JEXE1], 0>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[FEXE2], 0>,
>>>>>> + InstrStage<1, [CRD],
0>,
>>>>>> + InstrStage<1,
[JEXE2], 0>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<6,
[FEXE3], 0>,
>>>>>> + InstrStage<6, [LWB],
0>,
>>>>>> + InstrStage<6, [JWB],
0>,
>>>>>> + InstrStage<6,
[IWB]>]>,
>>>>>> + InstrItinData<SprMFSR , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>],
>>>>>> + [6, 4],
>>>>>> + [GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<SprMTMSR , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>],
>>>>>> + [6, 4],
>>>>>> + [GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<SprMTSR , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<3,
[IWB]>],
>>>>>> + [9, 4],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<SprTLBSYNC , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>]>,
>>>>>> + InstrItinData<SprMFCR , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>],
>>>>>> + [8, 4],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<SprMFMSR , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>],
>>>>>> + [7, 4],
>>>>>> + [GPR_Bypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<SprMFSPR , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<3,
[IWB]>],
>>>>>> + [10, 4],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<SprMFTB , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<3,
[IWB]>],
>>>>>> + [10, 4],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<SprMTSPR , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<3,
[IWB]>],
>>>>>> + [10, 4],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<SprMTSRIN , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<3,
[IWB]>],
>>>>>> + [10, 4],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<SprRFI , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>],
>>>>>> + [8, 4],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<SprSC , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[IRACC]>,
>>>>>> + InstrStage<1,
[IEXE1]>,
>>>>>> + InstrStage<1,
[IEXE2]>,
>>>>>> + InstrStage<1,
[IWB]>],
>>>>>> + [8, 4],
>>>>>> + [NoBypass,
GPR_Bypass]>,
>>>>>> + InstrItinData<FPGeneral , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[FRACC]>,
>>>>>> + InstrStage<1,
[FEXE1]>,
>>>>>> + InstrStage<1,
[FEXE2]>,
>>>>>> + InstrStage<1,
[FEXE3]>,
>>>>>> + InstrStage<1,
[FEXE4]>,
>>>>>> + InstrStage<1,
[FEXE5]>,
>>>>>> + InstrStage<1,
[FEXE6]>,
>>>>>> + InstrStage<1,
[FWB]>],
>>>>>> + [10, 4, 4],
>>>>>> + [FPR_Bypass, FPR_Bypass,
FPR_Bypass]>,
>>>>>> + InstrItinData<FPCompare , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[FRACC]>,
>>>>>> + InstrStage<1,
[FEXE1]>,
>>>>>> + InstrStage<1,
[FEXE2]>,
>>>>>> + InstrStage<1,
[FEXE3]>,
>>>>>> + InstrStage<1,
[FEXE4]>,
>>>>>> + InstrStage<1,
[FEXE5]>,
>>>>>> + InstrStage<1,
[FEXE6]>,
>>>>>> + InstrStage<1,
[FWB]>],
>>>>>> + [10, 4, 4],
>>>>>> + [FPR_Bypass, FPR_Bypass,
FPR_Bypass]>,
>>>>>> + InstrItinData<FPDivD , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[FRACC]>,
>>>>>> + InstrStage<1,
[FEXE1]>,
>>>>>> + InstrStage<1,
[FEXE2]>,
>>>>>> + InstrStage<1,
[FEXE3]>,
>>>>>> + InstrStage<1,
[FEXE4]>,
>>>>>> + InstrStage<1,
[FEXE5]>,
>>>>>> + InstrStage<1,
[FEXE6]>,
>>>>>> + InstrStage<25,
[FWB]>],
>>>>>> + [35, 4, 4],
>>>>>> + [NoBypass, FPR_Bypass,
FPR_Bypass]>,
>>>>>> + InstrItinData<FPDivS , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[FRACC]>,
>>>>>> + InstrStage<1,
[FEXE1]>,
>>>>>> + InstrStage<1,
[FEXE2]>,
>>>>>> + InstrStage<1,
[FEXE3]>,
>>>>>> + InstrStage<1,
[FEXE4]>,
>>>>>> + InstrStage<1,
[FEXE5]>,
>>>>>> + InstrStage<1,
[FEXE6]>,
>>>>>> + InstrStage<13,
[FWB]>],
>>>>>> + [23, 4, 4],
>>>>>> + [NoBypass, FPR_Bypass,
FPR_Bypass]>,
>>>>>> + InstrItinData<FPFused , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[FRACC]>,
>>>>>> + InstrStage<1,
[FEXE1]>,
>>>>>> + InstrStage<1,
[FEXE2]>,
>>>>>> + InstrStage<1,
[FEXE3]>,
>>>>>> + InstrStage<1,
[FEXE4]>,
>>>>>> + InstrStage<1,
[FEXE5]>,
>>>>>> + InstrStage<1,
[FEXE6]>,
>>>>>> + InstrStage<1,
[FWB]>],
>>>>>> + [10, 4, 4, 4],
>>>>>> + [FPR_Bypass, FPR_Bypass,
FPR_Bypass,
> FPR_Bypass]>,
>>>>>> + InstrItinData<FPRes , [InstrStage<1,
[IFTH1, IFTH2]>,
>>>>>> + InstrStage<1,
[PDCD1, PDCD2]>,
>>>>>> + InstrStage<1,
[DISS1, DISS2]>,
>>>>>> + InstrStage<1,
[FRACC]>,
>>>>>> + InstrStage<1,
[FEXE1]>,
>>>>>> + InstrStage<1,
[FEXE2]>,
>>>>>> + InstrStage<1,
[FEXE3]>,
>>>>>> + InstrStage<1,
[FEXE4]>,
>>>>>> + InstrStage<1,
[FEXE5]>,
>>>>>> + InstrStage<1,
[FEXE6]>,
>>>>>> + InstrStage<1,
[FWB]>],
>>>>>> + [10, 4],
>>>>>> + [FPR_Bypass,
FPR_Bypass]>
>>>>>> +]>;
>>>>>>
>>>>>> Added:
llvm/trunk/test/CodeGen/PowerPC/ppc440-fp-basic.ll
>>>>>> URL:
>
http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/ppc440-f
> p-basic.ll?rev=142171&view=auto
>>>>>>
>
===========================================================================>
=>>>>>> --- llvm/trunk/test/CodeGen/PowerPC/ppc440-fp-basic.ll
(added)
>>>>>> +++ llvm/trunk/test/CodeGen/PowerPC/ppc440-fp-basic.ll
Sun Oct 16
> 23:03:55 2011
>>>>>> @@ -0,0 +1,32 @@
>>>>>> +; RUN: llc < %s -march=ppc32 -mcpu=440 | grep fmadd
>>>>>> +
>>>>>> +%0 = type { double, double }
>>>>>> +
>>>>>> +define void @maybe_an_fma(%0* sret %agg.result, %0*
byval %a, %0*
> byval %b, %0* byval %c) nounwind {
>>>>>> +entry:
>>>>>> + %a.realp = getelementptr inbounds %0* %a, i32 0, i32
0
>>>>>> + %a.real = load double* %a.realp
>>>>>> + %a.imagp = getelementptr inbounds %0* %a, i32 0, i32
1
>>>>>> + %a.imag = load double* %a.imagp
>>>>>> + %b.realp = getelementptr inbounds %0* %b, i32 0, i32
0
>>>>>> + %b.real = load double* %b.realp
>>>>>> + %b.imagp = getelementptr inbounds %0* %b, i32 0, i32
1
>>>>>> + %b.imag = load double* %b.imagp
>>>>>> + %mul.rl = fmul double %a.real, %b.real
>>>>>> + %mul.rr = fmul double %a.imag, %b.imag
>>>>>> + %mul.r = fsub double %mul.rl, %mul.rr
>>>>>> + %mul.il = fmul double %a.imag, %b.real
>>>>>> + %mul.ir = fmul double %a.real, %b.imag
>>>>>> + %mul.i = fadd double %mul.il, %mul.ir
>>>>>> + %c.realp = getelementptr inbounds %0* %c, i32 0, i32
0
>>>>>> + %c.real = load double* %c.realp
>>>>>> + %c.imagp = getelementptr inbounds %0* %c, i32 0, i32
1
>>>>>> + %c.imag = load double* %c.imagp
>>>>>> + %add.r = fadd double %mul.r, %c.real
>>>>>> + %add.i = fadd double %mul.i, %c.imag
>>>>>> + %real = getelementptr inbounds %0* %agg.result, i32
0, i32 0
>>>>>> + %imag = getelementptr inbounds %0* %agg.result, i32
0, i32 1
>>>>>> + store double %add.r, double* %real
>>>>>> + store double %add.i, double* %imag
>>>>>> + ret void
>>>>>> +}
>>>>>>
>>>>>> Added: llvm/trunk/test/CodeGen/PowerPC/ppc440-msync.ll
>>>>>> URL:
>
http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/ppc440-m
> sync.ll?rev=142171&view=auto
>>>>>>
>
===========================================================================>
=>>>>>> --- llvm/trunk/test/CodeGen/PowerPC/ppc440-msync.ll
(added)
>>>>>> +++ llvm/trunk/test/CodeGen/PowerPC/ppc440-msync.ll Sun
Oct 16
> 23:03:55 2011
>>>>>> @@ -0,0 +1,23 @@
>>>>>> +; RUN: llc < %s -march=ppc32 -o %t
>>>>>> +; RUN: grep sync %t
>>>>>> +; RUN: not grep msync %t
>>>>>> +; RUN: llc < %s -march=ppc32 -mcpu=440 | grep msync
>>>>>> +
>>>>>> +define i32 @has_a_fence(i32 %a, i32 %b) nounwind {
>>>>>> +entry:
>>>>>> + fence acquire
>>>>>> + %cond = icmp eq i32 %a, %b
>>>>>> + br i1 %cond, label %IfEqual, label %IfUnequal
>>>>>> +
>>>>>> +IfEqual:
>>>>>> + fence release
>>>>>> + br label %end
>>>>>> +
>>>>>> +IfUnequal:
>>>>>> + fence release
>>>>>> + ret i32 0
>>>>>> +
>>>>>> +end:
>>>>>> + ret i32 1
>>>>>> +}
>>>>>> +
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> llvm-commits mailing list
>>>>>> llvm-commits at cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>
>>>>
>>>> --
>>>> Hal Finkel
>>>> Postdoctoral Appointee
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>>> 1-630-252-0023
>>>> hfinkel at anl.gov
>>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>> -- IMPORTANT NOTICE: The contents of this email and any attachments
are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy the
> information in any medium. Thank you.
>>>
>>
>
>
>
>
>