thr3ads.net - llvm dev - [llvm-dev] Per-write cycle count with ReadAdvance

If this information is useful, please help other people find it:
Share via:

Garfee Guan via llvm-dev

2018-Nov-19 06:03 UTC

[llvm-dev] Per-write cycle count with ReadAdvance - Do I really need that?

It does not work. I have tried to use the latest master today. But tblgen
still give me information like

error: Resources are defined for both SchedRead and its alias on processor
MyArchModel

def : ReadAdvance<MyReadVector, 3, [MyWriteAddVector]>;
^

Unless I change "MyReadVector" to another read like
"MyReadVector1", it
would not work. Debugging into tblgen, there is no path to handle multiplle
latencies for same Read...

Anyway as you reminded, I am searching for more Target and am looking into
Pierre's change (I finally notice that he has a patch associated within the
thread already :-) If it is feasible, I will try to make any suitable
change back upstream)

-Garfee

On Sat, Nov 17, 2018, 10:42 AM Andrew Trick <atrick at apple.com wrote:
>
>
> On Nov 16, 2018, at 6:31 PM, Garfee Guan <garfee.guan at gmail.com>
wrote:
>
> Thanks Andrew. I have tried with recent tblgen, ReadAdvance would not work
> for multiple latencies. Maybe I should make improvement into tblgen if
Pierre-Andre
> does not have the change anymore.
>
> However, I just a little curious about the situation I met. The hardware
> forwording may fail for different reasons, which different register read
> may have different latencies, depending both on the register reader and
> writer. I am freshman into tblgen. So I wonder if any other Target already
> has other way to describe that .
>
>
> Does this work for you?
>
> // Forward from a vector op (normal, add, mul) to a non-store.
> def : ReadAdvance<MyReadVector, 5, [WriteVector]>;
> def : ReadAdvance<MyReadVector, 3, [MyWriteAddVector]>;
> def : ReadAdvance<MyReadVector, 1, [MyWriteMulVector]>;
>
> A ReadAdvance is associated with a pair of write resource -> read
> resource. You can specify as many variants of read/write resources as you
> want, even using arbitrary C++ code inside a predicate. So, in theory I
> think that should be flexible enough.
>
> You can search the in-tree targets to see where ReadAdvance definitions
> are used. Sorry, I’m not familiar with anything beyond that, but maybe
> someone else on the list has dealt with the same problem.
>
> -Andy
>
> On Fri, Nov 16, 2018, 8:00 AM Andrew Trick <atrick at apple.com wrote:
>
>>
>>
>> On Nov 14, 2018, at 10:52 PM, Garfee Guan via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>> Hi list,
>> I happened to read below thread (written in 3 years ago). I think I may
>> need this ReadAdvance feature to work with my ARCH.
>>
>> It is about the scheduler info which describes reading my ARCH's
vector
>> register. There are different latencies since forwarding/bypass
appears. I
>> give it as below example:
>>
>> def : WriteRes<WriteVector,    [MyArchVALU]>  { let Latency = 6;
}
>> ...
>> def MyWriteAddVector : SchedWriteRes<[MyArchVALU]> { let Latency
= 6; }
>> def MyWriteMulVector : SchedWriteRes<[MyArchVALU]> { let Latency
= 6; }
>> ...
>>
>> Here I defined 3 different Writes with same latency number. Below shows
>> the forwarding.
>>
>> def : ReadAdvance<MyReadVector, 5, [WriteVector]>;
>> def : ReadAdvance<MyReadVector, 3, [MyWriteAddVector_3cycles]>;
>> def : ReadAdvance<MyReadVector, 1, [MyWriteMulVector_5cycles]>;
>> ...
>> def : ReadAdvance<MyReadStoreVector, 0, [WriteVector]>;
>> def : ReadAdvance<MyReadStoreVector, 0,
[MyWriteAddVector_3cycles]>;
>> def : ReadAdvance<MyReadStoreVector, 0,
[MyWriteMulVector_5cycles]>;
>> ...
>>
>> Basically my intention is to model that, for any non-store instruction
>> which reads vector, it forwards vector write to: normally 1 cycle, 3
>> cycles for my ADD, 5 cycles for my MUL. But for any store instruction
>> takes vector register as source, It can not forward. So the latency is
kept
>> as 6.
>>
>> Unfortunately, above code can not be compiled by tblgen. I am not sure
if
>> I really need per-write cycle count with ReadAdvance, or there is any
>> existed method to meet my requirement. Anyway the latencies here seems
to
>> be decided by considering both
>>
>> a) 3 kinds of Write,
>> b) 2 kinds of Read.
>>
>> Therefore I doubt if it can not be modeled with current tblgen
implement.
>>
>>
>> I’m not sure if the TableGen bug mentioned below was ever fixed.
>>
>> It looks to me like this should work, but I haven’t tried it:
>>
>> def : WriteRes<WriteVector,    [MyArchVALU]>  { let Latency = 6;
}
>> def MyWriteAddVector : SchedWriteRes<[MyArchVALU]> { let Latency
= 6; }
>> def MyWriteMulVector : SchedWriteRes<[MyArchVALU]> { let Latency
= 6; }
>>
>> // Forward from a vector op (normal, add, mul) to a non-store.
>> def : ReadAdvance<MyReadVector, 5, [WriteVector]>;
>> def : ReadAdvance<MyReadVector, 3, [MyWriteAddVector]>;
>> def : ReadAdvance<MyReadVector, 1, [MyWriteMulVector]>;
>>
>> Additionally, you could do this but I don’t think it would have any
>> effect at all:
>>
>> // Forward from a vector op (normal, add, mul) to a store.
>> def : ReadAdvance<MyReadStoreVector, 0,
>> [WriteVector, MyWriteAddVector, MyWriteMulVector]>;
>>
>> -Andy
>>
>> --
>> Garfee Guan,
>> LLVM Compiler Backend Engineer
>> Enflame Technology Co.
>> Website: http://www.enflame-tech.com/
>>
>> --------------------------------------------------------------------
>> [llvm-dev] Per-write cycle count with ReadAdvance
>> *Pierre-Andre Saulais via llvm-dev* llvm-dev at lists.llvm.org
>>
<llvm-dev%40lists.llvm.org?Subject=Re%3A%20%5Bllvm-dev%5D%20Per-write%20cycle%20count%20with%20ReadAdvance&In-Reply-To=%3C565C3F99.9060206%40codeplay.com%3E>
>> *Mon Nov 30 04:22:49 PST 2015*
>>
>>
>>    - Previous message: [llvm-dev] difference with autotools, cmake and
>>    ninja building methods
>>   
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/092870.html>
>>    - Next message: [llvm-dev] LLVM Weekly - #100, Nov 30th 2015
>>   
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/092850.html>
>>    - *Messages sorted by:* [ date ]
>>   
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/date.html#92849>
>>     [ thread ]
>>   
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/thread.html#92849>
>>     [ subject ]
>>   
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/subject.html#92849>
>>     [ author ]
>>   
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/author.html#92849>
>>
>> ------------------------------
>>
>> Hi all,
>>
>> I am working on a backend that uses the ProcResource scheduling model
>> and one limitation I found is that while it is possible to specify
>> multiple SchedWrites in a ReadAdvance record, each write uses the same
>> cycle count. I tried writing multiple ReadAdvance records for the same
>> SchedRead, but tablegen does not seem to allow that.
>>
>> It would be useful to have a per-write cycle count to model different
>> pipeline bypasses, where the cycle count depends on the (read, write)
>> pair and not just on the read.
>>
>> Two possible solutions are: 1) changing the 'Cycles' field in
>> (Proc)ReadAdvance to be a list of int and 2) changing tablegen to allow
>> multiple (Proc)ReadAdvance records with the same read resource.
>>
>> The former solution doesn't seem ideal as it requires repeating the
>> cycle count many times for targets that use long SchedWriteRes lists:
>>
>> -def : ReadAdvance<ReadIM, 1, [WriteImm,WriteI,
>> +def: ReadAdvance<ReadIM, [1, 1, 1, 1, 1, 1, 1, 1], [WriteImm,
WriteI,
>>                                 WriteISReg, WriteIEReg,WriteIS,
>>                                 WriteID32,WriteID64,
>>                                 WriteIM32,WriteIM64]>;
>>
>> The latter is a bit more verbose when per-write cycle count is used,
but
>> requires no change to existing targets. It is also easier to visually
>> match cycle counts to write types:
>>
>> def : ReadAdvance<ReadFoo, 2, [WriteType1]>;
>> def : ReadAdvance<ReadFoo, 4, [WriteType2]>;
>> def : ReadAdvance<ReadFoo, 3, [WriteType3]>;
>>
>> I have a patch for the second solution. Would that benefit any in-tree
>> target?
>>
>> Thanks,
>> Pierre-Andre
>>
>> --
>> Pierre-Andre Saulais
>> Principal Software Engineer, Compilers
>> Codeplay Software Ltd
>> Level C, Argyle House
>> 3 Lady Lawson St,
>> Edinburgh EH3 9DR
>> Tel: 0131 466 0503
>> Fax: 0131 557 6600
>> Website: http://www.codeplay.com
>> Twitter: https://twitter.com/codeplaysoft
>>
>> This email and any attachments may contain confidential and /or
privileged information and is for use by the addressee only. If you are not the
intended recipient, please notify Codeplay Software Ltd immediately and delete
the message from your computer. You may not copy or forward it, or use or
disclose its contents to any other person. Any views or other information in
this message which do not relate to our business are not authorized by Codeplay
software Ltd, nor does this message form part of any contract unless so stated.
>> As internet communications are capable of data corruption Codeplay
Software Ltd does not accept any responsibility for any changes made to this
message after it was sent. Please note that Codeplay Software Ltd does not
accept any liability or responsibility for viruses and it is your responsibility
to scan any attachments.
>> Company registered in England and Wales, number: 04567874
>> Registered office: 81 Linkfield Street, Redhill RH1 6BY
>>
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: multiple_readadvance.patch
>> Type: text/x-patch
>> Size: 6336 bytes
>> Desc: not available
>> URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151130/08d3acbf/attachment.bin>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20181119/81aea831/attachment.html>

Andrew Trick via llvm-dev

2018-Nov-19 16:36 UTC

head link

[llvm-dev] Per-write cycle count with ReadAdvance - Do I really need that?

> On Nov 18, 2018, at 10:03 PM, Garfee Guan <garfee.guan at gmail.com>
wrote:
> 
> It does not work. I have tried to use the latest master today. But tblgen
still give me information like
> 
> error: Resources are defined for both SchedRead and its alias on processor
MyArchModel
> 
> def : ReadAdvance<MyReadVector, 3, [MyWriteAddVector]>;  
> ^
> 
> Unless I change "MyReadVector" to another read like
"MyReadVector1", it would not work. Debugging into tblgen, there is no
path to handle multiplle latencies for same Read...
> 
> Anyway as you reminded, I am searching for more Target and am looking into
Pierre's change (I finally notice that he has a patch associated within the
thread already :-) If it is feasible, I will try to make any suitable change
back upstream)
> 
> -Garfee
I see what you mean. I thought the problem was with multiple latencies
associated with a single definition: ReadAdvance<Read1, #, [Write1,
Write2]>. There definitely should be some way to make this work. If you can
upstream the patch that would be fantastic.
-Andy
> 
> On Sat, Nov 17, 2018, 10:42 AM Andrew Trick <atrick at apple.com
<mailto:atrick at apple.com> wrote:
> 
> 
>> On Nov 16, 2018, at 6:31 PM, Garfee Guan <garfee.guan at gmail.com
<mailto:garfee.guan at gmail.com>> wrote:
>> 
>> Thanks Andrew. I have tried with recent tblgen, ReadAdvance would not
work for multiple latencies. Maybe I should make improvement into tblgen if
Pierre-Andre does not have the change anymore.
>> 
>> However, I just a little curious about the situation I met. The
hardware forwording may fail for different reasons, which different register
read may have different latencies, depending both on the register reader and
writer. I am freshman into tblgen. So I wonder if any other Target already has
other way to describe that .
> 
> Does this work for you?
> 
> // Forward from a vector op (normal, add, mul) to a non-store.
> def : ReadAdvance<MyReadVector, 5, [WriteVector]>;  
> def : ReadAdvance<MyReadVector, 3, [MyWriteAddVector]>;
> def : ReadAdvance<MyReadVector, 1, [MyWriteMulVector]>;
> 
> A ReadAdvance is associated with a pair of write resource -> read
resource. You can specify as many variants of read/write resources as you want,
even using arbitrary C++ code inside a predicate. So, in theory I think that
should be flexible enough.
> 
> You can search the in-tree targets to see where ReadAdvance definitions are
used. Sorry, I’m not familiar with anything beyond that, but maybe someone else
on the list has dealt with the same problem.
> 
> -Andy
> 
>> On Fri, Nov 16, 2018, 8:00 AM Andrew Trick <atrick at apple.com
<mailto:atrick at apple.com> wrote:
>> 
>> 
>>> On Nov 14, 2018, at 10:52 PM, Garfee Guan via llvm-dev <llvm-dev
at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>> 
>>> Hi list,
>>> 
>>> I happened to read below thread (written in 3 years ago). I think I
may need this ReadAdvance feature to work with my ARCH.
>>> 
>>> It is about the scheduler info which describes reading my
ARCH's vector register. There are different latencies since
forwarding/bypass appears. I give it as below example:
>>> 
>>> def : WriteRes<WriteVector,    [MyArchVALU]>  { let Latency =
6; }
>>> ...
>>> def MyWriteAddVector : SchedWriteRes<[MyArchVALU]> { let
Latency = 6; }
>>> def MyWriteMulVector : SchedWriteRes<[MyArchVALU]> { let
Latency = 6; }
>>> ...
>>> 
>>> Here I defined 3 different Writes with same latency number. Below
shows the forwarding.
>>> 
>>> def : ReadAdvance<MyReadVector, 5, [WriteVector]>;  
>>> def : ReadAdvance<MyReadVector, 3,
[MyWriteAddVector_3cycles]>;
>>> def : ReadAdvance<MyReadVector, 1,
[MyWriteMulVector_5cycles]>;
>>> ...
>>> def : ReadAdvance<MyReadStoreVector, 0, [WriteVector]>;  
>>> def : ReadAdvance<MyReadStoreVector, 0,
[MyWriteAddVector_3cycles]>;
>>> def : ReadAdvance<MyReadStoreVector, 0,
[MyWriteMulVector_5cycles]>;
>>> ...
>>> 
>>> Basically my intention is to model that, for any non-store
instruction which reads vector, it forwards vector write to: normally 1 cycle, 3
cycles for my ADD, 5 cycles for my MUL. But for any store instruction takes
vector register as source, It can not forward. So the latency is kept as 6.
>>> 
>>> Unfortunately, above code can not be compiled by tblgen. I am not
sure if I really need per-write cycle count with ReadAdvance, or there is any
existed method to meet my requirement. Anyway the latencies here seems to be
decided by considering both
>>> 
>>> a) 3 kinds of Write, 
>>> b) 2 kinds of Read. 
>>> 
>>> Therefore I doubt if it can not be modeled with current tblgen
implement.
>> 
>> I’m not sure if the TableGen bug mentioned below was ever fixed.
>> 
>> It looks to me like this should work, but I haven’t tried it:
>> 
>> def : WriteRes<WriteVector,    [MyArchVALU]>  { let Latency = 6;
}
>> def MyWriteAddVector : SchedWriteRes<[MyArchVALU]> { let Latency
= 6; }
>> def MyWriteMulVector : SchedWriteRes<[MyArchVALU]> { let Latency
= 6; }
>> 
>> // Forward from a vector op (normal, add, mul) to a non-store.
>> def : ReadAdvance<MyReadVector, 5, [WriteVector]>;  
>> def : ReadAdvance<MyReadVector, 3, [MyWriteAddVector]>;
>> def : ReadAdvance<MyReadVector, 1, [MyWriteMulVector]>;
>> 
>> Additionally, you could do this but I don’t think it would have any
effect at all:
>> 
>> // Forward from a vector op (normal, add, mul) to a store.
>> def : ReadAdvance<MyReadStoreVector, 0, [WriteVector,
MyWriteAddVector, MyWriteMulVector]>;
>> 
>> -Andy
>> 
>>> --
>>> Garfee Guan,
>>> LLVM Compiler Backend Engineer
>>> Enflame Technology Co.
>>> Website: http://www.enflame-tech.com/
<http://www.enflame-tech.com/>
>>> 
>>>
--------------------------------------------------------------------
>>> [llvm-dev] Per-write cycle count with ReadAdvance
>>> 
>>> Pierre-Andre Saulais via llvm-dev llvm-dev at lists.llvm.org 
<mailto:llvm-dev%40lists.llvm.org?Subject=Re%3A%20%5Bllvm-dev%5D%20Per-write%20cycle%20count%20with%20ReadAdvance&In-Reply-To=%3C565C3F99.9060206%40codeplay.com%3E>
>>> Mon Nov 30 04:22:49 PST 2015
>>> 
>>> Previous message: [llvm-dev] difference with autotools,	cmake and
ninja building methods
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/092870.html>
>>> Next message: [llvm-dev] LLVM Weekly - #100, Nov 30th 2015
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/092850.html>
>>> Messages sorted by: [ date ]
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/date.html#92849> [
thread ]
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/thread.html#92849>
[ subject ]
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/subject.html#92849>
[ author ]
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/author.html#92849>
>>> Hi all,
>>> 
>>> I am working on a backend that uses the ProcResource scheduling
model
>>> and one limitation I found is that while it is possible to specify 
>>> multiple SchedWrites in a ReadAdvance record, each write uses the
same
>>> cycle count. I tried writing multiple ReadAdvance records for the
same
>>> SchedRead, but tablegen does not seem to allow that.
>>> 
>>> It would be useful to have a per-write cycle count to model
different
>>> pipeline bypasses, where the cycle count depends on the (read,
write)
>>> pair and not just on the read.
>>> 
>>> Two possible solutions are: 1) changing the 'Cycles' field
in
>>> (Proc)ReadAdvance to be a list of int and 2) changing tablegen to
allow
>>> multiple (Proc)ReadAdvance records with the same read resource.
>>> 
>>> The former solution doesn't seem ideal as it requires repeating
the
>>> cycle count many times for targets that use long SchedWriteRes
lists:
>>> 
>>> -def : ReadAdvance<ReadIM, 1, [WriteImm,WriteI,
>>> +def: ReadAdvance<ReadIM, [1, 1, 1, 1, 1, 1, 1, 1], [WriteImm,
WriteI,
>>>                                 WriteISReg, WriteIEReg,WriteIS,
>>>                                 WriteID32,WriteID64,
>>>                                 WriteIM32,WriteIM64]>;
>>> 
>>> The latter is a bit more verbose when per-write cycle count is
used, but
>>> requires no change to existing targets. It is also easier to
visually
>>> match cycle counts to write types:
>>> 
>>> def : ReadAdvance<ReadFoo, 2, [WriteType1]>;
>>> def : ReadAdvance<ReadFoo, 4, [WriteType2]>;
>>> def : ReadAdvance<ReadFoo, 3, [WriteType3]>;
>>> 
>>> I have a patch for the second solution. Would that benefit any
in-tree
>>> target?
>>> 
>>> Thanks,
>>> Pierre-Andre
>>> 
>>> -- 
>>> Pierre-Andre Saulais
>>> Principal Software Engineer, Compilers
>>> Codeplay Software Ltd
>>> Level C, Argyle House
>>> 3 Lady Lawson St,
>>> Edinburgh EH3 9DR
>>> Tel: 0131 466 0503
>>> Fax: 0131 557 6600
>>> Website: http://www.codeplay.com <http://www.codeplay.com/>
>>> Twitter: https://twitter.com/codeplaysoft
<https://twitter.com/codeplaysoft>
>>> 
>>> This email and any attachments may contain confidential and /or
privileged information and is for use by the addressee only. If you are not the
intended recipient, please notify Codeplay Software Ltd immediately and delete
the message from your computer. You may not copy or forward it, or use or
disclose its contents to any other person. Any views or other information in
this message which do not relate to our business are not authorized by Codeplay
software Ltd, nor does this message form part of any contract unless so stated.
>>> As internet communications are capable of data corruption Codeplay
Software Ltd does not accept any responsibility for any changes made to this
message after it was sent. Please note that Codeplay Software Ltd does not
accept any liability or responsibility for viruses and it is your responsibility
to scan any attachments.
>>> Company registered in England and Wales, number: 04567874
>>> Registered office: 81 Linkfield Street, Redhill RH1 6BY
>>> 
>>> -------------- next part --------------
>>> A non-text attachment was scrubbed...
>>> Name: multiple_readadvance.patch
>>> Type: text/x-patch
>>> Size: 6336 bytes
>>> Desc: not available
>>> URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151130/08d3acbf/attachment.bin
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151130/08d3acbf/attachment.bin>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20181119/74aa2d9d/attachment.html>

Garfee Guan via llvm-dev

2018-Nov-20 02:19 UTC

head link

[llvm-dev] Per-write cycle count with ReadAdvance - Do I really need that?

Thanks for clarifying Andy. I will try to upstream when the patch is ready.

-Garfee

On Tue, Nov 20, 2018 at 12:37 AM Andrew Trick <atrick at apple.com> wrote:
>
>
> On Nov 18, 2018, at 10:03 PM, Garfee Guan <garfee.guan at gmail.com>
wrote:
>
> It does not work. I have tried to use the latest master today. But tblgen
> still give me information like
>
> error: Resources are defined for both SchedRead and its alias on processor
> MyArchModel
>
> def : ReadAdvance<MyReadVector, 3, [MyWriteAddVector]>;
> ^
>
> Unless I change "MyReadVector" to another read like
"MyReadVector1", it
> would not work. Debugging into tblgen, there is no path to handle multiplle
> latencies for same Read...
>
> Anyway as you reminded, I am searching for more Target and am looking into
> Pierre's change (I finally notice that he has a patch associated within
the
> thread already :-) If it is feasible, I will try to make any suitable
> change back upstream)
>
> -Garfee
>
>
> I see what you mean. I thought the problem was with multiple latencies
> associated with a single definition: ReadAdvance<Read1, #, [Write1,
> Write2]>. There definitely should be some way to make this work. If you
can
> upstream the patch that would be fantastic.
> -Andy
>
>
> On Sat, Nov 17, 2018, 10:42 AM Andrew Trick <atrick at apple.com wrote:
>
>>
>>
>> On Nov 16, 2018, at 6:31 PM, Garfee Guan <garfee.guan at
gmail.com> wrote:
>>
>> Thanks Andrew. I have tried with recent tblgen, ReadAdvance would not
>> work for multiple latencies. Maybe I should make improvement into
tblgen if Pierre-Andre
>> does not have the change anymore.
>>
>> However, I just a little curious about the situation I met. The
hardware
>> forwording may fail for different reasons, which different register
read
>> may have different latencies, depending both on the register reader and
>> writer. I am freshman into tblgen. So I wonder if any other Target
already
>> has other way to describe that .
>>
>>
>> Does this work for you?
>>
>> // Forward from a vector op (normal, add, mul) to a non-store.
>> def : ReadAdvance<MyReadVector, 5, [WriteVector]>;
>> def : ReadAdvance<MyReadVector, 3, [MyWriteAddVector]>;
>> def : ReadAdvance<MyReadVector, 1, [MyWriteMulVector]>;
>>
>> A ReadAdvance is associated with a pair of write resource -> read
>> resource. You can specify as many variants of read/write resources as
you
>> want, even using arbitrary C++ code inside a predicate. So, in theory I
>> think that should be flexible enough.
>>
>> You can search the in-tree targets to see where ReadAdvance definitions
>> are used. Sorry, I’m not familiar with anything beyond that, but maybe
>> someone else on the list has dealt with the same problem.
>>
>> -Andy
>>
>> On Fri, Nov 16, 2018, 8:00 AM Andrew Trick <atrick at apple.com
wrote:
>>
>>>
>>>
>>> On Nov 14, 2018, at 10:52 PM, Garfee Guan via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>> Hi list,
>>> I happened to read below thread (written in 3 years ago). I think I
may
>>> need this ReadAdvance feature to work with my ARCH.
>>>
>>> It is about the scheduler info which describes reading my
ARCH's vector
>>> register. There are different latencies since forwarding/bypass
appears. I
>>> give it as below example:
>>>
>>> def : WriteRes<WriteVector,    [MyArchVALU]>  { let Latency =
6; }
>>> ...
>>> def MyWriteAddVector : SchedWriteRes<[MyArchVALU]> { let
Latency = 6; }
>>> def MyWriteMulVector : SchedWriteRes<[MyArchVALU]> { let
Latency = 6; }
>>>
>>> ...
>>>
>>> Here I defined 3 different Writes with same latency number. Below
shows
>>> the forwarding.
>>>
>>> def : ReadAdvance<MyReadVector, 5, [WriteVector]>;
>>> def : ReadAdvance<MyReadVector, 3,
[MyWriteAddVector_3cycles]>;
>>> def : ReadAdvance<MyReadVector, 1,
[MyWriteMulVector_5cycles]>;
>>> ...
>>> def : ReadAdvance<MyReadStoreVector, 0, [WriteVector]>;
>>> def : ReadAdvance<MyReadStoreVector, 0,
[MyWriteAddVector_3cycles]>;
>>> def : ReadAdvance<MyReadStoreVector, 0,
[MyWriteMulVector_5cycles]>;
>>> ...
>>>
>>> Basically my intention is to model that, for any non-store
instruction
>>> which reads vector, it forwards vector write to: normally 1 cycle,
3
>>> cycles for my ADD, 5 cycles for my MUL. But for any store
instruction
>>> takes vector register as source, It can not forward. So the latency
is kept
>>> as 6.
>>>
>>> Unfortunately, above code can not be compiled by tblgen. I am not
sure
>>> if I really need per-write cycle count with ReadAdvance, or there
is any
>>> existed method to meet my requirement. Anyway the latencies here
seems to
>>> be decided by considering both
>>>
>>> a) 3 kinds of Write,
>>> b) 2 kinds of Read.
>>>
>>> Therefore I doubt if it can not be modeled with current tblgen
implement.
>>>
>>>
>>> I’m not sure if the TableGen bug mentioned below was ever fixed.
>>>
>>> It looks to me like this should work, but I haven’t tried it:
>>>
>>> def : WriteRes<WriteVector,    [MyArchVALU]>  { let Latency =
6; }
>>> def MyWriteAddVector : SchedWriteRes<[MyArchVALU]> { let
Latency = 6; }
>>> def MyWriteMulVector : SchedWriteRes<[MyArchVALU]> { let
Latency = 6; }
>>>
>>> // Forward from a vector op (normal, add, mul) to a non-store.
>>> def : ReadAdvance<MyReadVector, 5, [WriteVector]>;
>>> def : ReadAdvance<MyReadVector, 3, [MyWriteAddVector]>;
>>> def : ReadAdvance<MyReadVector, 1, [MyWriteMulVector]>;
>>>
>>> Additionally, you could do this but I don’t think it would have any
>>> effect at all:
>>>
>>> // Forward from a vector op (normal, add, mul) to a store.
>>> def : ReadAdvance<MyReadStoreVector, 0,
>>> [WriteVector, MyWriteAddVector, MyWriteMulVector]>;
>>>
>>> -Andy
>>>
>>> --
>>> Garfee Guan,
>>> LLVM Compiler Backend Engineer
>>> Enflame Technology Co.
>>> Website: http://www.enflame-tech.com/
>>>
>>>
--------------------------------------------------------------------
>>> [llvm-dev] Per-write cycle count with ReadAdvance
>>> *Pierre-Andre Saulais via llvm-dev* llvm-dev at lists.llvm.org
>>>
<llvm-dev%40lists.llvm.org?Subject=Re%3A%20%5Bllvm-dev%5D%20Per-write%20cycle%20count%20with%20ReadAdvance&In-Reply-To=%3C565C3F99.9060206%40codeplay.com%3E>
>>> *Mon Nov 30 04:22:49 PST 2015*
>>>
>>>
>>>    - Previous message: [llvm-dev] difference with autotools, cmake
and
>>>    ninja building methods
>>>   
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/092870.html>
>>>    - Next message: [llvm-dev] LLVM Weekly - #100, Nov 30th 2015
>>>   
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/092850.html>
>>>    - *Messages sorted by:* [ date ]
>>>   
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/date.html#92849>
>>>     [ thread ]
>>>   
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/thread.html#92849>
>>>     [ subject ]
>>>   
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/subject.html#92849>
>>>     [ author ]
>>>   
<http://lists.llvm.org/pipermail/llvm-dev/2015-November/author.html#92849>
>>>
>>> ------------------------------
>>>
>>> Hi all,
>>>
>>> I am working on a backend that uses the ProcResource scheduling
model
>>> and one limitation I found is that while it is possible to specify
>>> multiple SchedWrites in a ReadAdvance record, each write uses the
same
>>> cycle count. I tried writing multiple ReadAdvance records for the
same
>>> SchedRead, but tablegen does not seem to allow that.
>>>
>>> It would be useful to have a per-write cycle count to model
different
>>> pipeline bypasses, where the cycle count depends on the (read,
write)
>>> pair and not just on the read.
>>>
>>> Two possible solutions are: 1) changing the 'Cycles' field
in
>>> (Proc)ReadAdvance to be a list of int and 2) changing tablegen to
allow
>>> multiple (Proc)ReadAdvance records with the same read resource.
>>>
>>> The former solution doesn't seem ideal as it requires repeating
the
>>> cycle count many times for targets that use long SchedWriteRes
lists:
>>>
>>> -def : ReadAdvance<ReadIM, 1, [WriteImm,WriteI,
>>> +def: ReadAdvance<ReadIM, [1, 1, 1, 1, 1, 1, 1, 1], [WriteImm,
WriteI,
>>>                                 WriteISReg, WriteIEReg,WriteIS,
>>>                                 WriteID32,WriteID64,
>>>                                 WriteIM32,WriteIM64]>;
>>>
>>> The latter is a bit more verbose when per-write cycle count is
used, but
>>> requires no change to existing targets. It is also easier to
visually
>>> match cycle counts to write types:
>>>
>>> def : ReadAdvance<ReadFoo, 2, [WriteType1]>;
>>> def : ReadAdvance<ReadFoo, 4, [WriteType2]>;
>>> def : ReadAdvance<ReadFoo, 3, [WriteType3]>;
>>>
>>> I have a patch for the second solution. Would that benefit any
in-tree
>>> target?
>>>
>>> Thanks,
>>> Pierre-Andre
>>>
>>> --
>>> Pierre-Andre Saulais
>>> Principal Software Engineer, Compilers
>>> Codeplay Software Ltd
>>> Level C, Argyle House
>>> 3 Lady Lawson St,
>>> Edinburgh EH3 9DR
>>> Tel: 0131 466 0503
>>> Fax: 0131 557 6600
>>> Website: http://www.codeplay.com
>>> Twitter: https://twitter.com/codeplaysoft
>>>
>>> This email and any attachments may contain confidential and /or
privileged information and is for use by the addressee only. If you are not the
intended recipient, please notify Codeplay Software Ltd immediately and delete
the message from your computer. You may not copy or forward it, or use or
disclose its contents to any other person. Any views or other information in
this message which do not relate to our business are not authorized by Codeplay
software Ltd, nor does this message form part of any contract unless so stated.
>>> As internet communications are capable of data corruption Codeplay
Software Ltd does not accept any responsibility for any changes made to this
message after it was sent. Please note that Codeplay Software Ltd does not
accept any liability or responsibility for viruses and it is your responsibility
to scan any attachments.
>>> Company registered in England and Wales, number: 04567874
>>> Registered office: 81 Linkfield Street, Redhill RH1 6BY
>>>
>>> -------------- next part --------------
>>> A non-text attachment was scrubbed...
>>> Name: multiple_readadvance.patch
>>> Type: text/x-patch
>>> Size: 6336 bytes
>>> Desc: not available
>>> URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151130/08d3acbf/attachment.bin>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20181120/b10f3871/attachment.html>

llvm dev - Nov 2018 - Per-write cycle count with ReadAdvance - Do I really need that?

[llvm-dev] Per-write cycle count with ReadAdvance - Do I really need that?

[llvm-dev] Per-write cycle count with ReadAdvance - Do I really need that?

[llvm-dev] Per-write cycle count with ReadAdvance - Do I really need that?