Garfee Guan via llvm-dev
2018-Nov-15 06:52 UTC
[llvm-dev] Per-write cycle count with ReadAdvance - Do I really need that?
Hi list, I happened to read below thread (written in 3 years ago). I think I may need this ReadAdvance feature to work with my ARCH. It is about the scheduler info which describes reading my ARCH's vector register. There are different latencies since forwarding/bypass appears. I give it as below example: def : WriteRes<WriteVector, [MyArchVALU]> { let Latency = 6; } ... def MyWriteAddVector : SchedWriteRes<[MyArchVALU]> { let Latency = 6; } def MyWriteMulVector : SchedWriteRes<[MyArchVALU]> { let Latency = 6; } ... Here I defined 3 different Writes with same latency number. Below shows the forwarding. def : ReadAdvance<MyReadVector, 5, [WriteVector]>; def : ReadAdvance<MyReadVector, 3, [MyWriteAddVector_3cycles]>; def : ReadAdvance<MyReadVector, 1, [MyWriteMulVector_5cycles]>; ... def : ReadAdvance<MyReadStoreVector, 0, [WriteVector]>; def : ReadAdvance<MyReadStoreVector, 0, [MyWriteAddVector_3cycles]>; def : ReadAdvance<MyReadStoreVector, 0, [MyWriteMulVector_5cycles]>; ... Basically my intention is to model that, for any non-store instruction which reads vector, it forwards vector write to: normally 1 cycle, 3 cycles for my ADD, 5 cycles for my MUL. But for any store instruction takes vector register as source, It can not forward. So the latency is kept as 6. Unfortunately, above code can not be compiled by tblgen. I am not sure if I really need per-write cycle count with ReadAdvance, or there is any existed method to meet my requirement. Anyway the latencies here seems to be decided by considering both a) 3 kinds of Write, b) 2 kinds of Read. Therefore I doubt if it can not be modeled with current tblgen implement. Can you comment and help? -- Garfee Guan, LLVM Compiler Backend Engineer Enflame Technology Co. Website: http://www.enflame-tech.com/ -------------------------------------------------------------------- [llvm-dev] Per-write cycle count with ReadAdvance *Pierre-Andre Saulais via llvm-dev* llvm-dev at lists.llvm.org <llvm-dev%40lists.llvm.org?Subject=Re%3A%20%5Bllvm-dev%5D%20Per-write%20cycle%20count%20with%20ReadAdvance&In-Reply-To=%3C565C3F99.9060206%40codeplay.com%3E> *Mon Nov 30 04:22:49 PST 2015* - Previous message: [llvm-dev] difference with autotools, cmake and ninja building methods <http://lists.llvm.org/pipermail/llvm-dev/2015-November/092870.html> - Next message: [llvm-dev] LLVM Weekly - #100, Nov 30th 2015 <http://lists.llvm.org/pipermail/llvm-dev/2015-November/092850.html> - *Messages sorted by:* [ date ] <http://lists.llvm.org/pipermail/llvm-dev/2015-November/date.html#92849> [ thread ] <http://lists.llvm.org/pipermail/llvm-dev/2015-November/thread.html#92849> [ subject ] <http://lists.llvm.org/pipermail/llvm-dev/2015-November/subject.html#92849> [ author ] <http://lists.llvm.org/pipermail/llvm-dev/2015-November/author.html#92849> ------------------------------ Hi all, I am working on a backend that uses the ProcResource scheduling model and one limitation I found is that while it is possible to specify multiple SchedWrites in a ReadAdvance record, each write uses the same cycle count. I tried writing multiple ReadAdvance records for the same SchedRead, but tablegen does not seem to allow that. It would be useful to have a per-write cycle count to model different pipeline bypasses, where the cycle count depends on the (read, write) pair and not just on the read. Two possible solutions are: 1) changing the 'Cycles' field in (Proc)ReadAdvance to be a list of int and 2) changing tablegen to allow multiple (Proc)ReadAdvance records with the same read resource. The former solution doesn't seem ideal as it requires repeating the cycle count many times for targets that use long SchedWriteRes lists: -def : ReadAdvance<ReadIM, 1, [WriteImm,WriteI, +def: ReadAdvance<ReadIM, [1, 1, 1, 1, 1, 1, 1, 1], [WriteImm, WriteI, WriteISReg, WriteIEReg,WriteIS, WriteID32,WriteID64, WriteIM32,WriteIM64]>; The latter is a bit more verbose when per-write cycle count is used, but requires no change to existing targets. It is also easier to visually match cycle counts to write types: def : ReadAdvance<ReadFoo, 2, [WriteType1]>; def : ReadAdvance<ReadFoo, 4, [WriteType2]>; def : ReadAdvance<ReadFoo, 3, [WriteType3]>; I have a patch for the second solution. Would that benefit any in-tree target? Thanks, Pierre-Andre -- Pierre-Andre Saulais Principal Software Engineer, Compilers Codeplay Software Ltd Level C, Argyle House 3 Lady Lawson St, Edinburgh EH3 9DR Tel: 0131 466 0503 Fax: 0131 557 6600 Website: http://www.codeplay.com Twitter: https://twitter.com/codeplaysoft This email and any attachments may contain confidential and /or privileged information and is for use by the addressee only. If you are not the intended recipient, please notify Codeplay Software Ltd immediately and delete the message from your computer. You may not copy or forward it, or use or disclose its contents to any other person. Any views or other information in this message which do not relate to our business are not authorized by Codeplay software Ltd, nor does this message form part of any contract unless so stated. As internet communications are capable of data corruption Codeplay Software Ltd does not accept any responsibility for any changes made to this message after it was sent. Please note that Codeplay Software Ltd does not accept any liability or responsibility for viruses and it is your responsibility to scan any attachments. Company registered in England and Wales, number: 04567874 Registered office: 81 Linkfield Street, Redhill RH1 6BY -------------- next part -------------- A non-text attachment was scrubbed... Name: multiple_readadvance.patch Type: text/x-patch Size: 6336 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151130/08d3acbf/attachment.bin> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181115/4169cd68/attachment.html>
Andrew Trick via llvm-dev
2018-Nov-16 00:00 UTC
[llvm-dev] Per-write cycle count with ReadAdvance - Do I really need that?
> On Nov 14, 2018, at 10:52 PM, Garfee Guan via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi list, > > I happened to read below thread (written in 3 years ago). I think I may need this ReadAdvance feature to work with my ARCH. > > It is about the scheduler info which describes reading my ARCH's vector register. There are different latencies since forwarding/bypass appears. I give it as below example: > > def : WriteRes<WriteVector, [MyArchVALU]> { let Latency = 6; } > ... > def MyWriteAddVector : SchedWriteRes<[MyArchVALU]> { let Latency = 6; } > def MyWriteMulVector : SchedWriteRes<[MyArchVALU]> { let Latency = 6; } > ... > > Here I defined 3 different Writes with same latency number. Below shows the forwarding. > > def : ReadAdvance<MyReadVector, 5, [WriteVector]>; > def : ReadAdvance<MyReadVector, 3, [MyWriteAddVector_3cycles]>; > def : ReadAdvance<MyReadVector, 1, [MyWriteMulVector_5cycles]>; > ... > def : ReadAdvance<MyReadStoreVector, 0, [WriteVector]>; > def : ReadAdvance<MyReadStoreVector, 0, [MyWriteAddVector_3cycles]>; > def : ReadAdvance<MyReadStoreVector, 0, [MyWriteMulVector_5cycles]>; > ... > > Basically my intention is to model that, for any non-store instruction which reads vector, it forwards vector write to: normally 1 cycle, 3 cycles for my ADD, 5 cycles for my MUL. But for any store instruction takes vector register as source, It can not forward. So the latency is kept as 6. > > Unfortunately, above code can not be compiled by tblgen. I am not sure if I really need per-write cycle count with ReadAdvance, or there is any existed method to meet my requirement. Anyway the latencies here seems to be decided by considering both > > a) 3 kinds of Write, > b) 2 kinds of Read. > > Therefore I doubt if it can not be modeled with current tblgen implement.I’m not sure if the TableGen bug mentioned below was ever fixed. It looks to me like this should work, but I haven’t tried it: def : WriteRes<WriteVector, [MyArchVALU]> { let Latency = 6; } def MyWriteAddVector : SchedWriteRes<[MyArchVALU]> { let Latency = 6; } def MyWriteMulVector : SchedWriteRes<[MyArchVALU]> { let Latency = 6; } // Forward from a vector op (normal, add, mul) to a non-store. def : ReadAdvance<MyReadVector, 5, [WriteVector]>; def : ReadAdvance<MyReadVector, 3, [MyWriteAddVector]>; def : ReadAdvance<MyReadVector, 1, [MyWriteMulVector]>; Additionally, you could do this but I don’t think it would have any effect at all: // Forward from a vector op (normal, add, mul) to a store. def : ReadAdvance<MyReadStoreVector, 0, [WriteVector, MyWriteAddVector, MyWriteMulVector]>; -Andy> -- > Garfee Guan, > LLVM Compiler Backend Engineer > Enflame Technology Co. > Website: http://www.enflame-tech.com/ <http://www.enflame-tech.com/> > > -------------------------------------------------------------------- > [llvm-dev] Per-write cycle count with ReadAdvance > > Pierre-Andre Saulais via llvm-dev llvm-dev at lists.llvm.org <mailto:llvm-dev%40lists.llvm.org?Subject=Re%3A%20%5Bllvm-dev%5D%20Per-write%20cycle%20count%20with%20ReadAdvance&In-Reply-To=%3C565C3F99.9060206%40codeplay.com%3E> > Mon Nov 30 04:22:49 PST 2015 > > Previous message: [llvm-dev] difference with autotools, cmake and ninja building methods <http://lists.llvm.org/pipermail/llvm-dev/2015-November/092870.html> > Next message: [llvm-dev] LLVM Weekly - #100, Nov 30th 2015 <http://lists.llvm.org/pipermail/llvm-dev/2015-November/092850.html> > Messages sorted by: [ date ] <http://lists.llvm.org/pipermail/llvm-dev/2015-November/date.html#92849> [ thread ] <http://lists.llvm.org/pipermail/llvm-dev/2015-November/thread.html#92849> [ subject ] <http://lists.llvm.org/pipermail/llvm-dev/2015-November/subject.html#92849> [ author ] <http://lists.llvm.org/pipermail/llvm-dev/2015-November/author.html#92849> > Hi all, > > I am working on a backend that uses the ProcResource scheduling model > and one limitation I found is that while it is possible to specify > multiple SchedWrites in a ReadAdvance record, each write uses the same > cycle count. I tried writing multiple ReadAdvance records for the same > SchedRead, but tablegen does not seem to allow that. > > It would be useful to have a per-write cycle count to model different > pipeline bypasses, where the cycle count depends on the (read, write) > pair and not just on the read. > > Two possible solutions are: 1) changing the 'Cycles' field in > (Proc)ReadAdvance to be a list of int and 2) changing tablegen to allow > multiple (Proc)ReadAdvance records with the same read resource. > > The former solution doesn't seem ideal as it requires repeating the > cycle count many times for targets that use long SchedWriteRes lists: > > -def : ReadAdvance<ReadIM, 1, [WriteImm,WriteI, > +def: ReadAdvance<ReadIM, [1, 1, 1, 1, 1, 1, 1, 1], [WriteImm, WriteI, > WriteISReg, WriteIEReg,WriteIS, > WriteID32,WriteID64, > WriteIM32,WriteIM64]>; > > The latter is a bit more verbose when per-write cycle count is used, but > requires no change to existing targets. It is also easier to visually > match cycle counts to write types: > > def : ReadAdvance<ReadFoo, 2, [WriteType1]>; > def : ReadAdvance<ReadFoo, 4, [WriteType2]>; > def : ReadAdvance<ReadFoo, 3, [WriteType3]>; > > I have a patch for the second solution. Would that benefit any in-tree > target? > > Thanks, > Pierre-Andre > > -- > Pierre-Andre Saulais > Principal Software Engineer, Compilers > Codeplay Software Ltd > Level C, Argyle House > 3 Lady Lawson St, > Edinburgh EH3 9DR > Tel: 0131 466 0503 > Fax: 0131 557 6600 > Website: http://www.codeplay.com <http://www.codeplay.com/> > Twitter: https://twitter.com/codeplaysoft <https://twitter.com/codeplaysoft> > > This email and any attachments may contain confidential and /or privileged information and is for use by the addressee only. If you are not the intended recipient, please notify Codeplay Software Ltd immediately and delete the message from your computer. You may not copy or forward it, or use or disclose its contents to any other person. Any views or other information in this message which do not relate to our business are not authorized by Codeplay software Ltd, nor does this message form part of any contract unless so stated. > As internet communications are capable of data corruption Codeplay Software Ltd does not accept any responsibility for any changes made to this message after it was sent. Please note that Codeplay Software Ltd does not accept any liability or responsibility for viruses and it is your responsibility to scan any attachments. > Company registered in England and Wales, number: 04567874 > Registered office: 81 Linkfield Street, Redhill RH1 6BY > > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: multiple_readadvance.patch > Type: text/x-patch > Size: 6336 bytes > Desc: not available > URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151130/08d3acbf/attachment.bin <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151130/08d3acbf/attachment.bin>> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181115/3f09cea7/attachment.html>
Garfee Guan via llvm-dev
2018-Nov-17 02:31 UTC
[llvm-dev] Per-write cycle count with ReadAdvance - Do I really need that?
Thanks Andrew. I have tried with recent tblgen, ReadAdvance would not work for multiple latencies. Maybe I should make improvement into tblgen if Pierre-Andre does not have the change anymore. However, I just a little curious about the situation I met. The hardware forwording may fail for different reasons, which different register read may have different latencies, depending both on the register reader and writer. I am freshman into tblgen. So I wonder if any other Target already has other way to describe that . On Fri, Nov 16, 2018, 8:00 AM Andrew Trick <atrick at apple.com wrote:> > > On Nov 14, 2018, at 10:52 PM, Garfee Guan via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Hi list, > I happened to read below thread (written in 3 years ago). I think I may > need this ReadAdvance feature to work with my ARCH. > > It is about the scheduler info which describes reading my ARCH's vector > register. There are different latencies since forwarding/bypass appears. I > give it as below example: > > def : WriteRes<WriteVector, [MyArchVALU]> { let Latency = 6; } > ... > def MyWriteAddVector : SchedWriteRes<[MyArchVALU]> { let Latency = 6; } > def MyWriteMulVector : SchedWriteRes<[MyArchVALU]> { let Latency = 6; } > ... > > Here I defined 3 different Writes with same latency number. Below shows > the forwarding. > > def : ReadAdvance<MyReadVector, 5, [WriteVector]>; > def : ReadAdvance<MyReadVector, 3, [MyWriteAddVector_3cycles]>; > def : ReadAdvance<MyReadVector, 1, [MyWriteMulVector_5cycles]>; > ... > def : ReadAdvance<MyReadStoreVector, 0, [WriteVector]>; > def : ReadAdvance<MyReadStoreVector, 0, [MyWriteAddVector_3cycles]>; > def : ReadAdvance<MyReadStoreVector, 0, [MyWriteMulVector_5cycles]>; > ... > > Basically my intention is to model that, for any non-store instruction > which reads vector, it forwards vector write to: normally 1 cycle, 3 > cycles for my ADD, 5 cycles for my MUL. But for any store instruction > takes vector register as source, It can not forward. So the latency is kept > as 6. > > Unfortunately, above code can not be compiled by tblgen. I am not sure if > I really need per-write cycle count with ReadAdvance, or there is any > existed method to meet my requirement. Anyway the latencies here seems to > be decided by considering both > > a) 3 kinds of Write, > b) 2 kinds of Read. > > Therefore I doubt if it can not be modeled with current tblgen implement. > > > I’m not sure if the TableGen bug mentioned below was ever fixed. > > It looks to me like this should work, but I haven’t tried it: > > def : WriteRes<WriteVector, [MyArchVALU]> { let Latency = 6; } > def MyWriteAddVector : SchedWriteRes<[MyArchVALU]> { let Latency = 6; } > def MyWriteMulVector : SchedWriteRes<[MyArchVALU]> { let Latency = 6; } > > // Forward from a vector op (normal, add, mul) to a non-store. > def : ReadAdvance<MyReadVector, 5, [WriteVector]>; > def : ReadAdvance<MyReadVector, 3, [MyWriteAddVector]>; > def : ReadAdvance<MyReadVector, 1, [MyWriteMulVector]>; > > Additionally, you could do this but I don’t think it would have any effect > at all: > > // Forward from a vector op (normal, add, mul) to a store. > def : ReadAdvance<MyReadStoreVector, 0, > [WriteVector, MyWriteAddVector, MyWriteMulVector]>; > > -Andy > > -- > Garfee Guan, > LLVM Compiler Backend Engineer > Enflame Technology Co. > Website: http://www.enflame-tech.com/ > > -------------------------------------------------------------------- > [llvm-dev] Per-write cycle count with ReadAdvance > *Pierre-Andre Saulais via llvm-dev* llvm-dev at lists.llvm.org > <llvm-dev%40lists.llvm.org?Subject=Re%3A%20%5Bllvm-dev%5D%20Per-write%20cycle%20count%20with%20ReadAdvance&In-Reply-To=%3C565C3F99.9060206%40codeplay.com%3E> > *Mon Nov 30 04:22:49 PST 2015* > > > - Previous message: [llvm-dev] difference with autotools, cmake and > ninja building methods > <http://lists.llvm.org/pipermail/llvm-dev/2015-November/092870.html> > - Next message: [llvm-dev] LLVM Weekly - #100, Nov 30th 2015 > <http://lists.llvm.org/pipermail/llvm-dev/2015-November/092850.html> > - *Messages sorted by:* [ date ] > <http://lists.llvm.org/pipermail/llvm-dev/2015-November/date.html#92849> > [ thread ] > <http://lists.llvm.org/pipermail/llvm-dev/2015-November/thread.html#92849> > [ subject ] > <http://lists.llvm.org/pipermail/llvm-dev/2015-November/subject.html#92849> > [ author ] > <http://lists.llvm.org/pipermail/llvm-dev/2015-November/author.html#92849> > > ------------------------------ > > Hi all, > > I am working on a backend that uses the ProcResource scheduling model > and one limitation I found is that while it is possible to specify > multiple SchedWrites in a ReadAdvance record, each write uses the same > cycle count. I tried writing multiple ReadAdvance records for the same > SchedRead, but tablegen does not seem to allow that. > > It would be useful to have a per-write cycle count to model different > pipeline bypasses, where the cycle count depends on the (read, write) > pair and not just on the read. > > Two possible solutions are: 1) changing the 'Cycles' field in > (Proc)ReadAdvance to be a list of int and 2) changing tablegen to allow > multiple (Proc)ReadAdvance records with the same read resource. > > The former solution doesn't seem ideal as it requires repeating the > cycle count many times for targets that use long SchedWriteRes lists: > > -def : ReadAdvance<ReadIM, 1, [WriteImm,WriteI, > +def: ReadAdvance<ReadIM, [1, 1, 1, 1, 1, 1, 1, 1], [WriteImm, WriteI, > WriteISReg, WriteIEReg,WriteIS, > WriteID32,WriteID64, > WriteIM32,WriteIM64]>; > > The latter is a bit more verbose when per-write cycle count is used, but > requires no change to existing targets. It is also easier to visually > match cycle counts to write types: > > def : ReadAdvance<ReadFoo, 2, [WriteType1]>; > def : ReadAdvance<ReadFoo, 4, [WriteType2]>; > def : ReadAdvance<ReadFoo, 3, [WriteType3]>; > > I have a patch for the second solution. Would that benefit any in-tree > target? > > Thanks, > Pierre-Andre > > -- > Pierre-Andre Saulais > Principal Software Engineer, Compilers > Codeplay Software Ltd > Level C, Argyle House > 3 Lady Lawson St, > Edinburgh EH3 9DR > Tel: 0131 466 0503 > Fax: 0131 557 6600 > Website: http://www.codeplay.com > Twitter: https://twitter.com/codeplaysoft > > This email and any attachments may contain confidential and /or privileged information and is for use by the addressee only. If you are not the intended recipient, please notify Codeplay Software Ltd immediately and delete the message from your computer. You may not copy or forward it, or use or disclose its contents to any other person. Any views or other information in this message which do not relate to our business are not authorized by Codeplay software Ltd, nor does this message form part of any contract unless so stated. > As internet communications are capable of data corruption Codeplay Software Ltd does not accept any responsibility for any changes made to this message after it was sent. Please note that Codeplay Software Ltd does not accept any liability or responsibility for viruses and it is your responsibility to scan any attachments. > Company registered in England and Wales, number: 04567874 > Registered office: 81 Linkfield Street, Redhill RH1 6BY > > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: multiple_readadvance.patch > Type: text/x-patch > Size: 6336 bytes > Desc: not available > URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151130/08d3acbf/attachment.bin> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181117/d37638b4/attachment-0001.html>