On Oct 6, 2011, at 12:42 PM, David A. Greene wrote:> Jakob Stoklund Olesen <jolesen at apple.com> writes: > >> On Oct 6, 2011, at 7:59 AM, David A. Greene wrote: >> >>> For example, I want to be able to do this: >>> >>> defm MOVH : >>> vs1x_fps_binary_vv_node_rmonly< >>> 0x16, "movh", undef, 0, >>> // rr >>> [(undef)], >>> // rm >>> [(set DSTREGCLASS:$dst, >>> (DSTTYPE (movlhps SRCREGCLASS:$src1, >>> (DSTTYPE (bitconvert >>> (v2f64 (scalar_to_vector >>> (loadf64 addr:$src2))))))))], >>> // rr Pat >>> [], >>> // rm Pat >>> [[(DSTTYPE (movlhps SRCREGCLASS:$src1, (load addr:$src2))), >>> (MNEMONIC SRCREGCLASS:$src1, addr:$src2)], >>> [(INTDSTTYPE (movlhps SRCREGCLASS:$src1, (load addr:$src2))), >>> (MNEMONIC SRCREGCLASS:$src1, addr:$src2)]]>; >> >> This kind of thing is very hard to read and understand. > > What's hard about it? I'm not trying to be agitational here. I'm truly > wondering what I can do to make this more understandable.If you didn't write these patterns yourself, or if you wrote them six months ago, it is nearly impossible to figure out where a specific pattern came from, or where a specific instruction is defined. It is hard enough mentally executing the current multiclasses. Injecting patterns into multi defms like this makes it much harder still.> I am certainly happy to make things more readable and welcome > lots of feedback in that area. But the ability to quickly and easily > extend the ISA for new vector lengths is critical to us.This is where our priorities differ. Readability and maintainability are key. After all, we need to fix isel and codegen bugs more often than Intel and AMD add ISA extensions. /jakob
My purpose is to eliminate copy-paste style of programming in td files as much as possible, but only to a point that the new language constructs do not create too much overhead/readability-downgrade. In other words, I am targeting those low-hanging fruit of copy-paste programmings in td files that are eliminated by a simple for-loop syntax. The repetitive patterns I observed in PTX backend (and probably some other backends) are: * Members of a multiclass, such as those in PTXInstrInfo.td. * Consecutive similar defs, such as register declarations. [Why for-loop?] * All I need is a simple iteration language construct. I believe there is no simpler constructs (in terms of readability) than a for-loop, but I am happy to be convinced otherwise. * It is sufficient to eliminate the most common copy-paste programming that I observed. * It is simple enough to understand and maintain, at least I believe so. [Why preprocessor?] I admit that a preprocessor is probably not the best solution. And we can implement for-loop without a preprocessor. The only reason I chose a preprocessor is because (I believe) this would add least changes to the TGParser.cpp. The TGParser.cpp as its current form parses and emits the results in one-pass. That means it would emit the for-loop body even before we are done parsing the entire for-loop. So I believe a non-preprocessor approach would require 2 passes. The first pass parses the input and generates a simple syntax tree, and the second pass evaluate the syntax tree and emits output records (In fact, this is how I implemented the current preprocessor). And I believe that changing TGParser.cpp to accommodate 2 passes is quite a lot, and so I chose a preprocessor. But if you think we should really rewrite TGParser.cpp to parse and evaluate for-loops correctly, I am glad that we could get away with a preprocessor. [Why NOT while-loop?] * A while-loop requires evaluating an loop-condition expression; this is complexity that I would like to avoid. [Why NO if-else?] * It requires at least evaluating a Boolean expression, too. * If a piece of td codes is complicated enough that we need an if-else to eliminate its duplication, I think it is worthy of the duplication. [Why NO abstractions (like `define foo(a, b, c)`)?] * Abstractions is probably worthy of, but I am not sure yet. I think we could wait until it is clear that we really need abstractions. Hi Dave and Jakob, Thanks for comments. I try my best to respond any comments you wrote about. If I missed any comments, as they are really a lot, please let me know. [string vs token] The preprocessor (as its current form) has tokens by default, and it only converts a series of tokens and white spaces into a string if explicitly required (by a special escape #"#, see example below). ---------------------------------------- #for i = sequence(0, 127) def P#i# : PTXReg<#"#p#i##"#>; #end ---------------------------------------- * Anything between #"# are quoted, including white spaces and non-tokens. E.g., #"#hello world#"# --> "hello world" * Macro variable needs a '#' character at both front and back. This looks like the multiclass's #NAME# substitution, and so I think is more consistent than prepending a single '#' at front. * So my current idea is very similar to Dave's, except that I replace string with tokens (i.e., having both iterators as tokens and paste "operator" results as tokens). What do you think? Which one is more readable to you? !case<> or #"# or ... ? [Can the for-loop proopsal be a preprocessing phase?] I guess the example Dave gave (see below) cannot be handled in a (even extended) preprocessor. I am not keen on implementing for-loop in a preprocessor. I chose a preprocessor because I think it would cause least impact to the codebase and, to be honest, I didn't address of the pattern that Dave gave in his example in my design. I was trying to avoid variable-length lists because I think that is too complicated to users. But I could be wrong. ---------------------------------------- multiclass blah<list<int> Values> { for v = Values { def DEF#v : base_class<v>; } } ---------------------------------------- No preprocessor seems to have another syntactical benefits --- we could remove extra '#' characters. To be honest, those '#' are not very nice looking. And Dave's example looks cleaner than my excess-'#' style. Hi Dave, I am not sure what you want to play around with, but you are not disrupting anything so far. Regards, Che-Liang On Fri, Oct 7, 2011 at 5:37 AM, Jakob Stoklund Olesen <jolesen at apple.com> wrote:> > On Oct 6, 2011, at 12:42 PM, David A. Greene wrote: > >> Jakob Stoklund Olesen <jolesen at apple.com> writes: >> >>> On Oct 6, 2011, at 7:59 AM, David A. Greene wrote: >>> >>>> For example, I want to be able to do this: >>>> >>>> defm MOVH : >>>> vs1x_fps_binary_vv_node_rmonly< >>>> 0x16, "movh", undef, 0, >>>> // rr >>>> [(undef)], >>>> // rm >>>> [(set DSTREGCLASS:$dst, >>>> (DSTTYPE (movlhps SRCREGCLASS:$src1, >>>> (DSTTYPE (bitconvert >>>> (v2f64 (scalar_to_vector >>>> (loadf64 addr:$src2))))))))], >>>> // rr Pat >>>> [], >>>> // rm Pat >>>> [[(DSTTYPE (movlhps SRCREGCLASS:$src1, (load addr:$src2))), >>>> (MNEMONIC SRCREGCLASS:$src1, addr:$src2)], >>>> [(INTDSTTYPE (movlhps SRCREGCLASS:$src1, (load addr:$src2))), >>>> (MNEMONIC SRCREGCLASS:$src1, addr:$src2)]]>; >>> >>> This kind of thing is very hard to read and understand. >> >> What's hard about it? I'm not trying to be agitational here. I'm truly >> wondering what I can do to make this more understandable. > > If you didn't write these patterns yourself, or if you wrote them six months ago, it is nearly impossible to figure out where a specific pattern came from, or where a specific instruction is defined. > > It is hard enough mentally executing the current multiclasses. Injecting patterns into multi defms like this makes it much harder still. > >> I am certainly happy to make things more readable and welcome >> lots of feedback in that area. But the ability to quickly and easily >> extend the ISA for new vector lengths is critical to us. > > This is where our priorities differ. > > Readability and maintainability are key. > > After all, we need to fix isel and codegen bugs more often than Intel and AMD add ISA extensions. > > /jakob > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
Jakob Stoklund Olesen <jolesen at apple.com> writes:>>> This kind of thing is very hard to read and understand. >> >> What's hard about it? I'm not trying to be agitational here. I'm truly >> wondering what I can do to make this more understandable. > > If you didn't write these patterns yourself, or if you wrote them six > months ago, it is nearly impossible to figure out where a specific > pattern came from, or where a specific instruction is defined.I haven't had that problem. Sure, I wrote the pattern but I'm really not getting _what's_ hard about it. It looks just like any other pattern with the types replaced by parameters. In this example, all of the MOVH[PS][SD] patterns are defined right there, in one place. IMHO this is much better than having the various patterns scattered all over the .td file.> It is hard enough mentally executing the current multiclasses. > Injecting patterns into multi defms like this makes it much harder > still.So this:>>>> // rm Pat >>>> [[(DSTTYPE (movlhps SRCREGCLASS:$src1, (load addr:$src2))), >>>> (MNEMONIC SRCREGCLASS:$src1, addr:$src2)], >>>> [(INTDSTTYPE (movlhps SRCREGCLASS:$src1, (load addr:$src2))), >>>> (MNEMONIC SRCREGCLASS:$src1, addr:$src2)]]>;is exactly the same patterns you see in some Pat<> defs today, except those patterns are often completely separated from the "primary" pattern defined by the defm (MOVH[PS][SD] in this case). I find it easier to keep things in one place.>> I am certainly happy to make things more readable and welcome lots >> of feedback in that area. But the ability to quickly and easily >> extend the ISA for new vector lengths is critical to us. > > This is where our priorities differ. > > Readability and maintainability are key.Yes, they are very important.> After all, we need to fix isel and codegen bugs more often than Intel > and AMD add ISA extensions.We need to do both. Less code generally implies easier maintenance. I've maintained way too much cut-n-paste code in my career and it's always a waste of time. I am very open to reworking things to be more readable and maintainable. I think that the for loop extension along with Bruno's enhacements to allow multiclass defs to inherit from multiclasses will go a long way to making that happen. These are both things that were on my TableGen wishlist. Give me a chance to try some things out and report back. Fair? -Dave
Che-Liang Chiou <clchiou at gmail.com> writes:> My purpose is to eliminate copy-paste style of programming in td files > as much as possible, but only to a point that the new language > constructs do not create too much overhead/readability-downgrade.Yes!> In other words, I am targeting those low-hanging fruit of copy-paste > programmings in td files that are eliminated by a simple for-loop > syntax. The repetitive patterns I observed in PTX backend (and > probably some other backends) are: > * Members of a multiclass, such as those in PTXInstrInfo.td. > * Consecutive similar defs, such as register declarations.Yep.> [Why for-loop?] > * All I need is a simple iteration language construct. I believe there > is no simpler constructs (in terms of readability) than a for-loop, > but I am happy to be convinced otherwise. > * It is sufficient to eliminate the most common copy-paste programming > that I observed. > * It is simple enough to understand and maintain, at least I believe so.I mostly agree with these. One other thing I've found useful is the ability to abstract out the type information. For example, in x86 land, an SSE/AVX add is always: (set (type regclass:reg), (type (add (type regclass:reg), (type regclass:reg)))) Similarly, a sub is: (set (type regclass:reg), (type (sub (type regclass:reg), (type regclass:reg)))) In fact most binary operations are: (set (type regclass:reg), (type (op (type regclass:reg), (type regclass:reg)))) So why write hundreds of patterns to express this? Using the for-loop syntax: // WARNING: Pseudo-code, many details elided for presentation purposes. multiclass binop<opcode> : sse_binop<opcode>, avx_binop<opcode>; multiclass sse_binop<opcode> { for type = [f32, f64, v4f32, v2f64] regclass = [FP32, FP64, VR128, VR128] suffix = [ss, sd, ps, pd] { def !toupper(suffix)#rr : Instr< [(set (type regclass:$dst), (type (opcode (type regclass:$src1), (type regclass:$src2))))]>; def !toupper(suffix)#rm : Instr< [(set (type regclass:$dst), (type (opcode (type regclass:$src1), (type addr:$src2))))]>; } } multiclass avx_binop<opcode> { for type = [f32, f64, v4f32, v2f64, v8f32, v4f64] regclass = [FP32, FP64, VR128, VR128, VR256, VR256] prefix = [x, x, x, x, y, y] suffix = [ss, sd, ps, pd] { def V#prefix#NAME#!toupper(suffix)#rr : Instr< [(set (type regclass:$dst), (type (opcode (type regclass:$src1), (type regclass:$src2))))]>; def V#prefix#NAME#!toupper(suffix)#rm : Instr< [(set (type regclass:$dst), (type (opcode (type regclass:$src1), (type addr:$src2))))]>; } } def ADD : binop<add>; def SUB : binop<add>; def MUL : binop<add>; def DIV : binop<add>; [...] Here I am treating "#" as an infix !strconcat. This makes things much easier to read than both !strconcat() and a double-# notation, IMHO. Now each binary pattern is only specified twice and even that duplication can be eliminated with a little more abstraction. Perhaps that's not worth it, however. I can live with this level of duplication. I would also like to replace #NAME# with a "real" Record field that is automatically added to def-type Records. This removes a lot of the hackiness of #NAME#.> [Why preprocessor?] > > The TGParser.cpp as its current form parses and emits the results in > one-pass. That means it would emit the for-loop body even before we > are done parsing the entire for-loop.It doesn't have to. I'm working on a for loop that is not preprocessor based. It uses the same technique as the multiclass: remember the most inner for loop seen and instantiate everything after the entire for loop body is parsed.> So I believe a non-preprocessor approach would require 2 passes. The > first pass parses the input and generates a simple syntax tree, and > the second pass evaluate the syntax tree and emits output records (In > fact, this is how I implemented the current preprocessor). And I > believe that changing TGParser.cpp to accommodate 2 passes is quite a > lot, and so I chose a preprocessor.In the long run, a two-pass parser would probably be more maintainable but I agree it's a big job. That's why I went with a compromise of treating for like a sort of multiclass, at least in the instantiation mechanics. I'm still working on it so I haven't got all the details together yet.> But if you think we should really rewrite TGParser.cpp to parse and > evaluate for-loops correctly, I am glad that we could get away with a > preprocessor.I think we can do it without completely rewriting the parser.> [Why NOT while-loop?] > * A while-loop requires evaluating an loop-condition expression; this > is complexity that I would like to avoid.I agree. It's not needed. Simple iteration over a list is enoguh.> [Why NO if-else?] > * It requires at least evaluating a Boolean expression, too. > * If a piece of td codes is complicated enough that we need an if-else > to eliminate its duplication, I think it is worthy of the duplication.Perhaps. I'm sort of on the fence on this. I don't like the !if stuff I introduced for readability reasons. An imperitive-style if might make things easier but I think I could live with the duplication.> [Why NO abstractions (like `define foo(a, b, c)`)?] > * Abstractions is probably worthy of, but I am not sure yet. I think > we could wait until it is clear that we really need abstractions.I'm not sure what you mean by this abstraction. Can you elaborate?> [string vs token] > > The preprocessor (as its current form) has tokens by default, and it > only converts a series of tokens and white spaces into a string if > explicitly required (by a special escape #"#, see example below).On further thought, I think this is correct. My plan with the parser-integrated for is to allow the user to declare the type of the iterator. It solves a number of problems, most importantly TableGen's declaration before use requirement.> ---------------------------------------- > #for i = sequence(0, 127) > def P#i# : PTXReg<#"#p#i##"#>; > #end > ---------------------------------------- > * Anything between #"# are quoted, including white spaces and > non-tokens. E.g., #"#hello world#"# --> "hello world"As mentioned above, I've been thinking about making "#" an infix equivalent to !strconcat(). This would require the parser to implicitly cast values to string if necessary but that's not hard to do.> * Macro variable needs a '#' character at both front and back. This > looks like the multiclass's #NAME# substitution, and so I think is > more consistent than prepending a single '#' at front.To handle existing #NAME# constructs, I was planing to define a trailing # as a !strconcat(string, ""). This keeps consistency and provides more flexibility to the # operator.> What do you think? Which one is more readable to you? !case<> or #"# or ... ?I like an infix #. It's consistent with at least some other languages.> [Can the for-loop proopsal be a preprocessing phase?] > > I guess the example Dave gave (see below) cannot be handled in a (even > extended) preprocessor. I am not keen on implementing for-loop in a > preprocessor. I chose a preprocessor because I think it would cause > least impact to the codebase and, to be honest, I didn't address of > the pattern that Dave gave in his example in my design. I was trying > to avoid variable-length lists because I think that is too complicated > to users. But I could be wrong.I think it could be if not done carefully. I am already getting feedback that some of my proposals are too complicated. I am really taking that to heart and trying to find a good balance among maintainability, clarity and redundancy. I think things can be better than they are now and better than what I've proposed so far. No one gets the right answer in a vacuum. :) -Dave