Evan Cheng <evan.cheng at apple.com> writes:> David, we cannot accept the 'multidef' keyword. Please revert it.Working on it now.> We appreciate you thinking ahead about MIC, but we are against the > massive refactoring and complicated abstraction scheme. We'll never > accept those patches.How about a less massive and complicated scheme? I think we can make some good improvements to the current spec that will help with the MIC work.> Your proposed new TableGen functionalities are interesting but it is > clearly not where the code owners want it to go.Jakob at least seems interested in the for loop stuff. Am I reading you correctly, Jakob? Having that feature would make a huge difference. Frankly, the way things are with the x86 SIMD stuff is not scalable. I'd like to clean some of that up, in an incremental way, of course. -Dave
On Oct 7, 2011, at 11:23 AM, David A. Greene wrote:> Evan Cheng <evan.cheng at apple.com> writes: > >> David, we cannot accept the 'multidef' keyword. Please revert it. > > Working on it now. > >> We appreciate you thinking ahead about MIC, but we are against the >> massive refactoring and complicated abstraction scheme. We'll never >> accept those patches. > > How about a less massive and complicated scheme? I think we can > make some good improvements to the current spec that will help > with the MIC work.What do you have in mind? I don't think anyone is arguing that tblgen as-is is some paragon of virtue or anything. There's definitely room for improvements.>> Your proposed new TableGen functionalities are interesting but it is >> clearly not where the code owners want it to go. > > Jakob at least seems interested in the for loop stuff. Am I reading you > correctly, Jakob? Having that feature would make a huge difference. > > Frankly, the way things are with the x86 SIMD stuff is not scalable. > I'd like to clean some of that up, in an incremental way, of course. > > -Dave > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Oct 7, 2011, at 11:23 AM, David A. Greene wrote:> Evan Cheng <evan.cheng at apple.com> writes: > >> David, we cannot accept the 'multidef' keyword. Please revert it. > > Working on it now.Thanks.> >> We appreciate you thinking ahead about MIC, but we are against the >> massive refactoring and complicated abstraction scheme. We'll never >> accept those patches. > > How about a less massive and complicated scheme? I think we can > make some good improvements to the current spec that will help > with the MIC work.I'd like to defer MIC work until it's finalized. But yes, incremental refinement is always welcome.> >> Your proposed new TableGen functionalities are interesting but it is >> clearly not where the code owners want it to go. > > Jakob at least seems interested in the for loop stuff. Am I reading you > correctly, Jakob? Having that feature would make a huge difference. > > Frankly, the way things are with the x86 SIMD stuff is not scalable. > I'd like to clean some of that up, in an incremental way, of course.It's not like we are against improvements that make td files better. But there is fine balance we should strive for. Readability, maintainability, and ease of debugging are very important. Thanks, Evan> > -Dave
Evan Cheng <evan.cheng at apple.com> writes:> On Oct 7, 2011, at 11:23 AM, David A. Greene wrote: > >> Evan Cheng <evan.cheng at apple.com> writes: >> >>> David, we cannot accept the 'multidef' keyword. Please revert it. >> >> Working on it now. > > Thanks.No problem. I agree that multidef was sort of a half-baked idea. I did it for practical purposes here but Che-Liang developed a much better, mroe general solution to the problem.>> How about a less massive and complicated scheme? I think we can >> make some good improvements to the current spec that will help >> with the MIC work. > > I'd like to defer MIC work until it's finalized. But yes, incremental > refinement is always welcome.Ok. I'd like to do some of this incremental work to prepare for MIC. I will do this in small batches and please do keep telling me when I'm heading off track. :)>> Frankly, the way things are with the x86 SIMD stuff is not scalable. >> I'd like to clean some of that up, in an incremental way, of course. > > It's not like we are against improvements that make td files > better. But there is fine balance we should strive for. Readability, > maintainability, and ease of debugging are very important.Absolutely. I don't think I've ever claimed that what I did for AVX here should go whole hog into upstream. I certainly never had that view in my head. On the contrary, I learned a lot doing it and know all the mistakes and skeletons quite well. :) I'd would like to use that experience to inform how we might structure the x86 specification a bit better to make maintenance, readability and debuggability better. I plan to continue to work on a non-preprocessor for loop because I see immediate value to it, it is much clearer that what I have currently and it allows incremental improvements to existing code. My overarching goal is to get our trees in sync before heavy-duty AVX2 or MIC work begins. -Dave
On Oct 7, 2011, at 11:23 AM, David A. Greene wrote:> Evan Cheng <evan.cheng at apple.com> writes: > >> Your proposed new TableGen functionalities are interesting but it is >> clearly not where the code owners want it to go. > > Jakob at least seems interested in the for loop stuff. Am I reading you > correctly, Jakob? Having that feature would make a huge difference.I think the for loops have merit, but not the way you want to use them. Some target descriptions have many sequential definitions, for example PowerPC/PPCRegisterInfo.td: // Vector registers def V0 : VR< 0, "v0">, DwarfRegNum<[77, 77]>; def V1 : VR< 1, "v1">, DwarfRegNum<[78, 78]>; ... def V30 : VR<30, "v30">, DwarfRegNum<[107, 107]>; def V31 : VR<31, "v31">, DwarfRegNum<[108, 108]>; Such 'embarrassingly sequential' definitions have no value, and I would welcome a for-loop construct that can create them. I think this was the primary motivation behind Che-Liang's proposal. I do NOT want to factor out redundancy from instruction definitions like this: multiclass PTX_FLOAT_3OP<string opcstr> { def rr32 : InstPTX<(outs RegF32:$d), (ins RndMode:$r, RegF32:$a, RegF32:$b), !strconcat(opcstr, "$r.f32\t$d, $a, $b"), []>; def ri32 : InstPTX<(outs RegF32:$d), (ins RndMode:$r, RegF32:$a, f32imm:$b), !strconcat(opcstr, "$r.f32\t$d, $a, $b"), []>; def rr64 : InstPTX<(outs RegF64:$d), (ins RndMode:$r, RegF64:$a, RegF64:$b), !strconcat(opcstr, "$r.f64\t$d, $a, $b"), []>; def ri64 : InstPTX<(outs RegF64:$d), (ins RndMode:$r, RegF64:$a, f64imm:$b), !strconcat(opcstr, "$r.f64\t$d, $a, $b"), []>; } As repeated many times on this thread, the most common operation that a .td file must support is looking up an instruction and figuring out what its properties are and where they came from. Adding further abstractions to factor out redundancy works against that. Instruction definitions are hard enough to read as is. The direction you are going would make it much worse. /jakob
Jim Grosbach <grosbach at apple.com> writes:>>> We appreciate you thinking ahead about MIC, but we are against the >>> massive refactoring and complicated abstraction scheme. We'll never >>> accept those patches. >> >> How about a less massive and complicated scheme? I think we can >> make some good improvements to the current spec that will help >> with the MIC work. > > What do you have in mind? I don't think anyone is arguing that tblgen > as-is is some paragon of virtue or anything. There's definitely room > for improvements.Two things immediately: 1. Make '#' (paste) a first-class citizen. Make it an infix version of !strconcat(). This helps #2 by making use of the iterated values much simpler. I also think it's good in its own right as it eliminates a special case from the TableGen parser. 2. Add the for loop capability integrated into the Parser. Getting to #1: The first step will be to modify the parser to get rid of the #NAME# hack and turn paste into a first-class operator. To transition things, I propose that we temporarily add a built-in field in each Record created by a defm called "NAME" with the value being the string parsed as the defm ID. Then parser lookup will find the right value for everywhere #NAME# gets used. Since NAME has a somewhat large potential for name conflict with some user-created code, I would as a second step transition uses of #NAME# to something like #@DEFM#. The @ sign makes it unlikely to clash with a user name and DEFM is more precise than NAME. The built-in value NAME will be named "@DEFM." I am totally open to a different naming convention. Getting to #2: This is pretty straightforward for for loops in multiclasses. I will steal the instantiation work I did for multidef as the mechanics are exactly the same. Some parser enhancements are all that's needed to implement for loops in multiclasses. I am pretty far along in this already. I'm going to design the parsing of for loops so that they can occur almost anywhere and operate basically the same way everywhere. I think all that will be needed is to carry a bit more state around in the ForLoop object created (field names seen in a class body, for example). I think it's an incremental change for each kind of thing we'd like to iteratively produce. Those are the TableGen changes that I would find more immediately useful. If it works out like I'm envisioning, we can also get rid of the more complicated operators (!foreach(), etc.). Longer Term: Beyond that, what I'd like to do is take the x86 SSE/AVX specification as it is now and try to remove some of the cut-n-paste aspect without abstracting things too much so that meaning is obscured. I'm going to rely on the community to check those changes and make suggestions if they see problems. My primary goal here is to stop the exponential growth of the x86 SIMD .td files as Intel/AMD keep adding longer vector lengths. Ideally, they would not grow much at all as new vector types are added. I've come pretty close to that here and the for loop stuff will make it even easier to accomplish and with more clarity. So that's the 10,000' summary. HTH -Dave
Jakob Stoklund Olesen <jolesen at apple.com> writes:> I think the for loops have merit, but not the way you want to use them. > > Some target descriptions have many sequential definitions, for example PowerPC/PPCRegisterInfo.td: > > // Vector registers > def V0 : VR< 0, "v0">, DwarfRegNum<[77, 77]>; > def V1 : VR< 1, "v1">, DwarfRegNum<[78, 78]>; > ... > def V30 : VR<30, "v30">, DwarfRegNum<[107, 107]>; > def V31 : VR<31, "v31">, DwarfRegNum<[108, 108]>; > > Such 'embarrassingly sequential' definitions have no value, and I > would welcome a for-loop construct that can create them. I think this > was the primary motivation behind Che-Liang's proposal.Ok. We agree here.> I do NOT want to factor out redundancy from instruction definitions like this: > > multiclass PTX_FLOAT_3OP<string opcstr> { > def rr32 : InstPTX<(outs RegF32:$d), > (ins RndMode:$r, RegF32:$a, RegF32:$b), > !strconcat(opcstr, "$r.f32\t$d, $a, $b"), []>; > def ri32 : InstPTX<(outs RegF32:$d), > (ins RndMode:$r, RegF32:$a, f32imm:$b), > !strconcat(opcstr, "$r.f32\t$d, $a, $b"), []>; > def rr64 : InstPTX<(outs RegF64:$d), > (ins RndMode:$r, RegF64:$a, RegF64:$b), > !strconcat(opcstr, "$r.f64\t$d, $a, $b"), []>; > def ri64 : InstPTX<(outs RegF64:$d), > (ins RndMode:$r, RegF64:$a, f64imm:$b), > !strconcat(opcstr, "$r.f64\t$d, $a, $b"), []>; > } > > As repeated many times on this thread, the most common operation that > a .td file must support is looking up an instruction and figuring out > what its properties are and where they came from.Ok. What properties are most important to look up? I have found it really easy to just run "tblgen X86.td" and look at the output. That's really the canonical definition, right? Yes, I know it's an extra step but if one really can't puzzle out where an instruction came from, it's an easy one. To me, this: !strconcat(opcstr, "$r.f32\t$d, $a, $b"), []>; !strconcat(opcstr, "$r.f32\t$d, $a, $b"), []>; !strconcat(opcstr, "$r.f64\t$d, $a, $b"), []>; !strconcat(opcstr, "$r.f64\t$d, $a, $b"), []>; is dangerously redundant. It's the patterns that are hardest to write so I'd rather get them right once or a few times rather than get them wrong in lots of places and have to go hunt and fix them. This could be mitigated somewhat by doing something like this: class binary_pattern<string opcstr, string type> { string pattern = !strconcat(opcstr, "$r."#type#"\t$d, $a, $b"); } multiclass PTX_FLOAT_3OP<string opcstr> { def rr32 : InstPTX<(outs RegF32:$d), (ins RndMode:$r, RegF32:$a, RegF32:$b), binary_pattern<opcstrm "f32">.pattern, []>; def ri32 : InstPTX<(outs RegF32:$d), (ins RndMode:$r, RegF32:$a, f32imm:$b), binary_pattern<opcstrm "f32">.pattern, []>; def rr64 : InstPTX<(outs RegF64:$d), (ins RndMode:$r, RegF64:$a, RegF64:$b), binary_pattern<opcstrm "f64">.pattern, []>; def ri64 : InstPTX<(outs RegF64:$d), (ins RndMode:$r, RegF64:$a, f64imm:$b), binary_pattern<opcstrm "f64">.pattern, []>; } Would something like that be acceptable? This is even more important for many of the Pat<> patterns which tend to be quite complex. I have found several cases where Pat<> patterns existed for some types of, say, add operation but not others, for no apparent reason other than someone added one and forgot to add the others. Pat<> patterns is the whole reason multidefs got developed in the first place. I wanted to define Pat<> patterns right in the same place the "ordinary" patterns got defined. There's much less chance of missing something or writing inconsistent patterns that way. For x86, I also find the encoding details at the top level very distracting. I was confused for months about what they meant and I didn't care. I just wanted to generate assembly code. SSE/AVX has a very regular encoding and I think it would be valuable to express that a bit more concisely. Some of that is done now but I think it could be better. I have a less well-defined idea of what that looks like as it's one of the things I don't particularly like about what I developed here. -Dave