On May 4, 2011, at 8:23 AM, Evan Cheng wrote:>> I don't know how realistic it is to model the loop buffer in the register allocator, but this would a very interesting thing to try to optimize for in a later pass. If an inner loop "almost" fits, then it would probably be worth heroic effort to try to reduce the size of it to shave off a few bytes. > > Jakob and I have talked about this briefly. The idea is to insert a copy from the larger register to the smaller register in the loop preheader and change the references inside the loop. However, the reality is this is a lot harder than it sounds. > > 1. When is this profitable? We can model size of loop buffer. But this is also dependent on loop alignment. We may have to align small loops on 32 byte boundary to get this right. > > 2. When should we do this? I was thinking of a late pass. But it's harder to reason about dependencies when register allocation is done. Perhaps it should be done before the virtual registers are rewritten? > > 3. How restrictive is the optimization? What if the register is updated inside the loop? We have to be careful about instructions requiring specific register inputs. What if the register is updated and live out of the loop? We need to copy it back at loop exits? > > We should start tackling the the restrictive forms of the problem. That should fix a number of cases where linear scan got lucky. I am sure there are a lot more interesting cases though.Yep, doing this as a late pass, and having it only handle the "easy and obvious" cases would make a lot of sense. I bet that some non-trivial number of the major speedups that Jakob is seeing are due to "greedy getting lucky". The bad thing about this is that relying on luck makes performance analysis extremely difficult. It is much better to actively control these things where possible so that we both get good performance all the time, and have more useful and comparable perf analysis. I know you already know this :) -Chris
On Apr 13, 2011, at 21:43 CDT, John McCall wrote:> And it's okay to have limited goals! I personally don't; I think we > should aim to get the IR design good enough to support crazy resumptive > languages with crazy custom unwinding schemes. But I need to know what > range of problems we're willing to consider solving before I can usefully > weigh different solutions.I will dare a comment on this topic well over my head, so my answer will probably only reflect my ignorance on the deepness of the subject, however, in any case i hope to get some clarification of my concepts let me say more before i go into the idea on this post: I've used coroutines a lot in my c++ programming, and certainly when you work a lot of time with a hammer, everything starts looking like a nail. However, it has always seemed to me that exception handling (at least on c++) is just a particular syntax of a subset of coroutine semantics: void f() { throw 1; } void h(); void g() { try { f(); } catch (...) { h(); } } assumming a hypothetical Coroutine object that supports manual cleanup of the stack allocated objects, the code above is semantically equivalent to: struct CoroutineStack { Coroutine* context; CoroutineStack* next; } Coroutine* currentContext; CoroutineStack * nextCatchContextLink; void f () { currentContext->yieldTo( nextCatchContextLink->context ); } void h(); void g() { Coroutine tryContext( [&] => { f(); } ); Coroutine catchContext( [&] => { h(); } ); //define a new link to store current catch context // which points to outer stack contexts in this same stack CoroutineStack catchContextLink; catchContextLink.context = &catchContext; catchContextLink.next = nextCatchContextLink; nextCatchContextLink = & catchContextLink; tryContext(); } in the case a rethrow happens in the catch context, the code in the catch needs to change slightly; Coroutine catchContext( [&] => { h(); nextCatchContextLink = nextCatchContextLink ->next; } ); I don't know much about DWARF and other EH schemes, and i really can't say what can do a personality or a landing pad that you can't express with coroutine semantics, so forgive me if the above makes large oversights orits just Plain Silly. does this make sense?
On May 5, 2011, at 10:50 PM, Charllls Alquarra wrote:> On Apr 13, 2011, at 21:43 CDT, John McCall wrote: >> And it's okay to have limited goals! I personally don't; I think we >> should aim to get the IR design good enough to support crazy resumptive >> languages with crazy custom unwinding schemes. But I need to know what >> range of problems we're willing to consider solving before I can usefully >> weigh different solutions. > > I will dare a comment on this topic well over my head, so my answer will probably only reflect my ignorance on the deepness of the subject, however, in any case i hope to get some clarification of my concepts > > let me say more before i go into the idea on this post: I've used coroutines a lot in my c++ programming, and certainly when you work a lot of time with a hammer, everything starts looking like a nail. > > However, it has always seemed to me that exception handling (at least on c++) is just a particular syntax of a subset of coroutine semantics:This is certainly one way of implementing exceptions. Unfortunately, it would be prohibitively expensive to create a new coroutine every time a new EH scope is entered — much worse than other explicit scope-tracking implementations like SEH or builtin_sj/lj. That said, lowering IR from a "conventional" style — where cleanups appear as normal code and control dependencies are expressed directly with normal-ish control flow — to the form required by an SEH-like implementation would indeed require operations not fundamentally dissimilar from coroutine extraction. John.