tfpt review "/shelveset:Loops20;REDMOND\tomat" Implements adaptive loop compilation. This feature needed major changes to local variable handling and control flow implementation in interpreter. Local variables Replaces a list of local variables with LocalsVariable structure that encapsulates a dictionary. It doesn''t support variable shadowing yet but it at least detects it and throws NotSupportedException. Previously we silently used wrong indices to the variable array. Control flow Reimplements interpreter goto instructions and exception handling. Goto instructions used to encode all information describing the jump (a list of finally blocks to be executed and target stack depth). The loop compiler needs to find all GotoExpressions within the loop that jump out of the loop and associate them with the corresponding Goto instructions. This cannot be done in presence of reducible nodes as they don''t preserve nodes identity. Therefore we need to move the jump information from goto instruction to the target label and track current try and finally blocks. GotoInstruction, EnterTryFinallyInstruction and LeaveExceptionHandlerInstruction derive now from IndexedBranchInstruction. While OffsetInstruction hold on a relative offset these instructions hold on the target label index in the table of RuntimeLabels. RuntimeLabel struct comprises of target instruction index and target stack depth and target continuation stack depth. That''s all it is needed for a jump to be executed. Jumps via label index are a little bit slower than jumps to relative offset since they need to look up the target index in the label table. Also the label table is only as big as there are gotos and try-catch/try-finally blocks in the lambda. We can easily convert other branch instructions into IndexedBranchInstructions if we find it better. Using indexed branch instructions moves target stack depth to the label. We also need to move finally list out of goto instruction. Since a single label might be used as a target of multiple goto instructions/expressions and these could be nested in different try-finally blocks we need to track the stack of finally blocks that we enter and leave as we execute instructions. EnterTryFinallyInstruction is added at the beginning of every try-finally block. This instruction pushes a local continuation into the stack of continuations stored on InterpretedFrame. The top item of this stack is current continuation. A continuation is implemented as an integer index into label table. The continuation pushed by EnterTryFinally points to finally clause. GotoInstruction sets the current pending continuation and pending value (if it transfers a value) and jumps to the current continuation if there is any. A GotoInstruction is emitted at the end of the try-finally body. This goto''s target is the end of the entire try expression. EnterFinallyInstruction is emitted at the beginning of finally clause. It removes the current continuation from the continuation stack, pushes the pending continuation and value onto the data stack and invalidates them. If any exception is thrown but not caught during execution of finally clause the current pending continuation is canceled (and forgotten) and a new one is set. LeaveFinallyInstruction is emitted at the end of the finally clause. It pops the pending continuation (and pending value) from data stack and yields to it. YeildToPendingContinuation operation compares continuation stack depth of the current continuation with the continuation stack depth of the pending one. It jumps to the pending one only if its depth is less, i.e. when there is no continuation (finally clause) to be executed before we can jump to the target block. Otherwise it jumps to the current continuation. Whenever an exception occurs we catch it in Interpreter.Run method. We look for the exception handler that should be executed. If we find one we perform the same steps as if we just executed GotoInstruction targeted to the exception handle: we set the current pending continuation to the label that points to the handler and set pending value to the exception object. Finally, we jump to the current continuation. If there is no catch or fault handler we do the same as if there was one with instruction index Int32.MaxValue. That emulates a jump to the end of the instruction sequence. If this jump is not interrupted by another exception raised from some finally/fault block or goto jumping from a finally block we finish instruction execution and return from Run method with the current InstructionIndex set to the special value Int32.MaxValue. That indicates that we should rethrow the exception and so we do. Moves InterpretedFrame chaining from IronRuby to the interpreter. The frames are linked into a stack by Interpreter.Run method so that each CLR frame of this method corresponds to an interpreted stack frame in the interpreted stack. The two traces can be combined into one. A static ThreadLocal<InterpretedFrame> variable is updated upon entry and exit from Run method. Loop compiler Adds a new EnterLoopInstruction that is injected at the beginning of a loop generated from LoopExpression. This instruction has a counter that increments each time it is executed. If the counter reached CompilationThreshold a compilation is started on a background thread. The instruction holds on the LoopExpression to compile. The loop needs to be massaged before we can compile it to a lambda. The lambda we produce looks like: int lambda(InterpretedFrame frame) { T$1 loc$1 = (T$1)frame.Data[$index1]; ... T$n loc$n = (T$n)frame.Data[$indexN]; StrongBox<object> closure_loc$1 = frame.Closure[$index1]; ... StrongBox<object> closure_loc$M = frame.Closure[$indexM]; try { ... loc$1 = value ... ... closure_loc$1.Value = (object)value; ... return frame.Goto(labelIndex, value) // for each goto label (value), where label is outside loop } finally { // write back Frame.Data[$index1] = (object)loc$1; } return $breakOffset; } When the lambda is ready the EnterLoopInstruction is replaced by a CompiledLoopInstruction that holds on a delegate to the compiled lambda and calls it upon execution. Perf impact The interpreter thruput with disabled compilation is about 5% worse on Pystone with this change. About 1% amounts for tracking interpreted stack chain the rest is probably due to the more expensive try-finally blocks (continuation stack is allocated, continuations are pushed/popped on entry/exit to try and finally blocks, etc.). -X:NoAdaptiveCompilation is now better than adaptive compilation only by 4-7% (for compilation threshold 2 and 32, respectively), it used to be about 4 times better. Misc Special cases adaptive compilation for CompilationThreshold 0 and 1. In both cases the compilation is synchronous. This allows us to easily test and debug loop compiler and lambda compiler. Implements instruction provider for FinallyFlowControlExpression - the interpreter handles jumps from finally directly, so we don''t need to rewrite the tree. FlowControlRewriter should reduce all extensible nodes within the tree. It might miss some goto expressions or finally clauses otherwise (e.g. { label: try { REDUCIBLE } finally { REDUCIBLE; } }, where any of the REDUCIBLEs reduces to "goto label". Ruby, Python: CatchBlock defines a scope for its exception variable, which wasn''t taken into account in Python and Ruby AST generators and rewriters. They declared the variable in the containing block duplicating the variable definition and depending on variable shadowing. Removes the duplicate declarations. Removes "compileLoops" argument passed to LightCompile. All loops are adaptively compiled now. Python Adds missing debug info around for-loop initialization (see test_traceback.py run:test_throw_while_yield) Increases test_memory limit to 18k since the loop is adaptively compiled now. We might want to disable adaptive compilation during this test. Disables test_dict.py run:test_container_iterator. Filed bug: http://ironpython.codeplex.com/WorkItem/View.aspx?WorkItemId=25419 Disables test_traceback.py run:test_throw_while_yield. Filed bug: http://ironpython.codeplex.com/WorkItem/View.aspx?WorkItemId=25428 Ruby: Fixes mangling of "me" name. Disabled one test case in core/kernel/caller_spec.rb. The behavior that made this test accidentally pass was incorrect. Tomas -------------- next part -------------- A non-text attachment was scrubbed... Name: Loops20.diff Type: application/octet-stream Size: 85345 bytes Desc: Loops20.diff URL: <http://rubyforge.org/pipermail/ironruby-core/attachments/20091124/f7a6223c/attachment-0001.obj>