This is a simple SSA code generation 101 question. If I follow the IR code generation techniques in the Dragon book the statement x = y + z would translate into something like this in SSA/LLVM %0 = add %y, %z %x = %0 Obviously "copy instructions" like %foo = %bar are senseless in SSA since %foo and %bar are immutably fixed to the same value and there is no need for two aliases for the same thing (this is my own observation, please tell me if my thinking is off). What are the general code generation techniques to avoid "copy instructions"? For example, the simple code generation methods that yield the translation above might look like the following: Value *AddExpression::codeGen() { Value *l = left->codeGen(); Value *r = right->codeGen(); Value *result = new TempValue; // get unique temporary emit(result->str() + " add " + l->str() + ", " r-str()); return result; } Value *assignExpression::codeGen() { Value *rval = rvalue->codeGen(); Value *lval = new NameValue(ident); emit(lval->str() + " = " + rval->str()); // emit (silly) copy instruction return lval; } What I have suggested to my students is to omit the (non-existent) copy instruction and use the "rval" above as a replacement for all future occurrences of "ident." i.e., something like the following: Value *assignExpression::codeGen() { Value *rval = rvalue->codeGen(); update symbol table so that all future reference to "ident" are replaced with rval return rval; } Using this scheme, the following x = y + z u = x * y + foo(x) would be translated into %0 = add %y, %z %1 = mul %0, %y %2 = call foo(%0) %3 = add %1, %2 Is there a more obvious approach to avoiding "copy instructions"? --w Wayne O. Cochran Clinical Assistant Professor, Computer Science Washington State University Vancouver wcochran at vancouver.wsu.edu http://ezekiel.vancouver.wsu.edu/~wayne
On Fri, Apr 22, 2011 at 10:40 AM, Wayne Cochran <wcochran at vancouver.wsu.edu> wrote:> This is a simple SSA code generation 101 question. > > If I follow the IR code generation techniques in the Dragon book the > statement > x = y + z > would translate into something like this in SSA/LLVM > %0 = add %y, %z > %x = %0 > Obviously "copy instructions" like %foo = %bar are senseless in SSA > since %foo and %bar are immutably fixed to the same value and there > is no need for two aliases for the same thing (this is my own observation, > please tell me if my thinking is off). > > What are the general code generation techniques to avoid "copy instructions"? > > For example, the simple code generation methods that yield the translation > above might look like the following: > > Value *AddExpression::codeGen() { > Value *l = left->codeGen(); > Value *r = right->codeGen(); > Value *result = new TempValue; // get unique temporary > emit(result->str() + " add " + l->str() + ", " r-str()); > return result; > } > > Value *assignExpression::codeGen() { > Value *rval = rvalue->codeGen(); > Value *lval = new NameValue(ident); > emit(lval->str() + " = " + rval->str()); // emit (silly) copy instruction > return lval; > } > > What I have suggested to my students is to omit the (non-existent) copy instruction > and use the "rval" above as a replacement for all future occurrences of "ident." > i.e., something like the following: > > Value *assignExpression::codeGen() { > Value *rval = rvalue->codeGen(); > update symbol table so that all future reference to "ident" are replaced with rval > return rval; > } > > Using this scheme, the following > x = y + z > u = x * y + foo(x) > would be translated into > %0 = add %y, %z > %1 = mul %0, %y > %2 = call foo(%0) > %3 = add %1, %2 > > > Is there a more obvious approach to avoiding "copy instructions"?The recommended approach to generating LLVM IR is simply not to try to generate code in SSA form; see http://llvm.org/docs/tutorial/LangImpl7.html#memory . -Eli
Wayne, The short answer is "copy instruction" (of which LLVM has none, outside of ADDs with 0, ORs with 0, etc.) never show up in SSA forms. You would have to specifically force your compiler to generate such instructions. I've never read the dragon book, but I expect such instruction sequences are there only for instructional purposes, not as anything a real compiler would ever generate. The only place copy instructions should be inserted is in lowering PHI instructions - which is often heavily coupled with register operations. It looks as if you are giving exactly the right advice, as far as replacing the entry in the symbol table. As far as I know, this is not just the most obvious way to do it, but the *only* way when generating SSA form. The only time an assignment expression should involve generating a copy (move) is if you are directly emitting a linear IR (or machine code). Java and .NET both do this. Hope this helps. -Joshua On Fri, Apr 22, 2011 at 11:40 AM, Wayne Cochran <wcochran at vancouver.wsu.edu>wrote:> This is a simple SSA code generation 101 question. > > If I follow the IR code generation techniques in the Dragon book the > statement > x = y + z > would translate into something like this in SSA/LLVM > %0 = add %y, %z > %x = %0 > Obviously "copy instructions" like %foo = %bar are senseless in SSA > since %foo and %bar are immutably fixed to the same value and there > is no need for two aliases for the same thing (this is my own observation, > please tell me if my thinking is off). > > What are the general code generation techniques to avoid "copy > instructions"? > > For example, the simple code generation methods that yield the translation > above might look like the following: > > Value *AddExpression::codeGen() { > Value *l = left->codeGen(); > Value *r = right->codeGen(); > Value *result = new TempValue; // get unique temporary > emit(result->str() + " add " + l->str() + ", " r-str()); > return result; > } > > Value *assignExpression::codeGen() { > Value *rval = rvalue->codeGen(); > Value *lval = new NameValue(ident); > emit(lval->str() + " = " + rval->str()); // emit (silly) copy > instruction > return lval; > } > > What I have suggested to my students is to omit the (non-existent) copy > instruction > and use the "rval" above as a replacement for all future occurrences of > "ident." > i.e., something like the following: > > Value *assignExpression::codeGen() { > Value *rval = rvalue->codeGen(); > update symbol table so that all future reference to "ident" are replaced > with rval > return rval; > } > > Using this scheme, the following > x = y + z > u = x * y + foo(x) > would be translated into > %0 = add %y, %z > %1 = mul %0, %y > %2 = call foo(%0) > %3 = add %1, %2 > > > Is there a more obvious approach to avoiding "copy instructions"? > > --w > > Wayne O. Cochran > Clinical Assistant Professor, Computer Science > Washington State University Vancouver > wcochran at vancouver.wsu.edu > http://ezekiel.vancouver.wsu.edu/~wayne > > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110422/0ed248fe/attachment.html>
It is my understanding, the alloca memory routines are used for forcing variables to be allocated on the stack frame -- which you would want for source level debugging. When SSA registers are used, LLVM will decide what goes into registers and what will spill over to the stack frame. I want the latter. --w Wayne O. Cochran Assistant Professor Computer Science wcochran at vancouver.wsu.edu -----Original Message----- From: Eli Friedman [mailto:eli.friedman at gmail.com] Sent: Fri 4/22/2011 5:53 PM To: Cochran, Wayne Owen Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] copy instructions On Fri, Apr 22, 2011 at 10:40 AM, Wayne Cochran <wcochran at vancouver.wsu.edu> wrote:> This is a simple SSA code generation 101 question. > > If I follow the IR code generation techniques in the Dragon book the > statement > x = y + z > would translate into something like this in SSA/LLVM > %0 = add %y, %z > %x = %0 > Obviously "copy instructions" like %foo = %bar are senseless in SSA > since %foo and %bar are immutably fixed to the same value and there > is no need for two aliases for the same thing (this is my own observation, > please tell me if my thinking is off). > > What are the general code generation techniques to avoid "copy instructions"? > > For example, the simple code generation methods that yield the translation > above might look like the following: > > Value *AddExpression::codeGen() { > Value *l = left->codeGen(); > Value *r = right->codeGen(); > Value *result = new TempValue; // get unique temporary > emit(result->str() + " add " + l->str() + ", " r-str()); > return result; > } > > Value *assignExpression::codeGen() { > Value *rval = rvalue->codeGen(); > Value *lval = new NameValue(ident); > emit(lval->str() + " = " + rval->str()); // emit (silly) copy instruction > return lval; > } > > What I have suggested to my students is to omit the (non-existent) copy instruction > and use the "rval" above as a replacement for all future occurrences of "ident." > i.e., something like the following: > > Value *assignExpression::codeGen() { > Value *rval = rvalue->codeGen(); > update symbol table so that all future reference to "ident" are replaced with rval > return rval; > } > > Using this scheme, the following > x = y + z > u = x * y + foo(x) > would be translated into > %0 = add %y, %z > %1 = mul %0, %y > %2 = call foo(%0) > %3 = add %1, %2 > > > Is there a more obvious approach to avoiding "copy instructions"?The recommended approach to generating LLVM IR is simply not to try to generate code in SSA form; see http://llvm.org/docs/tutorial/LangImpl7.html#memory . -Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110423/f74713b7/attachment.html>
Seemingly Similar Threads
- [LLVMdev] copy instructions
- [LLVMdev] copy instructions
- [LLVMdev] copy instructions
- more smbd CPU mystery
- Branch 'as' - 15 commits - libswfdec/swfdec_as_context.c libswfdec/swfdec_as_context.h libswfdec/swfdec_as_frame.c libswfdec/swfdec_as_frame.h libswfdec/swfdec_as_function.c libswfdec/swfdec_as_function.h libswfdec/swfdec_as_interpret.c