thr3ads.net - llvm dev - [LLVMdev] Some question on LLVM design [Oct 2004]

If this information is useful, please help other people find it:
Share via:

Marc Ordinas i Llopis

2004-Oct-22 13:18 UTC

[LLVMdev] Some question on LLVM design

Hi everybody,

I'm currently looking at LLVM as a possible back-end to a dynamic 
programming system (in the tradition of Smalltalk) we are developing. I 
have read most of the llvmdev archives, and I'm aware that some things 
are 'planned' but not implemented yet. We are willing to contribute the 
code we'll need for our project, but before I can start coding I'll have
to submit to my boss a very concrete proposal on which changes I'll make 
and how long they're going to take.

So before I can present a concrete proposal, I have some doubts on the 
design of LLVM and on how some particular constructs should be mapped 
onto its bytecode representation. Please understand that these questions 
are not intended to criticize LLVM, but instead to better my 
understanding of it.

1. Opcodes and intrinsics

Which are the differences between opcodes and intrinsics? How is it 
determined, for an operation, to implement it as an opcode or as an 
intrinsic function?

As I understand it, compilation passes can both lower intrinsics into 
opcodes and also replace opcode sequences, so in the end some of them 
are interchangeable. For example, why is there a store opcode and a 
llvm_gcwrite intrinsic? Couldn't the front-end just produce 
stores/volatile stores and then a compilation pass transform them into a 
write-barrier if necessary?

A possible view of intrinsics could be "operations that don't depend on
the target architecture, but instead on the language runtime". But then 
wouldn't malloc/free be intrinsics?

2. Stack and registers

As the LLVM instruction set has a potentially infinite number of 
registers which are mapped onto target registers or the stack by the 
register allocator, why is there a separate stack? I would understand 
it, if the stack was more accessible, as a way to implement closures, 
but it has been repeated here that the correct way to do that is to use 
heap-allocated structures, as functions can't access other functions' 
stacks. Is it to signal locations that need to be changed in-place?

3. Control transfer

Why are the control transfer operations so high level when compared to 
actual processors? Usually processors have instructions to jump to a 
concrete location and everything else is managed by the compiler (saving 
into the stack, getting result parameters, etc.) depending on the 
language's calling conventions. In LLVM there's just one way to transfer
control, and the only proposal I've seen 
(http://nondot.org/sabre/LLVMNotes/CustomCallingConventions.txt) keeps 
this high level. What are the difficulties in having low level transfer 
control operations, with explicitly managed arguments, saving registers, 
etc?

Well, that's all for now. Thanks in advance,

Marc Ordinas i Llopis | Tragnarion Studios

Misha Brukman

2004-Oct-23 18:00 UTC

head link

[LLVMdev] Some question on LLVM design

On Fri, Oct 22, 2004 at 03:18:00PM +0200, Marc Ordinas i Llopis
wrote:> I'm currently looking at LLVM as a possible back-end to a dynamic 
> programming system (in the tradition of Smalltalk) we are developing.
Neat!
> I have read most of the llvmdev archives, and I'm aware that some
> things are 'planned' but not implemented yet. We are willing to
> contribute the code we'll need for our project, but before I can start
> coding I'll have to submit to my boss a very concrete proposal on
> which changes I'll make and how long they're going to take.
Sounds good!
 > 1. Opcodes and intrinsics
> 
> Which are the differences between opcodes and intrinsics? How is it 
> determined, for an operation, to implement it as an opcode or as an 
> intrinsic function?
Opcodes are such things that are usually implemented in one or just a
few machine instructions, i.e. things that have a direct correlation to
a machine ISA, such as 'add', 'mul', 'div', etc.

Intrinsics are such things that provide a more complex interaction with
the processor (or the language runtime), but the implementation would be
so different from one platform to the next, that it will be more than
just a "few instructions".  An example of such would be the llvm.gc*
intrinsics that you mention, which are for garbage collected languages.
Other intrinsics include debugging support, etc.
 > As I understand it, compilation passes can both lower intrinsics into
> opcodes and also replace opcode sequences, so in the end some of them
> are interchangeable.
That's not really correct.  The intrinsics such as llvm.frameaddress and
llvm.returnaddress have no equivalents in LLVM opcodes -- the meaning of
the intrinsics is specifically machine-dependent, and LLVM (and its
opcodes) are machine-independent, so there is no valid interchange of
these intrinsics with any combination of LLVM opcodes.

The llvm.memcpy intrinsic can be lowered onto an explicit LLVM loop, but
the memcpy intrinsic provides a succinct representation of memory copy
and so stays as such.  Plus, it is harder to do the reverse analysis --
prove that a loop is essentially a memcpy().
> For example, why is there a store opcode and a llvm_gcwrite intrinsic?
The 'store' opcode is for generic writes, the gcwrite is specifically
for garbage collection.  I think someone else would be able to give you
more information about the GC intrinsics than I am, however.
> Couldn't the front-end just produce stores/volatile stores and then a
> compilation pass transform them into a write-barrier if necessary?
Again, see above: gcwrite is not a "generic store" in the way that the
'store' opcode is, nor is it a barrier.
> A possible view of intrinsics could be "operations that don't
depend
> on the target architecture, but instead on the language runtime". But
> then wouldn't malloc/free be intrinsics? 
Good question.  Due to the amount of pointer/data analysis in LLVM, it
is often necessary to consider memory allocation instructions to see
where memory is allocated and deallocated.  As such, the malloc/free
instructions provide that higher-level construct that make it easier to
manipulate memory-operating instructions since that is done so often.

Consider: for an llvm.malloc intrinsic, one would have to say something
like this every time we wanted to analyze memory usage:

if (CallInst *CI = dyn_cast<CallInst>(I))
  if (CI->getCalledFunction()->getName() == "malloc") {
    // ...
  }

whereas with the malloc instruction raising/lowering pass, you would say

if (MallocInst *MI = dyn_cast<MallocInst>(I)) {
  // ...
}

Given the prevalence of such code, this makes it more efficient and
cleaner to write the second way.  Note that LLVM will work just fine if
you use direct calls to malloc() and never create malloc instructions.
> 2. Stack and registers
> 
> As the LLVM instruction set has a potentially infinite number of 
> registers which are mapped onto target registers or the stack by the 
> register allocator, why is there a separate stack?
The stack is used for two things: storing structures allocated with
alloca() or automatic variables declared to be function-local.  The
spilling of the registers is done automatically (transparently), but the
allocation of DATA is explicit in LLVM to analyze the pointer aliases.  
> I would understand it, if the stack was more accessible, as a way to
> implement closures, but it has been repeated here that the correct way
> to do that is to use heap-allocated structures, as functions can't
> access other functions' stacks. Is it to signal locations that need to
> be changed in-place?
I'm sorry, I don't quite understand this question.
 > 3. Control transfer
> 
> Why are the control transfer operations so high level when compared to
> actual processors?
Because LLVM is target-independent.  It is the job of the backend code
generator (see llvm/lib/Target/*) to convert high-level control transfer
"call" instruction (see below) to the stack-pop/push or
register-passing
method of the target.  These details are largely different from one
machine to the next, and have no relevance to LLVM analyses.
> Usually processors have instructions to jump to a concrete location
> and everything else is managed by the compiler (saving into the stack,
> getting result parameters, etc.) depending on the language's calling
> conventions.  
LLVM has the 'call' instruction that abstracts the target machine's
calling convention.  See http://llvm.cs.uiuc.edu/docs/LangRef.html#i_call
for more information.
> In LLVM there's just one way to transfer control, and
> the only proposal I've seen
> (http://nondot.org/sabre/LLVMNotes/CustomCallingConventions.txt) keeps
> this high level. 
This is to have more options for calling conventions that may be
specified for a target that are non-standard.  Mostly these are
optimizations (i.e., some registers do not need to be saved if calling a
particular library routine, etc.).
> What are the difficulties in having low level transfer control
> operations, with explicitly managed arguments, saving registers, etc?
Choosing any lower-level mechanism will make LLVM no longer be
target-independent.  Choosing to have the union of all machines is
nearly impossible and is overly complicated and unnecessary for any
LLVM-to-LLVM analyses and transformations.

What is that you are looking to express that isn't captured by the
`call' instruction?

Hope that helps,
-- 
Misha Brukman :: http://misha.brukman.net :: http://llvm.cs.uiuc.edu

Reid Spencer

2004-Oct-23 18:11 UTC

head link

[LLVMdev] Some question on LLVM design

On Fri, 2004-10-22 at 06:18, Marc Ordinas i Llopis
wrote:> Hi everybody,
> Hi Marc
> I'm currently looking at LLVM as a possible back-end to a dynamic 
> programming system (in the tradition of Smalltalk) we are developing. 
Great!
> I 
> have read most of the llvmdev archives, and I'm aware that some things 
> are 'planned' but not implemented yet. We are willing to contribute
the
> code we'll need for our project, but before I can start coding I'll
have
> to submit to my boss a very concrete proposal on which changes I'll
make
> and how long they're going to take.
> 
> So before I can present a concrete proposal, I have some doubts on the 
> design of LLVM and on how some particular constructs should be mapped 
> onto its bytecode representation. Please understand that these questions 
> are not intended to criticize LLVM, but instead to better my 
> understanding of it.
Okay, I'll take a crack at it. Others will probably want to give you
better answers than mine :)
> 
> 1. Opcodes and intrinsics
> 
> Which are the differences between opcodes and intrinsics? How is it 
> determined, for an operation, to implement it as an opcode or as an 
> intrinsic function?
The opcodes are generally fixed as they represent the LLVM mid-level IR.
Changing the opcode set can have wide-reaching impact on all the
analysis, transform, and codegen passes in LLVM so its not a change
taken lightly. However, when it makes sense, they are added
occasionally. For example, we recently added the "unreachable"
instruction which allows a front end compiler to identify code locations
that should not be reached. This can help some of the passes. 

As for intrinsics, these are basically function calls that LLVM knows
about. For example, things like memset and memcpy could be implemented
as a function but could also be implemented with direct code if the
processor supports it. Intrinsics are place holders for either code
generation or invocation of a runtime library function. This is probably
where you'd want to extend.
> As I understand it, compilation passes can both lower intrinsics into 
> opcodes and also replace opcode sequences, so in the end some of them 
> are interchangeable. For example, why is there a store opcode and a 
> llvm_gcwrite intrinsic? Couldn't the front-end just produce 
> stores/volatile stores and then a compilation pass transform them into a 
> write-barrier if necessary?
I believe the llvm_gcwrite intrinsic is for garbage collection, which is
entirely optional. If all stores were to have write-barrier semantics
there could be significant performance penalties.
> A possible view of intrinsics could be "operations that don't
depend on
> the target architecture, but instead on the language runtime". But
then
> wouldn't malloc/free be intrinsics?
Intrinsics are intended to be replaceable by the target's code
generation, if possible/necessary, or emulated with a function call if
not. Language runtimes should be just that .. calls to functions located
in libraries.
> 
> 2. Stack and registers
> 
> As the LLVM instruction set has a potentially infinite number of 
> registers which are mapped onto target registers or the stack by the 
> register allocator, why is there a separate stack? I would understand 
> it, if the stack was more accessible, as a way to implement closures, 
> but it has been repeated here that the correct way to do that is to use 
> heap-allocated structures, as functions can't access other
functions'
> stacks. Is it to signal locations that need to be changed in-place?
I'm not sure what you mean by a "separate stack". Register
allocation
can spill registers to THE stack. I'll let someone more knowledgeable
about this answer.
> 3. Control transfer
> 
> Why are the control transfer operations so high level when compared to 
> actual processors? Usually processors have instructions to jump to a 
> concrete location and everything else is managed by the compiler (saving 
> into the stack, getting result parameters, etc.) depending on the 
> language's calling conventions. In LLVM there's just one way to
transfer
> control, and the only proposal I've seen 
> (http://nondot.org/sabre/LLVMNotes/CustomCallingConventions.txt) keeps 
> this high level. What are the difficulties in having low level transfer 
> control operations, with explicitly managed arguments, saving registers, 
> etc?
I don't think there are any difficulties in having low-level control
transfer operations, I think its more that we don't want them. The point
behind the mid-level SSA IR is to make it simple to generate code from a
compiler front end and to not restrict choices that could be made by
analysis, transform, and codegen passes. The fact that we have a
relatively high level for control transfer operations goes right along
with the rest of the LLVM IR (e.g. gep, call, typed operators, malloc,
free). 

The point is to make a simple, consistent, small IR with the following
goals:

* front ends can easily generate code for it because the number of
  instructions is small and model is simple.
* analyses and transforms are simplified because they don't have to
  reason over mind-numbing complexity of lower-level instruction sets
* code generation has significant freedom in the code it generates
> 
> Well, that's all for now. Thanks in advance,
> 
> Marc Ordinas i Llopis | Tragnarion Studios
Thanks for the interesting questions!

Reid.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20041023/fc549a32/attachment.sig>

Chris Lattner

2004-Oct-24 06:13 UTC

head link

[LLVMdev] Some question on LLVM design

On Fri, 22 Oct 2004, Marc Ordinas i Llopis wrote:
> Hi everybody,
Hi!
> I'm currently looking at LLVM as a possible back-end to a dynamic
> programming system (in the tradition of Smalltalk) we are developing. I
Very cool!
> have read most of the llvmdev archives, and I'm aware that some things
> are 'planned' but not implemented yet. We are willing to contribute
the
> code we'll need for our project, but before I can start coding I'll
have
> to submit to my boss a very concrete proposal on which changes I'll
make
> and how long they're going to take.
Ok.
> So before I can present a concrete proposal, I have some doubts on the
> design of LLVM and on how some particular constructs should be mapped
> onto its bytecode representation. Please understand that these questions
> are not intended to criticize LLVM, but instead to better my
> understanding of it.
Sure.  Before we begin, let me point out that IR design is really a black
art of balancing various forces.  Adding stuff to the IR makes it more
powerful but also more complex to deal with and process (making it more
likely that bugs occur).  However, we need to add things when there are
features or capabilities that cannot be represented in LLVM.  Finally,
LLVM is not immutable: though it's quite stable now, we are always
interested in feedback and ideas. :)
> 1. Opcodes and intrinsics
>
> Which are the differences between opcodes and intrinsics? How is it
> determined, for an operation, to implement it as an opcode or as an
> intrinsic function?
This is a hard thing to really decide, as there frankly isn't much
difference.  Adding intrinsics is far easier than adding instructions, but
intrinsics cannot do some things (for example, they cannot be
terminator instructions).  I tend to prefer to make only very simple and
orthogonal operations be the instructions, relegating the rest to be
intrinsics.  In practice, the biggest difference might be in the bytecode
encoding: instructions are encoded more densely than intrinsics.
> As I understand it, compilation passes can both lower intrinsics into
> opcodes and also replace opcode sequences, so in the end some of them
> are interchangeable. For example, why is there a store opcode and a
> llvm_gcwrite intrinsic?
As pointed out by others, these are completely different operations.  In
particular, the gcwrite intrinsic is used by the front-end to indicate
that a write-barrier may be needed.  Depending on the implementation of
the garbage collector, this may expand to code or just a normal store
instruction.
> Couldn't the front-end just produce stores/volatile stores and then a
> compilation pass transform them into a write-barrier if necessary?
Sortof.  The problem with this is that (without gcwrite) there is no way
to identify the stores that should be turned into write barriers.  In
particular, the heap may have multiple parts to it, some of which are GC'd
and some are not.  For example, the .NET framework has different pointer
types for managed and unmanaged pointers: without the ability to represent
both in LLVM, we could not support it.
> A possible view of intrinsics could be "operations that don't
depend on
> the target architecture, but instead on the language runtime".
There are a least these catagories of intrinsics:

1. Target specific intrinsics: these are things like llvm.returnaddress
   and the varargs stuff that are specific to code generators: there is
   simply no way to express this in LLVM without an intrinsic.  These
   could be made into LLVM instructions, but would provide no real
   advantage if we did so.
2. GC intrinsics: These are their own catagory because they require
   support for the LLVM code generator and GC runtime library to
   interpret.
3. Extended langauge intrinsics.  These include llvm.memcpy and friends.
   Note that the intrinsics are different (more powerful) than the libc
   equivalent, because they allow alignment info to be expressed.
> But then wouldn't malloc/free be intrinsics?
Heh, good question.  This is largely historical, but you're right, they
probably should have been made intrinsics.  Perhaps in the future they
will be converted to be intrinsics, but for now they remain instructions.
> 2. Stack and registers
>
> As the LLVM instruction set has a potentially infinite number of
> registers which are mapped onto target registers or the stack by the
> register allocator,
Yes.
> why is there a separate stack? I would understand it, if the stack was
> more accessible, as a way to implement closures, but it has been
> repeated here that the correct way to do that is to use heap-allocated
> structures, as functions can't access other functions' stacks. Is
it to
> signal locations that need to be changed in-place?
I'm not sure what you mean.  In particular, the alloca instruction is used
to explicitly allocate stack space.  Because it is not possible to take
the address of LLVM registers, this the mechanism that we use to allocate
stack space.  Certain langauges do not need a stack, and thus do not need
to use alloca, other languages (e.g. C) do.  If you clarify your question
I'll be able to give a more satisfactory answer.
> 3. Control transfer
>
> Why are the control transfer operations so high level when compared to
> actual processors? Usually processors have instructions to jump to a
> concrete location and everything else is managed by the compiler (saving
> into the stack, getting result parameters, etc.) depending on the
> language's calling conventions.
The idea is to make the producer of the LLVM code as independent from the
target as possible.  In particular, a front-end for a type safe language
should be able to produce a type-safe module that works on all targets.
> In LLVM there's just one way to transfer control, and the only proposal
> I've seen
> (http://nondot.org/sabre/LLVMNotes/CustomCallingConventions.txt) keeps
> this high level. What are the difficulties in having low level transfer
> control operations, with explicitly managed arguments, saving registers,
> etc?
I'm not sure what it is that you are trying to do.  The abstraction
provided by the call instruction is good for the optimizers and analyses
(because they see exactly what they need, not extra details), good for
compilation speed, and good for target independence.  Custom calling
conventions (link above) are specifically designed to give the code
generator the flexibility to pick the most efficient calling convention
for a particular call/return pair.

Given all of the above (efficient compiled code, easy to write
analysis/xforms, happy front-ends, fast compile times), I'm not sure what
your approach would give.  Care to elaborate?

-Chris

-- 
http://llvm.org/
http://nondot.org/sabre/

Chris Lattner

2004-Oct-24 06:17 UTC

head link

[LLVMdev] Some question on LLVM design

On Sat, 23 Oct 2004, Misha Brukman wrote:> > A possible view of intrinsics could be "operations that don't
depend
> > on the target architecture, but instead on the language runtime".
But
> > then wouldn't malloc/free be intrinsics?
>
> Good question.  Due to the amount of pointer/data analysis in LLVM, it
> is often necessary to consider memory allocation instructions to see
> where memory is allocated and deallocated.  As such, the malloc/free
> instructions provide that higher-level construct that make it easier to
> manipulate memory-operating instructions since that is done so often.
This information could be provided by an intrinsic just as well.
> Consider: for an llvm.malloc intrinsic, one would have to say something
> like this every time we wanted to analyze memory usage:
>
> if (CallInst *CI = dyn_cast<CallInst>(I))
>   if (CI->getCalledFunction()->getName() == "malloc") {
>     // ...
>   }
> whereas with the malloc instruction raising/lowering pass, you would say
> if (MallocInst *MI = dyn_cast<MallocInst>(I)) {
Not true at all.  Consider the "llvm/IntrinsicInst.h" header, which
lets
us write stuff like this, today:

  if (MemCpyInst *MCI = dyn_cast<MemCpyInst>(I))
    MCI->getSource()   MCI->getLength() ...

The one advantage that mallocinst has over using an intrnisic is that
instructions can have different return values in various parts of the
program (e.g., you can write 'malloc int' instead of
'(int*)malloc(4)').

-Chris

-- 
http://llvm.org/
http://nondot.org/sabre/

Marc Ordinas i Llopis

2004-Oct-25 12:00 UTC

head link

[LLVMdev] Some question on LLVM design

Thanks all for the fast answers, I'm certainly understanding LLVM 
better. Some more comments below:

Chris Lattner wrote:>>Couldn't the front-end just produce stores/volatile stores and then
a
>>compilation pass transform them into a write-barrier if necessary?
> 
> 
> Sortof.  The problem with this is that (without gcwrite) there is no way
> to identify the stores that should be turned into write barriers.  In
> particular, the heap may have multiple parts to it, some of which are
GC'd
> and some are not.  For example, the .NET framework has different pointer
> types for managed and unmanaged pointers: without the ability to represent
> both in LLVM, we could not support it.
> 
Ok, I understand this now. Would allocation to different heaps be 
represented using different intrinsics, then?

Also, some answers were provided by Reid Specer, so I'm merging the two 
threads here:

 > Intrinsics are intended to be replaceable by the target's code
 > generation, if possible/necessary, or emulated with a function call if
 > not. Language runtimes should be just that .. calls to functions located
 > in libraries.

But sometimes it's more efficient to have knowledge of the runtime in 
the compiler, i.e. inlining method lookup. Would that be implemented 
using a lowering pass to substitute the function call for the runtime code?
>>2. Stack and registers
> 
> I'm not sure what you mean.  In particular, the alloca instruction is
used
> to explicitly allocate stack space.  Because it is not possible to take
> the address of LLVM registers, this the mechanism that we use to allocate
> stack space.  Certain langauges do not need a stack, and thus do not need
> to use alloca, other languages (e.g. C) do.  If you clarify your question
> I'll be able to give a more satisfactory answer.
> 
So would it be possible to pass a pointer to a structure allocated in 
the stack to a called function?

As to what I'm trying to do, more below.

Reid Spencer wrote:
 > I'm not sure what you mean by a "separate stack". Register
allocation
 > can spill registers to THE stack. I'll let someone more knowledgeable
 > about this answer.

I meant a stack separate from the infinite registers. I see now that we 
need a stack for taking the address of objects, thanks.
>>3. Control transfer
> I'm not sure what it is that you are trying to do.  The abstraction
> provided by the call instruction is good for the optimizers and analyses
> (because they see exactly what they need, not extra details), good for
> compilation speed, and good for target independence.  Custom calling
> conventions (link above) are specifically designed to give the code
> generator the flexibility to pick the most efficient calling convention
> for a particular call/return pair.
> 
> Given all of the above (efficient compiled code, easy to write
> analysis/xforms, happy front-ends, fast compile times), I'm not sure
what
> your approach would give.  Care to elaborate?
> 
Right now, I'm just trying to think on how I'm going to map our language
features on LLVM, to see if we can use it as our back-end with as little 
modification as possible.

Thanks everyone,

Marc Ordinas i Llopis | Tragnarion Studios

Marc Ordinas i Llopis

2004-Oct-25 12:18 UTC

head link

[LLVMdev] Some question on LLVM design

Misha Brukman wrote:> 
>>1. Opcodes and intrinsics
>>
> That's not really correct.  The intrinsics such as llvm.frameaddress
and
> llvm.returnaddress have no equivalents in LLVM opcodes -- the meaning of
> the intrinsics is specifically machine-dependent, and LLVM (and its
> opcodes) are machine-independent, so there is no valid interchange of
> these intrinsics with any combination of LLVM opcodes.
> 
Yes, I understand that those intrinsics are mapped differently on 
different machines, but isn't 'add' mapped differently too?

It seems to me that those intrinsics are there to support the GNU C 
extensions, making them a 'language feature' of sorts. That's why
they
are intrinsics and not opcodes.
>>3. Control transfer
>>
> LLVM has the 'call' instruction that abstracts the target
machine's
> calling convention.  See http://llvm.cs.uiuc.edu/docs/LangRef.html#i_call
> for more information.
> 
Yes, it abstracts the target's C calling convention. But I'm trying to 
see if other types of languages can be implemented using LLVM.
> What is that you are looking to express that isn't captured by the
> `call' instruction?
> 
Tail calls, closures, continuations, lazy allocation... Basically I'm 
trying to see how am I going to implement high-level functions on LLVM.
> Hope that helps,
Certainly, thanks!

Marc Ordinas i Llopis | Tragnarion Studios

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Oct 2004 - [LLVMdev] Some question on LLVM design

[LLVMdev] Some question on LLVM design

[LLVMdev] Some question on LLVM design

[LLVMdev] Some question on LLVM design

[LLVMdev] Some question on LLVM design

[LLVMdev] Some question on LLVM design

[LLVMdev] Some question on LLVM design

[LLVMdev] Some question on LLVM design

Apparently Analagous Threads