thr3ads.net - llvm dev - [LLVMdev] Inline Assembly (unique arch string for llvm) [Sep 2004]

If this information is useful, please help other people find it:
Share via:

Reid Spencer

2004-Sep-13 15:22 UTC

[LLVMdev] Inline Assembly

In order to get to the next stage with LLVM (like compiling a kernel) we
need to allow "pass through" of inline assembly so things like device
drivers, interrupt vectors, etc. can be written. While this feature
breaks the "pure" LLVM IR, I don't see any way around it. 

So, I thought I'd bring it up here so we can discuss potential
implementations.  I think we should take the "shoot yourself in the foot
approach". That is, we add an instruction type to LLVM that simply
encapsulates an assembly language statement. This instruction type is
just simply ignored (but retained) by all the optimization passes. When
code generation happens, the inline assembly is just blindly put out and
if the programmer has shot himself in the foot, so be it.

One other thing we can do that *might* be useful. If a function contains
only inline assembly instructions, we could circumvent the usual calling
conventions for that function.

Thoughts?

Reid.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20040913/caf22927/attachment.sig>

John Criswell

2004-Sep-13 16:40 UTC

head link

[LLVMdev] Inline Assembly

Reid Spencer wrote:> In order to get to the next stage with LLVM (like compiling a kernel) we
> need to allow "pass through" of inline assembly so things like
device
> drivers, interrupt vectors, etc. can be written. While this feature
> breaks the "pure" LLVM IR, I don't see any way around it. 
<shameless plug>
Actually, there should be a way around it.  I'm currently working on 
extensions to LLVM for operating system support.  You wouldn't be able 
to take the stock i386 Linux kernel and compile it, but you could write 
an operating system that would be completely compilable by LLVM (once I 
finish, that is).

Currently, I'm modifying the Linux kernel to use LLVM intrinsics instead 
of inline asm.  Currently, the intrinsics are simply library routines 
linked into the kernel, but someday (if all goes according to plan) they 
will become LLVM intrinsics.
</shameless plug>

<technical aside>
The difficult part of an OS is not actually all the funky hardware 
stuff.  The intrinsics for those are actually very straightforward and 
easy to implement.  I/O, for example, is really volatile loads and 
stores with MEMBAR's.  Registering interrupt handlers takes some very 
straitforward intrinsics.  The I/O intrinsics are already implemented 
for LLVM in the x86 code generator (minus the FENCE/MEMBAR instructions).

The difficult part is the code of the OS that changes native hardware 
state.  The kernel's code for changing the program counter to execute a 
signal handler, or the code in fork() that sets up the new process to 
return zero when it begins running for the first time: these are the 
hard parts, because native i386 state is visible in LLVM programs (more 
accurately; for our research, we don't want it visibile).
</technical aside>
> 
> So, I thought I'd bring it up here so we can discuss potential
> implementations.  I think we should take the "shoot yourself in the
foot
> approach". That is, we add an instruction type to LLVM that simply
> encapsulates an assembly language statement. This instruction type is
> just simply ignored (but retained) by all the optimization passes. When
> code generation happens, the inline assembly is just blindly put out and
> if the programmer has shot himself in the foot, so be it.
Question: Do you want inline asm to be able to compile programs out of 
the box?  Or do you want it so that we can use native hardware features 
that we can't use now?

For the former, we need inline i386/sparc/whatever support.  For the 
latter, LLVM intrinsics should do the trick, and do it rather portably.

The approach you suggest might work, although the code generator will 
need to know not to tromp on your registers, I guess.

The bigger problem is GCC.  GCC provides extended inline asm stuff that 
will probably be painful to pass from GCC to LLVM (and Linux, BTW, uses 
this feature a lot).

Another thought:

My impression is that inline assembly bites us a lot not because it's 
used a lot but because the LLVM compiler enables #defines for the i386 
platform that we don't support.

I think a lot of code has the following:

#ifdef _i386
inline asm
#else
slow C code
#endif

The LLVM GCC compiler still defines _i386 (or its equivalent), so 
configure and llvm-gcc end up trying to compile inline assembly code 
when they don't really need to.

I have to admit that this is an impression and not something I know for 
sure, but it seems reasonable that many application programs use i386 
assembly because i386 is the most common platform, and speedups on it 
are good.

Changing llvm-gcc to disable the _i386-like macros might make 
compilation of userspace programs easier.

So, summary:

o If you just want access to native hardware, the intrinsics I'm 
developing will be much cleaner than inline asm support (and portable too).

o If you want inline asm to compile programs out of the box, it'll be 
more painful than what you've described.

o Changing llvm-gcc so that it doesn't look like an i386 compiler might 
make it easier to compile applications with optional inline asm.

Sorry if this is a bit rantish; my thoughts on the matter are not well 
organized.
> 
> One other thing we can do that *might* be useful. If a function contains
> only inline assembly instructions, we could circumvent the usual calling
> conventions for that function.
> 
> Thoughts?
> 
> Reid.
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev
-- John T.

-- 
*********************************************************************
* John T. Criswell                         Email: criswell at uiuc.edu *
* Research Programmer                                               *
* University of Illinois at Urbana-Champaign                        *
*                                                                   *
* "It's today!" said Piglet. "My favorite day," said
Pooh.          *
*********************************************************************

Jeff Cohen

2004-Sep-13 16:50 UTC

head link

[LLVMdev] Inline Assembly

On Mon, 13 Sep 2004 11:40:48 -0500
John Criswell <criswell at cs.uiuc.edu> wrote:
> Reid Spencer wrote:
> > In order to get to the next stage with LLVM (like compiling a kernel)
we
> > need to allow "pass through" of inline assembly so things
like device
> > drivers, interrupt vectors, etc. can be written. While this feature
> > breaks the "pure" LLVM IR, I don't see any way around
it.
> 
> The approach you suggest might work, although the code generator will 
> need to know not to tromp on your registers, I guess.
It's worse than just knowing what registers are used by inlined
assembler.  You want the inline assembler to be able to reference local
and global variables and function arguments.  Plus, you have to be able
to handle transfers of control inside the inlined assembler, such as a
return, a branch to a label defined outside of the inlined assembler, or
even calls to other functions (to properly handle inter-procedural
optimzation).  It can get quite messy.  It will be a lot of work to do
it as well as gcc or Microsoft's compiler.

Chris Lattner

2004-Sep-13 18:54 UTC

head link

[LLVMdev] Inline Assembly

On Mon, 13 Sep 2004, John Criswell wrote:
> Actually, there should be a way around it.  I'm currently working on
> extensions to LLVM for operating system support.  You wouldn't be able
> to take the stock i386 Linux kernel and compile it, but you could write
> an operating system that would be completely compilable by LLVM (once I
> finish, that is).
Being able to use intrinsics is definitely good, but it's not sufficient.
There will always be things we don't cover, and inline asm will be
required.  In any case, compiling programs off the shelf certainly does
require inline asm support, so we do need it regardless of what intrinsics
we have.
> The difficult part is the code of the OS that changes native hardware
> state.  The kernel's code for changing the program counter to execute a
> signal handler, or the code in fork() that sets up the new process to
> return zero when it begins running for the first time: these are the
> hard parts, because native i386 state is visible in LLVM programs (more
> accurately; for our research, we don't want it visibile).
Some things really do want to be written in inline asm, and those things
are obviously non-portable.  This is not a problem, the goal of LLVM isn't
to turn every non-portable program into a portable one :)
> The bigger problem is GCC.  GCC provides extended inline asm stuff that
> will probably be painful to pass from GCC to LLVM (and Linux, BTW, uses
> this feature a lot).
Actually, the inline asm support provided by GCC is quite well thought out
and makes a lot of sense (inline asms are required to define their side
effects in target-independent terms).  The big complaint that I have is
it's incredibly baroque syntax.  Eventually we should also support other
forms of inline asm by translating them into the LLVM inline asm format,
but keeping the inline asm format symantically equivalent to the GCC
format is basically what we want.
> My impression is that inline assembly bites us a lot not because it's
> used a lot but because the LLVM compiler enables #defines for the i386
> platform that we don't support.
We should aspire to be as compatible with GCC as reasonable, and including
inline asm support is a big piece of that.

In terms of implementation, adding inline asm support is just a "small
matter of implementation": it shouldn't cause any fundamental problems
with the llvm design.  In particular, LLVM should get an "asm"
Instruction, which takes a blob of text and some arguments.  The big
missing feature in LLVM is multiple return value support, which is
required by asms that define multiple registers.  My notes on multiple ret
values are here if anyone is interested:
http://nondot.org/sabre/LLVMNotes/MultipleReturnValues.txt

-Chris

-- 
http://llvm.org/
http://nondot.org/sabre/

Andrew Lenharth

2004-Sep-17 04:24 UTC

head link

[LLVMdev] Inline Assembly (unique arch string for llvm)

On Mon, 2004-09-13 at 11:40, John Criswell wrote:> My impression is that inline assembly bites us a lot not because it's 
> used a lot but because the LLVM compiler enables #defines for the i386 
> platform that we don't support.
> 
> I think a lot of code has the following:
> 
> #ifdef _i386
> inline asm
> #else
> slow C code
> #endif
> 
> The LLVM GCC compiler still defines _i386 (or its equivalent), so 
> configure and llvm-gcc end up trying to compile inline assembly code 
> when they don't really need to.
> 
> I have to admit that this is an impression and not something I know for 
> sure, but it seems reasonable that many application programs use i386 
> assembly because i386 is the most common platform, and speedups on it 
> are good.
> 
> Changing llvm-gcc to disable the _i386-like macros might make 
> compilation of userspace programs easier.
When I was working on porting glibc (currently being held up by a C99
support bug) the most straight forward approach was to define a new
architecture string and implement a new target in glibc based on that
machine string.

So I propose that llvm-gcc not consider itself any type of x86-linux (or
what ever it platform it was compiled on), but rather create a new
architecture, say llvm (or perhaps 2, one for each bit and little
endian).  Thuse llvm-gcc -dumpmachine would return llvm-os.

This would make system library (and OS kernel!) ports easier to maintain
since arch llvm would be supported by adding stuff rather than changing
stuff, and all the inline asm for known archs would go away and the C
version would be used.  In most cases the config scripts should consider
compiling with llvm on a host as a cross compile from host arch to arch
llvm.

Andrew
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20040916/017f74ff/attachment.sig>

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - Sep 2004 - [LLVMdev] Inline Assembly (unique arch string for llvm)

[LLVMdev] Inline Assembly

[LLVMdev] Inline Assembly

[LLVMdev] Inline Assembly

[LLVMdev] Inline Assembly

[LLVMdev] Inline Assembly (unique arch string for llvm)

Maybe Matching Threads