thr3ads.net - llvm dev - [LLVMdev] Compiler Driver [high-level comments] [Aug 2004]

If this information is useful, please help other people find it:
Share via:

Reid Spencer

2004-Jul-28 16:33 UTC

[LLVMdev] Compiler Driver Requrements & Design (Comments Solicited!)

LLVMers,

As part of my work on bug 353: Create Front End Framework And Compiler
Driver (http://llvm.cs.uiuc.edu/PR353), I'm starting a discussion on the
design and requirements of the compiler driver. If you have comments on
this, by all means PLEASE chime in. This is by no means cast in stone.
The results of the ensuing discussion will be documented in PR353 (and
elsewhere) and I'll use it as my guide in implementing the compiler
driver.

If your comments are limited in scope, please place the section number
and title in the subject line so we can have independent lines of
discussion on sub-topics. Thanks.

CONTENTS:
======== 1. What it is
 2. Mode of operation
 3. Naming 
 4. Similar options as GCC.
 5. Basic/Standard compilation tasks.
 6. Recognize file types by their extensions
 7. Input/Output Flexibility
 8. Configurable for a variety of tasks/languages
 9. Source language/tool agnostic
10. Standard levels of optimization
11. Pipes or temporary files.
12. Automatic linkage
13. Automatic runtime library support
14. Integration with front end framework.
15. Next steps


1. WHAT IT IS
============The compiler driver is a program that will execute a variety of
compiler, linking and optimization tools. The basic idea is that it
provides an engine for transformation between files of different types.
The compiler driver offers a standard set of command line options for
specifying the transformations needed and the ability to invoke other
programs (tools) to perform those transformations. The driver itself
doesn't do anything with the files, its just a master invoker of other
programs.


2. MODE OF OPERATION
===================The driver will simply read its command line arguments, read
its
configuration data, and invoke the compilation, linking, and
optimization tools necessary to complete the user's request. Its basic
function is somewhat like a SQL query optimizer in that it tries to find
the optimal strategy for executing the user's request given the current
situation. It is given a high level description of what to do (the
command line arguments, akin to the SQL statement), and a detailed
description of the tools that can be invoked to accomplish the request
(configuration files, akin to the database layout/parameters). From
these two inputs, it generates a (hopefully optimal) strategy for
accomplishing the request with as little program invocation and I/O as
possible. It then executes that strategy and terminates.


3. NAMING
========We want to have a really great name for this tool since it will
(eventually) be the main touch point for users of LLVM. The name has not
been settled, but a few have been suggested:

9iron - as in a golf driver
lcd - LLVM Compiler Driver
ccd - Configurable Compiler Driver
ngc - Next Generation Compiler
llvm - essentially, the gateway to LLVM based tools.
myc - My Compiler

These were contributed by Reid, Chris and Misha. If you have thoughts on
this, please voice them!


4. SIMILAR OPTIONS AS GCC
========================Certain common GCC options should be supported in order
to make the
driver appear familiar to users of GCC. In particular, the following
options are important to preserve:

-c Compile and assemble file to object
-S Compile a file to assembly
-O The optimization family of options
-x Specify the language of a source input
-v Show what the driver is doing as its doing it
-g Include debugging info in the output (passed to tools)
-f Support for optimization/language tweaking (passed to tools)
-m Specify the machine to generate code for (passed to tools)
-W Pass arbitrary other options to tools.
-X Pass argument to assembler, compiler, or linker

Additionally, we should have options to:
* generate analysis reports ala the LLVM analyze tool
* have a "no op" mode like -v where it just reports what it would do
* have a language specific help utility based on suffixes. For example,
  --help ll would list the options applicable to *.ll input files. This
  would extend to source languages too (e.g. --help c for C help or
  --help f for FORTRAN help). The generated help info would be specific
  for the given language, after the config files have been read thus
  allowing the output to vary depending on the driver's configuration.
* Support the -- option to terminate command line options and indicate
  the remaining options are files to be processed. This 
* Support command line configuration (override config files on the
  command line) either by specifying a config file or using special
  configuration options.
* each option should have short (-X) and long (--language) variants


5. BASIC/STANDARD COMPILATION TASKS
==================================The driver will perform basic tasks such as
compilation, optimization,
and linking. The following definitions are suggested, but more could be
supported.

-c|--compile 
  Goal: Compile source to object
  Inputs: Source language (e.g.: .c,.st,.cpp,.f,.p,.java,.ll)
  Outputs: Objects (e.g.: .bc, .o, .c)

-S|--assemble
  Goal: Compile source to assembly
  Inputs: Source language (e.g.: .c,.st,.cpp,.f,.p,.java,.ll)
  Outputs: Assembly (e.g.: .s, .ll)

--link
  Goal: Create executable program
  Inputs: Source, Assembly, Object, Library, Bytecode
  Outputs: Native executable or lli wrapper 

-z|--analyze
  Goal: Analyze program
  Inputs: Source/Assembly/Bytecode
  Outputs: various (loadable) reports on the inputs

In particular, these options specify goals to be satisfied. The driver
should compensate for a given tool's lack of features in order to
satisfy the goal. For example, suppose a Scheme front end simply
generated .ll files but the command line was:

driver -c -o myprog.o myprog.ss

This tells the driver to compile myprog.ss (scheme input) to a native
object file, myprog.o. The driver would "backfill" the tool by
running:

1. scheme front end (.ss -> .ll)
2. llvm-as (.ll -> .bc)
3. llc (.bc -> .s)
4. gas (.s -> .o)

Or some optimization of the above sequence.


6. RECOGNIZE FILE TYPES BY EXTENSIONS
====================================In general, the driver will classify its
input files (command line
options not preceded by -) by their extensions. This will generally
indicate the transformations necessary to be applied to the file in
order for the task to be completed. Additionally, the user may use the
-X option to force a given file to be classified differently than the
default derived from its file extension.


7. INPUT/OUTPUT FLEXIBILITY
==========================Front end compiler tools (those that translate a given
source language
into something the driver can work with) will come in a variety of
flavors and perform a variety of tasks. Indeed, there must be no
requirements placed on the front end compiler tools by the driver. The
*only* requirement is that the tool be invokable with command line
arguments.

The driver tool is not expected to do anything but invoke other tools,
so it needs to understand how to invoke a tool, what optimizations the
tool supports, and what the output of that tool is.  Let's take stkrc as
an example. stkrc generates verbose, unoptimized byte code. It cannot
generate LLVM assembly, native assembly, or native object files.
Consequently, the driver would make up for its shortcomings by passing
the .bc files to opt or llc in order to get optimizations done and to
generate assembly, CBE or native code. More aggressive front ends (such
as the C front end) should be able to optimize their results both
specifically for the source language (e.g. directly at the AST level)
and with the help of LLVM (its various passes). They should also
directly support compilation to a variety of output formats (CBE, BC,
native .o, etc.). 

This approach provides a high level of flexibility while retaining
performance where it is needed. Simple front ends (like stkrc) can be
coded quickly and the driver can "back fill" the necessary
optimization,
code generation and linking capabilities. As a front end matures and
takes on the burden of optimizing and various output formats,
performance will increase because fewer llvm tools will need to be
invoked in order to complete a given task. 

Flexibility should be supported on the output side as well. Regardless
of the output of a given tool, the driver should support generation of
LLVM assembly (.ll), LLVM byte code (.bc), native assembly (.s), and
native object files (.o) when compiling. When linking, it should be able
to generate native executables (.exe, a.out, ELF, whatever's supported),
lli wrapper scripts, and C Back End.


8. CONFIGURATION FILES
=====================In order to support the flexibility described above, it
must be possible
for an existing compiler tool to be invoked by the driver without
changing either the tool or the driver. This is a firm requirement to
increase the drver's flexibility. 

Consequently, a set of configuration data is needed by the driver in
order to know how to invoke the tools, what they do when invoked, and
what kind of output they create. A simple textual format is envisioned
that describes this information. The configuration files should be read
from standard locations (e.g. /etc/llvm/*), installation locations (e.g.
/usr/local/mycompiler/llvm/*) and user-specific locations (e.g. 
~/.llvm/*.conf).

Configuration files will play a large part in defining what the driver
does. Configuration files will form a cascade of definitions much like a
unix shell does: files in standard locations, installation locations,
and user-specific locations are read successively in a well-defined
order. Files read later override definitions in files read earlier. The
driver will also have built in configuration information for the LLVM
tool set it is based on so that info needed in every environment doesn't
need to be read from configuration files (something akin to make's
default rules). Each source language can provide a configuration file
for the front end compiler for that language. Users can override any/all
definitions to make the driver do what they want it to.

It is unclear what form the configuration files should take. The SPEC
format used in gcc.c is unintelligible and will be avoided. Some of the
ideas so far are:

* XML based (pro: well-structured, con: verbose)
* Java properties style (pro: familiar, con: not structured)
* Window .ini style (pro: familiar, con: not well-structured)
* Special Language (pro: perfect fit, con: new language)

Things to include in the configuration files are:

* command line options supported by a compilation tool and their
   meanings for invocation by the driver.
* language specific command line options that should be supported 
  on the driver's command line but simply passed through to the 
  compilation tool.
* file suffixes supported as input by a compilation tool
* file suffixes supported as output by a compilation tool
* for each input suffix, a description of what the tool expects as
  input.
* for each output suffix, a description of what the tool produces as
  output, how much optimization it does, etc.
* tool chain definitions required for implementing a compiler. 
  For example, stkrc generates non-optimized bytecode files. Its tool
  chain might look like: stkrc | opt | llvm-link | llc | gcc (this
  is obviously a gross oversimplification).
* Runtime libraries needed by a front end (might vary with compilation
  options, e.g. thread support or not).

An optimization of the config files would cache the config data for a
given user in their ~/.llvm directory for faster reading of the config
files. Only if the config files have a time stamp later than the cache
file will the config data be re-parsed. Its not expected that this
optimization would appear in the first version of the driver.


9. SOURCE LANGUAGE/TOOL AGNOSTIC
===============================The driver must be agnostic towards source
languages and their
compilation tools. It is expected that a myriad of source languages will
be constructed using LLVM tools, however, that shouldn't be a
constraint. A given source language compiler might be written in Scheme,
Haskell, ML, or assembler and use none of the LLVM libraries or tools.
At best it might generate LLVM Assembly (.ll). The LLVM driver shouldn't
care. As long as the compiler is invokable via command line arguments,
it should be supported. Configuration files will detail what arguments
to use, and what is produced by the compiler.

Furthermore, it must be possible to invoke native compilers (like gcc or
Intel C++ compiler, or Visual C++) from the driver and incorporate their
results into the linkage of LLVM based programs.


10. STANDARD LEVELS OF OPTIMIZATION
==================================The -O family of options to the driver should
be standardized by the
driver across all languages so that common levels of optimization can be
expected when using the driver.  The following definitions for the
various -O options are currently suggested:

-On - do no optimizations except, perhaps, mem2reg
-O0 - do simple, quickly executing optimizations including mem2reg,
        simplifycfg, instcombine
-O1 - More aggressive optimizations, including gcse, sccp, scalarrepl
-O2 - Loop optimizations, IPO at compile-time, etc.
-O3 - Link-time optimization, aggressive analysis
-O4 - Run-time, profile guided optimizations

To extend this list, we might want to have "basic" and
"aggressive"
optimizations at various levels: functions, globals, modules, link-time
(IPO), run-time.  A certain amount of thought needs to go into this in
order to get the correct set of definitions. Ideas welcome.


11. PIPES OR TEMPORARY FILES
===========================The user should have the option of passing output
between the
compilation tools via either pipes or temporary files. Depending on the
system and hardware, one or the other should provide for fast execution
of the tool chain.


12. AUTOMATIC LINKAGE
====================Byte code files were recently enhanced to encode their
dependent
libraries. When the driver is linking a program, it should use the 
dependent library information in the .bc files to build the link 
command line so that users never have to worry about getting the correct
set of libraries to link with. This should work equally well for byte
code libraries as well as user and system native libraries.

This feature implies some intelligence in the front ends. Front ends
written specifically for LLVM (generating byte code or LLVM assembly)
should support the dependent libraries feature. Other compilers (native)
will not have this feature. A pre-processor might be able to derive the
dependencies or we just let the link fail. Ideas on how to support this
feature for non-LLVM tools would be welcome.


13. RUNTIME LIBRARIES
====================Runtime libraries will come in  various forms: native system
libs,
native user libs, LLVM standard runtime libraries (e.g. crtend.a), LLVM
language component libraries (e.g. GC/threads), language specific
runtime libs (e.g. libg++, libstkr_runtime). The driver must support
linking against all these different types of libraries. In particular,
it must be possible to configure the driver so that it knows where to
find the runtime libraries needed for a given program.


13. INTEGRATION WITH FRONT END FRAMEWORK
=======================================Eventually, LLVM will provide a framework
for front end compilers. This
will essentially be a toolkit of code to make it easier to implement a
front end. While the design of this framework is only loosely sketched,
we can currently make the requirement that the driver should support
close integration with front ends based on the framework. 

The initial release of the FE Framework is already envisioned. It will
support the command line options needed to support invocation by the
driver and take care of all the "back end" details such as pass
invocation and generation of code (in various forms). 

Subsequent releases of the FE Framework will possibly include:
* AST construction helpers
* support for garbage collection
* support for threading
* support for LLVM debugging
* providing a complier as a loadable module so that fork/exec isn't
  needed by driver (with very minimal interface between them).

Whatever it turns out to be, the driver needs to integrate with it.


NEXT STEPS
=========1. Gather feedback from this email.
2. Document driver tool command line as a .pod file in the
    llvm/docs/CommandGuide directory. Submit for review and incorporate
    feedback.
3. Document driver tool requirements, design, config file language, and
    other tools based on input in a .html file in llvm/docs. Submit for
    review and incorporate feedback.
4. Incorporate design (by reference) to bug 353.
5. Code driver to specifications previously documented.
6. Generate test cases for the driver and test it.
7. Submit driver code and tests for review and incorporate feedback.
8. Commit driver to LLVM CVS
9. Write an initial "front end framework" for making it easier to
write
    driver compatible front ends. This would basically support all the
    back end plumbing necessary to recognize optimizations (-On) and
    different kinds of output (.s, .ll, .bc, .o)
10.Retrofit stkrc to use the initial front end framework so it can be
    less brain dead and actually generate optimized code. Perhaps do the
    same for BF.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20040728/b29946eb/attachment.sig>

Chris Lattner

2004-Jul-28 18:26 UTC

head link

[LLVMdev] Compiler Driver [high-level comments]

On Wed, 28 Jul 2004, Reid Spencer wrote:> 2. MODE OF OPERATION
> ===================> The driver will simply read its command line
arguments, read its
> configuration data, and invoke the compilation, linking, and
> optimization tools necessary to complete the user's request. Its basic
I'm not sure that I agree with this.  Compilers need to be extremely
predictable and simple.  In particular, saying:

llvmgcc x.c y.c z.c

should invoke exactly the same tools as:

llvmgcc x.c -c
llvmgcc y.c -c
llvmgcc z.c -c
llvmgcc x.o y.o z.o

I don't necessarily think that you're contradicting this, I just wanted
to
make sure we're on the same page.
> 4. SIMILAR OPTIONS AS GCC
> ========================> Certain common GCC options should be supported
in order to make the
> driver appear familiar to users of GCC. In particular, the following
> options are important to preserve:
Very important, I agree.
> Additionally, we should have options to:
> * generate analysis reports ala the LLVM analyze tool
I'm not certain how useful this would be.  It would add complexity to the
driver that is of arguable use.  If anything I would make this the last
priority: the people who use 'analyze' are compiler developers, not end
users.
> * have a "no op" mode like -v where it just reports what it would
do
> * have a language specific help utility based on suffixes. For example,
>   --help ll would list the options applicable to *.ll input files. This
>   would extend to source languages too (e.g. --help c for C help or
>   --help f for FORTRAN help). The generated help info would be specific
>   for the given language, after the config files have been read thus
>   allowing the output to vary depending on the driver's configuration.
> * Support the -- option to terminate command line options and indicate
>   the remaining options are files to be processed. This
> * Support command line configuration (override config files on the
>   command line) either by specifying a config file or using special
>   configuration options.
> * each option should have short (-X) and long (--language) variants
Sure.
> 5. BASIC/STANDARD COMPILATION TASKS
> ==================================> The driver will perform basic tasks
such as compilation, optimization,
> and linking. The following definitions are suggested, but more could be
> supported.
There has been a lot of discussion/confusion on IRC relating to what
actually will go into .s or .o files.  In particular, some people were
arguing that if we output a .o file, that it should only contain native
code.  This means that these two commands would do very different things:

llvmgcc x.c -o x.o     # compile to native .o
llvmgcc x.c -o x.bc    # compile to bytecode

I have to say that I *strenuously* object to this behavior.  In
particular, this would require all users to change their makefiles to get
IPO/lifelong optzn support from LLVM, violating one of the main goals of
the system.

There are a couple of things that people brought up (including wrapping
.bc files in ELF sections, generating .o files containing native
code+.bc), but here is the proposal that I like best:  :)

I don't think that anything should change w.r.t. the contents of .o files.
In particular, .o files should contain LLVM bytecode without wrappers or
anything fancy around them.  The big problem with this is compiler
interoperability, in particular, mixing .o files from various compilers
(e.g. a native GCC) will not work (e.g. 'ld' will barf when it hits an
LLVM .o file).

Personally I don't see a problem with this.  We already have "llvm
aware"
replacements for many system tools, including ld, nm, and a start for ar.
These tools could be made 'native aware', so that 'llvm-ld x.o
b.o' would
do the right thing for mixed native and llvm .o files.  Imagine an
llvm-objdump tool that either runs the native objdump program or llvm-dis
depending on the file type.

The one major thing that I want to fix is the current kludge of using
llvmgcc -S or llvmgcc -c to control whether the compile-time optimizer is
run.  The only reason we did this was because it was easy, and a new
compiler driver is exactly what we need to fix this.  In particular, I
would really like to see something like this:

llvmgcc X.c -S     # compiles, runs gccas, emits an *optimized* .ll file
llvmgcc X.c -c     # Same as -S, but now in .bc form instead of .ll form
llvmgcc X.c -On -S # "no" optimization, emit a 'raw' .ll file
llvmgcc X.c -On -c # "no" optimization, emit a 'raw' .bc file

Basically, today's equivalents to these are:

llvmgcc X.c -c -o - | llvm-dis > X.s
llvmgcc X.c -c
llvmgcc X.c -S
llvmgcc X.c -S -o - | llvm-as > X.o

The ability to capture the raw output of a front-end is very useful and
important, but it should be controlled with -O options, not -S/-c.  Also,
llvmgcc -O0 is not necessary the same as -On, because some optimizations
actually speed up compilation (e.g., dead code elim).

Anyway, these are just some high-level ideas.

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://nondot.org/sabre/

Reid Spencer

2004-Jul-29 04:43 UTC

head link

[LLVMdev] Compiler Driver [high-level comments]

On Wed, 2004-07-28 at 11:26, Chris Lattner wrote:> On Wed, 28 Jul 2004, Reid Spencer wrote:
> > 2. MODE OF OPERATION
> > ===================> > The driver will simply read its command
line arguments, read its
> > configuration data, and invoke the compilation, linking, and
> > optimization tools necessary to complete the user's request. Its
basic
> 
> I'm not sure that I agree with this.  Compilers need to be extremely
> predictable and simple.  In particular, saying:
> 
> llvmgcc x.c y.c z.c
> 
> should invoke exactly the same tools as:
> 
> llvmgcc x.c -c
> llvmgcc y.c -c
> llvmgcc z.c -c
> llvmgcc x.o y.o z.o
> 
> I don't necessarily think that you're contradicting this, I just
wanted to
> make sure we're on the same page.
I'm not contradicting anything here. The driver will select a completely
deterministic, simple, and direct sequence of commands in a well defined
order. My analogy to the SQL query optimizer was just that: an analogy.
Its not going to look for the "best" solution, it'll just be coded
with
the best strategies built in and completely predictable from there.
> 
> > 4. SIMILAR OPTIONS AS GCC
> > ========================> > Certain common GCC options should be
supported in order to make the
> > driver appear familiar to users of GCC. In particular, the following
> > options are important to preserve:
> 
> Very important, I agree.
> 
> > Additionally, we should have options to:
> > * generate analysis reports ala the LLVM analyze tool
> 
> I'm not certain how useful this would be.  It would add complexity to
the
> driver that is of arguable use.  If anything I would make this the last
> priority: the people who use 'analyze' are compiler developers, not
end
> users.
True, I'll drop it.
> > 5. BASIC/STANDARD COMPILATION TASKS
> > ==================================> > The driver will perform
basic tasks such as compilation, optimization,
> > and linking. The following definitions are suggested, but more could
be
> > supported.
> 
> There has been a lot of discussion/confusion on IRC relating to what
> actually will go into .s or .o files.  In particular, some people were
> arguing that if we output a .o file, that it should only contain native
> code.  This means that these two commands would do very different things:
> 
> llvmgcc x.c -o x.o     # compile to native .o
> llvmgcc x.c -o x.bc    # compile to bytecode
> 
> I have to say that I *strenuously* object to this behavior.  In
> particular, this would require all users to change their makefiles to get
> IPO/lifelong optzn support from LLVM, violating one of the main goals of
> the system.
> 
> There are a couple of things that people brought up (including wrapping
> .bc files in ELF sections, generating .o files containing native
> code+.bc), but here is the proposal that I like best:  :)
> 
> I don't think that anything should change w.r.t. the contents of .o
files.
> In particular, .o files should contain LLVM bytecode without wrappers or
> anything fancy around them.  The big problem with this is compiler
> interoperability, in particular, mixing .o files from various compilers
> (e.g. a native GCC) will not work (e.g. 'ld' will barf when it hits
an
> LLVM .o file).
> 
> Personally I don't see a problem with this.  We already have "llvm
aware"
> replacements for many system tools, including ld, nm, and a start for ar.
> These tools could be made 'native aware', so that 'llvm-ld x.o
b.o' would
> do the right thing for mixed native and llvm .o files.  Imagine an
> llvm-objdump tool that either runs the native objdump program or llvm-dis
> depending on the file type.
Okay, above is agreed.
> The one major thing that I want to fix is the current kludge of using
> llvmgcc -S or llvmgcc -c to control whether the compile-time optimizer is
> run.  The only reason we did this was because it was easy, and a new
> compiler driver is exactly what we need to fix this.  In particular, I
> would really like to see something like this:
> 
> llvmgcc X.c -S     # compiles, runs gccas, emits an *optimized* .ll file
> llvmgcc X.c -c     # Same as -S, but now in .bc form instead of .ll form
Okay, but what's the default -Ox option that gets applied? -O2? -O1?.
Its not clear from this what the default is. To mimic GCC, such a
command line would produce very little, if any optimization.
> llvmgcc X.c -On -S # "no" optimization, emit a 'raw' .ll
file
> llvmgcc X.c -On -c # "no" optimization, emit a 'raw' .bc
file
That's fine, -On, I suppose is basically "absolutely no optimization
passes" but what is -O0 (oh zero)? a synonym for -On? Some minimal
optimization?
> Basically, today's equivalents to these are:
> 
> llvmgcc X.c -c -o - | llvm-dis > X.s
> llvmgcc X.c -c
> llvmgcc X.c -S
> llvmgcc X.c -S -o - | llvm-as > X.o
Are these supposed to match the four above? The use of llvmgcc is
confusing me here. In future discussion, when you mean the future
driver, please write as "driver" (or the actual name if its decided by
then). 

So one problem with this is that there's no way to emit a native .o
file? I thought one of the goals you wanted for the driver was to allow
an invoked compiler tool to generate as much as possible, including
native object file (.o) such as ELF. This would imply from the last
example that:

driver X.c -On -c would produce:

llvmgcc X.c -S -o - | llvm-as | llc | gas > X.o

But, your scheme doesn't seem to permit this?
> 
> The ability to capture the raw output of a front-end is very useful and
> important, but it should be controlled with -O options, not -S/-c.  Also,
> llvmgcc -O0 is not necessary the same as -On, because some optimizations
> actually speed up compilation (e.g., dead code elim).
Okay, you answered my question above. Perhaps you can define the
specific passes that should be included n -O0.

As for capturing the raw output of a front-end, GCC has the -E option
(well, at least for the pre-processor). Do we want to do that ?

> 
> Anyway, these are just some high-level ideas.
Your thoughts, if any on the other topics would be very much
appreciated.


Thanks,

Reid.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20040728/368cd726/attachment.sig>

Vikram Adve

2004-Aug-03 21:12 UTC

head link

[LLVMdev] Compiler Driver [high-level comments]

I just had a chance to read some of follow-up comments on Reid's 
initial document.  I agree with Chris's discussion below of what is 
needed for users to get IPO/lifelong opt'n via LLVM without extensive 
changes to Makefiles, and about what .o files should contain.  This is 
in perfect agreement with what I just said about how users should view 
LLVM.

--Vikram
http://www.cs.uiuc.edu/~vadve
http://llvm.cs.uiuc.edu/

On Jul 28, 2004, at 1:26 PM, Chris Lattner wrote:
> On Wed, 28 Jul 2004, Reid Spencer wrote:
>> 2. MODE OF OPERATION
>> ===================>> The driver will simply read its command
line arguments, read its
>> configuration data, and invoke the compilation, linking, and
>> optimization tools necessary to complete the user's request. Its
basic
>
> I'm not sure that I agree with this.  Compilers need to be extremely
> predictable and simple.  In particular, saying:
>
> llvmgcc x.c y.c z.c
>
> should invoke exactly the same tools as:
>
> llvmgcc x.c -c
> llvmgcc y.c -c
> llvmgcc z.c -c
> llvmgcc x.o y.o z.o
>
> I don't necessarily think that you're contradicting this, I just 
> wanted to
> make sure we're on the same page.
>
>> 4. SIMILAR OPTIONS AS GCC
>> ========================>> Certain common GCC options should be
supported in order to make the
>> driver appear familiar to users of GCC. In particular, the following
>> options are important to preserve:
>
> Very important, I agree.
>
>> Additionally, we should have options to:
>> * generate analysis reports ala the LLVM analyze tool
>
> I'm not certain how useful this would be.  It would add complexity to 
> the
> driver that is of arguable use.  If anything I would make this the last
> priority: the people who use 'analyze' are compiler developers, not
end
> users.
>
>> * have a "no op" mode like -v where it just reports what it
would do
>> * have a language specific help utility based on suffixes. For 
>> example,
>>   --help ll would list the options applicable to *.ll input files. 
>> This
>>   would extend to source languages too (e.g. --help c for C help or
>>   --help f for FORTRAN help). The generated help info would be 
>> specific
>>   for the given language, after the config files have been read thus
>>   allowing the output to vary depending on the driver's
configuration.
>> * Support the -- option to terminate command line options and indicate
>>   the remaining options are files to be processed. This
>> * Support command line configuration (override config files on the
>>   command line) either by specifying a config file or using special
>>   configuration options.
>> * each option should have short (-X) and long (--language) variants
>
> Sure.
>
>> 5. BASIC/STANDARD COMPILATION TASKS
>> ==================================>> The driver will perform
basic tasks such as compilation, optimization,
>> and linking. The following definitions are suggested, but more could 
>> be
>> supported.
>
> There has been a lot of discussion/confusion on IRC relating to what
> actually will go into .s or .o files.  In particular, some people were
> arguing that if we output a .o file, that it should only contain native
> code.  This means that these two commands would do very different 
> things:
>
> llvmgcc x.c -o x.o     # compile to native .o
> llvmgcc x.c -o x.bc    # compile to bytecode
>
> I have to say that I *strenuously* object to this behavior.  In
> particular, this would require all users to change their makefiles to 
> get
> IPO/lifelong optzn support from LLVM, violating one of the main goals 
> of
> the system.
>
> There are a couple of things that people brought up (including wrapping
> .bc files in ELF sections, generating .o files containing native
> code+.bc), but here is the proposal that I like best:  :)
>
> I don't think that anything should change w.r.t. the contents of .o 
> files.
> In particular, .o files should contain LLVM bytecode without wrappers 
> or
> anything fancy around them.  The big problem with this is compiler
> interoperability, in particular, mixing .o files from various compilers
> (e.g. a native GCC) will not work (e.g. 'ld' will barf when it hits
an
> LLVM .o file).
>
> Personally I don't see a problem with this.  We already have "llvm
> aware"
> replacements for many system tools, including ld, nm, and a start for 
> ar.
> These tools could be made 'native aware', so that 'llvm-ld x.o
b.o'
> would
> do the right thing for mixed native and llvm .o files.  Imagine an
> llvm-objdump tool that either runs the native objdump program or 
> llvm-dis
> depending on the file type.
>
>
> The one major thing that I want to fix is the current kludge of using
> llvmgcc -S or llvmgcc -c to control whether the compile-time optimizer 
> is
> run.  The only reason we did this was because it was easy, and a new
> compiler driver is exactly what we need to fix this.  In particular, I
> would really like to see something like this:
>
> llvmgcc X.c -S     # compiles, runs gccas, emits an *optimized* .ll 
> file
> llvmgcc X.c -c     # Same as -S, but now in .bc form instead of .ll 
> form
> llvmgcc X.c -On -S # "no" optimization, emit a 'raw' .ll
file
> llvmgcc X.c -On -c # "no" optimization, emit a 'raw' .bc
file
>
> Basically, today's equivalents to these are:
>
> llvmgcc X.c -c -o - | llvm-dis > X.s
> llvmgcc X.c -c
> llvmgcc X.c -S
> llvmgcc X.c -S -o - | llvm-as > X.o
>
> The ability to capture the raw output of a front-end is very useful and
> important, but it should be controlled with -O options, not -S/-c.  
> Also,
> llvmgcc -O0 is not necessary the same as -On, because some 
> optimizations
> actually speed up compilation (e.g., dead code elim).
>
> Anyway, these are just some high-level ideas.
>
> -Chris
>
> -- 
> http://llvm.cs.uiuc.edu/
> http://nondot.org/sabre/
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 5895 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20040803/93d28e6f/attachment.bin>

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Aug 2004 - [LLVMdev] Compiler Driver [high-level comments]

[LLVMdev] Compiler Driver Requrements & Design (Comments Solicited!)

[LLVMdev] Compiler Driver [high-level comments]

[LLVMdev] Compiler Driver [high-level comments]

[LLVMdev] Compiler Driver [high-level comments]

Maybe Matching Threads