thr3ads.net - llvm dev - [LLVMdev] Emscripten: LLVM => JavaScript [Dec 2011]

If this information is useful, please help other people find it:
Share via:

Alon Zakai

2011-Dec-16 00:10 UTC

[LLVMdev] Emscripten: LLVM => JavaScript

Hi everyone,

I wanted to mention a project using LLVM: Emscripten. Emscripten
is an open source LLVM to JavaScript compiler,

http://emscripten.org
https://github.com/kripken/emscripten/

There are various demos linked to on the wiki (the first link),
of various large C/C++ codebases compiled to JS and running
on the web, like Python, Bullet, Poppler, etc.

Emscripten is not a 'conventional' LLVM backend. It's written
itself in JavaScript, mainly in order to quickly prototype
various code generation modes, and in general to make writing
a compiler into JavaScript easier. Instead of hooking into
LLVM itself like a normal backend, Emscripten takes as input
LLVM assembly and compiles that into JavaScript. So the
build process is to run Clang with -emit-llvm, then to run
Emscripten on the output.

The first goal of this post is just to mention another
project that uses LLVM, and that would have been much much
harder without it. Having a great open source toolchain
that generates a clear and well-documented IR is a very
useful thing. Thanks to everyone involved!

On that topic, I see there is an LLVM users page,

http://llvm.org/Users.html

- what is the procedure for suggesting adding a project to
there?

The second issue I want to raise in this post is regarding
upstreaming. Emscripten is fairly mature now (a 2.0 release
is coming up soon), is there any interest in LLVM in including
an unconventional LLVM backend of this nature? I would love
for this to happen, and am happy to do the relevant work.
Licensing is not a problem, Emscripten is dual-licensed MIT
and the LLVM license (precisely for this reason), but from
a technical perspective as mentioned before, Emscripten is
not a conventional C++ backend. I hope that isn't a problem
though?

The third issue I want to raise is regarding closer
integration with LLVM. Right now, Emscripten uses unmodified
LLVM and Clang, parsing their normal output. There are
however some reasons for integrating more closely, in
particular Emscripten has a problem when all LLVM
optimizations are run. This is not always important for
performance, as a safe subset exists, and we do our own
JS-level optimizations later which overlap somewhat. However,
it would be nice to be able to run all the LLVM optimizations.
The problems we have there are

1. i64s and doubles can be on 32-bit alignment, which is
   a problem for a JavaScript implementation with typed arrays
   with a shared buffer, since unaligned reads/writes there
   are impossible to do in a quick way. This can happen
   without optimizations, but is more common there due to
   the next point.

   I've been told by Rafael Ávila de Espíndola that for this,
   I would need an Emscripten target in LLVM. Would that be
   upstreamable? (With or without Emscripten itself, preferably
   with?)

2. Optimization sometimes generates types like i288, which
   Emscripten currently doesn't handle. From an optimizing
   perspective, it isn't yet clear if it would be faster to
   try to directly implement those, or to just break them up
   into more manageable native (32-bit) sizes. Note that even
   i64 is somewhat challenging to implement in a fast way
   on JavaScript, since that environment is really a 32-bit
   one, so it would be best to never do things like combine
   two 32-bit writes into one 64-bit write. It would be nice
   to have an option in LLVM to process the IR/bitcode back
   into having only target-native types, is that possible?

I have begun to investigate both issues in the LLVM
codebase, but I am new to it - any advice and/or help
would be welcome.

Best,
  Alon Zakai

Eli Friedman

2011-Dec-16 03:02 UTC

head link

[LLVMdev] Emscripten: LLVM => JavaScript

On Thu, Dec 15, 2011 at 4:10 PM, Alon Zakai <azakai at mozilla.com>
wrote:> On that topic, I see there is an LLVM users page,
>
> http://llvm.org/Users.html
>
> - what is the procedure for suggesting adding a project to
> there?
Send a patch to llvm-commits.
> The third issue I want to raise is regarding closer
> integration with LLVM. Right now, Emscripten uses unmodified
> LLVM and Clang, parsing their normal output. There are
> however some reasons for integrating more closely, in
> particular Emscripten has a problem when all LLVM
> optimizations are run. This is not always important for
> performance, as a safe subset exists, and we do our own
> JS-level optimizations later which overlap somewhat. However,
> it would be nice to be able to run all the LLVM optimizations.
> The problems we have there are
>
> 1. i64s and doubles can be on 32-bit alignment, which is
>   a problem for a JavaScript implementation with typed arrays
>   with a shared buffer, since unaligned reads/writes there
>   are impossible to do in a quick way. This can happen
>   without optimizations, but is more common there due to
>   the next point.
>
>   I've been told by Rafael Ávila de Espíndola that for this,
>   I would need an Emscripten target in LLVM. Would that be
>   upstreamable? (With or without Emscripten itself, preferably
>   with?)
Adding a Emscripten target to clang would be fine.  Note that clang
might generate unaligned loads anyway, but specifying an appropriate
target will ensure it doesn't use such loads unless they are
necessary.
> 2. Optimization sometimes generates types like i288, which
>   Emscripten currently doesn't handle. From an optimizing
>   perspective, it isn't yet clear if it would be faster to
>   try to directly implement those, or to just break them up
>   into more manageable native (32-bit) sizes. Note that even
>   i64 is somewhat challenging to implement in a fast way
>   on JavaScript, since that environment is really a 32-bit
>   one, so it would be best to never do things like combine
>   two 32-bit writes into one 64-bit write. It would be nice
>   to have an option in LLVM to process the IR/bitcode back
>   into having only target-native types, is that possible?
All the LLVM targets which use the common code generation
infrastructure have access to the legalizer, which handles that sort
of thing.  It would in theory be possible to write an equivalent that
does most of that work on IR, but it's a substantial amount of work
without any obvious benefit for existing targets.

-Eli

Alon Zakai

2011-Dec-17 03:14 UTC

head link

[LLVMdev] Emscripten: LLVM => JavaScript

----- Original Message -----> From: "Eli Friedman" <eli.friedman at gmail.com>
> To: "Alon Zakai" <azakai at mozilla.com>
> Cc: llvmdev at cs.uiuc.edu
> Sent: Thursday, December 15, 2011 7:02:34 PM
> Subject: Re: [LLVMdev] Emscripten: LLVM => JavaScript
> On Thu, Dec 15, 2011 at 4:10 PM, Alon Zakai <azakai at mozilla.com>
> wrote:
> > On that topic, I see there is an LLVM users page,
> >
> > http://llvm.org/Users.html
> >
> > - what is the procedure for suggesting adding a project to
> > there?
> 
> Send a patch to llvm-commits.
Thanks, I'll do that.
> 
> > The third issue I want to raise is regarding closer
> > integration with LLVM. Right now, Emscripten uses unmodified
> > LLVM and Clang, parsing their normal output. There are
> > however some reasons for integrating more closely, in
> > particular Emscripten has a problem when all LLVM
> > optimizations are run. This is not always important for
> > performance, as a safe subset exists, and we do our own
> > JS-level optimizations later which overlap somewhat. However,
> > it would be nice to be able to run all the LLVM optimizations.
> > The problems we have there are
> >
> > 1. i64s and doubles can be on 32-bit alignment, which is
> >   a problem for a JavaScript implementation with typed arrays
> >   with a shared buffer, since unaligned reads/writes there
> >   are impossible to do in a quick way. This can happen
> >   without optimizations, but is more common there due to
> >   the next point.
> >
> >   I've been told by Rafael Ávila de Espíndola that for this,
> >   I would need an Emscripten target in LLVM. Would that be
> >   upstreamable? (With or without Emscripten itself, preferably
> >   with?)
> 
> Adding a Emscripten target to clang would be fine. Note that clang
> might generate unaligned loads anyway, but specifying an appropriate
> target will ensure it doesn't use such loads unless they are
> necessary.
In what situation would unaligned loads be necessary? I was
hoping that unless the code literally did something crazy like
a load of an 8-byte value from a hardcoded 4-byte aligned
address (like 0x4), then otherwise "normal" C/C++ code would
always end up aligned. Is that correct?
> 
> > 2. Optimization sometimes generates types like i288, which
> >   Emscripten currently doesn't handle. From an optimizing
> >   perspective, it isn't yet clear if it would be faster to
> >   try to directly implement those, or to just break them up
> >   into more manageable native (32-bit) sizes. Note that even
> >   i64 is somewhat challenging to implement in a fast way
> >   on JavaScript, since that environment is really a 32-bit
> >   one, so it would be best to never do things like combine
> >   two 32-bit writes into one 64-bit write. It would be nice
> >   to have an option in LLVM to process the IR/bitcode back
> >   into having only target-native types, is that possible?
> 
> All the LLVM targets which use the common code generation
> infrastructure have access to the legalizer, which handles that sort
> of thing. It would in theory be possible to write an equivalent that
> does most of that work on IR, but it's a substantial amount of work
> without any obvious benefit for existing targets.
> 
Ok, I guess that means I'll need to implement a legalizer. The
simplest thing would probably be for me to do it in Emscripten,
because the Emscripten IR is a simpler subset of LLVM IR (and
I'm already familiar with the codebase). But if it would be
useful for LLVM to have an IR pass that does legalization,
I'd consider doing it in LLVM. Thoughts?

Best,
  Alon Zakai

Seemingly Similar Threads

Search for more possibly parallel threads

llvm dev - Dec 2011 - [LLVMdev] Emscripten: LLVM => JavaScript

[LLVMdev] Emscripten: LLVM => JavaScript

[LLVMdev] Emscripten: LLVM => JavaScript

[LLVMdev] Emscripten: LLVM => JavaScript

Seemingly Similar Threads