On 3 Nov., 10:06, Chris Lattner <clatt... at apple.com> wrote:> On Nov 2, 2008, at 2:20 PM, Jonathan Brandmeyer wrote: > > > I am interested in making my LLVM front-end multi-threaded in a way > > similar to the GCC compiler server proposal and was wondering about > > the > > extent that the LLVM passes support it. > > Do you have a link for this? I'm not familiar with any parallelism > proposed by that project. My understanding was that it was mostly > about sharing across invocations of the compiler. > > > Expression-at-a-time parallel construction: > > If function definitions are built purely depth-first, such that the > > parent pointers are not provided as they are created, what will break? > > I noted that the function and module verifiers aren't complaining, at > > least not yet. Is there a generic "fixup upward-pointing parent > > pointers" pass that can be run afterwords? If not, do I need to > > implement and perform that pass? I suspect that emitting code for > > individual expressions in parallel will probably end up being too > > fine-grained, which leads me to... > > Are you talking about building your AST or about building LLVM IR. > The rules for constructing your AST are pretty much defined by you. > The rules for constructing LLVM IR are a bit more tricky. The most > significant issue right now is that certain objects in LLVM IR are > uniqued (like constants) and these have use/def chains. Since use/def > chain updating is not atomic or locked, this means that you can't > create llvm ir on multiple threads. This is something that I'm very > much interested in solving someday, but no one is working on it at > this time (that I'm aware of).What about "inventing" pseudo-constants (which point to the right thing) and build the piece of IR with them. When done, grab mutex and RAUW it in. Alternatively, submit to a privileged thread that performs the RAUW. The trick is to prepare the def/use chain(s) to a degree that the mutex is only held a minimal time. If only IR-builder threads are running concurrently there is no danger that a real constant vanishes, leaving behind a stale reference from a pseudo-constant. Any major headaches I have ignored? Cheers, Gabor> > > Function-at-a-time parallel construction: > > Which (if any) LLVM objects support the object-level thread safety > > guarantee? If I construct two separate function pass managers in > > separate threads and use them to optimize and emit object code for > > separate llvm::Function definitions in the program, will this work? > > Same question for llvm::Modules. > > Unfortunately, for the above reason... basically none. The LLVM code > generators are actually very close to being able to run in parallel. > The major issue is that they run a few llvm IR level passes first (LSR > and codegen prepare) that hack on LLVM IR before the code generators > run. Because of this, they inherit the limitations of LLVM IR > passes. Very long term, I'd really like to make the code generator > not affect the LLVM IR being put into them, but this is not likely to > happen anytime in the near future. > > If you're interested in this, tackling the use/def atomicity issues > would be a great place to start. > > -Chris > > _______________________________________________ > LLVM Developers mailing list > LLVM... at cs.uiuc.edu http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Nov 3, 2008, at 3:55 PM, heisenbug wrote:> What about "inventing" pseudo-constants (which point to the right > thing) and build the piece of IR with them. When done, grab mutex and > RAUW it in. Alternatively, submit to a privileged thread that performs > the RAUW. > The trick is to prepare the def/use chain(s) to a degree that the > mutex is only held a minimal time. If only IR-builder threads are > running concurrently there is no danger that a real constant vanishes, > leaving behind a stale reference from a pseudo-constant.That could work. It would also have to be done for global values as well, and inline asm objects etc. However, I don't see any show- stoppers. The implementation could be tricky, but a nice property of your approach is that the single threaded case could be made particularly fast (instead of doing atomic ops or locking always). -Chris
On Mon, 2008-11-03 at 23:59 -0800, Chris Lattner wrote:> On Nov 3, 2008, at 3:55 PM, heisenbug wrote: > > What about "inventing" pseudo-constants (which point to the right > > thing) and build the piece of IR with them. When done, grab mutex and > > RAUW it in. Alternatively, submit to a privileged thread that performs > > the RAUW. > > The trick is to prepare the def/use chain(s) to a degree that the > > mutex is only held a minimal time. If only IR-builder threads are > > running concurrently there is no danger that a real constant vanishes, > > leaving behind a stale reference from a pseudo-constant. > > That could work. It would also have to be done for global values as > well, and inline asm objects etc. However, I don't see any show- > stoppers. The implementation could be tricky, but a nice property of > your approach is that the single threaded case could be made > particularly fast (instead of doing atomic ops or locking always).It might work for the IR construction phase, but not for optimization and emitting object code. The locking issue is going to be severe because it will be nearly (completely?) impossible to guarantee a globally consistent lock order for any given def/use chain. Therefore, such a solution would require a kind of high-level contention manager akin to software transactional memory (STM). Even the fastest STMs in research are much slower than locking. I think that there is a better way. I would like to propose a different solution: Lift all internalized objects to be unique at the Module level instead of globally. This will require an initial pass to be performed (called IRFinalize?), and equality of Type objects by pointer comparison will not be valid until after this pass is complete. The Module is already the unit of compilation once LLVM IR has been initially emitted for most cases, and it should be straightfoward to structure the pass such that it can work on single functions if the user is compiling at that level. The IRFinalize pass can allocate the bookkeeping storage for identifying duplicate Constants and Types and then release it so long as none of the optimization and analysis passes 1) Emit new Types or 2) are broken by duplicate Constants. -Jonathan PS: What is RAUW? I'll volunteer the clerical work of adding it to the Lexicon if you'd be kind enough to hand me a small dose of clue :)> -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev