Hi all, I was looking at the profile for a tool I’m working on, and noticed that it is spending 10% of its time doing locking related stuff. The structure of the tool is that it reading in a ton of stuff (e.g. one moderate example I’m working with is 40M of input) into MLIR, then uses its multithreaded pass manager to do transformations. As it happens, the structure of this is that the parsing pass is single threaded, because it is parsing through a linear file (the parser is simple and fast, so this is bound by IR construction). This means that none of the locking during IR construction is useful. Historically, LLVM had a design where you could dynamically enable and disable multithreading support in a tool, which would be perfect for this use case, but it got removed by this patch <https://github.com/llvm/llvm-project/commit/9c9710eaf4c1f01b8b518bdba89aba059ab14175#diff-bb772a7e6d4ebf2b46c6d42c95f40ddf>: (xref https://reviews.llvm.org/D4216 <https://reviews.llvm.org/D4216>). The rationale in the patch doesn’t make sense to me - this mode had nothing to do with the old LLVM global lock, this had to do with whether llvm::llvm_is_multithreaded() returned true or false … which all the locking stuff is guarded on. Would it make sense to re-enable this, or am I missing something? -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200411/65dbc058/attachment.html>
Hey Chris, On Sat, Apr 11, 2020 at 5:15 PM Chris Lattner via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi all, > > I was looking at the profile for a tool I’m working on, and noticed that > it is spending 10% of its time doing locking related stuff. The structure > of the tool is that it reading in a ton of stuff (e.g. one moderate example > I’m working with is 40M of input) into MLIR, then uses its multithreaded > pass manager to do transformations. > > As it happens, the structure of this is that the parsing pass is single > threaded, because it is parsing through a linear file (the parser is simple > and fast, so this is bound by IR construction). This means that none of > the locking during IR construction is useful. >I'm curious which are the places that show up on the profile? Do you have a few stacktraces to share? Historically, LLVM had a design where you could dynamically enable and> disable multithreading support in a tool, which would be perfect for this > use case, but it got removed by this patch > <https://github.com/llvm/llvm-project/commit/9c9710eaf4c1f01b8b518bdba89aba059ab14175#diff-bb772a7e6d4ebf2b46c6d42c95f40ddf>: > (xref https://reviews.llvm.org/D4216). The rationale in the patch > doesn’t make sense to me - this mode had nothing to do with the old LLVM > global lock, this had to do with whether llvm::llvm_is_multithreaded() > returned true or false … which all the locking stuff is guarded on. >It seems that at the time the assumption was that this flag was there to alleviate the cost of the global lock only and removing the lock removed the motivation for the feature? Looks like you proved this wrong :) +Zach, David, and Reid to make sure they don't miss this.> Would it make sense to re-enable this, or am I missing something? >Finding a way to re-enable it seems interesting. I wonder how much it'll interact with the places inside the compiler that are threaded now, maybe it isn't much more than tracking and auditing the uses of LLVM_ENABLE_THREADS (like lib/Support/ThreadPool.cpp for example). Have you already looked into it? -- Mehdi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200412/bbbc1afb/attachment.html>
> On Apr 12, 2020, at 11:27 AM, Mehdi AMINI <joker.eph at gmail.com> wrote: > > Hey Chris, > > > On Sat, Apr 11, 2020 at 5:15 PM Chris Lattner via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > Hi all, > > I was looking at the profile for a tool I’m working on, and noticed that it is spending 10% of its time doing locking related stuff. The structure of the tool is that it reading in a ton of stuff (e.g. one moderate example I’m working with is 40M of input) into MLIR, then uses its multithreaded pass manager to do transformations. > > As it happens, the structure of this is that the parsing pass is single threaded, because it is parsing through a linear file (the parser is simple and fast, so this is bound by IR construction). This means that none of the locking during IR construction is useful. > > I'm curious which are the places that show up on the profile? Do you have a few stacktraces to share?In my case, it is all MLIR attribute/type uniquification stuff which is guarded by a RWMutex.> Historically, LLVM had a design where you could dynamically enable and disable multithreading support in a tool, which would be perfect for this use case, but it got removed by this patch <https://github.com/llvm/llvm-project/commit/9c9710eaf4c1f01b8b518bdba89aba059ab14175#diff-bb772a7e6d4ebf2b46c6d42c95f40ddf>: (xref https://reviews.llvm.org/D4216 <https://reviews.llvm.org/D4216>). The rationale in the patch doesn’t make sense to me - this mode had nothing to do with the old LLVM global lock, this had to do with whether llvm::llvm_is_multithreaded() returned true or false … which all the locking stuff is guarded on. > > It seems that at the time the assumption was that this flag was there to alleviate the cost of the global lock only and removing the lock removed the motivation for the feature? Looks like you proved this wrong :) > > +Zach, David, and Reid to make sure they don't miss this.Yeah, it was about not paying the cost for synchronization when it wasn’t worthwhile.> Would it make sense to re-enable this, or am I missing something? > > Finding a way to re-enable it seems interesting. I wonder how much it'll interact with the places inside the compiler that are threaded now, maybe it isn't much more than tracking and auditing the uses of LLVM_ENABLE_THREADS (like lib/Support/ThreadPool.cpp for example). Have you already looked into it?It is super-easy to reenable, because the entire codebase is still calling llvm::llvm_is_multithreaded(). We just need to add the global back, along with the methods to set and clear the global, and change llvm::llvm_is_multithreaded() to something like: bool llvm::llvm_is_multithreaded() { #if LLVM_ENABLE_THREADS != 0 return someGlobal; #else return false; #endif } -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200412/dc3aca64/attachment.html>
Yes, the llvm::Smart* family of locks still exist. But very few places are using them outside of MLIR; it’s more common to just use plain std::mutex. That said, I don’t think it’s really a good idea to use them, even if they were fixed to work as designed. It’s not composable: the boolean “enabled” bit is process-wide, not local to whatever data structure you’re trying to build. So your single-threaded tool gets some benefit, but the benefit goes away as soon as the process starts using multiple threads, even if there still only one thread using the MLIR context in question. So probably I’d recommend two things: 1. If locking uncontended locks is showing up on profiles as a performance bottleneck, it’s probably worth looking into ways to reduce that overhead in both single-threaded and multi-threaded contexts. (Reducing the number of locks taken in frequently called code, or using a better lock implementation). 2. If you want some mechanism to disable MLIR locking, it should probably be a boolean attached to the MLIR context in question, not a global variable. -Eli From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Chris Lattner via llvm-dev Sent: Saturday, April 11, 2020 5:16 PM To: LLVM Developers' List <llvm-dev at lists.llvm.org> Subject: [EXT] [llvm-dev] LLVM multithreading support Hi all, I was looking at the profile for a tool I’m working on, and noticed that it is spending 10% of its time doing locking related stuff. The structure of the tool is that it reading in a ton of stuff (e.g. one moderate example I’m working with is 40M of input) into MLIR, then uses its multithreaded pass manager to do transformations. As it happens, the structure of this is that the parsing pass is single threaded, because it is parsing through a linear file (the parser is simple and fast, so this is bound by IR construction). This means that none of the locking during IR construction is useful. Historically, LLVM had a design where you could dynamically enable and disable multithreading support in a tool, which would be perfect for this use case, but it got removed by this patch<https://github.com/llvm/llvm-project/commit/9c9710eaf4c1f01b8b518bdba89aba059ab14175#diff-bb772a7e6d4ebf2b46c6d42c95f40ddf>: (xref https://reviews.llvm.org/D4216). The rationale in the patch doesn’t make sense to me - this mode had nothing to do with the old LLVM global lock, this had to do with whether llvm::llvm_is_multithreaded() returned true or false … which all the locking stuff is guarded on. Would it make sense to re-enable this, or am I missing something? -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200412/78fc1c58/attachment.html>
On Apr 12, 2020, at 2:23 PM, Eli Friedman <efriedma at quicinc.com> wrote:> > Yes, the llvm::Smart* family of locks still exist. But very few places are using them outside of MLIR; it’s more common to just use plain std::mutex. > > That said, I don’t think it’s really a good idea to use them, even if they were fixed to work as designed. It’s not composable: the boolean “enabled” bit is process-wide, not local to whatever data structure you’re trying to build. So your single-threaded tool gets some benefit, but the benefit goes away as soon as the process starts using multiple threads, even if there still only one thread using the MLIR context in question.Yes, I agree, this is similar to Mehdi’s point. I think it is clear that “enable” and “disable” multithreaded mode should only be called from applications, not libraries. Calling one of them from a library breaks composability.> So probably I’d recommend two things: > If locking uncontended locks is showing up on profiles as a performance bottleneck, it’s probably worth looking into ways to reduce that overhead in both single-threaded and multi-threaded contexts. (Reducing the number of locks taken in frequently called code, or using a better lock implementation). > If you want some mechanism to disable MLIR locking, it should probably be a boolean attached to the MLIR context in question, not a global variable. >Ok, but let me argue the other way. We currently have a cmake flag that sets LLVM_ENABLE_THREADS, and that flag enables an across the board speedup. That cmake flag is the *worst* possible thing for library composability. :-). Are you suggesting that we remove it? -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200412/1541072a/attachment.html>
On Sat, Apr 11, 2020 at 5:15 PM Chris Lattner <clattner at nondot.org> wrote:> > Historically, LLVM had a design where you could dynamically enable and > disable multithreading support in a tool, which would be perfect for this > use case, but it got removed by this patch > <https://github.com/llvm/llvm-project/commit/9c9710eaf4c1f01b8b518bdba89aba059ab14175#diff-bb772a7e6d4ebf2b46c6d42c95f40ddf>: > (xref https://reviews.llvm.org/D4216). The rationale in the patch > doesn’t make sense to me - this mode had nothing to do with the old LLVM > global lock, this had to do with whether llvm::llvm_is_multithreaded() > returned true or false … which all the locking stuff is guarded on. > > Would it make sense to re-enable this, or am I missing something? > > -Chris >It’s been a while, but also see https://reviews.llvm.org/D4142 for more context. This was probably one of my first patches ever to LLVM, but it sounds like the thought process was something like: 1) Stop using llvm_start_multithreaded in two places where it wasn’t needed. 2) Realize that it’s never called anymore *anywhere* 3) Delete dead code. Given that it’s been 6 years with nobody else needing this, I’m skeptical that its of broad enough utility to re-introduce. On the other hand, I’m basically inactive at this point so I have no horse in the race. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200413/ae4ec76e/attachment.html>
My only input on this is that, if we do add this back, threading should be enabled by default, and then apps can disable it if they know it is safe to do so. As Eli mentioned, C++11 threading primitives have proliferated in LLVM, and I would hesitate to add wrappers for them all. Either way I don't feel super strongly about it. On Sat, Apr 11, 2020, 5:15 PM Chris Lattner <clattner at nondot.org> wrote:> Hi all, > > I was looking at the profile for a tool I’m working on, and noticed that > it is spending 10% of its time doing locking related stuff. The structure > of the tool is that it reading in a ton of stuff (e.g. one moderate example > I’m working with is 40M of input) into MLIR, then uses its multithreaded > pass manager to do transformations. > > As it happens, the structure of this is that the parsing pass is single > threaded, because it is parsing through a linear file (the parser is simple > and fast, so this is bound by IR construction). This means that none of > the locking during IR construction is useful. > > Historically, LLVM had a design where you could dynamically enable and > disable multithreading support in a tool, which would be perfect for this > use case, but it got removed by this patch > <https://github.com/llvm/llvm-project/commit/9c9710eaf4c1f01b8b518bdba89aba059ab14175#diff-bb772a7e6d4ebf2b46c6d42c95f40ddf>: > (xref https://reviews.llvm.org/D4216). The rationale in the patch > doesn’t make sense to me - this mode had nothing to do with the old LLVM > global lock, this had to do with whether llvm::llvm_is_multithreaded() > returned true or false … which all the locking stuff is guarded on. > > Would it make sense to re-enable this, or am I missing something? > > -Chris >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200414/9f32de7a/attachment.html>
I think that per the discussion up-thread that we should keep things as they are - downstream clients that don’t want the overhead should conditionalize themselves using local techniques. If we keep LLVM_ENABLE_THREADS, then I think it would make sense to move that inline into the Threading.h header file, so everything can be constant propagated without LTO. Right now we have this in the .cpp file: bool llvm::llvm_is_multithreaded() { #if LLVM_ENABLE_THREADS != 0 return true; #else return false; #endif } In any case, thank you all for the discussion, this all makes sense to me! -Chris> On Apr 14, 2020, at 4:34 PM, Reid Kleckner <rnk at google.com> wrote: > > My only input on this is that, if we do add this back, threading should be enabled by default, and then apps can disable it if they know it is safe to do so. > > As Eli mentioned, C++11 threading primitives have proliferated in LLVM, and I would hesitate to add wrappers for them all. Either way I don't feel super strongly about it. > > On Sat, Apr 11, 2020, 5:15 PM Chris Lattner <clattner at nondot.org <mailto:clattner at nondot.org>> wrote: > Hi all, > > I was looking at the profile for a tool I’m working on, and noticed that it is spending 10% of its time doing locking related stuff. The structure of the tool is that it reading in a ton of stuff (e.g. one moderate example I’m working with is 40M of input) into MLIR, then uses its multithreaded pass manager to do transformations. > > As it happens, the structure of this is that the parsing pass is single threaded, because it is parsing through a linear file (the parser is simple and fast, so this is bound by IR construction). This means that none of the locking during IR construction is useful. > > Historically, LLVM had a design where you could dynamically enable and disable multithreading support in a tool, which would be perfect for this use case, but it got removed by this patch <https://github.com/llvm/llvm-project/commit/9c9710eaf4c1f01b8b518bdba89aba059ab14175#diff-bb772a7e6d4ebf2b46c6d42c95f40ddf>: (xref https://reviews.llvm.org/D4216 <https://reviews.llvm.org/D4216>). The rationale in the patch doesn’t make sense to me - this mode had nothing to do with the old LLVM global lock, this had to do with whether llvm::llvm_is_multithreaded() returned true or false … which all the locking stuff is guarded on. > > Would it make sense to re-enable this, or am I missing something? > > -Chris-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200416/4d0c31f7/attachment.html>