Evgeny Leviant via llvm-dev
2017-Oct-06 17:09 UTC
[llvm-dev] Newly added ThreadPoolExecutor causes deadlock in lld
Recently added ThreadPoolExecutor limits number of worker threads to number of logical processors. This might cause deadlock in case one's doing nested calls to parallel_for_each, like this: void Bar() { ... } void Foo() { parallel_for_each(Begin, End, Bar); } void main() { parallel_for_each(Begin, End, Foo); } This happens because both parallel_for_each and parallel_for_each_n wait for task group to finish and this may actually never happen in case they're executed from worker threads. In such case worker thread is blocked in TaskGroup destructor. This does happen in lld, when it writes output sections as there is a nested call to parallel_for_each_n to write each output section inputs. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171006/2690526c/attachment.html>
Rui Ueyama via llvm-dev
2017-Oct-10 01:28 UTC
[llvm-dev] Newly added ThreadPoolExecutor causes deadlock in lld
I reverted my change so that there's no nested parallel_for_each calls in lld. We should fix ThreadPoolExecutor so that it can be called from nested loops. Quote from https://bugs.llvm.org/show_bug.cgi?id=34806#c10 So, orthogonal to what is the best way to achieve a maximum performance in lld, I think ThreadPoolExecutor should be able to handle nested calls. It's because something that is not composable is hard to use. In general, when we call some function, we don't know/care about whether the function uses ThreadPoolExecutor or not. That belongs to an internal design of the function and shouldn't leak. On Fri, Oct 6, 2017 at 10:09 AM, Evgeny Leviant <eleviant at accesssoftek.com> wrote:> Recently added ThreadPoolExecutor limits number of worker threads to > number of logical processors. > > This might cause deadlock in case one's doing nested calls to > parallel_for_each, like this: > > > void Bar() { ... } > > > void Foo() { > > parallel_for_each(Begin, End, Bar); > > } > > > void main() { > > parallel_for_each(Begin, End, Foo); > > } > > > This happens because both parallel_for_each and parallel_for_each_n wait > for task group to finish > > and this may actually never happen in case they're executed from worker > threads. In such case worker > > thread is blocked in TaskGroup destructor. This does happen in lld, when > it writes output sections as > > there is a nested call to parallel_for_each_n to write each output > section inputs. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171009/55015f1c/attachment.html>