Valentin Churavy via llvm-dev
2021-Aug-31 15:30 UTC
[llvm-dev] Thread migration during function execution, semantics of thread local variables
Hi LLVM-dev, I am working on a runtime system that has task migration, e.g. a task can be migrated between different threads. So a function can start executing on one thread, call a function (that might call into the function), and then execute onto a different thread. This poses a problem with thread local variables. As an example, take the program below. After the call to `callee` we might have switched threads and thus we need to recalculate the location of the thread local variable. ``` @var = available_externally thread_local global i32 0, align 4 declare void @callee() define signext i32 @main() nounwind { entry: %0 = load i32, i32* @var, align 4 call void @callee() %1 = load i32, i32* @var, align 4 %2 = icmp eq i32 %0, %1 %3 = zext i1 %2 to i32 ret i32 %3 } ``` As far as I can tell there is no current mechanism to inform LLVM that thread migration might occur, and it depends on the backend what behaviour you might get. As an example compiling with `x86-64-unknown-linux-gnu`, we get: ``` movq var at GOTTPOFF(%rip), %rbx movl %fs:(%rbx), %ebp callq callee at PLT xorl %eax, %eax cmpl %fs:(%rbx), %ebp ``` Which happens to be correct. On Darwin on the other hand: ``` movq _var at TLVP(%rip), %rdi callq *(%rdi) movq %rax, %rbx movl (%rax), %ebp callq _callee xorl %eax, %eax cmpl (%rbx), %ebp ``` the address for the TLS get's CSE'd, and thus the load could be incorrect. Has there been any prior work on supporting thread migration + thread local storage? Kind regards, Valentin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210831/6e79f22e/attachment.html>
James Y Knight via llvm-dev
2021-Aug-31 21:30 UTC
[llvm-dev] Thread migration during function execution, semantics of thread local variables
There was a discussion on a very similar topic with regards to C++20 coroutines back in November/December 2020 entitled "[RFC] Coroutine and pthread_self". It discusses exactly the same issues you will run into -- although for coroutines, the issue only occurs in early optimization passes, because eventually the coroutine with yield-points gets transformed into a "normal" function. Note that TLS access is not the only problem you have -- the removal of redundant function-calls across a thread-switch will also be a problem, e.g. as enabled by LLVM IR's "readnone" attribute (which is generated from C __attribute__((const)) which is present e.g. on pthread_self). See the thread starting here: https://lists.llvm.org/pipermail/llvm-dev/2020-November/146766.html and then into the next month here: https://lists.llvm.org/pipermail/llvm-dev/2020-December/147012.html The work in this area has not yet been completed. On Tue, Aug 31, 2021 at 1:44 PM Valentin Churavy via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi LLVM-dev, > > I am working on a runtime system that has task migration, e.g. a task can > be migrated between different threads. So a function can start executing on > one thread, call a function (that might call into the function), > and then execute onto a different thread. This poses a problem with thread > local variables. As an example, take the program below. After the call to > `callee` we might have switched threads and thus we need to > recalculate the location of the thread local variable. > > ``` > @var = available_externally thread_local global i32 0, align 4 > > declare void @callee() > > define signext i32 @main() nounwind { > entry: > %0 = load i32, i32* @var, align 4 > call void @callee() > %1 = load i32, i32* @var, align 4 > %2 = icmp eq i32 %0, %1 > %3 = zext i1 %2 to i32 > ret i32 %3 > } > ``` > > As far as I can tell there is no current mechanism to inform LLVM that > thread migration might occur, and it depends on the backend what behaviour > you might get. > > As an example compiling with `x86-64-unknown-linux-gnu`, we get: > > ``` > movq var at GOTTPOFF(%rip), %rbx > movl %fs:(%rbx), %ebp > callq callee at PLT > xorl %eax, %eax > cmpl %fs:(%rbx), %ebp > ``` > > Which happens to be correct. On Darwin on the other hand: > > ``` > movq _var at TLVP(%rip), %rdi > callq *(%rdi) > movq %rax, %rbx > movl (%rax), %ebp > callq _callee > xorl %eax, %eax > cmpl (%rbx), %ebp > ``` > > the address for the TLS get's CSE'd, and thus the load could be incorrect. > > Has there been any prior work on supporting thread migration + thread > local storage? > > Kind regards, > Valentin > > > > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210831/caa3b56a/attachment.html>