Hi, I’m experimenting with LLVM coroutines, and am wondering about a particular case where a seemingly irrelevant IR change prevents elision optimization. Any insight into why this happens would be greatly appreciated. I'm using LLVM 7.0.1. The code I'm working with is basically equivalent to the following Python example: def my_coro(n: int): yield n my_var = <some extern global> if my_var > 0: for a in my_coro(my_var): print a Here’s my_coro in LLVM IR (note that there is an initial suspend, then a suspend to yield the value, then the final suspend): define private i8* @my_coro(i64) { entry: %promise = alloca i64, i64 1 %1 = bitcast i64* %promise to i8* %id = call token @llvm.coro.id(i32 0, i8* %1, i8* null, i8* null) %2 = alloca i64, i64 1 store i64 %0, i64* %2 %3 = call i1 @llvm.coro.alloc(token %id) br i1 %3, label %alloc, label %begin alloc: ; preds = %entry %4 = call i64 @llvm.coro.size.i64() %5 = call i8* @my_alloc(i64 %4) br label %begin begin: ; preds = %entry, %alloc %6 = phi i8* [ null, %entry ], [ %5, %alloc ] %hdl = call i8* @llvm.coro.begin(token %id, i8* %6) %7 = call i8 @llvm.coro.suspend(token none, i1 false) switch i8 %7, label %suspend [ i8 0, label %9 i8 1, label %cleanup ] final: ; preds = %12 %8 = call i8 @llvm.coro.suspend(token none, i1 true) switch i8 %8, label %suspend [ i8 0, label %13 i8 1, label %cleanup ] ; <label>:9: ; preds = %begin %10 = load i64, i64* %2 store i64 %10, i64* %promise %11 = call i8 @llvm.coro.suspend(token none, i1 false) switch i8 %11, label %suspend [ i8 0, label %12 i8 1, label %cleanup ] ; <label>:12: ; preds = %9 br label %final ; <label>:13: ; preds = %final unreachable cleanup: ; preds = %final, %9, %entry %14 = call i8* @llvm.coro.free(token %id, i8* %hdl) br label %suspend suspend: ; preds = %final, %9, %entry, %cleanup %15 = call i1 @llvm.coro.end(i8* %hdl, i1 false) ret i8* %hdl } And how it's called (i.e. the for-loop above): define external void @main() { entry: %0 = load i64, i64* @my.var %1 = icmp sgt i64 %0, 0 br i1 %1, label %if, label %exit if: ; preds = %entry %2 = load i64, i64* @my.var %3 = call i8* @my_coro(i64 %2) br label %for for: ; preds = %body, %for_cont, %if call void @llvm.coro.resume(i8* %3) %4 = call i1 @llvm.coro.done(i8* %3) br i1 %4, label %cleanup, label %body body: ; preds = %for %5 = call i8* @llvm.coro.promise(i8* %3, i32 8, i1 false) %6 = bitcast i8* %5 to i64* %7 = load i64, i64* %6 call void @my_print(i64 %7) br label %for cleanup: ; preds = %for call void @llvm.coro.destroy(i8* %3) br label %exit exit: ; preds = %entry, %cleanup ret void } Now if I optimize this with "opt -S -enable-coroutines -O3", the coroutine allocation is _not_ elided. But if I remove the if-statement (e.g. change the condition to 1 > 0, giving the first branch in main() an i1 true condition), then elision does take place. However, I see no reason why elision can't be applied to the first version -- why does the presence of a branch outside the blocks where the coroutine is used change anything? Am I perhaps using opt incorrectly? Any insight would be greatly appreciated here, and thanks in advance. Ariya -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190204/8aef65a7/attachment.html>