Dániel Mihályi via llvm-dev
2017-Apr-05 09:27 UTC
[llvm-dev] Deopt operand bundle behavior
Hi! We have started to use deopt operand bundle to make our native stacktrace deoptimizable and garbage collectable. We stumbled upon an issue and we don't know if it is really an issue on our side or really a problem within LLVM. For example, for this input: declare { i8*, i8* } @getCode() define void @testFunc() { entry: %0 = call { i8*, i8* } @getCode() %1 = extractvalue { i8*, i8* } %0, 1 %2 = bitcast i8* %1 to void ()* call void %2() [ "deopt"() ] ret void } We get this output machine code for x86_64: _testFunc: ## @testFunc .cfi_startproc ## BB#0: ## %entry pushq %rax Lcfi0: .cfi_def_cfa_offset 16 callq _getCode callq *%rax Ltmp0: popq %rax retq Without the deopt operand bundle: _testFunc: ## @testFunc .cfi_startproc ## BB#0: ## %entry pushq %rax Lcfi0: .cfi_def_cfa_offset 16 callq _getCode callq *%rdx popq %rax retq For some reason with the deopt operand bundle for the second half of the value returned by getCode the wrong register is used, namingly %rax instead of %rdx. Am I not aware of something regarding to this feature? Thanks ahead for your time, Daniel Mihalyi
Hi, Are you seeing this issue in general, or only with aggregate return values? If the latter, then I suspect this is a bug specifically around lowering aggregate return values from calls with deopt bundles. We (Azul) do not use aggregate types in function boundaries, so that area is definitely not well tested. If you want to debug this, I'd suggest starting to look at SelectionDAGBuilder::LowerAsSTATEPOINT and SelectionDAGBuilder::LowerCallSiteWithDeoptBundleImpl. It is probably just an oversight, and not a fundamental issue. Given that you have a tiny reproducer I can take a look at it too, but I cannot guarantee a timely response -- I'm fairly time constrained at this point. I'm also very interested in hearing about new uses of deopt operand bundles. If you can share some details on what you're doing with it, that'll be great! Note that if you're working with a *relocating* collector (i.e. your GC copies objects to new addresses) then deopt operand bundles is not sufficient for GC (though it will still let you deoptimize) -- you'll need to use gc.statepoint to get proper semantics. Thanks, -- Sanjoy On Wed, Apr 5, 2017 at 2:27 AM, Dániel Mihályi via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Hi! > > We have started to use deopt operand bundle to make our native stacktrace deoptimizable and garbage collectable. We stumbled upon an issue and we don't know if it is really an issue on our side or really a problem within LLVM. > > > For example, for this input: > > declare { i8*, i8* } @getCode() > > define void @testFunc() { > entry: > %0 = call { i8*, i8* } @getCode() > %1 = extractvalue { i8*, i8* } %0, 1 > %2 = bitcast i8* %1 to void ()* > call void %2() [ "deopt"() ] > ret void > } > > > We get this output machine code for x86_64: > > _testFunc: ## @testFunc > .cfi_startproc > ## BB#0: ## %entry > pushq %rax > Lcfi0: > .cfi_def_cfa_offset 16 > callq _getCode > callq *%rax > Ltmp0: > popq %rax > retq > > > Without the deopt operand bundle: > > _testFunc: ## @testFunc > .cfi_startproc > ## BB#0: ## %entry > pushq %rax > Lcfi0: > .cfi_def_cfa_offset 16 > callq _getCode > callq *%rdx > popq %rax > retq > > > For some reason with the deopt operand bundle for the second half of the value returned by getCode the wrong register is used, namingly %rax instead of %rdx. > > Am I not aware of something regarding to this feature? > > Thanks ahead for your time, > Daniel Mihalyi > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Dániel Mihályi via llvm-dev
2017-Apr-06 10:32 UTC
[llvm-dev] Deopt operand bundle behavior
Hi! Thank you for your insight. This is the only case I have encountered so far. Also, if I switch to -O0 from -O2, then somehow the right register is used as callee. Btw., we intend to use this feature with the help of libunwind to retrieve info for instrumentation and (for now) non-moving garbage collectors. Daniel Mihalyi> On 2017. Apr 5., at 20:43, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > Hi, > > Are you seeing this issue in general, or only with aggregate return values? > > If the latter, then I suspect this is a bug specifically around > lowering aggregate return values from calls with deopt bundles. We > (Azul) do not use aggregate types in function boundaries, so that area > is definitely not well tested. > > If you want to debug this, I'd suggest starting to look at > SelectionDAGBuilder::LowerAsSTATEPOINT and > SelectionDAGBuilder::LowerCallSiteWithDeoptBundleImpl. It is probably > just an oversight, and not a fundamental issue. > > Given that you have a tiny reproducer I can take a look at it too, but > I cannot guarantee a timely response -- I'm fairly time constrained at > this point. > > I'm also very interested in hearing about new uses of deopt operand > bundles. If you can share some details on what you're doing with it, > that'll be great! Note that if you're working with a *relocating* > collector (i.e. your GC copies objects to new addresses) then deopt > operand bundles is not sufficient for GC (though it will still let you > deoptimize) -- you'll need to use gc.statepoint to get proper > semantics. > > Thanks, > -- Sanjoy > > > On Wed, Apr 5, 2017 at 2:27 AM, Dániel Mihályi via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> Hi! >> >> We have started to use deopt operand bundle to make our native stacktrace deoptimizable and garbage collectable. We stumbled upon an issue and we don't know if it is really an issue on our side or really a problem within LLVM. >> >> >> For example, for this input: >> >> declare { i8*, i8* } @getCode() >> >> define void @testFunc() { >> entry: >> %0 = call { i8*, i8* } @getCode() >> %1 = extractvalue { i8*, i8* } %0, 1 >> %2 = bitcast i8* %1 to void ()* >> call void %2() [ "deopt"() ] >> ret void >> } >> >> >> We get this output machine code for x86_64: >> >> _testFunc: ## @testFunc >> .cfi_startproc >> ## BB#0: ## %entry >> pushq %rax >> Lcfi0: >> .cfi_def_cfa_offset 16 >> callq _getCode >> callq *%rax >> Ltmp0: >> popq %rax >> retq >> >> >> Without the deopt operand bundle: >> >> _testFunc: ## @testFunc >> .cfi_startproc >> ## BB#0: ## %entry >> pushq %rax >> Lcfi0: >> .cfi_def_cfa_offset 16 >> callq _getCode >> callq *%rdx >> popq %rax >> retq >> >> >> For some reason with the deopt operand bundle for the second half of the value returned by getCode the wrong register is used, namingly %rax instead of %rdx. >> >> Am I not aware of something regarding to this feature? >> >> Thanks ahead for your time, >> Daniel Mihalyi >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev