thr3ads.net - llvm dev - [llvm-dev] Maintaining musttail call contract [Nov 2020]

If this information is useful, please help other people find it:
Share via:

Reid Kleckner via llvm-dev

2020-Nov-02 19:03 UTC

[llvm-dev] Maintaining musttail call contract

IIRC, we do already have a verifier rule for this. At least, I'm pretty
sure I added one.

I was going to say, passes are supposed to handle this by checking
ReturnInst::isPrecededByMustTailCall, but apparently that helper doesn't
exist yet! I just imagined it. :) Let's add it. I think if we had such a
predicate, one that scans backwards skipping debug intrinsics and casts
looking for a musttail calls or any other non-cast, non-musttail call
instruction, it would make it easier for passes to get this right.

You are right, though, this construct is designed in a way that makes it
hard for instrumentation passes to be correct by default, and it's not
their fault that they are all "broken" according to the LangRef. The
only
other implementation strategy I can think of for this feature is to have a
special "tail call" instruction that is both a call and return, since
that's really what we are expressing with two separate instructions. That
has a different set of tradeoffs, not all beneficial.

On Sun, Nov 1, 2020 at 12:53 PM Philip Reames via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> I just skimmed your message, so I might be missing something, but first
> impression would be that this sounds like a missing verifier rule.  Any
> reason the verifier can't check for this legality restriction?  If it
> can, then the existing verify after all functionality should be
> sufficient to catch the bugs you're referencing?
>
> Philip
>
> On 10/30/20 7:47 PM, Xun Li via llvm-dev wrote:
> > Hi,
> >
> > A musttail call must be preceded by an optional bitcast instruction
> > and a return instruction. It seems that there isn't a mechanism to
> > make sure the musttail call contract isn't broken by any
instruction
> > insertion in the IR.
> > As far as I can tell, all IR transformations don't really look at
> > what's before a ReturnInst when inserting instructions. For
example,
> > all Instrumentation passes ignored them, and would insert
> > instrumentation instructions in between musttail call and ReturnInst,
> > which would break the contract.
> > I fixed it for ASAN in https://reviews.llvm.org/D87777 and fixed it
> > for TSAN in https://reviews.llvm.org/D87620.
> > But lately I also found that SafeStack pass has the same problem. For
> > instance, running "opt --safe-stack" on the following IR
would crash
> > Clang:
> >
> > target triple = "x86_64-unknown-linux-gnu"
> >
> > declare i32 @foo(i32* %p)
> > declare void @alloca_test_use([10 x i8]*)
> >
> > define i32 @call_foo(i32* %a) safestack {
> >    %x = alloca [10 x i8], align 1
> >    call void @alloca_test_use([10 x i8]* %x)
> >    %r = musttail call i32 @foo(i32* %a)
> >    ret i32 %r
> > }
> >
> > I am pretty sure that SafeStack isn't the last pass that has this
> > problem and there will be more like this in the future.
> > I am trying to see if there are good systematic ways to solve this
> > problem without having to require every pass author to be careful when
> > inserting instructions before ReturnInst.
> >
> > One solution I can think of is to update
> > IRBuilderBase::SetInsertPoint(Instruction *I) to always check if
"I"
> > is a ReturnInst followed by a musttail call contract, and if so, move
> > the insertion point to before the call. Since this approach adds
> > overhead to every insertion point creation, I am not sure if it's
the
> > best solution.
> >
> > Advice and feedback much appreciated!
> >
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201102/5b41b5af/attachment.html>

Xun Li via llvm-dev

2020-Nov-03 04:13 UTC

head link

[llvm-dev] Maintaining musttail call contract

Thanks for the feedback.
Yes this is covered by the verifier. So there won't be incorrect
runtime program behavior as long as verifier runs.
I was concerned however that we don't know how many bugs are there and
it would be whoever uses musttail call start to discover more and more
bugs.
ReturnInst::isPrecededByMustTailCall seems a good first step, I can add that.
Is a special "tail call" instruction that is both a call and return
practical to add? It would need to inherit from both call and return
inst, otherwise all the places where it's checking against either of
them will need to remember to check this special instruction as well,
which also sounds error-prone to me.

On Mon, Nov 2, 2020 at 11:03 AM Reid Kleckner <rnk at google.com>
wrote:>
> IIRC, we do already have a verifier rule for this. At least, I'm pretty
sure I added one.
>
> I was going to say, passes are supposed to handle this by checking
ReturnInst::isPrecededByMustTailCall, but apparently that helper doesn't
exist yet! I just imagined it. :) Let's add it. I think if we had such a
predicate, one that scans backwards skipping debug intrinsics and casts looking
for a musttail calls or any other non-cast, non-musttail call instruction, it
would make it easier for passes to get this right.
>
> You are right, though, this construct is designed in a way that makes it
hard for instrumentation passes to be correct by default, and it's not their
fault that they are all "broken" according to the LangRef. The only
other implementation strategy I can think of for this feature is to have a
special "tail call" instruction that is both a call and return, since
that's really what we are expressing with two separate instructions. That
has a different set of tradeoffs, not all beneficial.
>
> On Sun, Nov 1, 2020 at 12:53 PM Philip Reames via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
>>
>> I just skimmed your message, so I might be missing something, but first
>> impression would be that this sounds like a missing verifier rule.  Any
>> reason the verifier can't check for this legality restriction?  If
it
>> can, then the existing verify after all functionality should be
>> sufficient to catch the bugs you're referencing?
>>
>> Philip
>>
>> On 10/30/20 7:47 PM, Xun Li via llvm-dev wrote:
>> > Hi,
>> >
>> > A musttail call must be preceded by an optional bitcast
instruction
>> > and a return instruction. It seems that there isn't a
mechanism to
>> > make sure the musttail call contract isn't broken by any
instruction
>> > insertion in the IR.
>> > As far as I can tell, all IR transformations don't really look
at
>> > what's before a ReturnInst when inserting instructions. For
example,
>> > all Instrumentation passes ignored them, and would insert
>> > instrumentation instructions in between musttail call and
ReturnInst,
>> > which would break the contract.
>> > I fixed it for ASAN in https://reviews.llvm.org/D87777 and fixed
it
>> > for TSAN in https://reviews.llvm.org/D87620.
>> > But lately I also found that SafeStack pass has the same problem.
For
>> > instance, running "opt --safe-stack" on the following IR
would crash
>> > Clang:
>> >
>> > target triple = "x86_64-unknown-linux-gnu"
>> >
>> > declare i32 @foo(i32* %p)
>> > declare void @alloca_test_use([10 x i8]*)
>> >
>> > define i32 @call_foo(i32* %a) safestack {
>> >    %x = alloca [10 x i8], align 1
>> >    call void @alloca_test_use([10 x i8]* %x)
>> >    %r = musttail call i32 @foo(i32* %a)
>> >    ret i32 %r
>> > }
>> >
>> > I am pretty sure that SafeStack isn't the last pass that has
this
>> > problem and there will be more like this in the future.
>> > I am trying to see if there are good systematic ways to solve this
>> > problem without having to require every pass author to be careful
when
>> > inserting instructions before ReturnInst.
>> >
>> > One solution I can think of is to update
>> > IRBuilderBase::SetInsertPoint(Instruction *I) to always check if
"I"
>> > is a ReturnInst followed by a musttail call contract, and if so,
move
>> > the insertion point to before the call. Since this approach adds
>> > overhead to every insertion point creation, I am not sure if
it's the
>> > best solution.
>> >
>> > Advice and feedback much appreciated!
>> >
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


-- 
Xun

Philip Reames via llvm-dev

2020-Nov-03 17:19 UTC

head link

[llvm-dev] Maintaining musttail call contract

On 11/2/2020 8:13 PM, Xun Li wrote:> Thanks for the feedback.
> Yes this is covered by the verifier. So there won't be incorrect
> runtime program behavior as long as verifier runs.
> I was concerned however that we don't know how many bugs are there and
> it would be whoever uses musttail call start to discover more and more
> bugs.
 From experience, fuzzing cases like this tends to get extremely good 
coverage.  Once you have the verifier rule for the oracle of 
correctness, you're just looking to exercise paths through the 
instrumentation code, and fuzzers are quite good at that.  Instead of 
focusing on an API change, I'd suggest standing up a fuzzer and run it 
for a few days.

(To be clear, the program being fuzzed must include the verifiier 
check.  That's critical as it gives the fuzzer coverage information 
about the error cases.)
> ReturnInst::isPrecededByMustTailCall seems a good first step, I can add
that.
> Is a special "tail call" instruction that is both a call and
return
> practical to add? It would need to inherit from both call and return
> inst, otherwise all the places where it's checking against either of
> them will need to remember to check this special instruction as well,
> which also sounds error-prone to me.
>
> On Mon, Nov 2, 2020 at 11:03 AM Reid Kleckner <rnk at google.com>
wrote:
>> IIRC, we do already have a verifier rule for this. At least, I'm
pretty sure I added one.
>>
>> I was going to say, passes are supposed to handle this by checking
ReturnInst::isPrecededByMustTailCall, but apparently that helper doesn't
exist yet! I just imagined it. :) Let's add it. I think if we had such a
predicate, one that scans backwards skipping debug intrinsics and casts looking
for a musttail calls or any other non-cast, non-musttail call instruction, it
would make it easier for passes to get this right.
>>
>> You are right, though, this construct is designed in a way that makes
it hard for instrumentation passes to be correct by default, and it's not
their fault that they are all "broken" according to the LangRef. The
only other implementation strategy I can think of for this feature is to have a
special "tail call" instruction that is both a call and return, since
that's really what we are expressing with two separate instructions. That
has a different set of tradeoffs, not all beneficial.
>>
>> On Sun, Nov 1, 2020 at 12:53 PM Philip Reames via llvm-dev <llvm-dev
at lists.llvm.org> wrote:
>>> I just skimmed your message, so I might be missing something, but
first
>>> impression would be that this sounds like a missing verifier rule. 
Any
>>> reason the verifier can't check for this legality restriction? 
If it
>>> can, then the existing verify after all functionality should be
>>> sufficient to catch the bugs you're referencing?
>>>
>>> Philip
>>>
>>> On 10/30/20 7:47 PM, Xun Li via llvm-dev wrote:
>>>> Hi,
>>>>
>>>> A musttail call must be preceded by an optional bitcast
instruction
>>>> and a return instruction. It seems that there isn't a
mechanism to
>>>> make sure the musttail call contract isn't broken by any
instruction
>>>> insertion in the IR.
>>>> As far as I can tell, all IR transformations don't really
look at
>>>> what's before a ReturnInst when inserting instructions. For
example,
>>>> all Instrumentation passes ignored them, and would insert
>>>> instrumentation instructions in between musttail call and
ReturnInst,
>>>> which would break the contract.
>>>> I fixed it for ASAN in https://reviews.llvm.org/D87777 and
fixed it
>>>> for TSAN in https://reviews.llvm.org/D87620.
>>>> But lately I also found that SafeStack pass has the same
problem. For
>>>> instance, running "opt --safe-stack" on the following
IR would crash
>>>> Clang:
>>>>
>>>> target triple = "x86_64-unknown-linux-gnu"
>>>>
>>>> declare i32 @foo(i32* %p)
>>>> declare void @alloca_test_use([10 x i8]*)
>>>>
>>>> define i32 @call_foo(i32* %a) safestack {
>>>>     %x = alloca [10 x i8], align 1
>>>>     call void @alloca_test_use([10 x i8]* %x)
>>>>     %r = musttail call i32 @foo(i32* %a)
>>>>     ret i32 %r
>>>> }
>>>>
>>>> I am pretty sure that SafeStack isn't the last pass that
has this
>>>> problem and there will be more like this in the future.
>>>> I am trying to see if there are good systematic ways to solve
this
>>>> problem without having to require every pass author to be
careful when
>>>> inserting instructions before ReturnInst.
>>>>
>>>> One solution I can think of is to update
>>>> IRBuilderBase::SetInsertPoint(Instruction *I) to always check
if "I"
>>>> is a ReturnInst followed by a musttail call contract, and if
so, move
>>>> the insertion point to before the call. Since this approach
adds
>>>> overhead to every insertion point creation, I am not sure if
it's the
>>>> best solution.
>>>>
>>>> Advice and feedback much appreciated!
>>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>

Reid Kleckner via llvm-dev

2020-Nov-03 17:42 UTC

head link

[llvm-dev] Maintaining musttail call contract

On Mon, Nov 2, 2020 at 8:13 PM Xun Li <lxfind at gmail.com> wrote:
> Is a special "tail call" instruction that is both a call and
return
> practical to add? It would need to inherit from both call and return
> inst, otherwise all the places where it's checking against either of
> them will need to remember to check this special instruction as well,
> which also sounds error-prone to me.
>
My sense is that the current design is better for LLVM as it is today, but
at the end of the day, these are design tradeoffs. I think the current
design makes IPO easier, but makes instrumentation harder. Most scalar
optimization passes don't run into problems because they generally remove
or replace instructions rather than inserting new instructions, and when
they insert new instructions, they tend to be casts, which are permitted.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201103/1437d049/attachment.html>

llvm dev - Nov 2020 - Maintaining musttail call contract

[llvm-dev] Maintaining musttail call contract

[llvm-dev] Maintaining musttail call contract

[llvm-dev] Maintaining musttail call contract

[llvm-dev] Maintaining musttail call contract