thr3ads.net - llvm dev - [llvm-dev] [cfe-dev] Zero length function pointer equality [Mar 2021]

If this information is useful, please help other people find it:
Share via:

James Y Knight via llvm-dev

2021-Mar-19 22:00 UTC

[llvm-dev] [cfe-dev] Zero length function pointer equality

I'm not sure whether we'd want *every* unreachable to emit a trap, but I
do
think we should try not to let code fall out of one function and into a
completely unrelated one. That is: I'd propose that the last basic-block in
every function should get a trap instruction added unless it already ends
in a control transfer instruction (jmp, ret, or branch -- call doesn't
count, since it may return).

On Fri, Mar 19, 2021 at 5:12 PM David Blaikie via cfe-dev <
cfe-dev at lists.llvm.org> wrote:
> Just writing it down in this thread - this issue's been discussed a bit
in
> this bug: https://bugs.llvm.org/show_bug.cgi?id=49599
>
> And yeah, I'm considering adopting MachO's default (TrapUnreachable
+
> NoTrapOnNoreturn) as the default for LLVM (will require some design
> discussion, no doubt) since it seems to capture most of the functionality
> desired. Maybe there are some cases where we have extra unreachables that
> could've otherwise been optimized away/elided, but hopefully nothing
> drastic.
>
> (some platforms still need the trap-on-noreturn - Windows+AArch64 and
> maybe Sony, etc - happy for some folks to opt into that). I wonder whether
> TrapUnreachable shouldn't even be an option anymore though, if it
becomes
> load bearing for correctness - or should it become a fallback option -
"no
> trap unreachable" maybe means nop instead of trap, in case your target
> can't handle a trap sometimes (I came across an issue with AMDGPU not
being
> able to add traps to functions that it isn't expecting - the function
needs
> some special attribute to have a trap in it - but I guess it can be updated
> to add that attribute if the function has an unreachable in it (though then
> it has to recreate the no-trap-on-noreturn handling too when deciding
> whether to add the attribute... ))
>
> On Mon, Jul 27, 2020 at 9:20 AM Robinson, Paul <paul.robinson at
sony.com>
> wrote:
>
>>
>>
>> > -----Original Message-----
>> > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On
Behalf Of Hans
>> > Wennborg via llvm-dev
>> > Sent: Monday, July 27, 2020 9:11 AM
>> > To: David Blaikie <dblaikie at gmail.com>
>> > Cc: llvm-dev <llvm-dev at lists.llvm.org>; Clang Dev <
>> cfe-dev at lists.llvm.org>
>> > Subject: Re: [llvm-dev] [cfe-dev] Zero length function pointer
equality
>> >
>> > On Sat, Jul 25, 2020 at 3:40 AM David Blaikie <dblaikie at
gmail.com>
>> wrote:
>> > >
>> > > Looks perfect to me!
>> > >
>> > > well, a couple of questions: Why a noop, rather than
int3/ud2/etc?
>> >
>> > Probably no reason.
>>
>> FTR there is TargetOptions.TrapUnreachable, which some targets turn on
>> (for X86 it's on for MachO and PS4), this turns
'unreachable' into ud2.
>> Clearly it covers more than "empty" functions but is probably
the kind
>> of thing you're looking for.
>> --paulr
>>
>> >
>> > > Might be worth using the existing code that places such an
instruction
>> > > when building at -O0?
>> >
>> > I wasn't aware of that. Does it happen for all functions (e.g.
I think
>> > I got pulled into this due to functions with the naked attribute)?
>> >
>> > > & you mention that this causes problems on Windows - but
ICF done by
>> > > the Windows linker does not cause such problems? (I'd
have thought
>> > > they'd result in the same situation - two functions
described as being
>> > > at the same address?) is there a quick summary of why those
two cases
>> > > turn out differently?
>> >
>> > The case that we hit was that the Control Flow Guard table of
>> > addresses in the binary ended up listing the same address twice,
which
>> > the loader didn't expect. It may be that the linker took care
to avoid
>> > that for ICF (if two ICF'd functions got the same address,
only list
>> > it once in the CFG table) but still didn't handle the
"empty function"
>> > problem.
>> >
>> > > On Fri, Jul 24, 2020 at 6:17 AM Hans Wennborg <hans at
chromium.org>
>> wrote:
>> > > >
>> > > > Maybe we can just expand this to always apply:
>> > https://reviews.llvm.org/D32330
>> > > >
>> > > > On Fri, Jul 24, 2020 at 2:46 AM David Blaikie via
cfe-dev
>> > > > <cfe-dev at lists.llvm.org> wrote:
>> > > > >
>> > > > > LLVM can produce zero length functions from cases
like this (when
>> > > > > optimizations are enabled):
>> > > > >
>> > > > > void f1() { __builtin_unreachable(); }
>> > > > > int f2() { /* missing return statement */ }
>> > > > >
>> > > > > This code is valid, so long as the functions are
never called.
>> > > > >
>> > > > > I believe C++ requires that all functions have a
distinct address
>> > (ie:
>> > > > > &f1 != &f2) and LLVM optimizes code on this
basis (assert(f1 =>> f2)
>> > > > > gets optimized into an unconditional assertion
failure)
>> > > > >
>> > > > > But these zero length functions can end up with
identical
>> addresses.
>> > > > >
>> > > > > I'm unaware of anything in the C++ spec (or the
LLVM langref) that
>> > > > > would indicate that would allow distinct functions
to have
>> identical
>> > > > > addresses - so should we do something about this in
the LLVM
>> > backend?
>> > > > > add a little padding? a nop instruction? (if
we're adding an
>> > > > > instruction anyway, perhaps we might as well make
it an int3?)
>> > > > >
>> > > > > (I came across this due to DWARF issues with zero
length
>> functions &
>> > > > > thinking about if/how this should be supported)
>> > > > > _______________________________________________
>> > > > > cfe-dev mailing list
>> > > > > cfe-dev at lists.llvm.org
>> > > > >
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>> > _______________________________________________
>> > LLVM Developers mailing list
>> > llvm-dev at lists.llvm.org
>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210319/55b3ad9a/attachment.html>

David Blaikie via llvm-dev

2021-Mar-19 22:31 UTC

head link

[llvm-dev] [cfe-dev] Zero length function pointer equality

On Fri, Mar 19, 2021 at 3:00 PM James Y Knight <jyknight at google.com>
wrote:
> I'm not sure whether we'd want *every* unreachable to emit a trap,
but I
> do think we should try not to let code fall out of one function and into a
> completely unrelated one.
>
Yeah, that was my gut reaction too - though TrapUnreachable doesn't inhibit
SimplifyCFG from eliminating unreachable blocks (can still leave
unreachable in a block after a call - because the call may or may not
return).

> That is: I'd propose that the last basic-block in every function should
> get a trap instruction added unless it already ends in a control transfer
> instruction
>
That was my initial theory, but given MachO defaults to TrapUnreachable and
it doesn't inhibit SimplifyCFG, so most unreachable blocks do get
eliminated - I'm sort of leaning towards that being sufficiently correct.

> (jmp, ret, or branch -- call doesn't count, since it may return).
>
I'm inclined to omit the trap after a call to a noreturn function for
brevity - even though it does leave the possibility of violating the
noreturn contract leading to that fallthrough UB.

>
> On Fri, Mar 19, 2021 at 5:12 PM David Blaikie via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>> Just writing it down in this thread - this issue's been discussed a
bit
>> in this bug: https://bugs.llvm.org/show_bug.cgi?id=49599
>>
>> And yeah, I'm considering adopting MachO's default
(TrapUnreachable +
>> NoTrapOnNoreturn) as the default for LLVM (will require some design
>> discussion, no doubt) since it seems to capture most of the
functionality
>> desired. Maybe there are some cases where we have extra unreachables
that
>> could've otherwise been optimized away/elided, but hopefully
nothing
>> drastic.
>>
>> (some platforms still need the trap-on-noreturn - Windows+AArch64 and
>> maybe Sony, etc - happy for some folks to opt into that). I wonder
whether
>> TrapUnreachable shouldn't even be an option anymore though, if it
becomes
>> load bearing for correctness - or should it become a fallback option -
"no
>> trap unreachable" maybe means nop instead of trap, in case your
target
>> can't handle a trap sometimes (I came across an issue with AMDGPU
not being
>> able to add traps to functions that it isn't expecting - the
function needs
>> some special attribute to have a trap in it - but I guess it can be
updated
>> to add that attribute if the function has an unreachable in it (though
then
>> it has to recreate the no-trap-on-noreturn handling too when deciding
>> whether to add the attribute... ))
>>
>> On Mon, Jul 27, 2020 at 9:20 AM Robinson, Paul <paul.robinson at
sony.com>
>> wrote:
>>
>>>
>>>
>>> > -----Original Message-----
>>> > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On
Behalf Of Hans
>>> > Wennborg via llvm-dev
>>> > Sent: Monday, July 27, 2020 9:11 AM
>>> > To: David Blaikie <dblaikie at gmail.com>
>>> > Cc: llvm-dev <llvm-dev at lists.llvm.org>; Clang Dev
<
>>> cfe-dev at lists.llvm.org>
>>> > Subject: Re: [llvm-dev] [cfe-dev] Zero length function pointer
equality
>>> >
>>> > On Sat, Jul 25, 2020 at 3:40 AM David Blaikie <dblaikie at
gmail.com>
>>> wrote:
>>> > >
>>> > > Looks perfect to me!
>>> > >
>>> > > well, a couple of questions: Why a noop, rather than
int3/ud2/etc?
>>> >
>>> > Probably no reason.
>>>
>>> FTR there is TargetOptions.TrapUnreachable, which some targets turn
on
>>> (for X86 it's on for MachO and PS4), this turns
'unreachable' into ud2.
>>> Clearly it covers more than "empty" functions but is
probably the kind
>>> of thing you're looking for.
>>> --paulr
>>>
>>> >
>>> > > Might be worth using the existing code that places such
an
>>> instruction
>>> > > when building at -O0?
>>> >
>>> > I wasn't aware of that. Does it happen for all functions
(e.g. I think
>>> > I got pulled into this due to functions with the naked
attribute)?
>>> >
>>> > > & you mention that this causes problems on Windows -
but ICF done by
>>> > > the Windows linker does not cause such problems? (I'd
have thought
>>> > > they'd result in the same situation - two functions
described as
>>> being
>>> > > at the same address?) is there a quick summary of why
those two cases
>>> > > turn out differently?
>>> >
>>> > The case that we hit was that the Control Flow Guard table of
>>> > addresses in the binary ended up listing the same address
twice, which
>>> > the loader didn't expect. It may be that the linker took
care to avoid
>>> > that for ICF (if two ICF'd functions got the same address,
only list
>>> > it once in the CFG table) but still didn't handle the
"empty function"
>>> > problem.
>>> >
>>> > > On Fri, Jul 24, 2020 at 6:17 AM Hans Wennborg <hans at
chromium.org>
>>> wrote:
>>> > > >
>>> > > > Maybe we can just expand this to always apply:
>>> > https://reviews.llvm.org/D32330
>>> > > >
>>> > > > On Fri, Jul 24, 2020 at 2:46 AM David Blaikie via
cfe-dev
>>> > > > <cfe-dev at lists.llvm.org> wrote:
>>> > > > >
>>> > > > > LLVM can produce zero length functions from
cases like this (when
>>> > > > > optimizations are enabled):
>>> > > > >
>>> > > > > void f1() { __builtin_unreachable(); }
>>> > > > > int f2() { /* missing return statement */ }
>>> > > > >
>>> > > > > This code is valid, so long as the functions
are never called.
>>> > > > >
>>> > > > > I believe C++ requires that all functions have
a distinct address
>>> > (ie:
>>> > > > > &f1 != &f2) and LLVM optimizes code on
this basis (assert(f1 =>>> f2)
>>> > > > > gets optimized into an unconditional assertion
failure)
>>> > > > >
>>> > > > > But these zero length functions can end up with
identical
>>> addresses.
>>> > > > >
>>> > > > > I'm unaware of anything in the C++ spec (or
the LLVM langref)
>>> that
>>> > > > > would indicate that would allow distinct
functions to have
>>> identical
>>> > > > > addresses - so should we do something about
this in the LLVM
>>> > backend?
>>> > > > > add a little padding? a nop instruction? (if
we're adding an
>>> > > > > instruction anyway, perhaps we might as well
make it an int3?)
>>> > > > >
>>> > > > > (I came across this due to DWARF issues with
zero length
>>> functions &
>>> > > > > thinking about if/how this should be supported)
>>> > > > > _______________________________________________
>>> > > > > cfe-dev mailing list
>>> > > > > cfe-dev at lists.llvm.org
>>> > > > >
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>> > _______________________________________________
>>> > LLVM Developers mailing list
>>> > llvm-dev at lists.llvm.org
>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210319/51d64da2/attachment.html>

llvm dev - Mar 2021 - [cfe-dev] Zero length function pointer equality

[llvm-dev] [cfe-dev] Zero length function pointer equality

[llvm-dev] [cfe-dev] Zero length function pointer equality