thr3ads.net - llvm dev - [llvm-dev] [RFC] Setting the current debug loc when the insertion point changes [Nov 2017]

If this information is useful, please help other people find it:
Share via:

Vedant Kumar via llvm-dev

2017-Nov-06 22:19 UTC

[llvm-dev] [RFC] Setting the current debug loc when the insertion point changes

Hi everybody,

I'm investigating some correctness issues with debug line table information
in optimized code. I've noticed something problematic in IRBuilder. Setting
the insertion point of an IRBuilder sets the builder’s current debug loc to the
one attached to the insertion point. IMO this isn't always a correct or
convenient thing to do, and it shouldn't be the default behavior.

# Correctness

The IRBuilder shouldn't assume that the debug loc at its insertion point
applies to any instructions it emits. It doesn't have enough information to
make that connection. This assumption leads to bugs like llvm.org/PR25630
<http://llvm.org/PR25630>. I.e it creates situations in which we silently
push around incorrect DILocations without realizing it, because the debug locs
in the IR look nice and contiguous.

To fix the issue, clients of IRBuilder should be required to either:

1. Explicitly set the correct debug location before inserting instructions, or
2. Opt-in to the current behavior of propagating the debug loc from the
insertion point.

# Convenience

The current behavior of IRBuilder can be convenient when it's correct. But
it can be *really* inconvenient when it's incorrect.

If a client sets an IRBuilder's debug loc correctly and makes a call that
changes the builder's insertion point, all generated instructions will pick
up a bogus debug loc from the IP. E.g this is what happens with
SCEVExpander::expandCodeFor(). There's no way to fix this bug without
changing IRBuilder's behavior.

# Solutions

1. Add a method called SetInsertPointAndDebugLoc() which behaves identically to
the current SetInsertPoint(). Change SetInsertPoint() so that it doesn't
change the current debug loc.
2. Add a boolean argument to SetInsertPoint() called UpdateDbgLoc which defaults
to false. Only change the current debug loc if UpdateDbgLoc is true.

I have a mild preference for idea #1 because it's easier to read, IMO.

Whichever solution we pick, my plan is to first migrate all in-tree users of
SetInsertPoint() to the explicit API, and to then gradually fix passes which
rely on the current incorrect behavior.

I'd appreciate your thoughts on this!

thanks,
vedant
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171106/f40d2bb6/attachment.html>

Adrian Prantl via llvm-dev

2017-Nov-06 22:32 UTC

head link

[llvm-dev] [RFC] Setting the current debug loc when the insertion point changes

> On Nov 6, 2017, at 2:19 PM, Vedant Kumar <vsk at apple.com> wrote:
> 
> Hi everybody,
> 
> I'm investigating some correctness issues with debug line table
information in optimized code. I've noticed something problematic in
IRBuilder. Setting the insertion point of an IRBuilder sets the builder’s
current debug loc to the one attached to the insertion point. IMO this isn't
always a correct or convenient thing to do, and it shouldn't be the default
behavior.
> 
> # Correctness
> 
> The IRBuilder shouldn't assume that the debug loc at its insertion
point applies to any instructions it emits. It doesn't have enough
information to make that connection. This assumption leads to bugs like
llvm.org/PR25630. I.e it creates situations in which we silently push around
incorrect DILocations without realizing it, because the debug locs in the IR
look nice and contiguous.
I agree with your analysis.
> 
> To fix the issue, clients of IRBuilder should be required to either:
> 
> 1. Explicitly set the correct debug location before inserting instructions,
or
> 2. Opt-in to the current behavior of propagating the debug loc from the
insertion point.
> 
> # Convenience
> 
> The current behavior of IRBuilder can be convenient when it's correct.
But it can be *really* inconvenient when it's incorrect.
> 
> If a client sets an IRBuilder's debug loc correctly and makes a call
that changes the builder's insertion point, all generated instructions will
pick up a bogus debug loc from the IP. E.g this is what happens with
SCEVExpander::expandCodeFor(). There's no way to fix this bug without
changing IRBuilder's behavior.
> 
> # Solutions
> 
> 1. Add a method called SetInsertPointAndDebugLoc() which behaves
identically to the current SetInsertPoint(). Change SetInsertPoint() so that it
doesn't change the current debug loc.
> 2. Add a boolean argument to SetInsertPoint() called UpdateDbgLoc which
defaults to false. Only change the current debug loc if UpdateDbgLoc is true.
> 
> I have a mild preference for idea #1 because it's easier to read, IMO.
> 
> Whichever solution we pick, my plan is to first migrate all in-tree users
of SetInsertPoint() to the explicit API, and to then gradually fix passes which
rely on the current incorrect behavior.
Is the behavior of inheriting the debug location of the insertion point ever
what we want? I guess I can see how it can be useful when expanding an existing
instruction into a sequence, but I wonder if any clients actually correctly use
it like that today. If there aren't many uses, we could also add a
(DILocation *) argument to SetInsertPoint to force every client to think about
how to properly set the location in their use-case.

I also like the idea of renaming the method to make it really obvious what
happens here. Could we *gasp* even fix the capitalization?

We should also think about how to map this to the C API.

-- adrian
> 
> I'd appreciate your thoughts on this!
> 
> thanks,
> vedant

Davide Italiano via llvm-dev

2017-Nov-06 22:42 UTC

head link

[llvm-dev] [RFC] Setting the current debug loc when the insertion point changes

On Mon, Nov 6, 2017 at 2:19 PM, Vedant Kumar <vsk at apple.com>
wrote:> Hi everybody,
>
> I'm investigating some correctness issues with debug line table
information
> in optimized code. I've noticed something problematic in IRBuilder.
Setting
> the insertion point of an IRBuilder sets the builder’s current debug loc to
> the one attached to the insertion point. IMO this isn't always a
correct or
> convenient thing to do, and it shouldn't be the default behavior.
>
> # Correctness
>
> The IRBuilder shouldn't assume that the debug loc at its insertion
point
> applies to any instructions it emits. It doesn't have enough
information to
> make that connection. This assumption leads to bugs like llvm.org/PR25630.
> I.e it creates situations in which we silently push around incorrect
> DILocations without realizing it, because the debug locs in the IR look
nice
> and contiguous.
>
I remember that one, and I got to your same conclusion at the time
when analyzing the bug, so thanks for writing this down :)
> To fix the issue, clients of IRBuilder should be required to either:
>
> 1. Explicitly set the correct debug location before inserting instructions,
> or
> 2. Opt-in to the current behavior of propagating the debug loc from the
> insertion point.
>
> # Convenience
>
> The current behavior of IRBuilder can be convenient when it's correct.
But
> it can be *really* inconvenient when it's incorrect.
>
> If a client sets an IRBuilder's debug loc correctly and makes a call
that
> changes the builder's insertion point, all generated instructions will
pick
> up a bogus debug loc from the IP. E.g this is what happens with
> SCEVExpander::expandCodeFor(). There's no way to fix this bug without
> changing IRBuilder's behavior.
>
> # Solutions
>
> 1. Add a method called SetInsertPointAndDebugLoc() which behaves
identically
> to the current SetInsertPoint(). Change SetInsertPoint() so that it
doesn't
> change the current debug loc.
> 2. Add a boolean argument to SetInsertPoint() called UpdateDbgLoc which
> defaults to false. Only change the current debug loc if UpdateDbgLoc is
> true.
>
> I have a mild preference for idea #1 because it's easier to read, IMO.
>
> Whichever solution we pick, my plan is to first migrate all in-tree users
of
> SetInsertPoint() to the explicit API, and to then gradually fix passes
which
> rely on the current incorrect behavior.
>
My preference is as well for 1. I kind of dislike adding bool
parameters to API as it becomes easier to get them wrong.
Your plan makes sense to me, regardless. I'll be happy to review the
patches.

Thanks!

--
Davide

Vedant Kumar via llvm-dev

2017-Nov-06 22:50 UTC

head link

[llvm-dev] [RFC] Setting the current debug loc when the insertion point changes

> On Nov 6, 2017, at 2:32 PM, Adrian Prantl <aprantl at apple.com>
wrote:
> 
> 
>> On Nov 6, 2017, at 2:19 PM, Vedant Kumar <vsk at apple.com>
wrote:
>> 
>> Hi everybody,
>> 
>> I'm investigating some correctness issues with debug line table
information in optimized code. I've noticed something problematic in
IRBuilder. Setting the insertion point of an IRBuilder sets the builder’s
current debug loc to the one attached to the insertion point. IMO this isn't
always a correct or convenient thing to do, and it shouldn't be the default
behavior.
>> 
>> # Correctness
>> 
>> The IRBuilder shouldn't assume that the debug loc at its insertion
point applies to any instructions it emits. It doesn't have enough
information to make that connection. This assumption leads to bugs like
llvm.org/PR25630. I.e it creates situations in which we silently push around
incorrect DILocations without realizing it, because the debug locs in the IR
look nice and contiguous.
> 
> I agree with your analysis.
> 
>> 
>> To fix the issue, clients of IRBuilder should be required to either:
>> 
>> 1. Explicitly set the correct debug location before inserting
instructions, or
>> 2. Opt-in to the current behavior of propagating the debug loc from the
insertion point.
>> 
>> # Convenience
>> 
>> The current behavior of IRBuilder can be convenient when it's
correct. But it can be *really* inconvenient when it's incorrect.
>> 
>> If a client sets an IRBuilder's debug loc correctly and makes a
call that changes the builder's insertion point, all generated instructions
will pick up a bogus debug loc from the IP. E.g this is what happens with
SCEVExpander::expandCodeFor(). There's no way to fix this bug without
changing IRBuilder's behavior.
>> 
>> # Solutions
>> 
>> 1. Add a method called SetInsertPointAndDebugLoc() which behaves
identically to the current SetInsertPoint(). Change SetInsertPoint() so that it
doesn't change the current debug loc.
>> 2. Add a boolean argument to SetInsertPoint() called UpdateDbgLoc which
defaults to false. Only change the current debug loc if UpdateDbgLoc is true.
>> 
>> I have a mild preference for idea #1 because it's easier to read,
IMO.
>> 
>> Whichever solution we pick, my plan is to first migrate all in-tree
users of SetInsertPoint() to the explicit API, and to then gradually fix passes
which rely on the current incorrect behavior.
> 
> Is the behavior of inheriting the debug location of the insertion point
ever what we want? I guess I can see how it can be useful when expanding an
existing instruction into a sequence, but I wonder if any clients actually
correctly use it like that today. If there aren't many uses, we could also
add a (DILocation *) argument to SetInsertPoint to force every client to think
about how to properly set the location in their use-case.
There's at least one correct use of the current SetInsertPoint() API in
speculateSelectInstLoads() :). As you predicted, it expands an existing
instruction into a sequence.

I'm not sure how many correct uses of the API we have. There are over 300
uses in just {llvm,clang}/lib. I think that prevents us from adding a mandatory
argument to SetInsertPoint(): we wouldn't be able to promptly fix each use
of the API.

OTOH, if we added a method like SetInsertPointAndDebugLoc(), we could
incrementally eliminate as many uses of that method as possible.

> I also like the idea of renaming the method to make it really obvious what
happens here. Could we *gasp* even fix the capitalization?
Sure :).

> We should also think about how to map this to the C API.
Wdyt of exposing SetInsertPointAndDebugLoc() via the C API, with a note
explaining that it will eventually be deprecated?

vedant

> 
> -- adrian
> 
>> 
>> I'd appreciate your thoughts on this!
>> 
>> thanks,
>> vedant
>

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Nov 2017 - [RFC] Setting the current debug loc when the insertion point changes

[llvm-dev] [RFC] Setting the current debug loc when the insertion point changes

[llvm-dev] [RFC] Setting the current debug loc when the insertion point changes

[llvm-dev] [RFC] Setting the current debug loc when the insertion point changes

[llvm-dev] [RFC] Setting the current debug loc when the insertion point changes

Possibly Parallel Threads