Chrulski, Christopher M via llvm-dev
2020-Jan-10 19:25 UTC
[llvm-dev] Incorrect code generation when using -fprofile-generate on code which contains exception handling (Windows target)
Hi,
I've run into a bug with the LLVM backend that causes incorrect code
generation to happen when using -fprofile-generate on programs that contain C++
exception handling when building for Windows.
The problem occurs when the value profiling inserts function calls into
exception handling blocks. The instrumentation inserts value profiling intrinsic
calls, and these are subsequently lowered into target library calls. However,
these library calls do not get a funclet operand bundle associated with them.
This causes the Windows Exception Handling Preparation Pass to drop all the
instructions within the exception handler starting from the PGO instrumentation
call, and replace them with 'unreachable'. This is being done by the
function removeImplausibleInstructions (WinEHPrepare.cpp).
A simple reproducer of the problem shown here which will lead to incorrect code
on the method test::run(). In this example, the virtual function called from
within the exception handler triggers the bug when using -fprofile-generate.
#include <stdexcept>
#include <iostream>
extern void may_throw(int);
class base {
public:
base() : x(0) {};
int get_x() const { return x; }
virtual void update() { x++; }
int x;
};
class derived : public base {
public:
derived() {}
virtual void update() { x--; }
};
class test {
public:
void run(base* b, int count) {
try {
for (int i = 0; i < count; ++i)
may_throw(i);
}
catch (std::exception& e) {
// Virtual function call in exception handler for value profiling.
b->update();
}
}
};
void run_test() {
test tester;
base *obj = new derived;
tester.run(obj, 100);
std::cout << "Value in obj (should be -1): " <<
obj->get_x() << "\n";
if (obj->get_x() == -1)
std::cout << "test passed\n";
else
std::cout << "test failed\n";
}
int main() {
// Without PGO, test runs and prints result.
// With -fprofile-generate, program seg-faults without printing.
run_test();
return 0;
}
__attribute__((noinline))
void may_throw(int x) {
if (x > 10)
throw std::range_error("value out of range");
}
On Windows, build with: clang -O2 -fprofile-generate test.cpp
When profiling is enabled the program will seg fault without printing anything.
Without the -fprofile-generate flag, the program will run successfully.
The compiler problem is as follows: Prior to the Windows Exception Handling
Preparation Pass, the IR for the function "test::run" contains the
following:
19: ; preds = %17
%20 = catchpad within %18 [%rtti.TypeDescriptor19* @"??_R0?AVexception at
std@@@8", i32 8, %"class.std::exception"** %6]
%21 = load i64, i64* getelementptr inbounds ([3 x i64], [3 x i64]*
@"__profc_?run at test@@QEAAXPEAVbase@@H at Z", i64 0, i64 2), align 8
%22 = add i64 %21, 1
store i64 %22, i64* getelementptr inbounds ([3 x i64], [3 x i64]*
@"__profc_?run at test@@QEAAXPEAVbase@@H at Z", i64 0, i64 2), align 8
%23 = bitcast %class.base* %1 to void (%class.base*)***
%24 = load void (%class.base*)**, void (%class.base*)*** %23, align 8, !tbaa
!9
%25 = load void (%class.base*)*, void (%class.base*)** %24, align 8
%26 = ptrtoint void (%class.base*)* %25 to i64
call void @__llvm_profile_instrument_target(i64 %26, i8* bitcast ({ i64, i64,
i64*, i8*, i8*, i32, [2 x i16] }* @"__profd_?run at test@@QEAAXPEAVbase@@H
at Z" to i8*), i32 0)
call void %25(%class.base* %1) [ "funclet"(token %20) ]
call void @_CxxThrowException(i8* null, %eh.ThrowInfo* null) #15 [
"funclet"(token %20) ]
unreachable
Following this pass, this IR has been replaced with the following, causing a
breakage to the original program. This is occurring because the instrumentation
function call, "__llvm_profile_instrument_target", is not marked with
the funclet operand bundle [ "funclet"(token %20) ].
19: ; preds = %17
%20 = catchpad within %18 [%rtti.TypeDescriptor19* @"??_R0?AVexception at
std@@@8", i32 8, %"class.std::exception"** %6]
%21 = load i64, i64* getelementptr inbounds ([3 x i64], [3 x i64]*
@"__profc_?run at test@@QEAAXPEAVbase@@H at Z", i64 0, i64 2), align 8
%22 = add i64 %21, 1
store i64 %22, i64* getelementptr inbounds ([3 x i64], [3 x i64]*
@"__profc_?run at test@@QEAAXPEAVbase@@H at Z", i64 0, i64 2), align 8
unreachable
Possible solutions:
1) Avoid value profiling of calls within exception handling blocks
Pros: Solves the problem
Cons: Could lose some cases of value profiling, but since the exception code
is not supposed to be the primary execution path, this should not be a
significant performance issue.
2) Propagate the funclet information onto the value profiling intrinsics
created. And then also propagate this info to the library routines these
intrinsics get lowered into.
For indirect function calls, the funclet information can be copied from
the original function call.
However, for MemIntrinsic call operand value profiling, these do not have
funclet operand bundles attached to them by the front-end. (Not sure if it's
possible to do because the interfaces that are used to create these do not take
operand bundles) Therefore, PGO would need to determine the appropriate funclet
value with colorEHFunlets to identify the funclet operand bundle to attach to
the instrumentation calls. Unfortunately, because it is possible that a basic
block could be associated with multiple funclets or both a funclet and outside
the funclet, this may also need to clone some of basic blocks similar to the
WinEHPrepare.cpp routine cloneCommonBlocks(), prior to computing the
instrumentation.
Pros: does not disable value profiling opportunities.
Cons: complex to implement due to the need to determine the appropriate
funclet to place on the memory operand value profiling calls. This would
necessitate the same cloning behavior to be done for the PGO use compilation.
3) Teach the Windows Exception Preparation Pass about the value profiling
library functions. Currently this pass will ignore llvm intrinsic functions that
are marked with the 'does not throw' attribute, but the value profiling
intrinsic calls have been lowered from being intrinsic calls into runtime
library target specific functions before reaching this point.
Pros: does not disable value profiling opportunities
Cons: requires exposing function names from InstrProf.h to the
WinEHPrepare.cpp file, or requires a new attribute on the function calls to
identify them as instrumentation library calls. Also, the IR does not correctly
reflect the correct state regarding the operand bundle funclet information for
the PGO inserted function calls.
For options 2 or 3 to work, it also requires that the PGO indirect function call
promotion pass used for -fprofile-use to maintain the 'funclet' operand
bundle on the specialized function call that is inserted as a direct function
call target. Fortunately, the code within that pass is cloning the original
indirect call, so the 'funclet' operand bundle is being maintained on
it.
Any thoughts on which of these options should be taken, or other suggestions for
resolving this problem?
Chris