I've been trying to get oprofile results for jitted code without success. I built an 3.5.0 llvm with oprofile enabled, and tested it with lli on a small test case. I built the latest oprofile from the git repository. Debugging I can see that lli is registering the listener and making the oprofile calls to the libopagent api to specify the names and address ranges of jit'd routines, and even the line number information. And the oprofile agent is writing infomation to /tmp/.oprofile/jitdump. But although oprofile counts lots of samples but doesnt attribute them to the jit'd code. When execution finishes oprofile runs opjitconv to report the jit counts but doesn't see how to attribute them so drops them on the floor. Working with one of the primary oprofile developers, Maynard Johnson, we've got a couple of theories: 1) oprofile jit code support doesn't actually work with MCJIT yet. Although its supposed to work there are no examples we could find in recent web history. Does anyone have a working example? 2) Using a small test case the jit'd code is on a small region of heap memory. Because it isnt marked as anonymous (as it would be if it came from an mmap/large malloc request) oprofile ignores the samples. I would have though it would attribute those samples to lli which owns its heap. Perhaps that's a bug? My test case is this little C program converted to IR and run with lli: loopy.c: #include <stdio.h> int fib_left(int); int fib_right(int); int fib_left(int i) { if (i < 2) return 1; return fib_left(i-1) + fib_right(i-2); } int fib_right(int i) { if (i < 2) return 1; return fib_left(i-1) + fib_right(i-2); } int fib(int i) { if (i < 2) return 1; return fib_left(i-1) + fib_right(i-2); } int main() { int i = 45; printf("fib(%d) == %d\n", i, fib(i)); } $clang -g -S -emit-llvm -o loopy.ll loopy.c $ operf -Vdebug -e CPU_CLK_UNHALTED:5000000 lli -debug-only="oprofile-jit-event-listener" loopy.ll Using samples dir /home/dad/oprofile_data/samples Kernel profiling is not possible with current system config Set /proc/sys/kernel/kptr_restrict to 0 to collect kernel samples. Exec args are: lli -debug-only=oprofile-jit-event-listener loopy.ll telling child to start app app 47033 is running parent says start app /usr/local/bin/lli Forking read pid going into waitpid on profiled app 47033 Successfully read header info for sample data Converting operf data to oprofile sample data format sample type is 43 operf: Profiler started Connected to OProfile agent. <<< from the jit event listener code Mapping 0x7f5ed6893014 to loopy.c:13 <<< adding debug line information in the listener Mapping 0x7f5ed689301f to loopy.c:13 Mapping 0x7f5ed6893032 to loopy.c:13 Mapping 0x7f5ed6893054 to loopy.c:14 Mapping 0x7f5ed6893079 to loopy.c:9 Mapping 0x7f5ed6893082 to loopy.c:9 Mapping 0x7f5ed689308e to loopy.c:10 Mapping 0x7f5ed68930a1 to loopy.c:10 Mapping 0x7f5ed68930b8 to loopy.c:10 Mapping 0x7f5ed68930d9 to loopy.c:5 Mapping 0x7f5ed68930e2 to loopy.c:5 Mapping 0x7f5ed68930ee to loopy.c:6 Mapping 0x7f5ed6893101 to loopy.c:6 Mapping 0x7f5ed6893118 to loopy.c:6 Mapping 0x7f5ed6893139 to loopy.c:7 Mapping 0x7f5ed6893142 to loopy.c:7 Mapping 0x7f5ed689314e to loopy.c:8 Mapping 0x7f5ed6893161 to loopy.c:8 Mapping 0x7f5ed6893178 to loopy.c:8 fib(45) == 1836311903 profiled app ended normally. operf recording finished. Total bytes recorded from perf events: 226136 operf-record process returned OK operf_read: Total bytes received from operf_record process: 225952 Calling _do_jitdump_convert <<<< oprofile converting jit sample information start time/end time is 1413559198/1413559208 opjitconv: Ending with rc = 2. This code is usually OK, but can be useful for debugging purposes. JIT dump processing complete. operf-read process returned OK Profiling done. $ opreport Using /home/dad/oprofile_data/samples/ for samples directory. CPU: Intel Haswell microarchitecture, speed 3498 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 5000000 CPU_CLK_UNHALT...| samples| %| ------------------ 6949 100.000 lli CPU_CLK_UNHALT...| samples| %| ------------------ 6903 99.3380 ld-2.19.so 23 0.3310 no-vmlinux 10 0.1439 libLLVMCore.so 8 0.1151 libLLVMCodeGen.so 2 0.0288 libpthread-2.19.so 1 0.0144 libLLVMAsmPrinter.so 1 0.0144 libLLVMSelectionDAG.so 1 0.0144 libLLVMTransformUtils.so $ opreport -l `which lli` Using /home/dad/oprofile_data/samples/ for samples directory. CPU: Intel Haswell microarchitecture, speed 3498 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 5000000 samples % image name symbol name 38 57.5758 no-vmlinux /no-vmlinux 2 3.0303 ld-2.19.so _dl_lookup_symbol_x 2 3.0303 libLLVMCore.so llvm::DenseMapBase<llvm::DenseMap<void const*, llvm::PassInfo const*, llvm::DenseMapInfo<void const*> >, void const*, llvm::PassInfo const*, llvm::DenseMapInfo<void const*> >::getHashValue(void const* const&) 2 3.0303 libLLVMCore.so std::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> >::begin() const 1 1.5152 ld-2.19.so _dl_check_map_versions 1 1.5152 ld-2.19.so _dl_fixup 1 1.5152 ld-2.19.so do_lookup_x 1 1.5152 libLLVMAsmParser.so llvm::LLLexer::ReadString(llvm::lltok::Kind) 1 1.5152 libLLVMCodeGen.so llvm::MachineFunction::getName() const 1 1.5152 libLLVMCodeGen.so llvm::MachineOperand::isReg() const ....>From this you can see that lli is calling the oprofile interface. Oprofileis counting 6949 samples, but attributing 99% to ld-2.19.so. They are actually the samples from the jit code execution. Asking opreport to show the samples from lli it only sees the jit compilation samples (<1% of the total). Asking for the samples from ld-2.19.so shows the same info - just the compilation process. Perhaps I'm missing something obvious so I wanted to ask the list if there was anyone actively using oprofile with jit'd code so we can resolve the problem. thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141017/5eb099f8/attachment.html>
The fact that you are getting line number information leads me to believe that whether you intended it or not you are using the old JIT (which is now gone), not MCJIT. The “Mapping 0x7f5ed6893032 to loopy.c:13” lines come from a path called through NotifyFunctionEmitted and as far as I know nothing in MCJIT calls that. At one time oprofile support worked completely with the old JIT. It sounds like you might have a build where the old JIT was partially dismantled. That may or may not be why things aren’t working for you. With MCJIT, I would expect that oprofile support would work, minus line numbers. I’m not sure about that though, as I haven’t tried anything with oprofile and MCJIT for a couple of years and the code has changed a lot recently. I’m not sure this is relevant, but I seem to recall having had to run oprofile for system level profiling. Perhaps the registration mechanism isn’t passing through enough information to associate the addresses it is registering with the process you are profiling. Incidentally, it should be possible to get line number support with oprofile and MCJIT. It just isn’t implemented yet. -Andy From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Maurice Marks Sent: Friday, October 17, 2014 8:37 AM To: llvmdev at cs.uiuc.edu Subject: [LLVMdev] oprofile support? I've been trying to get oprofile results for jitted code without success. I built an 3.5.0 llvm with oprofile enabled, and tested it with lli on a small test case. I built the latest oprofile from the git repository. Debugging I can see that lli is registering the listener and making the oprofile calls to the libopagent api to specify the names and address ranges of jit'd routines, and even the line number information. And the oprofile agent is writing infomation to /tmp/.oprofile/jitdump. But although oprofile counts lots of samples but doesnt attribute them to the jit'd code. When execution finishes oprofile runs opjitconv to report the jit counts but doesn't see how to attribute them so drops them on the floor. Working with one of the primary oprofile developers, Maynard Johnson, we've got a couple of theories: 1) oprofile jit code support doesn't actually work with MCJIT yet. Although its supposed to work there are no examples we could find in recent web history. Does anyone have a working example? 2) Using a small test case the jit'd code is on a small region of heap memory. Because it isnt marked as anonymous (as it would be if it came from an mmap/large malloc request) oprofile ignores the samples. I would have though it would attribute those samples to lli which owns its heap. Perhaps that's a bug? My test case is this little C program converted to IR and run with lli: loopy.c: #include <stdio.h> int fib_left(int); int fib_right(int); int fib_left(int i) { if (i < 2) return 1; return fib_left(i-1) + fib_right(i-2); } int fib_right(int i) { if (i < 2) return 1; return fib_left(i-1) + fib_right(i-2); } int fib(int i) { if (i < 2) return 1; return fib_left(i-1) + fib_right(i-2); } int main() { int i = 45; printf("fib(%d) == %d\n", i, fib(i)); } $clang -g -S -emit-llvm -o loopy.ll loopy.c $ operf -Vdebug -e CPU_CLK_UNHALTED:5000000 lli -debug-only="oprofile-jit-event-listener" loopy.ll Using samples dir /home/dad/oprofile_data/samples Kernel profiling is not possible with current system config Set /proc/sys/kernel/kptr_restrict to 0 to collect kernel samples. Exec args are: lli -debug-only=oprofile-jit-event-listener loopy.ll telling child to start app app 47033 is running parent says start app /usr/local/bin/lli Forking read pid going into waitpid on profiled app 47033 Successfully read header info for sample data Converting operf data to oprofile sample data format sample type is 43 operf: Profiler started Connected to OProfile agent. <<< from the jit event listener code Mapping 0x7f5ed6893014 to loopy.c:13 <<< adding debug line information in the listener Mapping 0x7f5ed689301f to loopy.c:13 Mapping 0x7f5ed6893032 to loopy.c:13 Mapping 0x7f5ed6893054 to loopy.c:14 Mapping 0x7f5ed6893079 to loopy.c:9 Mapping 0x7f5ed6893082 to loopy.c:9 Mapping 0x7f5ed689308e to loopy.c:10 Mapping 0x7f5ed68930a1 to loopy.c:10 Mapping 0x7f5ed68930b8 to loopy.c:10 Mapping 0x7f5ed68930d9 to loopy.c:5 Mapping 0x7f5ed68930e2 to loopy.c:5 Mapping 0x7f5ed68930ee to loopy.c:6 Mapping 0x7f5ed6893101 to loopy.c:6 Mapping 0x7f5ed6893118 to loopy.c:6 Mapping 0x7f5ed6893139 to loopy.c:7 Mapping 0x7f5ed6893142 to loopy.c:7 Mapping 0x7f5ed689314e to loopy.c:8 Mapping 0x7f5ed6893161 to loopy.c:8 Mapping 0x7f5ed6893178 to loopy.c:8 fib(45) == 1836311903 profiled app ended normally. operf recording finished. Total bytes recorded from perf events: 226136 operf-record process returned OK operf_read: Total bytes received from operf_record process: 225952 Calling _do_jitdump_convert <<<< oprofile converting jit sample information start time/end time is 1413559198/1413559208 opjitconv: Ending with rc = 2. This code is usually OK, but can be useful for debugging purposes. JIT dump processing complete. operf-read process returned OK Profiling done. $ opreport Using /home/dad/oprofile_data/samples/ for samples directory. CPU: Intel Haswell microarchitecture, speed 3498 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 5000000 CPU_CLK_UNHALT...| samples| %| ------------------ 6949 100.000 lli CPU_CLK_UNHALT...| samples| %| ------------------ 6903 99.3380 ld-2.19.so<http://ld-2.19.so> 23 0.3310 no-vmlinux 10 0.1439 libLLVMCore.so 8 0.1151 libLLVMCodeGen.so 2 0.0288 libpthread-2.19.so<http://libpthread-2.19.so> 1 0.0144 libLLVMAsmPrinter.so 1 0.0144 libLLVMSelectionDAG.so 1 0.0144 libLLVMTransformUtils.so $ opreport -l `which lli` Using /home/dad/oprofile_data/samples/ for samples directory. CPU: Intel Haswell microarchitecture, speed 3498 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 5000000 samples % image name symbol name 38 57.5758 no-vmlinux /no-vmlinux 2 3.0303 ld-2.19.so<http://ld-2.19.so> _dl_lookup_symbol_x 2 3.0303 libLLVMCore.so llvm::DenseMapBase<llvm::DenseMap<void const*, llvm::PassInfo const*, llvm::DenseMapInfo<void const*> >, void const*, llvm::PassInfo const*, llvm::DenseMapInfo<void const*> >::getHashValue(void const* const&) 2 3.0303 libLLVMCore.so std::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> >::begin() const 1 1.5152 ld-2.19.so<http://ld-2.19.so> _dl_check_map_versions 1 1.5152 ld-2.19.so<http://ld-2.19.so> _dl_fixup 1 1.5152 ld-2.19.so<http://ld-2.19.so> do_lookup_x 1 1.5152 libLLVMAsmParser.so llvm::LLLexer::ReadString(llvm::lltok::Kind) 1 1.5152 libLLVMCodeGen.so llvm::MachineFunction::getName() const 1 1.5152 libLLVMCodeGen.so llvm::MachineOperand::isReg() const .... From this you can see that lli is calling the oprofile interface. Oprofile is counting 6949 samples, but attributing 99% to ld-2.19.so<http://ld-2.19.so>. They are actually the samples from the jit code execution. Asking opreport to show the samples from lli it only sees the jit compilation samples (<1% of the total). Asking for the samples from ld-2.19.so<http://ld-2.19.so> shows the same info - just the compilation process. Perhaps I'm missing something obvious so I wanted to ask the list if there was anyone actively using oprofile with jit'd code so we can resolve the problem. thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141017/a0b82cb2/attachment.html>
I haven't gotten operf to work with JIT'd code, but opcontrol seems to work ok if you can accept its unreliability. We've since switched to using perf, and haven't looked back. On Fri, Oct 17, 2014 at 8:36 AM, Maurice Marks <maurice.marks at gmail.com> wrote:> I've been trying to get oprofile results for jitted code without success. > I built an 3.5.0 llvm with oprofile enabled, and tested it with lli on a > small test case. I built the latest oprofile from the git repository. > > Debugging I can see that lli is registering the listener and making the > oprofile calls to the libopagent api to specify the names and address > ranges of jit'd routines, and even the line number information. And the > oprofile agent is writing infomation to /tmp/.oprofile/jitdump. But > although oprofile counts lots of samples but doesnt attribute them to the > jit'd code. When execution finishes oprofile runs opjitconv to report the > jit counts but doesn't see how to attribute them so drops them on the floor. > Working with one of the primary oprofile developers, Maynard Johnson, > we've got a couple of theories: > > 1) oprofile jit code support doesn't actually work with MCJIT yet. > Although its supposed to work there are no examples we could find in recent > web history. Does anyone have a working example? > 2) Using a small test case the jit'd code is on a small region of heap > memory. Because it isnt marked as anonymous (as it would be if it came from > an mmap/large malloc request) oprofile ignores the samples. I would have > though it would attribute those samples to lli which owns its heap. Perhaps > that's a bug? > > My test case is this little C program converted to IR and run with lli: > > loopy.c: > > #include <stdio.h> > int fib_left(int); > int fib_right(int); > int fib_left(int i) { > if (i < 2) return 1; > return fib_left(i-1) + fib_right(i-2); } > int fib_right(int i) { if (i < 2) return 1; > return fib_left(i-1) + fib_right(i-2); } > int fib(int i) { if (i < 2) return 1; > return fib_left(i-1) + fib_right(i-2); } > > int main() { > int i = 45; printf("fib(%d) == %d\n", i, fib(i)); } > > $clang -g -S -emit-llvm -o loopy.ll loopy.c > > $ operf -Vdebug -e CPU_CLK_UNHALTED:5000000 lli > -debug-only="oprofile-jit-event-listener" loopy.ll > Using samples dir /home/dad/oprofile_data/samples > Kernel profiling is not possible with current system config > Set /proc/sys/kernel/kptr_restrict to 0 to collect kernel samples. > Exec args are: lli -debug-only=oprofile-jit-event-listener loopy.ll > telling child to start app > app 47033 is running > parent says start app /usr/local/bin/lli > Forking read pid > going into waitpid on profiled app 47033 > Successfully read header info for sample data > Converting operf data to oprofile sample data format > sample type is 43 > operf: Profiler started > Connected to OProfile agent. <<< from the jit event listener code > Mapping 0x7f5ed6893014 to loopy.c:13 <<< adding debug line information in > the listener > Mapping 0x7f5ed689301f to loopy.c:13 > Mapping 0x7f5ed6893032 to loopy.c:13 > Mapping 0x7f5ed6893054 to loopy.c:14 > Mapping 0x7f5ed6893079 to loopy.c:9 > Mapping 0x7f5ed6893082 to loopy.c:9 > Mapping 0x7f5ed689308e to loopy.c:10 > Mapping 0x7f5ed68930a1 to loopy.c:10 > Mapping 0x7f5ed68930b8 to loopy.c:10 > Mapping 0x7f5ed68930d9 to loopy.c:5 > Mapping 0x7f5ed68930e2 to loopy.c:5 > Mapping 0x7f5ed68930ee to loopy.c:6 > Mapping 0x7f5ed6893101 to loopy.c:6 > Mapping 0x7f5ed6893118 to loopy.c:6 > Mapping 0x7f5ed6893139 to loopy.c:7 > Mapping 0x7f5ed6893142 to loopy.c:7 > Mapping 0x7f5ed689314e to loopy.c:8 > Mapping 0x7f5ed6893161 to loopy.c:8 > Mapping 0x7f5ed6893178 to loopy.c:8 > fib(45) == 1836311903 > profiled app ended normally. > operf recording finished. > Total bytes recorded from perf events: 226136 > operf-record process returned OK > operf_read: Total bytes received from operf_record process: 225952 > Calling _do_jitdump_convert <<<< oprofile converting jit sample > information > start time/end time is 1413559198/1413559208 > opjitconv: Ending with rc = 2. This code is usually OK, but can be useful > for debugging purposes. > JIT dump processing complete. > operf-read process returned OK > > Profiling done. > > $ opreport > Using /home/dad/oprofile_data/samples/ for samples directory. > CPU: Intel Haswell microarchitecture, speed 3498 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 5000000 > CPU_CLK_UNHALT...| > samples| %| > ------------------ > 6949 100.000 lli > CPU_CLK_UNHALT...| > samples| %| > ------------------ > 6903 99.3380 ld-2.19.so > 23 0.3310 no-vmlinux > 10 0.1439 libLLVMCore.so > 8 0.1151 libLLVMCodeGen.so > 2 0.0288 libpthread-2.19.so > 1 0.0144 libLLVMAsmPrinter.so > 1 0.0144 libLLVMSelectionDAG.so > 1 0.0144 libLLVMTransformUtils.so > > $ opreport -l `which lli` > Using /home/dad/oprofile_data/samples/ for samples directory. > CPU: Intel Haswell microarchitecture, speed 3498 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 5000000 > samples % image name symbol name > 38 57.5758 no-vmlinux /no-vmlinux > 2 3.0303 ld-2.19.so _dl_lookup_symbol_x > 2 3.0303 libLLVMCore.so > llvm::DenseMapBase<llvm::DenseMap<void const*, llvm::PassInfo const*, > llvm::DenseMapInfo<void const*> >, void const*, llvm::PassInfo const*, > llvm::DenseMapInfo<void const*> >::getHashValue(void const* const&) > 2 3.0303 libLLVMCore.so std::vector<llvm::PassInfo > const*, std::allocator<llvm::PassInfo const*> >::begin() const > 1 1.5152 ld-2.19.so _dl_check_map_versions > 1 1.5152 ld-2.19.so _dl_fixup > 1 1.5152 ld-2.19.so do_lookup_x > 1 1.5152 libLLVMAsmParser.so > llvm::LLLexer::ReadString(llvm::lltok::Kind) > 1 1.5152 libLLVMCodeGen.so > llvm::MachineFunction::getName() const > 1 1.5152 libLLVMCodeGen.so llvm::MachineOperand::isReg() > const > .... > > From this you can see that lli is calling the oprofile interface. Oprofile > is counting 6949 samples, but attributing 99% to ld-2.19.so. They are > actually the samples from the jit code execution. > > Asking opreport to show the samples from lli it only sees the jit > compilation samples (<1% of the total). Asking for the samples from > ld-2.19.so shows the same info - just the compilation process. > > Perhaps I'm missing something obvious so I wanted to ask the list if there > was anyone actively using oprofile with jit'd code so we can resolve the > problem. > > thanks! > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141017/d641b03e/attachment.html>
Mi Maurice, A follow up to Andy's comments: MCJIT can find line numbers for ELF files, at least in limited circumstances. I know because I broke the regression test for it while doing some cleanup recently. ;) Looking at lib/ExecutionEngine/OProfileJIT/OProfileJITEventListener.cpp, I see the line: // TODO: support line number info (similar to IntelJITEventListener.cpp)>From a glance at IntelJITEventListener, I think it would be easy to copythe line number support in there over to OProfileJITEventListener. Cheers, Lang. On Fri, Oct 17, 2014 at 3:04 PM, Kaylor, Andrew <andrew.kaylor at intel.com> wrote:> The fact that you are getting line number information leads me to > believe that whether you intended it or not you are using the old JIT > (which is now gone), not MCJIT. The “Mapping 0x7f5ed6893032 to loopy.c:13” > lines come from a path called through NotifyFunctionEmitted and as far as I > know nothing in MCJIT calls that. > > > > At one time oprofile support worked completely with the old JIT. It > sounds like you might have a build where the old JIT was partially > dismantled. That may or may not be why things aren’t working for you. > > > > With MCJIT, I would expect that oprofile support would work, minus line > numbers. I’m not sure about that though, as I haven’t tried anything with > oprofile and MCJIT for a couple of years and the code has changed a lot > recently. > > > > I’m not sure this is relevant, but I seem to recall having had to run > oprofile for system level profiling. Perhaps the registration mechanism > isn’t passing through enough information to associate the addresses it is > registering with the process you are profiling. > > > > Incidentally, it should be possible to get line number support with > oprofile and MCJIT. It just isn’t implemented yet. > > > > -Andy > > > > *From:* llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] *On > Behalf Of *Maurice Marks > *Sent:* Friday, October 17, 2014 8:37 AM > *To:* llvmdev at cs.uiuc.edu > *Subject:* [LLVMdev] oprofile support? > > > > I've been trying to get oprofile results for jitted code without success. > I built an 3.5.0 llvm with oprofile enabled, and tested it with lli on a > small test case. I built the latest oprofile from the git repository. > > Debugging I can see that lli is registering the listener and making the > oprofile calls to the libopagent api to specify the names and address > ranges of jit'd routines, and even the line number information. And the > oprofile agent is writing infomation to /tmp/.oprofile/jitdump. But > although oprofile counts lots of samples but doesnt attribute them to the > jit'd code. When execution finishes oprofile runs opjitconv to report the > jit counts but doesn't see how to attribute them so drops them on the floor. > Working with one of the primary oprofile developers, Maynard Johnson, > we've got a couple of theories: > > 1) oprofile jit code support doesn't actually work with MCJIT yet. > Although its supposed to work there are no examples we could find in recent > web history. Does anyone have a working example? > > 2) Using a small test case the jit'd code is on a small region of heap > memory. Because it isnt marked as anonymous (as it would be if it came from > an mmap/large malloc request) oprofile ignores the samples. I would have > though it would attribute those samples to lli which owns its heap. Perhaps > that's a bug? > > My test case is this little C program converted to IR and run with lli: > > loopy.c: > > #include <stdio.h> > int fib_left(int); > int fib_right(int); > int fib_left(int i) { > if (i < 2) return 1; > return fib_left(i-1) + fib_right(i-2); } > int fib_right(int i) { if (i < 2) return 1; > return fib_left(i-1) + fib_right(i-2); } > int fib(int i) { if (i < 2) return 1; > return fib_left(i-1) + fib_right(i-2); } > > int main() { > int i = 45; printf("fib(%d) == %d\n", i, fib(i)); } > > $clang -g -S -emit-llvm -o loopy.ll loopy.c > > $ operf -Vdebug -e CPU_CLK_UNHALTED:5000000 lli > -debug-only="oprofile-jit-event-listener" loopy.ll > Using samples dir /home/dad/oprofile_data/samples > Kernel profiling is not possible with current system config > Set /proc/sys/kernel/kptr_restrict to 0 to collect kernel samples. > Exec args are: lli -debug-only=oprofile-jit-event-listener loopy.ll > telling child to start app > app 47033 is running > parent says start app /usr/local/bin/lli > Forking read pid > going into waitpid on profiled app 47033 > Successfully read header info for sample data > Converting operf data to oprofile sample data format > sample type is 43 > operf: Profiler started > Connected to OProfile agent. <<< from the jit event listener code > Mapping 0x7f5ed6893014 to loopy.c:13 <<< adding debug line information in > the listener > Mapping 0x7f5ed689301f to loopy.c:13 > Mapping 0x7f5ed6893032 to loopy.c:13 > Mapping 0x7f5ed6893054 to loopy.c:14 > Mapping 0x7f5ed6893079 to loopy.c:9 > Mapping 0x7f5ed6893082 to loopy.c:9 > Mapping 0x7f5ed689308e to loopy.c:10 > Mapping 0x7f5ed68930a1 to loopy.c:10 > Mapping 0x7f5ed68930b8 to loopy.c:10 > Mapping 0x7f5ed68930d9 to loopy.c:5 > Mapping 0x7f5ed68930e2 to loopy.c:5 > Mapping 0x7f5ed68930ee to loopy.c:6 > Mapping 0x7f5ed6893101 to loopy.c:6 > Mapping 0x7f5ed6893118 to loopy.c:6 > Mapping 0x7f5ed6893139 to loopy.c:7 > Mapping 0x7f5ed6893142 to loopy.c:7 > Mapping 0x7f5ed689314e to loopy.c:8 > Mapping 0x7f5ed6893161 to loopy.c:8 > Mapping 0x7f5ed6893178 to loopy.c:8 > fib(45) == 1836311903 > profiled app ended normally. > operf recording finished. > Total bytes recorded from perf events: 226136 > operf-record process returned OK > operf_read: Total bytes received from operf_record process: 225952 > Calling _do_jitdump_convert <<<< oprofile converting jit sample > information > start time/end time is 1413559198/1413559208 > opjitconv: Ending with rc = 2. This code is usually OK, but can be useful > for debugging purposes. > JIT dump processing complete. > operf-read process returned OK > > Profiling done. > > $ opreport > Using /home/dad/oprofile_data/samples/ for samples directory. > CPU: Intel Haswell microarchitecture, speed 3498 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 5000000 > CPU_CLK_UNHALT...| > samples| %| > ------------------ > 6949 100.000 lli > CPU_CLK_UNHALT...| > samples| %| > ------------------ > 6903 99.3380 ld-2.19.so > 23 0.3310 no-vmlinux > 10 0.1439 libLLVMCore.so > 8 0.1151 libLLVMCodeGen.so > 2 0.0288 libpthread-2.19.so > 1 0.0144 libLLVMAsmPrinter.so > 1 0.0144 libLLVMSelectionDAG.so > 1 0.0144 libLLVMTransformUtils.so > > $ opreport -l `which lli` > Using /home/dad/oprofile_data/samples/ for samples directory. > CPU: Intel Haswell microarchitecture, speed 3498 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 5000000 > samples % image name symbol name > 38 57.5758 no-vmlinux /no-vmlinux > 2 3.0303 ld-2.19.so _dl_lookup_symbol_x > 2 3.0303 libLLVMCore.so > llvm::DenseMapBase<llvm::DenseMap<void const*, llvm::PassInfo const*, > llvm::DenseMapInfo<void const*> >, void const*, llvm::PassInfo const*, > llvm::DenseMapInfo<void const*> >::getHashValue(void const* const&) > 2 3.0303 libLLVMCore.so std::vector<llvm::PassInfo > const*, std::allocator<llvm::PassInfo const*> >::begin() const > 1 1.5152 ld-2.19.so _dl_check_map_versions > 1 1.5152 ld-2.19.so _dl_fixup > 1 1.5152 ld-2.19.so do_lookup_x > 1 1.5152 libLLVMAsmParser.so > llvm::LLLexer::ReadString(llvm::lltok::Kind) > 1 1.5152 libLLVMCodeGen.so > llvm::MachineFunction::getName() const > 1 1.5152 libLLVMCodeGen.so llvm::MachineOperand::isReg() > const > .... > > From this you can see that lli is calling the oprofile interface. Oprofile > is counting 6949 samples, but attributing 99% to ld-2.19.so. They are > actually the samples from the jit code execution. > > Asking opreport to show the samples from lli it only sees the jit > compilation samples (<1% of the total). Asking for the samples from > ld-2.19.so shows the same info - just the compilation process. > > Perhaps I'm missing something obvious so I wanted to ask the list if there > was anyone actively using oprofile with jit'd code so we can resolve the > problem. > > thanks! > > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141128/88a20135/attachment.html>