Dean Michael Berris via llvm-dev
2016-Aug-04 07:57 UTC
[llvm-dev] XRay: Demo on x86_64/Linux almost done; some questions.
> On 4 Aug 2016, at 06:27, Serge Rogatch <serge.rogatch at gmail.com> wrote: > > Hi Dean, > > I have a question about the following piece of code in compiler-rt/trunk/lib/xray/xray_trampoline_x86.S : > movq _ZN6__xray19XRayPatchedFunctionE(%rip), %rax > testq %rax, %rax > je .Ltmp0 > > // assume that %r10d has the function id. > movl %r10d, %edi > xor %esi,%esi > callq *%rax > What happens if someone unsets the handler function (i.e. calls __xray_remove_handler() ) or changes the handler (i.e. calls __xray_set_handler() with a different pointer to function) between "movq _ZN6__xray19XRayPatchedFunctionE(%rip), %rax" and "callq *%rax" ? I understood that the old handler will still be called, but isn't it a problem? What if the code removing the handler starts destructing some objects after that, so that a handler call would result in a crash or other bad things?That would be unfortunate. :) Yes, you're right, it will be called despite having the handler be un-set from the global atomic. Though I'd suggest that log handler implementations really shouldn't be crashing, and should assume that it will be called at any given time (from multiple threads, so it should also be thread-safe). Consider also that we're taking a function pointer by design -- assuming that it will be a pointer to a function that will only deal with globals and the explicitly passed in arguments. An implementer of the log handler should really be responsible for "internal signalling" -- i.e. implementing it so that any required synchronisation happens internally. Does this help? PS. I think more definitive documentation for this is due, let me think about putting something some documentation on how to use XRay as it is currently implemented once we have the naive in-memory logging implementation land (https://reviews.llvm.org/D21982). Cheers -- Dean
Serge Rogatch via llvm-dev
2016-Aug-04 16:29 UTC
[llvm-dev] XRay: Demo on x86_64/Linux almost done; some questions.
Yes, double-checking on the handler side whether its resources are still operational, e.g. by calling weak_ptr::lock() may be a better idea than elimination of spurious handler calls in XRay at a cost of heavy synchronization objects. Ok, let's live with spurious handler calls. Cheers, Serge On 4 August 2016 at 10:57, Dean Michael Berris <dean.berris at gmail.com> wrote:> > > On 4 Aug 2016, at 06:27, Serge Rogatch <serge.rogatch at gmail.com> wrote: > > > > Hi Dean, > > > > I have a question about the following piece of code in > compiler-rt/trunk/lib/xray/xray_trampoline_x86.S : > > movq _ZN6__xray19XRayPatchedFunctionE(%rip), %rax > > testq %rax, %rax > > je .Ltmp0 > > > > // assume that %r10d has the function id. > > movl %r10d, %edi > > xor %esi,%esi > > callq *%rax > > What happens if someone unsets the handler function (i.e. calls > __xray_remove_handler() ) or changes the handler (i.e. calls > __xray_set_handler() with a different pointer to function) between "movq > _ZN6__xray19XRayPatchedFunctionE(%rip), %rax" and "callq *%rax" ? I > understood that the old handler will still be called, but isn't it a > problem? What if the code removing the handler starts destructing some > objects after that, so that a handler call would result in a crash or other > bad things? > > That would be unfortunate. :) > > Yes, you're right, it will be called despite having the handler be un-set > from the global atomic. Though I'd suggest that log handler implementations > really shouldn't be crashing, and should assume that it will be called at > any given time (from multiple threads, so it should also be thread-safe). > Consider also that we're taking a function pointer by design -- assuming > that it will be a pointer to a function that will only deal with globals > and the explicitly passed in arguments. > > An implementer of the log handler should really be responsible for > "internal signalling" -- i.e. implementing it so that any required > synchronisation happens internally. > > Does this help? > > PS. I think more definitive documentation for this is due, let me think > about putting something some documentation on how to use XRay as it is > currently implemented once we have the naive in-memory logging > implementation land (https://reviews.llvm.org/D21982). > > Cheers > > -- Dean > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160804/657ad566/attachment.html>
Serge Rogatch via llvm-dev
2016-Aug-05 18:06 UTC
[llvm-dev] XRay: Demo on x86_64/Linux almost done; some questions.
Hi Dean, I have a question for 32-bit platforms. I see in the code that you used the following in compiler-rt/trunk/lib/xray/xray_interface_internal.h : struct XRaySledEntry { uint64_t Address; uint64_t Function; unsigned char Kind; unsigned char AlwaysInstrument; unsigned char Padding[14]; // Need 32 bytes }; And the peer code in llvm/trunk/lib/Target/X86/X86MCInstLower.cpp : void X86AsmPrinter::EmitXRayTable() { if (Sleds.empty()) return; if (Subtarget->isTargetELF()) { auto *Section = OutContext.getELFSection( "xray_instr_map", ELF::SHT_PROGBITS, ELF::SHF_ALLOC | ELF::SHF_GROUP | ELF::SHF_MERGE, 0, CurrentFnSym->getName()); auto PrevSection = OutStreamer->getCurrentSectionOnly(); OutStreamer->SwitchSection(Section); for (const auto &Sled : Sleds) { OutStreamer->EmitSymbolValue(Sled.Sled, 8); OutStreamer->EmitSymbolValue(CurrentFnSym, 8); auto Kind = static_cast<uint8_t>(Sled.Kind); OutStreamer->EmitBytes( StringRef(reinterpret_cast<const char *>(&Kind), 1)); OutStreamer->EmitBytes( StringRef(reinterpret_cast<const char *>(&Sled.AlwaysInstrument), 1)); OutStreamer->EmitZeros(14); } OutStreamer->SwitchSection(PrevSection); } Sleds.clear(); } So useful part of your entry is 18 bytes, and you add 14 bytes of padding to achieve 32-byte alignment. But for 32-bit CPUs I understood that XRaySledEntry::Address and XRaySledEntry::Function can be 32-bit too, right? So the entry can fit 16 bytes, with 10 useful bytes and 6 bytes of padding. Can I use 16-byte entries or is there some external (OS? ELF? Linker?) requirement that one entry must be 32-bytes, or aligned at 32 bytes, etc.? Thanks, Serge On 4 August 2016 at 19:29, Serge Rogatch <serge.rogatch at gmail.com> wrote:> Yes, double-checking on the handler side whether its resources are still > operational, e.g. by calling weak_ptr::lock() may be a better idea than > elimination of spurious handler calls in XRay at a cost of heavy > synchronization objects. Ok, let's live with spurious handler calls. > > Cheers, > Serge > > On 4 August 2016 at 10:57, Dean Michael Berris <dean.berris at gmail.com> > wrote: > >> >> > On 4 Aug 2016, at 06:27, Serge Rogatch <serge.rogatch at gmail.com> wrote: >> > >> > Hi Dean, >> > >> > I have a question about the following piece of code in >> compiler-rt/trunk/lib/xray/xray_trampoline_x86.S : >> > movq _ZN6__xray19XRayPatchedFunctionE(%rip), %rax >> > testq %rax, %rax >> > je .Ltmp0 >> > >> > // assume that %r10d has the function id. >> > movl %r10d, %edi >> > xor %esi,%esi >> > callq *%rax >> > What happens if someone unsets the handler function (i.e. calls >> __xray_remove_handler() ) or changes the handler (i.e. calls >> __xray_set_handler() with a different pointer to function) between "movq >> _ZN6__xray19XRayPatchedFunctionE(%rip), %rax" and "callq *%rax" ? I >> understood that the old handler will still be called, but isn't it a >> problem? What if the code removing the handler starts destructing some >> objects after that, so that a handler call would result in a crash or other >> bad things? >> >> That would be unfortunate. :) >> >> Yes, you're right, it will be called despite having the handler be un-set >> from the global atomic. Though I'd suggest that log handler implementations >> really shouldn't be crashing, and should assume that it will be called at >> any given time (from multiple threads, so it should also be thread-safe). >> Consider also that we're taking a function pointer by design -- assuming >> that it will be a pointer to a function that will only deal with globals >> and the explicitly passed in arguments. >> >> An implementer of the log handler should really be responsible for >> "internal signalling" -- i.e. implementing it so that any required >> synchronisation happens internally. >> >> Does this help? >> >> PS. I think more definitive documentation for this is due, let me think >> about putting something some documentation on how to use XRay as it is >> currently implemented once we have the naive in-memory logging >> implementation land (https://reviews.llvm.org/D21982). >> >> Cheers >> >> -- Dean >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160805/bc3f9f0b/attachment.html>