comic fans via llvm-dev
2017-Nov-16 13:44 UTC
[llvm-dev] question about xray tls data initialization
I'm learning the xray library and try if it can be built on windows, in xray_fdr_logging_impl.h line 152 , comment written as // Using pthread_once(...) to initialize the thread-local data structures but at line 175, 183, code written as thread_local pthread_key_t key; // Ensure that we only actually ever do the pthread initialization once. thread_local bool UNUSED Unused = [] { new (&TLSBuffer) ThreadLocalData(); auto result = pthread_key_create(&key, +[](void *) { auto &TLD = *reinterpret_cast<ThreadLocalData *>(&TLSBuffer); I'm confused that pthread_key_t and Unused are both thread_local variable, doesn't it mean the following lambda will run for each thread , and create one pthread_key_t for only one tls data(instead of only one pthread_key_t for all thread) ? also what does the '+' before lambda expression mean ? this may be stupid questions, could somebody kindly helped ?
Dean Michael Berris via llvm-dev
2017-Nov-21 11:46 UTC
[llvm-dev] question about xray tls data initialization
> On 17 Nov 2017, at 00:44, comic fans via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > I'm learning the xray library and try if it can be built on windows, in > xray_fdr_logging_impl.h > > line 152 , comment written as > // Using pthread_once(...) to initialize the thread-local data structures > > > but at line 175, 183, code written as > > thread_local pthread_key_t key; > > // Ensure that we only actually ever do the pthread initialization once. > thread_local bool UNUSED Unused = [] { > new (&TLSBuffer) ThreadLocalData(); > auto result = pthread_key_create(&key, +[](void *) { > auto &TLD = *reinterpret_cast<ThreadLocalData *>(&TLSBuffer); > > > I'm confused that pthread_key_t and Unused are both thread_local > variable, doesn't it mean the following lambda will run for each > thread , and create one pthread_key_t for only one tls data(instead of > only one pthread_key_t for all thread) ? also what does the '+' before > lambda expression mean ? this may be stupid questions, could somebody > kindly helped ?Yeah, that comment is out-of-date (and the implementation is buggy) -- which is a shame really. :/ But, the good news, is I think we've fixed this now in the top-of-trunk with https://reviews.llvm.org/D39526 <https://reviews.llvm.org/D39526> and https://reviews.llvm.org/D40164 <https://reviews.llvm.org/D40164>. Curiously though, how far did your exploration into getting XRay to build on Windows go? Cheers -- Dean -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171121/02a7503b/attachment.html>
comic fans via llvm-dev
2017-Nov-21 15:32 UTC
[llvm-dev] question about xray tls data initialization
with some dirty hack , I've made xray runtime 'built' on windows , but unfortunately I haven't enough knowledge about linker and the runtime, and finally built executable didn't run. I'd like to share my changes here , hopes somebody help me to make it run on windows. in AsmPrinter, copy/paster xray for coff target InstMap = OutContext.getCOFFSection("xray_instr_map", 0, SectionKind::getReadOnlyWithRel()); FnSledIndex = OutContext.getCOFFSection("xray_fn_idx", 0,SectionKind::getReadOnlyWithRel()); in XRayArgs , allow windows platform to use xray args. with this, generated code seems have sled and xray parts. in xray runtime, bool atomic_compare_exchange_strong(volatile atomic_sint32_t *a, s32 *cmp, s32 xchg, memory_order mo) is missed for MSVC , I take atomic_uint32_t implementation msvc 14.1 treats BufferQueue::Buffer::Buffer as constructor instead of data member, Buf.Buffer=>Buf.Data FunctionRecord pack , __attribute__((packed)) => #pragma pack(push,1), msvc also requires bitfields to be same type to pack them together( all types => uint32_t) FD int => HANDLE, most code logic still valid (-1 as invalid value), r/w API replaced with windows mprotect => VirtualProtect readTSC in xray_x86_64.inc also works for windows replace read tsc from proc with QueryPerformanceFrequency msvc can not compile such code void setupNewBuffer(int (*wall_clock_reader)(clockid_t, struct timespec *)); must use typedef first . xray use clock_gettime as default implementation , which is not friendly for windows .create a fake one based on chrono system_clock(ignore clockid_t) for tls destructor part, I've just commented them out.(but https://www.codeproject.com/Articles/8113/Thread-Local-Storage-The-C-Way gives a thread exit callback way for coff) and last thing , which I don't understand is the weak symbol for __start_xray_instr_map[] __stop_xray_instr_map[] __start_xray_fn_idx[] __stop_xray_fn_idx[] I replace them with __declspec(selectany) , but I'm not sure they have same meanings. some random generated code: .text .intel_syntax noprefix .def call; .scl 2; .type 32; .endef .globl call # -- Begin function call .p2align 4, 0x90 call: # @call .seh_proc call # BB#0: # %entry .p2align 1, 0x90 .Lxray_sled_0: .ascii "\353\t" nop word ptr [rax + rax + 512] sub rsp, 16 .seh_stackalloc 16 .seh_endprologue mov dword ptr [rsp + 12], ecx mov dword ptr [rsp + 8], 0 mov dword ptr [rsp + 4], 0 .LBB0_1: # %for.cond # =>This Inner Loop Header: Depth=1 mov eax, dword ptr [rsp + 4] cmp eax, dword ptr [rsp + 12] jge .LBB0_4 # BB#2: # %for.body # in Loop: Header=BB0_1 Depth=1 mov eax, dword ptr [rsp + 4] add eax, dword ptr [rsp + 8] mov dword ptr [rsp + 8], eax # BB#3: # %for.inc # in Loop: Header=BB0_1 Depth=1 mov eax, dword ptr [rsp + 4] add eax, 1 mov dword ptr [rsp + 4], eax jmp .LBB0_1 .LBB0_4: # %for.end mov eax, dword ptr [rsp + 8] add rsp, 16 .p2align 1, 0x90 .Lxray_sled_1: ret nop word ptr cs:[rax + rax + 512] .seh_handlerdata .text .seh_endproc # -- End function .section xray_instr_map,"y" .Lxray_sleds_start0: .quad .Lxray_sled_0 .quad call .byte 0x00 .byte 0x00 .byte 0x00 .zero 13 .quad .Lxray_sled_1 .quad call .byte 0x01 .byte 0x00 .byte 0x00 .zero 13 .Lxray_sleds_end0: .section xray_fn_idx,"y" .p2align 4, 0x90 .quad .Lxray_sleds_start0 .quad .Lxray_sleds_end0 .text and parts of obj dump: SECTION HEADER #5 /16 name (xray_instr_map) 0 physical address 0 virtual address 40 size of raw data 198 file pointer to raw data (00000198 to 000001D7) 1D8 file pointer to relocation table 0 file pointer to line numbers 4 number of relocations 0 number of line numbers 100000 flags 1 byte align RAW DATA #5 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000020: 56 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 V............... 00000030: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ RELOCATIONS #5 Symbol Symbol Offset Type Applied To Index Name -------- ---------------- ----------------- -------- ------ 00000000 ADDR64 00000000 00000000 0 .text 00000008 ADDR64 00000000 00000000 E call 00000020 ADDR64 00000000 00000056 0 .text 00000028 ADDR64 00000000 00000000 E call SECTION HEADER #6 /4 name (xray_fn_idx) 0 physical address 0 virtual address 10 size of raw data 200 file pointer to raw data (00000200 to 0000020F) 210 file pointer to relocation table 0 file pointer to line numbers 2 number of relocations 0 number of line numbers 500000 flags 16 byte align RAW DATA #6 00000000: 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 ........ at ....... RELOCATIONS #6 Symbol Symbol Offset Type Applied To Index Name -------- ---------------- ----------------- -------- ------ 00000000 ADDR64 00000000 00000000 8 xray_instr_map 00000008 ADDR64 00000000 00000040 8 xray_instr_map On Tue, Nov 21, 2017 at 7:46 PM, Dean Michael Berris <dean.berris at gmail.com> wrote:> > On 17 Nov 2017, at 00:44, comic fans via llvm-dev <llvm-dev at lists.llvm.org> > wrote: > > I'm learning the xray library and try if it can be built on windows, in > xray_fdr_logging_impl.h > > line 152 , comment written as > // Using pthread_once(...) to initialize the thread-local data structures > > > but at line 175, 183, code written as > > thread_local pthread_key_t key; > > // Ensure that we only actually ever do the pthread initialization once. > thread_local bool UNUSED Unused = [] { > new (&TLSBuffer) ThreadLocalData(); > auto result = pthread_key_create(&key, +[](void *) { > auto &TLD = *reinterpret_cast<ThreadLocalData *>(&TLSBuffer); > > > I'm confused that pthread_key_t and Unused are both thread_local > variable, doesn't it mean the following lambda will run for each > thread , and create one pthread_key_t for only one tls data(instead of > only one pthread_key_t for all thread) ? also what does the '+' before > lambda expression mean ? this may be stupid questions, could somebody > kindly helped ? > > > Yeah, that comment is out-of-date (and the implementation is buggy) -- which > is a shame really. :/ > > But, the good news, is I think we've fixed this now in the top-of-trunk with > https://reviews.llvm.org/D39526 and https://reviews.llvm.org/D40164. > > Curiously though, how far did your exploration into getting XRay to build on > Windows go? > > Cheers > > -- Dean >
Apparently Analagous Threads
- question about xray tls data initialization
- [XRay] Alternatives to relocations in .text section
- question about xray tls data initialization
- [PATCH libnbd v3] lib/errors.c: Fix assert fail in exit path in multi-threaded code
- [LLVMdev] probleam about ThreadLocalImpl of llvm