Sean Silva via llvm-dev
2017-Jun-06 00:18 UTC
[llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT
On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola < nikodemus at random-state.net> wrote:> Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it > through an opaque identity function ... then everything works fine. > > Is this a bug in LLVM or is there some magic involving globals I'm > misunderstanding? >This looks like a bug in the handling of constant GEP's. Specifically the `getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)` used to calculate the address of the integer inside the struct. Your observation "The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!" at the level of the MachineInstr dump rules out problems before that. After MachineInstr comes MC to emit the object file, but `foo+4` is one of the most basic relocation types, so I doubt that there's a bug in the lowering there or else "everything" would be broken. Just to verify though, checking assembly of a small example across 32-bit targets of all 3 object file formats looks fine at a glance (MC is getting the +4 addend, though you would need to run `llvm-objdump -d -r` to see the actual relocation in the binary) . https://godbolt.org/g/0Owzf5 https://godbolt.org/g/n0qzmg https://godbolt.org/g/kAOvkQ Beyond MC, you already have your static object file. If that is fine, then in a JIT context you might be running into issues with RuntimeDyld. The actual GEP's that clang generates are identical to the ones in your code, further suggesting that this is JIT specific and that static links are unaffected (if you could verify that, it would help to narrow down the possibilities). Maybe look at the output of `llvm-objdump -d -r` on a static .o file generated from your IR and see where the relocation is handled in lib/ExecutionEngine/RuntimeDyld (this will depend on your platform; grepping for the name of the relocation shown by llvm-objdump should find the right code to look at). By the way, what platform are you JIT'ing on? I noticed that it is a 32-bit target, and I suspect that the 32-bit support in the JIT infrastructure isn't as well tested / commonly used as the 64-bit code, possibly explaining why this sort of bug could sneak through. -- Sean Silva> > define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* > @"XEP:__anonToplevel/0" { > entry: > %0 = call { i8*, i32 }* @identity({ i8*, i32 }* nonnull @foo) > %1 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0) > %2 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 0 > %3 = ptrtoint { i8*, i32 }* %0 to i32 > %4 = call { i8*, i32 } @debugInt(i32 %3) > store i8* @FixnumClass, i8** %2, align 4 > %5 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 1 > %6 = ptrtoint i32* %5 to i32 > %7 = call { i8*, i32 } @debugInt(i32 %6) > store i32 123, i32* %5, align 4 > %8 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0) > store i8* @FixnumClass, i8** %2, align 4 > store i32 123, i32* %5, align 4 > %9 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0) > call void @setGlobal({ i8*, i32 }* %0, { i8*, i32 } { i8* @FixnumClass, > i32 123 }) > %10 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0) > ret { i8*, i32 } { i8* @FixnumClass, i32 123 } > } > > Output, now with correct addresses out of the GEPs, and memory being > modified as expected: > > p = 02F80000 > class: 00000000 > datum: 00000000 > x = 02F80000 > x = 02F80004 > p = 02F80000 > class: 028D3E98 > datum: 0000007B > p = 02F80000 > class: 028D3E98 > datum: 0000007B > p = 02F80000 > class: 028D3E98 > datum: 0000007B > > Cheers, > > -- nikodemus > > > On Mon, Jun 5, 2017 at 10:57 PM, Nikodemus Siivola < > nikodemus at random-state.net> wrote: > >> Since the getelementptrs were implicitly generated by the >> CreateStore/Load I'm not sure how to get access to them. >> >> So I hacked the assignment to be done thrice: once using a manual >> decomposition into two GEPs and stores, once using the "big" CreateStore, >> once via the setGlobal function, printing addresses and memory contents at >> each point to the degree that I have access to them. >> >> It seems the following GEPs compute the same address?! I can buy myself >> not understanding how GEP works and doing it wrong, but >> builder.CreateStore() creates what look like identical GEPs implicitly... >> >> i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 >> 0), align 4 >> i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 >> 1), align 4 >> >> The details. >> >> This is the relevant part from my codegen: >> >> auto ty = val->getType(); >> cout << "val type:" << endl; >> ty->dump(); >> cout << "ptr type:" << endl; >> ptr->getType()->dump(); >> // Print memory >> ctx.EmitCall1("debugPointer", ptr); >> // Set class pointer >> auto c = ctx.bld.CreateExtractValue(val, 0, "class"); >> auto cp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 0); >> auto cx = ctx.bld.CreatePtrToInt(cp, ctx.Int32Type()); >> ctx.EmitCall1("debugInt", cx); >> ctx.bld.CreateStore(c, cp); >> // Set datum >> auto d = ctx.bld.CreateExtractValue(val, 1, "datum"); >> auto dp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 1); >> auto dx = ctx.bld.CreatePtrToInt(dp, ctx.Int32Type()); >> ctx.EmitCall1("debugInt", dx); >> ctx.bld.CreateStore(d, dp); >> // Print memory >> ctx.EmitCall1("debugPointer", ptr); >> // Do the same with a single store >> ctx.bld.CreateStore(val, ptr); >> // Print memory >> ctx.EmitCall1("debugPointer", ptr); >> // Call out >> ctx.EmitCall2("setGlobal", ptr, val); >> // Print memory >> ctx.EmitCall1("debugPointer", ptr); >> >> Here is the compile-time output showing types of the value and the >> pointer: >> >> val type: >> { i8*, i32 } >> ptr type: >> { i8*, i32 }* >> >> Here is the IR dump for the function (after a couple of passes), right >> before it's fed to the JIT: >> >> define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* >> @"XEP:__anonToplevel/0" { >> entry: >> %0 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo) >> %1 = call { i8*, i32 } @debugInt(i32 ptrtoint ({ i8*, i32 }* @foo to >> i32)) >> store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { >> i8*, i32 }* @foo, i32 0, i32 0), align 4 >> %2 = call { i8*, i32 } @debugInt(i32 ptrtoint (i32* getelementptr >> inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1) to i32)) >> store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* >> @foo, i32 0, i32 1), align 4 >> %3 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo) >> store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { >> i8*, i32 }* @foo, i32 0, i32 0), align 4 >> store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* >> @foo, i32 0, i32 1), align 4 >> %4 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo) >> call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } { i8* >> @FixnumClass, i32 123 }) >> %5 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo) >> ret { i8*, i32 } { i8* @FixnumClass, i32 123 } >> } >> >> Here is the runtime from calling the JITed function, including memory >> addresses and contents, with my annotations: >> >> # Before >> p = 03C10000 >> class: 00000000 >> datum: 00000000 >> # Should be address of the class slot --> correct >> x = 03C10000 >> # Should be address of the datum slot, ie address of class slot + 4 --> >> incorrect >> x = 03C10000 >> # Yeah, both values want to class slot, so actual class pointer got >> clobbered >> p = 03C10000 >> class: 0000007B >> datum: 00000000 >> # Same result from the single CreateStore >> p = 03C10000 >> class: 0000007B >> datum: 00000000 >> # Calling out to setGlobal as in my first email works >> p = 03C10000 >> class: 039D2E98 >> datum: 0000007B >> >> Finally, I didn't manage nice disassembly yet, so here is the last output >> from --print-after-all for the function. The bizarre thing is that even >> this looks correct: the debugInt is called first with @foo, then @foo+4, >> and the stores seem to be going to the right addresses as well: @foo and >> @foo+4! >> >> BB#0: derived from LLVM BB %entry >> PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use> >> CFI_INSTRUCTION <call frame instruction> >> CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX >> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, >> %EAX<imp-def,dead>, %EDX<imp-def,dead> >> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead> >> CFI_INSTRUCTION <call frame instruction> >> PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use> >> CFI_INSTRUCTION <call frame instruction> >> CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI >> %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, >> %EAX<imp-def,dead>, %EDX<imp-def,dead> >> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead> >> CFI_INSTRUCTION <call frame instruction> >> MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, >> <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { >> i8*, i32 }* @foo, i32 0, i32 0)] >> PUSHi32 <ga:@foo+4>, %ESP<imp-def>, %ESP<imp-use> >> CFI_INSTRUCTION <call frame instruction> >> CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI >> %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, >> %EAX<imp-def,dead>, %EDX<imp-def,dead> >> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead> >> CFI_INSTRUCTION <call frame instruction> >> MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; >> mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, >> i32 1)] >> PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use> >> CFI_INSTRUCTION <call frame instruction> >> CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX >> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, >> %EAX<imp-def,dead>, %EDX<imp-def,dead> >> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead> >> CFI_INSTRUCTION <call frame instruction> >> MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, >> <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { >> i8*, i32 }* @foo, i32 0, i32 0)] >> MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; >> mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, >> i32 1)] >> PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use> >> CFI_INSTRUCTION <call frame instruction> >> CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX >> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, >> %EAX<imp-def,dead>, %EDX<imp-def,dead> >> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead> >> CFI_INSTRUCTION <call frame instruction> >> PUSH32i8 123, %ESP<imp-def>, %ESP<imp-use> >> CFI_INSTRUCTION <call frame instruction> >> PUSHi32 <ga:@JazzFixnumClass>, %ESP<imp-def>, %ESP<imp-use> >> CFI_INSTRUCTION <call frame instruction> >> PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use> >> CFI_INSTRUCTION <call frame instruction> >> CALLpcrel32 <ga:@setGlobal>, <regmask %BH %BL %BP %BPL %BX %DI >> %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def> >> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 12, %EFLAGS<imp-def,dead> >> CFI_INSTRUCTION <call frame instruction> >> PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use> >> CFI_INSTRUCTION <call frame instruction> >> CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX >> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, >> %EAX<imp-def,dead>, %EDX<imp-def,dead> >> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead> >> CFI_INSTRUCTION <call frame instruction> >> %EAX<def> = MOV32ri <ga:@JazzFixnumClass> >> %EDX<def> = MOV32ri 123 >> RETL %EAX<kill>, %EDX<kill> >> >> Also, I have essentially identical code working perfectly fine when the >> memory being written to is from @alloca. >> >> I am completely clueless. Any suggestions most welcome. >> >> Cheers, >> >> -- nikodemus >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170605/32f1e42f/attachment.html>
Nikodemus Siivola via llvm-dev
2017-Jun-06 08:09 UTC
[llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT
This is on Windows 10: didn't yet manage to get a 64-bit toolchain set up
that agreed on everything necessary.
Dumped bitcode, but when I did that everything landed in the same module
(normally the global is defined in a different module then its uses) -->
the relocations are different... different enough that when I loaded the
bitcode back in and handed the single module to JIT it worked fine.
I'll try to dump a case where the definition is in a different module
tomorrow.
Anyhow, below is what clang-cl turned the bitcode from my IR into --
probably not very useful though as this code does what it should...
$ llvm-objdump.exe -r -d test.o
test.o: file format COFF-i386
Disassembly of section .text:
.text:
0: 00 00 addb %al, (%eax)
00000000: IMAGE_REL_I386_DIR32 _XEP:setfoo
2: 00 00 addb %al, (%eax)
_setfoo:
4: 56 pushl %esi
5: 83 ec 40 subl $64, %esp
8: 89 e0 movl %esp, %eax
a: c7 00 00 00 00 00 movl $0, (%eax)
0000000c: IMAGE_REL_I386_DIR32 _foo
10: e8 00 00 00 00 calll 0 <_setfoo+0x11>
00000011: IMAGE_REL_I386_REL32 _debugPointer
15: 89 e1 movl %esp, %ecx
17: c7 01 00 00 00 00 movl $0, (%ecx)
00000019: IMAGE_REL_I386_DIR32 _foo
1d: 89 44 24 3c movl %eax, 60(%esp)
21: 89 54 24 38 movl %edx, 56(%esp)
25: e8 00 00 00 00 calll 0 <_setfoo+0x26>
00000026: IMAGE_REL_I386_REL32 _debugInt
2a: c7 05 00 00 00 00 00 00 00 00 movl $0, 0
0000002c: IMAGE_REL_I386_DIR32 _foo
00000030: IMAGE_REL_I386_DIR32 _JazzFixnumClass
34: b9 00 00 00 00 movl $0, %ecx
00000035: IMAGE_REL_I386_DIR32 _JazzFixnumClass
39: 89 e6 movl %esp, %esi
3b: c7 06 04 00 00 00 movl $4, (%esi)
0000003d: IMAGE_REL_I386_DIR32 _foo
41: 89 44 24 34 movl %eax, 52(%esp)
45: 89 54 24 30 movl %edx, 48(%esp)
49: 89 4c 24 2c movl %ecx, 44(%esp)
4d: e8 00 00 00 00 calll 0 <_setfoo+0x4E>
0000004e: IMAGE_REL_I386_REL32 _debugInt
52: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4
00000054: IMAGE_REL_I386_DIR32 _foo
5c: 89 e1 movl %esp, %ecx
5e: c7 01 00 00 00 00 movl $0, (%ecx)
00000060: IMAGE_REL_I386_DIR32 _foo
64: 89 44 24 28 movl %eax, 40(%esp)
68: 89 54 24 24 movl %edx, 36(%esp)
6c: e8 00 00 00 00 calll 0 <_setfoo+0x6D>
0000006d: IMAGE_REL_I386_REL32 _debugPointer
71: c7 05 00 00 00 00 00 00 00 00 movl $0, 0
00000073: IMAGE_REL_I386_DIR32 _foo
00000077: IMAGE_REL_I386_DIR32 _JazzFixnumClass
7b: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4
0000007d: IMAGE_REL_I386_DIR32 _foo
85: 89 e1 movl %esp, %ecx
87: c7 01 00 00 00 00 movl $0, (%ecx)
00000089: IMAGE_REL_I386_DIR32 _foo
8d: 89 44 24 20 movl %eax, 32(%esp)
91: 89 54 24 1c movl %edx, 28(%esp)
95: e8 00 00 00 00 calll 0 <_setfoo+0x96>
00000096: IMAGE_REL_I386_REL32 _debugPointer
9a: 89 e1 movl %esp, %ecx
9c: c7 41 08 d5 00 00 00 movl $213, 8(%ecx)
a3: c7 41 04 00 00 00 00 movl $0, 4(%ecx)
000000a6: IMAGE_REL_I386_DIR32 _JazzFixnumClass
aa: c7 01 00 00 00 00 movl $0, (%ecx)
000000ac: IMAGE_REL_I386_DIR32 _foo
b0: 89 44 24 18 movl %eax, 24(%esp)
b4: 89 54 24 14 movl %edx, 20(%esp)
b8: e8 00 00 00 00 calll 0 <_setfoo+0xB9>
000000b9: IMAGE_REL_I386_REL32 _setGlobal
bd: 89 e0 movl %esp, %eax
bf: c7 00 00 00 00 00 movl $0, (%eax)
000000c1: IMAGE_REL_I386_DIR32 _foo
c5: e8 00 00 00 00 calll 0 <_setfoo+0xC6>
000000c6: IMAGE_REL_I386_REL32 _debugPointer
ca: b9 d5 00 00 00 movl $213, %ecx
cf: 8b 74 24 2c movl 44(%esp), %esi
d3: 89 44 24 10 movl %eax, 16(%esp)
d7: 89 f0 movl %esi, %eax
d9: 89 54 24 0c movl %edx, 12(%esp)
dd: 89 ca movl %ecx, %edx
df: 83 c4 40 addl $64, %esp
e2: 5e popl %esi
e3: c3 retl
e4: 66 66 66 2e 0f 1f 84 00 00 00 00 00 nopw
%cs:(%eax,%eax)
_XEP:setfoo:
f0: 8b 44 24 04 movl 4(%esp), %eax
f4: 83 f8 00 cmpl $0, %eax
f7: 0f 84 05 00 00 00 je 5 <_XEP:setfoo+0x12>
fd: e8 00 00 00 00 calll 0 <_XEP:setfoo+0x12>
000000fe: IMAGE_REL_I386_REL32 _typeError
102: e8 00 00 00 00 calll 0 <_XEP:setfoo+0x17>
00000103: IMAGE_REL_I386_REL32 _setfoo
107: c3 retl
108: 0f 1f 84 00 00 00 00 00 nopl (%eax,%eax)
110: 00 00 addb %al, (%eax)
00000110: IMAGE_REL_I386_DIR32 _XEP:getfoo
112: 00 00 addb %al, (%eax)
_getfoo:
114: 50 pushl %eax
115: 89 e0 movl %esp, %eax
117: c7 00 00 00 00 00 movl $0, (%eax)
00000119: IMAGE_REL_I386_DIR32 _foo
11d: e8 00 00 00 00 calll 0 <_getfoo+0xE>
0000011e: IMAGE_REL_I386_REL32 _getGlobal
122: 59 popl %ecx
123: c3 retl
124: 66 66 66 2e 0f 1f 84 00 00 00 00 00 nopw
%cs:(%eax,%eax)
_XEP:getfoo:
130: 8b 44 24 04 movl 4(%esp), %eax
134: 83 f8 00 cmpl $0, %eax
137: 0f 84 05 00 00 00 je 5 <_XEP:getfoo+0x12>
13d: e8 00 00 00 00 calll 0 <_XEP:getfoo+0x12>
0000013e: IMAGE_REL_I386_REL32 _typeError
142: e8 00 00 00 00 calll 0 <_XEP:getfoo+0x17>
00000143: IMAGE_REL_I386_REL32 _getfoo
147: c3 retl
On Tue, Jun 6, 2017 at 3:18 AM, Sean Silva <chisophugis at gmail.com>
wrote:
>
>
> On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola <
> nikodemus at random-state.net> wrote:
>
>> Uh. Turns out that if I hide the pointer to @foo from LLVM by passing
it
>> through an opaque identity function ... then everything works fine.
>>
>> Is this a bug in LLVM or is there some magic involving globals I'm
>> misunderstanding?
>>
>
> This looks like a bug in the handling of constant GEP's. Specifically
the
> `getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)`
> used to calculate the address of the integer inside the struct. Your
> observation "The bizarre thing is that even this looks correct: the
> debugInt is called first with @foo, then @foo+4, and the stores seem to be
> going to the right addresses as well: @foo and @foo+4!" at the level
of the
> MachineInstr dump rules out problems before that.
>
> After MachineInstr comes MC to emit the object file, but `foo+4` is one of
> the most basic relocation types, so I doubt that there's a bug in the
> lowering there or else "everything" would be broken.
> Just to verify though, checking assembly of a small example across 32-bit
> targets of all 3 object file formats looks fine at a glance (MC is getting
> the +4 addend, though you would need to run `llvm-objdump -d -r` to see the
> actual relocation in the binary) .
> https://godbolt.org/g/0Owzf5
> https://godbolt.org/g/n0qzmg
> https://godbolt.org/g/kAOvkQ
>
> Beyond MC, you already have your static object file. If that is fine, then
> in a JIT context you might be running into issues with RuntimeDyld. The
> actual GEP's that clang generates are identical to the ones in your
code,
> further suggesting that this is JIT specific and that static links are
> unaffected (if you could verify that, it would help to narrow down the
> possibilities).
> Maybe look at the output of `llvm-objdump -d -r` on a static .o file
> generated from your IR and see where the relocation is handled
> in lib/ExecutionEngine/RuntimeDyld (this will depend on your platform;
> grepping for the name of the relocation shown by llvm-objdump should find
> the right code to look at).
>
> By the way, what platform are you JIT'ing on? I noticed that it is a
> 32-bit target, and I suspect that the 32-bit support in the JIT
> infrastructure isn't as well tested / commonly used as the 64-bit code,
> possibly explaining why this sort of bug could sneak through.
>
> -- Sean Silva
>
>
>>
>> define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 }
(i32)*
>> @"XEP:__anonToplevel/0" {
>> entry:
>> %0 = call { i8*, i32 }* @identity({ i8*, i32 }* nonnull @foo)
>> %1 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
>> %2 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 0
>> %3 = ptrtoint { i8*, i32 }* %0 to i32
>> %4 = call { i8*, i32 } @debugInt(i32 %3)
>> store i8* @FixnumClass, i8** %2, align 4
>> %5 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 1
>> %6 = ptrtoint i32* %5 to i32
>> %7 = call { i8*, i32 } @debugInt(i32 %6)
>> store i32 123, i32* %5, align 4
>> %8 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
>> store i8* @FixnumClass, i8** %2, align 4
>> store i32 123, i32* %5, align 4
>> %9 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
>> call void @setGlobal({ i8*, i32 }* %0, { i8*, i32 } { i8*
@FixnumClass,
>> i32 123 })
>> %10 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
>> ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
>> }
>>
>> Output, now with correct addresses out of the GEPs, and memory being
>> modified as expected:
>>
>> p = 02F80000
>> class: 00000000
>> datum: 00000000
>> x = 02F80000
>> x = 02F80004
>> p = 02F80000
>> class: 028D3E98
>> datum: 0000007B
>> p = 02F80000
>> class: 028D3E98
>> datum: 0000007B
>> p = 02F80000
>> class: 028D3E98
>> datum: 0000007B
>>
>> Cheers,
>>
>> -- nikodemus
>>
>>
>> On Mon, Jun 5, 2017 at 10:57 PM, Nikodemus Siivola <
>> nikodemus at random-state.net> wrote:
>>
>>> Since the getelementptrs were implicitly generated by the
>>> CreateStore/Load I'm not sure how to get access to them.
>>>
>>> So I hacked the assignment to be done thrice: once using a manual
>>> decomposition into two GEPs and stores, once using the
"big" CreateStore,
>>> once via the setGlobal function, printing addresses and memory
contents at
>>> each point to the degree that I have access to them.
>>>
>>> It seems the following GEPs compute the same address?! I can buy
myself
>>> not understanding how GEP works and doing it wrong, but
>>> builder.CreateStore() creates what look like identical GEPs
implicitly...
>>>
>>> i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32
0,
>>> i32 0), align 4
>>> i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32
0,
>>> i32 1), align 4
>>>
>>> The details.
>>>
>>> This is the relevant part from my codegen:
>>>
>>> auto ty = val->getType();
>>> cout << "val type:" << endl;
>>> ty->dump();
>>> cout << "ptr type:" << endl;
>>> ptr->getType()->dump();
>>> // Print memory
>>> ctx.EmitCall1("debugPointer", ptr);
>>> // Set class pointer
>>> auto c = ctx.bld.CreateExtractValue(val, 0,
"class");
>>> auto cp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 0);
>>> auto cx = ctx.bld.CreatePtrToInt(cp, ctx.Int32Type());
>>> ctx.EmitCall1("debugInt", cx);
>>> ctx.bld.CreateStore(c, cp);
>>> // Set datum
>>> auto d = ctx.bld.CreateExtractValue(val, 1,
"datum");
>>> auto dp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 1);
>>> auto dx = ctx.bld.CreatePtrToInt(dp, ctx.Int32Type());
>>> ctx.EmitCall1("debugInt", dx);
>>> ctx.bld.CreateStore(d, dp);
>>> // Print memory
>>> ctx.EmitCall1("debugPointer", ptr);
>>> // Do the same with a single store
>>> ctx.bld.CreateStore(val, ptr);
>>> // Print memory
>>> ctx.EmitCall1("debugPointer", ptr);
>>> // Call out
>>> ctx.EmitCall2("setGlobal", ptr, val);
>>> // Print memory
>>> ctx.EmitCall1("debugPointer", ptr);
>>>
>>> Here is the compile-time output showing types of the value and the
>>> pointer:
>>>
>>> val type:
>>> { i8*, i32 }
>>> ptr type:
>>> { i8*, i32 }*
>>>
>>> Here is the IR dump for the function (after a couple of passes),
right
>>> before it's fed to the JIT:
>>>
>>> define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*,
i32 } (i32)*
>>> @"XEP:__anonToplevel/0" {
>>> entry:
>>> %0 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
>>> %1 = call { i8*, i32 } @debugInt(i32 ptrtoint ({ i8*, i32 }* @foo
to
>>> i32))
>>> store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32
}, {
>>> i8*, i32 }* @foo, i32 0, i32 0), align 4
>>> %2 = call { i8*, i32 } @debugInt(i32 ptrtoint (i32* getelementptr
>>> inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1) to i32))
>>> store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*,
i32
>>> }* @foo, i32 0, i32 1), align 4
>>> %3 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
>>> store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32
}, {
>>> i8*, i32 }* @foo, i32 0, i32 0), align 4
>>> store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*,
i32
>>> }* @foo, i32 0, i32 1), align 4
>>> %4 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
>>> call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } {
i8*
>>> @FixnumClass, i32 123 })
>>> %5 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
>>> ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
>>> }
>>>
>>> Here is the runtime from calling the JITed function, including
memory
>>> addresses and contents, with my annotations:
>>>
>>> # Before
>>> p = 03C10000
>>> class: 00000000
>>> datum: 00000000
>>> # Should be address of the class slot --> correct
>>> x = 03C10000
>>> # Should be address of the datum slot, ie address of class slot + 4
-->
>>> incorrect
>>> x = 03C10000
>>> # Yeah, both values want to class slot, so actual class pointer got
>>> clobbered
>>> p = 03C10000
>>> class: 0000007B
>>> datum: 00000000
>>> # Same result from the single CreateStore
>>> p = 03C10000
>>> class: 0000007B
>>> datum: 00000000
>>> # Calling out to setGlobal as in my first email works
>>> p = 03C10000
>>> class: 039D2E98
>>> datum: 0000007B
>>>
>>> Finally, I didn't manage nice disassembly yet, so here is the
last
>>> output from --print-after-all for the function. The bizarre thing
is that
>>> even this looks correct: the debugInt is called first with @foo,
then
>>> @foo+4, and the stores seem to be going to the right addresses as
well:
>>> @foo and @foo+4!
>>>
>>> BB#0: derived from LLVM BB %entry
>>> PUSHi32 <ga:@foo>, %ESP<imp-def>,
%ESP<imp-use>
>>> CFI_INSTRUCTION <call frame instruction>
>>> CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL
%BP %BPL %BX
>>> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>,
%ESP<imp-def>,
>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
%EFLAGS<imp-def,dead>
>>> CFI_INSTRUCTION <call frame instruction>
>>> PUSHi32 <ga:@foo>, %ESP<imp-def>,
%ESP<imp-use>
>>> CFI_INSTRUCTION <call frame instruction>
>>> CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP
%BPL %BX %DI
>>> %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>,
%ESP<imp-def>,
>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
%EFLAGS<imp-def,dead>
>>> CFI_INSTRUCTION <call frame instruction>
>>> MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg,
>>> <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*,
i32 }, {
>>> i8*, i32 }* @foo, i32 0, i32 0)]
>>> PUSHi32 <ga:@foo+4>, %ESP<imp-def>,
%ESP<imp-use>
>>> CFI_INSTRUCTION <call frame instruction>
>>> CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP
%BPL %BX %DI
>>> %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>,
%ESP<imp-def>,
>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
%EFLAGS<imp-def,dead>
>>> CFI_INSTRUCTION <call frame instruction>
>>> MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123;
>>> mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo,
i32 0,
>>> i32 1)]
>>> PUSHi32 <ga:@foo>, %ESP<imp-def>,
%ESP<imp-use>
>>> CFI_INSTRUCTION <call frame instruction>
>>> CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL
%BP %BPL %BX
>>> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>,
%ESP<imp-def>,
>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
%EFLAGS<imp-def,dead>
>>> CFI_INSTRUCTION <call frame instruction>
>>> MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg,
>>> <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*,
i32 }, {
>>> i8*, i32 }* @foo, i32 0, i32 0)]
>>> MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123;
>>> mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo,
i32 0,
>>> i32 1)]
>>> PUSHi32 <ga:@foo>, %ESP<imp-def>,
%ESP<imp-use>
>>> CFI_INSTRUCTION <call frame instruction>
>>> CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL
%BP %BPL %BX
>>> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>,
%ESP<imp-def>,
>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
%EFLAGS<imp-def,dead>
>>> CFI_INSTRUCTION <call frame instruction>
>>> PUSH32i8 123, %ESP<imp-def>, %ESP<imp-use>
>>> CFI_INSTRUCTION <call frame instruction>
>>> PUSHi32 <ga:@JazzFixnumClass>, %ESP<imp-def>,
%ESP<imp-use>
>>> CFI_INSTRUCTION <call frame instruction>
>>> PUSHi32 <ga:@foo>, %ESP<imp-def>,
%ESP<imp-use>
>>> CFI_INSTRUCTION <call frame instruction>
>>> CALLpcrel32 <ga:@setGlobal>, <regmask %BH %BL %BP
%BPL %BX %DI
>>> %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>,
%ESP<imp-def>
>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 12,
%EFLAGS<imp-def,dead>
>>> CFI_INSTRUCTION <call frame instruction>
>>> PUSHi32 <ga:@foo>, %ESP<imp-def>,
%ESP<imp-use>
>>> CFI_INSTRUCTION <call frame instruction>
>>> CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL
%BP %BPL %BX
>>> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>,
%ESP<imp-def>,
>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
%EFLAGS<imp-def,dead>
>>> CFI_INSTRUCTION <call frame instruction>
>>> %EAX<def> = MOV32ri <ga:@JazzFixnumClass>
>>> %EDX<def> = MOV32ri 123
>>> RETL %EAX<kill>, %EDX<kill>
>>>
>>> Also, I have essentially identical code working perfectly fine when
the
>>> memory being written to is from @alloca.
>>>
>>> I am completely clueless. Any suggestions most welcome.
>>>
>>> Cheers,
>>>
>>> -- nikodemus
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170606/0e75a1a6/attachment.html>
Sean Silva via llvm-dev
2017-Jun-06 21:16 UTC
[llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT
That's useful to know that the static compilation code path works.
Furthermore, as expected from that:
52: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4
00000054: IMAGE_REL_I386_DIR32 _foo
It looks like the offset `4` of the second field of your struct is correct
in the object file, so this does seem to be a problem in the JIT-specific
linking/loading.
Can you try debugging
into lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFI386.h to see
if the relocation is getting applied correctly in the context of your JIT?
You may be able to repro this more easily using `lli`. It has a `-jit-kind`
argument that should get you into the JIT codepath. (see
test/ExecutionEngine/{MCJIT,ORCMCJIT}/)
-- Sean Silva
On Tue, Jun 6, 2017 at 1:09 AM, Nikodemus Siivola <
nikodemus at random-state.net> wrote:
> This is on Windows 10: didn't yet manage to get a 64-bit toolchain set
up
> that agreed on everything necessary.
>
> Dumped bitcode, but when I did that everything landed in the same module
> (normally the global is defined in a different module then its uses) -->
> the relocations are different... different enough that when I loaded the
> bitcode back in and handed the single module to JIT it worked fine.
>
> I'll try to dump a case where the definition is in a different module
> tomorrow.
>
> Anyhow, below is what clang-cl turned the bitcode from my IR into --
> probably not very useful though as this code does what it should...
>
> $ llvm-objdump.exe -r -d test.o
>
> test.o: file format COFF-i386
>
> Disassembly of section .text:
> .text:
> 0: 00 00 addb %al, (%eax)
> 00000000: IMAGE_REL_I386_DIR32 _XEP:setfoo
> 2: 00 00 addb %al, (%eax)
>
> _setfoo:
> 4: 56 pushl %esi
> 5: 83 ec 40 subl $64, %esp
> 8: 89 e0 movl %esp, %eax
> a: c7 00 00 00 00 00 movl $0, (%eax)
> 0000000c: IMAGE_REL_I386_DIR32 _foo
> 10: e8 00 00 00 00 calll 0 <_setfoo+0x11>
> 00000011: IMAGE_REL_I386_REL32 _debugPointer
> 15: 89 e1 movl %esp, %ecx
> 17: c7 01 00 00 00 00 movl $0, (%ecx)
> 00000019: IMAGE_REL_I386_DIR32 _foo
> 1d: 89 44 24 3c movl %eax, 60(%esp)
> 21: 89 54 24 38 movl %edx, 56(%esp)
> 25: e8 00 00 00 00 calll 0 <_setfoo+0x26>
> 00000026: IMAGE_REL_I386_REL32 _debugInt
> 2a: c7 05 00 00 00 00 00 00 00 00 movl $0, 0
> 0000002c: IMAGE_REL_I386_DIR32 _foo
> 00000030: IMAGE_REL_I386_DIR32 _JazzFixnumClass
> 34: b9 00 00 00 00 movl $0, %ecx
> 00000035: IMAGE_REL_I386_DIR32 _JazzFixnumClass
> 39: 89 e6 movl %esp, %esi
> 3b: c7 06 04 00 00 00 movl $4, (%esi)
> 0000003d: IMAGE_REL_I386_DIR32 _foo
> 41: 89 44 24 34 movl %eax, 52(%esp)
> 45: 89 54 24 30 movl %edx, 48(%esp)
> 49: 89 4c 24 2c movl %ecx, 44(%esp)
> 4d: e8 00 00 00 00 calll 0 <_setfoo+0x4E>
> 0000004e: IMAGE_REL_I386_REL32 _debugInt
> 52: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4
> 00000054: IMAGE_REL_I386_DIR32 _foo
> 5c: 89 e1 movl %esp, %ecx
> 5e: c7 01 00 00 00 00 movl $0, (%ecx)
> 00000060: IMAGE_REL_I386_DIR32 _foo
> 64: 89 44 24 28 movl %eax, 40(%esp)
> 68: 89 54 24 24 movl %edx, 36(%esp)
> 6c: e8 00 00 00 00 calll 0 <_setfoo+0x6D>
> 0000006d: IMAGE_REL_I386_REL32 _debugPointer
> 71: c7 05 00 00 00 00 00 00 00 00 movl $0, 0
> 00000073: IMAGE_REL_I386_DIR32 _foo
> 00000077: IMAGE_REL_I386_DIR32 _JazzFixnumClass
> 7b: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4
> 0000007d: IMAGE_REL_I386_DIR32 _foo
> 85: 89 e1 movl %esp, %ecx
> 87: c7 01 00 00 00 00 movl $0, (%ecx)
> 00000089: IMAGE_REL_I386_DIR32 _foo
> 8d: 89 44 24 20 movl %eax, 32(%esp)
> 91: 89 54 24 1c movl %edx, 28(%esp)
> 95: e8 00 00 00 00 calll 0 <_setfoo+0x96>
> 00000096: IMAGE_REL_I386_REL32 _debugPointer
> 9a: 89 e1 movl %esp, %ecx
> 9c: c7 41 08 d5 00 00 00 movl $213, 8(%ecx)
> a3: c7 41 04 00 00 00 00 movl $0, 4(%ecx)
> 000000a6: IMAGE_REL_I386_DIR32 _JazzFixnumClass
> aa: c7 01 00 00 00 00 movl $0, (%ecx)
> 000000ac: IMAGE_REL_I386_DIR32 _foo
> b0: 89 44 24 18 movl %eax, 24(%esp)
> b4: 89 54 24 14 movl %edx, 20(%esp)
> b8: e8 00 00 00 00 calll 0 <_setfoo+0xB9>
> 000000b9: IMAGE_REL_I386_REL32 _setGlobal
> bd: 89 e0 movl %esp, %eax
> bf: c7 00 00 00 00 00 movl $0, (%eax)
> 000000c1: IMAGE_REL_I386_DIR32 _foo
> c5: e8 00 00 00 00 calll 0 <_setfoo+0xC6>
> 000000c6: IMAGE_REL_I386_REL32 _debugPointer
> ca: b9 d5 00 00 00 movl $213, %ecx
> cf: 8b 74 24 2c movl 44(%esp), %esi
> d3: 89 44 24 10 movl %eax, 16(%esp)
> d7: 89 f0 movl %esi, %eax
> d9: 89 54 24 0c movl %edx, 12(%esp)
> dd: 89 ca movl %ecx, %edx
> df: 83 c4 40 addl $64, %esp
> e2: 5e popl %esi
> e3: c3 retl
> e4: 66 66 66 2e 0f 1f 84 00 00 00 00 00 nopw
> %cs:(%eax,%eax)
>
> _XEP:setfoo:
> f0: 8b 44 24 04 movl 4(%esp), %eax
> f4: 83 f8 00 cmpl $0, %eax
> f7: 0f 84 05 00 00 00 je 5 <_XEP:setfoo+0x12>
> fd: e8 00 00 00 00 calll 0 <_XEP:setfoo+0x12>
> 000000fe: IMAGE_REL_I386_REL32 _typeError
> 102: e8 00 00 00 00 calll 0 <_XEP:setfoo+0x17>
> 00000103: IMAGE_REL_I386_REL32 _setfoo
> 107: c3 retl
> 108: 0f 1f 84 00 00 00 00 00 nopl (%eax,%eax)
> 110: 00 00 addb %al, (%eax)
> 00000110: IMAGE_REL_I386_DIR32 _XEP:getfoo
> 112: 00 00 addb %al, (%eax)
>
> _getfoo:
> 114: 50 pushl %eax
> 115: 89 e0 movl %esp, %eax
> 117: c7 00 00 00 00 00 movl $0, (%eax)
> 00000119: IMAGE_REL_I386_DIR32 _foo
> 11d: e8 00 00 00 00 calll 0 <_getfoo+0xE>
> 0000011e: IMAGE_REL_I386_REL32 _getGlobal
> 122: 59 popl %ecx
> 123: c3 retl
> 124: 66 66 66 2e 0f 1f 84 00 00 00 00 00 nopw
> %cs:(%eax,%eax)
>
> _XEP:getfoo:
> 130: 8b 44 24 04 movl 4(%esp), %eax
> 134: 83 f8 00 cmpl $0, %eax
> 137: 0f 84 05 00 00 00 je 5 <_XEP:getfoo+0x12>
> 13d: e8 00 00 00 00 calll 0 <_XEP:getfoo+0x12>
> 0000013e: IMAGE_REL_I386_REL32 _typeError
> 142: e8 00 00 00 00 calll 0 <_XEP:getfoo+0x17>
> 00000143: IMAGE_REL_I386_REL32 _getfoo
> 147: c3 retl
>
>
> On Tue, Jun 6, 2017 at 3:18 AM, Sean Silva <chisophugis at gmail.com>
wrote:
>
>>
>>
>> On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola <
>> nikodemus at random-state.net> wrote:
>>
>>> Uh. Turns out that if I hide the pointer to @foo from LLVM by
passing it
>>> through an opaque identity function ... then everything works fine.
>>>
>>> Is this a bug in LLVM or is there some magic involving globals
I'm
>>> misunderstanding?
>>>
>>
>> This looks like a bug in the handling of constant GEP's.
Specifically the
>> `getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32
1)`
>> used to calculate the address of the integer inside the struct. Your
>> observation "The bizarre thing is that even this looks correct:
the
>> debugInt is called first with @foo, then @foo+4, and the stores seem to
be
>> going to the right addresses as well: @foo and @foo+4!" at the
level of the
>> MachineInstr dump rules out problems before that.
>>
>> After MachineInstr comes MC to emit the object file, but `foo+4` is one
>> of the most basic relocation types, so I doubt that there's a bug
in the
>> lowering there or else "everything" would be broken.
>> Just to verify though, checking assembly of a small example across
32-bit
>> targets of all 3 object file formats looks fine at a glance (MC is
getting
>> the +4 addend, though you would need to run `llvm-objdump -d -r` to see
the
>> actual relocation in the binary) .
>> https://godbolt.org/g/0Owzf5
>> https://godbolt.org/g/n0qzmg
>> https://godbolt.org/g/kAOvkQ
>>
>> Beyond MC, you already have your static object file. If that is fine,
>> then in a JIT context you might be running into issues with
>> RuntimeDyld. The actual GEP's that clang generates are identical to
the
>> ones in your code, further suggesting that this is JIT specific and
that
>> static links are unaffected (if you could verify that, it would help to
>> narrow down the possibilities).
>> Maybe look at the output of `llvm-objdump -d -r` on a static .o file
>> generated from your IR and see where the relocation is handled
>> in lib/ExecutionEngine/RuntimeDyld (this will depend on your platform;
>> grepping for the name of the relocation shown by llvm-objdump should
find
>> the right code to look at).
>>
>> By the way, what platform are you JIT'ing on? I noticed that it is
a
>> 32-bit target, and I suspect that the 32-bit support in the JIT
>> infrastructure isn't as well tested / commonly used as the 64-bit
code,
>> possibly explaining why this sort of bug could sneak through.
>>
>> -- Sean Silva
>>
>>
>>>
>>> define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*,
i32 } (i32)*
>>> @"XEP:__anonToplevel/0" {
>>> entry:
>>> %0 = call { i8*, i32 }* @identity({ i8*, i32 }* nonnull @foo)
>>> %1 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
>>> %2 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 0
>>> %3 = ptrtoint { i8*, i32 }* %0 to i32
>>> %4 = call { i8*, i32 } @debugInt(i32 %3)
>>> store i8* @FixnumClass, i8** %2, align 4
>>> %5 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 1
>>> %6 = ptrtoint i32* %5 to i32
>>> %7 = call { i8*, i32 } @debugInt(i32 %6)
>>> store i32 123, i32* %5, align 4
>>> %8 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
>>> store i8* @FixnumClass, i8** %2, align 4
>>> store i32 123, i32* %5, align 4
>>> %9 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
>>> call void @setGlobal({ i8*, i32 }* %0, { i8*, i32 } { i8*
>>> @FixnumClass, i32 123 })
>>> %10 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
>>> ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
>>> }
>>>
>>> Output, now with correct addresses out of the GEPs, and memory
being
>>> modified as expected:
>>>
>>> p = 02F80000
>>> class: 00000000
>>> datum: 00000000
>>> x = 02F80000
>>> x = 02F80004
>>> p = 02F80000
>>> class: 028D3E98
>>> datum: 0000007B
>>> p = 02F80000
>>> class: 028D3E98
>>> datum: 0000007B
>>> p = 02F80000
>>> class: 028D3E98
>>> datum: 0000007B
>>>
>>> Cheers,
>>>
>>> -- nikodemus
>>>
>>>
>>> On Mon, Jun 5, 2017 at 10:57 PM, Nikodemus Siivola <
>>> nikodemus at random-state.net> wrote:
>>>
>>>> Since the getelementptrs were implicitly generated by the
>>>> CreateStore/Load I'm not sure how to get access to them.
>>>>
>>>> So I hacked the assignment to be done thrice: once using a
manual
>>>> decomposition into two GEPs and stores, once using the
"big" CreateStore,
>>>> once via the setGlobal function, printing addresses and memory
contents at
>>>> each point to the degree that I have access to them.
>>>>
>>>> It seems the following GEPs compute the same address?! I can
buy myself
>>>> not understanding how GEP works and doing it wrong, but
>>>> builder.CreateStore() creates what look like identical GEPs
implicitly...
>>>>
>>>> i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo,
i32 0,
>>>> i32 0), align 4
>>>> i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo,
i32 0,
>>>> i32 1), align 4
>>>>
>>>> The details.
>>>>
>>>> This is the relevant part from my codegen:
>>>>
>>>> auto ty = val->getType();
>>>> cout << "val type:" << endl;
>>>> ty->dump();
>>>> cout << "ptr type:" << endl;
>>>> ptr->getType()->dump();
>>>> // Print memory
>>>> ctx.EmitCall1("debugPointer", ptr);
>>>> // Set class pointer
>>>> auto c = ctx.bld.CreateExtractValue(val, 0,
"class");
>>>> auto cp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0,
0);
>>>> auto cx = ctx.bld.CreatePtrToInt(cp,
ctx.Int32Type());
>>>> ctx.EmitCall1("debugInt", cx);
>>>> ctx.bld.CreateStore(c, cp);
>>>> // Set datum
>>>> auto d = ctx.bld.CreateExtractValue(val, 1,
"datum");
>>>> auto dp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0,
1);
>>>> auto dx = ctx.bld.CreatePtrToInt(dp,
ctx.Int32Type());
>>>> ctx.EmitCall1("debugInt", dx);
>>>> ctx.bld.CreateStore(d, dp);
>>>> // Print memory
>>>> ctx.EmitCall1("debugPointer", ptr);
>>>> // Do the same with a single store
>>>> ctx.bld.CreateStore(val, ptr);
>>>> // Print memory
>>>> ctx.EmitCall1("debugPointer", ptr);
>>>> // Call out
>>>> ctx.EmitCall2("setGlobal", ptr, val);
>>>> // Print memory
>>>> ctx.EmitCall1("debugPointer", ptr);
>>>>
>>>> Here is the compile-time output showing types of the value and
the
>>>> pointer:
>>>>
>>>> val type:
>>>> { i8*, i32 }
>>>> ptr type:
>>>> { i8*, i32 }*
>>>>
>>>> Here is the IR dump for the function (after a couple of
passes), right
>>>> before it's fed to the JIT:
>>>>
>>>> define { i8*, i32 } @"__anonToplevel/0"() prefix {
i8*, i32 } (i32)*
>>>> @"XEP:__anonToplevel/0" {
>>>> entry:
>>>> %0 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull
@foo)
>>>> %1 = call { i8*, i32 } @debugInt(i32 ptrtoint ({ i8*, i32 }*
@foo to
>>>> i32))
>>>> store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*,
i32 }, {
>>>> i8*, i32 }* @foo, i32 0, i32 0), align 4
>>>> %2 = call { i8*, i32 } @debugInt(i32 ptrtoint (i32*
getelementptr
>>>> inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1) to
i32))
>>>> store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, {
i8*, i32
>>>> }* @foo, i32 0, i32 1), align 4
>>>> %3 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull
@foo)
>>>> store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*,
i32 }, {
>>>> i8*, i32 }* @foo, i32 0, i32 0), align 4
>>>> store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, {
i8*, i32
>>>> }* @foo, i32 0, i32 1), align 4
>>>> %4 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull
@foo)
>>>> call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 }
{ i8*
>>>> @FixnumClass, i32 123 })
>>>> %5 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull
@foo)
>>>> ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
>>>> }
>>>>
>>>> Here is the runtime from calling the JITed function, including
memory
>>>> addresses and contents, with my annotations:
>>>>
>>>> # Before
>>>> p = 03C10000
>>>> class: 00000000
>>>> datum: 00000000
>>>> # Should be address of the class slot --> correct
>>>> x = 03C10000
>>>> # Should be address of the datum slot, ie address of class slot
+ 4 -->
>>>> incorrect
>>>> x = 03C10000
>>>> # Yeah, both values want to class slot, so actual class pointer
got
>>>> clobbered
>>>> p = 03C10000
>>>> class: 0000007B
>>>> datum: 00000000
>>>> # Same result from the single CreateStore
>>>> p = 03C10000
>>>> class: 0000007B
>>>> datum: 00000000
>>>> # Calling out to setGlobal as in my first email works
>>>> p = 03C10000
>>>> class: 039D2E98
>>>> datum: 0000007B
>>>>
>>>> Finally, I didn't manage nice disassembly yet, so here is
the last
>>>> output from --print-after-all for the function. The bizarre
thing is that
>>>> even this looks correct: the debugInt is called first with
@foo, then
>>>> @foo+4, and the stores seem to be going to the right addresses
as well:
>>>> @foo and @foo+4!
>>>>
>>>> BB#0: derived from LLVM BB %entry
>>>> PUSHi32 <ga:@foo>, %ESP<imp-def>,
%ESP<imp-use>
>>>> CFI_INSTRUCTION <call frame instruction>
>>>> CALLpcrel32 <ga:@debugPointer>, <regmask %BH
%BL %BP %BPL %BX
>>>> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>,
%ESP<imp-def>,
>>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
%EFLAGS<imp-def,dead>
>>>> CFI_INSTRUCTION <call frame instruction>
>>>> PUSHi32 <ga:@foo>, %ESP<imp-def>,
%ESP<imp-use>
>>>> CFI_INSTRUCTION <call frame instruction>
>>>> CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL
%BP %BPL %BX %DI
>>>> %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>,
%ESP<imp-def>,
>>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
%EFLAGS<imp-def,dead>
>>>> CFI_INSTRUCTION <call frame instruction>
>>>> MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg,
>>>> <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({
i8*, i32 }, {
>>>> i8*, i32 }* @foo, i32 0, i32 0)]
>>>> PUSHi32 <ga:@foo+4>, %ESP<imp-def>,
%ESP<imp-use>
>>>> CFI_INSTRUCTION <call frame instruction>
>>>> CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL
%BP %BPL %BX %DI
>>>> %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>,
%ESP<imp-def>,
>>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
%EFLAGS<imp-def,dead>
>>>> CFI_INSTRUCTION <call frame instruction>
>>>> MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg,
123;
>>>> mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }*
@foo, i32 0,
>>>> i32 1)]
>>>> PUSHi32 <ga:@foo>, %ESP<imp-def>,
%ESP<imp-use>
>>>> CFI_INSTRUCTION <call frame instruction>
>>>> CALLpcrel32 <ga:@debugPointer>, <regmask %BH
%BL %BP %BPL %BX
>>>> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>,
%ESP<imp-def>,
>>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
%EFLAGS<imp-def,dead>
>>>> CFI_INSTRUCTION <call frame instruction>
>>>> MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg,
>>>> <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({
i8*, i32 }, {
>>>> i8*, i32 }* @foo, i32 0, i32 0)]
>>>> MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg,
123;
>>>> mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }*
@foo, i32 0,
>>>> i32 1)]
>>>> PUSHi32 <ga:@foo>, %ESP<imp-def>,
%ESP<imp-use>
>>>> CFI_INSTRUCTION <call frame instruction>
>>>> CALLpcrel32 <ga:@debugPointer>, <regmask %BH
%BL %BP %BPL %BX
>>>> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>,
%ESP<imp-def>,
>>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
%EFLAGS<imp-def,dead>
>>>> CFI_INSTRUCTION <call frame instruction>
>>>> PUSH32i8 123, %ESP<imp-def>, %ESP<imp-use>
>>>> CFI_INSTRUCTION <call frame instruction>
>>>> PUSHi32 <ga:@JazzFixnumClass>,
%ESP<imp-def>, %ESP<imp-use>
>>>> CFI_INSTRUCTION <call frame instruction>
>>>> PUSHi32 <ga:@foo>, %ESP<imp-def>,
%ESP<imp-use>
>>>> CFI_INSTRUCTION <call frame instruction>
>>>> CALLpcrel32 <ga:@setGlobal>, <regmask %BH %BL
%BP %BPL %BX %DI
>>>> %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>,
%ESP<imp-def>
>>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 12,
>>>> %EFLAGS<imp-def,dead>
>>>> CFI_INSTRUCTION <call frame instruction>
>>>> PUSHi32 <ga:@foo>, %ESP<imp-def>,
%ESP<imp-use>
>>>> CFI_INSTRUCTION <call frame instruction>
>>>> CALLpcrel32 <ga:@debugPointer>, <regmask %BH
%BL %BP %BPL %BX
>>>> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>,
%ESP<imp-def>,
>>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
%EFLAGS<imp-def,dead>
>>>> CFI_INSTRUCTION <call frame instruction>
>>>> %EAX<def> = MOV32ri <ga:@JazzFixnumClass>
>>>> %EDX<def> = MOV32ri 123
>>>> RETL %EAX<kill>, %EDX<kill>
>>>>
>>>> Also, I have essentially identical code working perfectly fine
when the
>>>> memory being written to is from @alloca.
>>>>
>>>> I am completely clueless. Any suggestions most welcome.
>>>>
>>>> Cheers,
>>>>
>>>> -- nikodemus
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170606/50ffa352/attachment-0001.html>
Maybe Matching Threads
- [newbie] trouble with global variables and CreateLoad/Store in JIT
- [newbie] trouble with global variables and CreateLoad/Store in JIT
- [newbie] trouble with global variables and CreateLoad/Store in JIT
- [newbie] trouble with global variables and CreateLoad/Store in JIT
- [LLVMdev] Implement implicit TLS on Windows - need advice