Moritz Angermann via llvm-dev
2017-Dec-01 10:30 UTC
[llvm-dev] Some strange i64 behavior with arm 32bit. (Raspberry Pi)
Hi Tim, thanks for the swift response! @debug is defined in the same module, which makes this all the more confusing. The target information from the working example are: target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64" target triple = "armv6kz--linux-gnueabihf" from the ghc produced module: target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64" target triple = "arm-unknown-linux-gnueabihf" However there ones more thing, I could think of, arm does allow mixed mode I believe. And as such as the code from the ghc produced module is called from outside of the module, could the endianness be set there prior to entering the function? The working module contains the main directly and is not called from a main function in a different module. I've also tried to define a regular c function with the same code and called that from within the ghccc function with the same (incorrect) results. Any further ideas I could expore? Cheers, Moritz> On Dec 1, 2017, at 4:26 PM, Tim Northover <t.p.northover at gmail.com> wrote: > > Hi Moritz, >> If someone could offer some hint, where to look further for debugging this, I'd very much appreciate the advice! >> I'm a bit lost right now how to figure out why I end up getting swapped words. > > If one file was compiled for big-endian ARM and the other for > little-endian that could be the result. I'm not aware of any other > possible cause and from local tests I don't think the "ghccc" alone > explains the difference. > > So maybe some glitch in how GHC was configured on your system? What's > the triple at the top of the GHC module and the module containing the > definition of @debug? > > Cheers. > > Tim.
Moritz Angermann via llvm-dev
2017-Dec-03 07:26 UTC
[llvm-dev] Some strange i64 behavior with arm 32bit. (Raspberry Pi)
Alright, so after some more debugging (injeting print statements at the llvm ir level), I came across the following: GHC has the following code for the C into STG and back bridge: `RunStg`, which is defined in https://github.com/ghc/ghc/blob/master/rts/StgCRun.c; the resulting llvm ir ends up being: ``` ; Function Attrs: nounwind define hidden %struct.StgRegTable* @StgRun(i8* ()* ()*, %struct.StgRegTable*) local_unnamed_addr #0 { %3 = tail call %struct.StgRegTable* asm sideeffect "stmfd sp!, {r4-r11, ip, lr}\0A\09vstmdb sp!, {d8-d11}\0A\09sub sp, sp, $3\0A\09mov r4, $2\0A\09bx $1\0A\09.globl StgReturn\0A\09.type StgReturn, %function\0AStgReturn:\0A\09add sp, sp, $3\0A\09mov $0, r7\0A\09vldmia sp!, {d8-d11}\0A\09ldmfd sp!, {r4-r11, ip, lr}\0A\09", "=r,r,r,i,~{r4},~{r5},~{r6},~{r7},~{r8},~{r9},~{r10},~{r12},~{lr}"(i8* ()* ()* %0, %struct.StgRegTable* %1, i32 8192) #1, !srcloc !3 ret %struct.StgRegTable* %3 } ``` The assembly for better readability reads: stmfd sp!, {r4-r11, ip, lr} vstmdb sp!, {d8-d11} sub sp, sp, $3 mov r4, $2 bx $1 .globl StgReturn .type StgReturn, %function StgReturn: add sp, sp, $3 mov $0, r7 vldmia sp!, {d8-d11} ldmfd sp!, {r4-r11, ip, lr} And when this results in the following assembly being emitted (for armv-unknown-linux-gnueabihf): ``` 00000074 <StgRun>: 74: e92d4ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, lr} 78: e28db01c add fp, sp, #28, 0 7c: e92d5ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr} 80: ed2d8b08 vpush {d8-d11} 84: e24dda02 sub sp, sp, #8192 ; 0x2000 88: e1a04001 mov r4, r1 8c: e12fff10 bx r0 00000090 <StgReturn>: 90: e28dda02 add sp, sp, #8192 ; 0x2000 94: e1a00007 mov r0, r7 98: ecbd8b08 vpop {d8-d11} 9c: e8bd5ff0 pop {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr} a0: e8bd8ff0 pop {r4, r5, r6, r7, r8, r9, sl, fp, pc} ``` By adding extra ptinf statements, I found out that adding a `printf` statement after the assembly and before the `ret`, the generated code looks slightly different: ``` 00000074 <StgRun>: 74: e92d4ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, lr} 78: e28db01c add fp, sp, #28, 0 7c: e24dd004 sub sp, sp, #4, 0 80: e92d5ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr} 84: ed2d8b08 vpush {d8-d11} 88: e24dda02 sub sp, sp, #8192 ; 0x2000 8c: e1a04001 mov r4, r1 90: e12fff10 bx r0 00000094 <StgReturn>: 94: e28dda02 add sp, sp, #8192 ; 0x2000 98: e1a00007 mov r0, r7 9c: ecbd8b08 vpop {d8-d11} a0: e8bd5ff0 pop {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr} a4: e58d0000 str r0, [sp] a8: e3a00002 mov r0, #2, 0 ac: ebfffffe bl 44 <.LdebugEnd> b0: e59d0000 ldr r0, [sp] b4: e24bd01c sub sp, fp, #28, 0 b8: e8bd8ff0 pop {r4, r5, r6, r7, r8, r9, sl, fp, pc} ``` and we can see that an additional `sp = sp - 4` was added. With the log statement in StgRun, subsequent log statements so far work. Now I wonder a) could I write this logic in llvm ir directly, without having to resort to assembly? b) could I force llvm to emit 32 instead of 28 somehow? to make sure my sp is 8byte aligned? Of course I'm happy to take any other ideas as well. Cheers, Moritz> On Dec 1, 2017, at 6:30 PM, Moritz Angermann <moritz.angermann at gmail.com> wrote: > > Hi Tim, > thanks for the swift response! > > @debug is defined in the same module, which makes this all the more confusing. > > > The target information from the working example are: > target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64" > target triple = "armv6kz--linux-gnueabihf" > > > from the ghc produced module: > target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64" > target triple = "arm-unknown-linux-gnueabihf" > > However there ones more thing, I could think of, arm does allow mixed mode > I believe. And as such as the code from the ghc produced module is called > from outside of the module, could the endianness be set there prior to > entering the function? > > The working module contains the main directly and is not called from a main > function in a different module. > > I've also tried to define a regular c function with the same code and called > that from within the ghccc function with the same (incorrect) results. > > Any further ideas I could expore? > > > Cheers, > Moritz > >> On Dec 1, 2017, at 4:26 PM, Tim Northover <t.p.northover at gmail.com> wrote: >> >> Hi Moritz, >>> If someone could offer some hint, where to look further for debugging this, I'd very much appreciate the advice! >>> I'm a bit lost right now how to figure out why I end up getting swapped words. >> >> If one file was compiled for big-endian ARM and the other for >> little-endian that could be the result. I'm not aware of any other >> possible cause and from local tests I don't think the "ghccc" alone >> explains the difference. >> >> So maybe some glitch in how GHC was configured on your system? What's >> the triple at the top of the GHC module and the module containing the >> definition of @debug? >> >> Cheers. >> >> Tim. >
Moritz Angermann via llvm-dev
2017-Dec-03 13:52 UTC
[llvm-dev] Some strange i64 behavior with arm 32bit. (Raspberry Pi)
Ok... after some more digging it turned out that the underlying issue was a bug in my code generator. For the record I'll just note down the issue. My code generator generated /unpacked/ structs for simplicity reasons, and because I though--incorrectly--that we (GHC) generated GEP accessors. We don't! GHC computes absolute offsets into those structs, as such generating /unpacked/ structs (e.g. { i32, i64 }, does not guarantee that the i64 is at offset +4; there might be padding) is futile and all I needed to change was to generate packed instead of unpacked structs. However I still believe that the code gen for the C to STG bridge should add an `sub sp, sp, 4` line to the inline assembly *if* it emits the `vstmdb sp!, {d8-d11}` part, to ensure that the stack is 8byte aligned. Thank you. Cheers, Moritz> On Dec 3, 2017, at 3:26 PM, Moritz Angermann <moritz.angermann at gmail.com> wrote: > > Alright, so after some more debugging (injeting print statements at the llvm ir level), > I came across the following: > > GHC has the following code for the C into STG and back bridge: `RunStg`, which is defined > in https://github.com/ghc/ghc/blob/master/rts/StgCRun.c; the resulting llvm ir ends up being: > > ``` > ; Function Attrs: nounwind > define hidden %struct.StgRegTable* @StgRun(i8* ()* ()*, %struct.StgRegTable*) local_unnamed_addr #0 { > > %3 = tail call %struct.StgRegTable* asm sideeffect "stmfd sp!, {r4-r11, ip, lr}\0A\09vstmdb sp!, {d8-d11}\0A\09sub sp, sp, $3\0A\09mov r4, $2\0A\09bx $1\0A\09.globl StgReturn\0A\09.type StgReturn, %function\0AStgReturn:\0A\09add sp, sp, $3\0A\09mov $0, r7\0A\09vldmia sp!, {d8-d11}\0A\09ldmfd sp!, {r4-r11, ip, lr}\0A\09", "=r,r,r,i,~{r4},~{r5},~{r6},~{r7},~{r8},~{r9},~{r10},~{r12},~{lr}"(i8* ()* ()* %0, %struct.StgRegTable* %1, i32 8192) #1, !srcloc !3 > > ret %struct.StgRegTable* %3 > } > ``` > > The assembly for better readability reads: > > stmfd sp!, {r4-r11, ip, lr} > vstmdb sp!, {d8-d11} > sub sp, sp, $3 > mov r4, $2 > bx $1 > .globl StgReturn > .type StgReturn, %function > StgReturn: > add sp, sp, $3 > mov $0, r7 > vldmia sp!, {d8-d11} > ldmfd sp!, {r4-r11, ip, lr} > > And when this results in the following assembly being emitted (for armv-unknown-linux-gnueabihf): > > ``` > 00000074 <StgRun>: > 74: e92d4ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, lr} > 78: e28db01c add fp, sp, #28, 0 > 7c: e92d5ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr} > 80: ed2d8b08 vpush {d8-d11} > 84: e24dda02 sub sp, sp, #8192 ; 0x2000 > 88: e1a04001 mov r4, r1 > 8c: e12fff10 bx r0 > > 00000090 <StgReturn>: > 90: e28dda02 add sp, sp, #8192 ; 0x2000 > 94: e1a00007 mov r0, r7 > 98: ecbd8b08 vpop {d8-d11} > 9c: e8bd5ff0 pop {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr} > a0: e8bd8ff0 pop {r4, r5, r6, r7, r8, r9, sl, fp, pc} > ``` > > By adding extra ptinf statements, I found out that adding a `printf` statement after the assembly and before > the `ret`, the generated code looks slightly different: > > ``` > 00000074 <StgRun>: > 74: e92d4ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, lr} > 78: e28db01c add fp, sp, #28, 0 > 7c: e24dd004 sub sp, sp, #4, 0 > 80: e92d5ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr} > 84: ed2d8b08 vpush {d8-d11} > 88: e24dda02 sub sp, sp, #8192 ; 0x2000 > 8c: e1a04001 mov r4, r1 > 90: e12fff10 bx r0 > > 00000094 <StgReturn>: > 94: e28dda02 add sp, sp, #8192 ; 0x2000 > 98: e1a00007 mov r0, r7 > 9c: ecbd8b08 vpop {d8-d11} > a0: e8bd5ff0 pop {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr} > a4: e58d0000 str r0, [sp] > a8: e3a00002 mov r0, #2, 0 > ac: ebfffffe bl 44 <.LdebugEnd> > b0: e59d0000 ldr r0, [sp] > b4: e24bd01c sub sp, fp, #28, 0 > b8: e8bd8ff0 pop {r4, r5, r6, r7, r8, r9, sl, fp, pc} > ``` > > and we can see that an additional `sp = sp - 4` was added. > > With the log statement in StgRun, subsequent log statements so far work. > > Now I wonder > a) could I write this logic in llvm ir directly, > without having to resort to assembly? > b) could I force llvm to emit 32 instead of 28 somehow? to make sure > my sp is 8byte aligned? > > Of course I'm happy to take any other ideas as well. > > Cheers, > Moritz > >> On Dec 1, 2017, at 6:30 PM, Moritz Angermann <moritz.angermann at gmail.com> wrote: >> >> Hi Tim, >> thanks for the swift response! >> >> @debug is defined in the same module, which makes this all the more confusing. >> >> >> The target information from the working example are: >> target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64" >> target triple = "armv6kz--linux-gnueabihf" >> >> >> from the ghc produced module: >> target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64" >> target triple = "arm-unknown-linux-gnueabihf" >> >> However there ones more thing, I could think of, arm does allow mixed mode >> I believe. And as such as the code from the ghc produced module is called >> from outside of the module, could the endianness be set there prior to >> entering the function? >> >> The working module contains the main directly and is not called from a main >> function in a different module. >> >> I've also tried to define a regular c function with the same code and called >> that from within the ghccc function with the same (incorrect) results. >> >> Any further ideas I could expore? >> >> >> Cheers, >> Moritz >> >>> On Dec 1, 2017, at 4:26 PM, Tim Northover <t.p.northover at gmail.com> wrote: >>> >>> Hi Moritz, >>>> If someone could offer some hint, where to look further for debugging this, I'd very much appreciate the advice! >>>> I'm a bit lost right now how to figure out why I end up getting swapped words. >>> >>> If one file was compiled for big-endian ARM and the other for >>> little-endian that could be the result. I'm not aware of any other >>> possible cause and from local tests I don't think the "ghccc" alone >>> explains the difference. >>> >>> So maybe some glitch in how GHC was configured on your system? What's >>> the triple at the top of the GHC module and the module containing the >>> definition of @debug? >>> >>> Cheers. >>> >>> Tim. >> >