Zhao, Weiming via llvm-dev
2017-Apr-05 17:14 UTC
[llvm-dev] compiler-rt, v4.0: arm\udivsi3.S broken for division by zero
Yes, it's a bug. Please review https://reviews.llvm.org/D31716 On 4/5/2017 3:50 AM, Renato Golin wrote:> On 21 March 2017 at 18:32, Peter Jakubek via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> I think the current implementation for the call "bl __aeabi_idiv0" in >> builtins\arm\udivsi3.S is broken. >> At least for the case that __aeabi_idiv0 returns. (the provided >> implementation does) >> >> Since LR is not preserved, the following JMP(lr) results in an endless loop. >> >> Or is this an intentional change of the behavior? > Hi Peter, > > That is most certainly a bug. Weiming's patch was supposed to only > introduce Thumb1 code, not transform div0 into a busy loop. :) > > >> The file contains another implementation enabled by __ARM_ARCH_EXT_IDIV__. >> This uses "b" instead of "bl". >> (This works as in previous versions) > The comment on the patch makes that clear: > > bl __aeabi_idiv0 // due to relocation limit, can't use b. > > I'm not sure why the bottom one is fine and the top one isn't, wrt. > range, but we may have to add a trampoline here. > > Weiming, Saleem, comments? > > cheers, > --renato-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Peter Jakubek via llvm-dev
2017-Apr-05 18:40 UTC
[llvm-dev] compiler-rt, v4.0: arm\udivsi3.S broken for division by zero
I recommend the other incarnation of the call to divby0 at line 256 is also modified to use bl instead of b for the same reason. Or just remove one of the two to avoid confusion in the future? Cheers, Peter Am 05.04.2017 um 19:14 schrieb Zhao, Weiming:> Yes, it's a bug. > > Please review https://reviews.llvm.org/D31716 > > > On 4/5/2017 3:50 AM, Renato Golin wrote: >> On 21 March 2017 at 18:32, Peter Jakubek via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >>> I think the current implementation for the call "bl __aeabi_idiv0" in >>> builtins\arm\udivsi3.S is broken. >>> At least for the case that __aeabi_idiv0 returns. (the provided >>> implementation does) >>> >>> Since LR is not preserved, the following JMP(lr) results in an >>> endless loop. >>> >>> Or is this an intentional change of the behavior? >> Hi Peter, >> >> That is most certainly a bug. Weiming's patch was supposed to only >> introduce Thumb1 code, not transform div0 into a busy loop. :) >> >> >>> The file contains another implementation enabled by >>> __ARM_ARCH_EXT_IDIV__. >>> This uses "b" instead of "bl". >>> (This works as in previous versions) >> The comment on the patch makes that clear: >> >> bl __aeabi_idiv0 // due to relocation limit, can't use b. >> >> I'm not sure why the bottom one is fine and the top one isn't, wrt. >> range, but we may have to add a trampoline here. >> >> Weiming, Saleem, comments? >> >> cheers, >> --renato >
Zhao, Weiming via llvm-dev
2017-Apr-05 19:21 UTC
[llvm-dev] compiler-rt, v4.0: arm\udivsi3.S broken for division by zero
Using "b " is correct as it's a tail call. However, for Thumb1, it can't use "b" due to +/- 2 KB range. "BL label" has a much wider range of +/- 4MB. The reason we have two blocks of "divby0": conditional branch in Thumb1 has even shorter range -252 to +258, so it can't jump to the block around line 258. So we cloned the "divby0" block. I think we can fold the 2nd "divby0" into line 39 to make it clear. Thanks, Weiming On 4/5/2017 11:40 AM, Peter Jakubek wrote:> I recommend the other incarnation of the call to divby0 at line 256 is > also > modified to use bl instead of b for the same reason. > > Or just remove one of the two to avoid confusion in the future? > > Cheers, > > Peter > > Am 05.04.2017 um 19:14 schrieb Zhao, Weiming: >> Yes, it's a bug. >> >> Please review https://reviews.llvm.org/D31716 >> >> >> On 4/5/2017 3:50 AM, Renato Golin wrote: >>> On 21 March 2017 at 18:32, Peter Jakubek via llvm-dev >>> <llvm-dev at lists.llvm.org> wrote: >>>> I think the current implementation for the call "bl __aeabi_idiv0" in >>>> builtins\arm\udivsi3.S is broken. >>>> At least for the case that __aeabi_idiv0 returns. (the provided >>>> implementation does) >>>> >>>> Since LR is not preserved, the following JMP(lr) results in an >>>> endless loop. >>>> >>>> Or is this an intentional change of the behavior? >>> Hi Peter, >>> >>> That is most certainly a bug. Weiming's patch was supposed to only >>> introduce Thumb1 code, not transform div0 into a busy loop. :) >>> >>> >>>> The file contains another implementation enabled by >>>> __ARM_ARCH_EXT_IDIV__. >>>> This uses "b" instead of "bl". >>>> (This works as in previous versions) >>> The comment on the patch makes that clear: >>> >>> bl __aeabi_idiv0 // due to relocation limit, can't use b. >>> >>> I'm not sure why the bottom one is fine and the top one isn't, wrt. >>> range, but we may have to add a trampoline here. >>> >>> Weiming, Saleem, comments? >>> >>> cheers, >>> --renato >>-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation