I've been diagnosing this bug:
http://llvm.org/bugs/show_bug.cgi?id=17827
Summary: I think the following program miscompiles at -O1 because the fact
that 'f0' is a signed 3-bit value is lost in the unoptimized LLVM IR.
How
do we fix this?
$ cat bitfield.c
/* %struct.S = type { i8, [3 x i8] } ??? */
struct S {
  int f0:3;
} a;
int foo (int p) {
  struct S c = a;
  c.f0 = p & 6;
  return c.f0 < 1;
}
int main () {
  return foo (4);
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131115/88eaeac4/attachment.html>
On Nov 15, 2013, at 3:42 PM, Kay Tiong Khoo <kkhoo at perfwizard.com> wrote:> I've been diagnosing this bug: > http://llvm.org/bugs/show_bug.cgi?id=17827 > > Summary: I think the following program miscompiles at -O1 because the fact that 'f0' is a signed 3-bit value is lost in the unoptimized LLVM IR. How do we fix this?I don’t have the C/C++ standards in front of me but IIRC whether a char/short/int/long/long long bitfield is signed or unsigned is implementation defined. You need to explicitly specify signed or unsigned in order to have any guarantee of the signedness, e.g. signed int.> > $ cat bitfield.c > /* %struct.S = type { i8, [3 x i8] } ??? */ > struct S { > int f0:3; > } a; > > int foo (int p) { > struct S c = a; > c.f0 = p & 6; > return c.f0 < 1; > } > > int main () { > return foo (4); > } > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131115/a6fbe46e/attachment.html>
I actually think it is a problem with the optimizer like Kay first thought. -instcombine seems turning "((x and 6) shl 5) slt 32" into "(x and 6) slt 1". If the comparison were unsigned or the shl had a nsw flag, I think this would be okay. Since none of these is true, I don't think this transformation is correct. H. On Sat, Nov 16, 2013 at 1:41 AM, Mark Lacey <mark.lacey at apple.com> wrote:> > On Nov 15, 2013, at 3:42 PM, Kay Tiong Khoo <kkhoo at perfwizard.com> wrote: > > I've been diagnosing this bug: > http://llvm.org/bugs/show_bug.cgi?id=17827 > > Summary: I think the following program miscompiles at -O1 because the fact > that 'f0' is a signed 3-bit value is lost in the unoptimized LLVM IR. How > do we fix this? > > > I don’t have the C/C++ standards in front of me but IIRC whether a > char/short/int/long/long long bitfield is signed or unsigned is > implementation defined. You need to explicitly specify signed or unsigned > in order to have any guarantee of the signedness, e.g. signed int. > > > $ cat bitfield.c > /* %struct.S = type { i8, [3 x i8] } ??? */ > struct S { > int f0:3; > } a; > > int foo (int p) { > struct S c = a; > c.f0 = p & 6; > return c.f0 < 1; > } > > int main () { > return foo (4); > } > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131116/173f9ab3/attachment.html>
Invalidating this bug with a language technicality would be a great way out. :) But I don't see anything in the standard to suggest that a plain 'int' bitfield is any different than a 'signed int' bitfield, and even if that was true, I don't see any difference in codegen whether I specify 'signed' explicitly or not. So at the least, clang or llvm still has a bug for the explicit 'signed' case from what I can tell. On Fri, Nov 15, 2013 at 8:41 PM, Mark Lacey <mark.lacey at apple.com> wrote:> > On Nov 15, 2013, at 3:42 PM, Kay Tiong Khoo <kkhoo at perfwizard.com> wrote: > > I've been diagnosing this bug: > http://llvm.org/bugs/show_bug.cgi?id=17827 > > Summary: I think the following program miscompiles at -O1 because the fact > that 'f0' is a signed 3-bit value is lost in the unoptimized LLVM IR. How > do we fix this? > > > I don’t have the C/C++ standards in front of me but IIRC whether a > char/short/int/long/long long bitfield is signed or unsigned is > implementation defined. You need to explicitly specify signed or unsigned > in order to have any guarantee of the signedness, e.g. signed int. > > > $ cat bitfield.c > /* %struct.S = type { i8, [3 x i8] } ??? */ > struct S { > int f0:3; > } a; > > int foo (int p) { > struct S c = a; > c.f0 = p & 6; > return c.f0 < 1; > } > > int main () { > return foo (4); > } > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131116/77a0515e/attachment.html>
Duncan P. N. Exon Smith
2013-Nov-16  21:55 UTC
[LLVMdev] struct with signed bitfield (PR17827)
On 2013 Nov 15, at 19:41, Mark Lacey <mark.lacey at apple.com> wrote:> I don’t have the C/C++ standards in front of me but IIRC whether a char/short/int/long/long long bitfield is signed or unsigned is implementation defined. You need to explicitly specify signed or unsigned in order to have any guarantee of the signedness, e.g. signed int.Section 3.9.1 of the C++11 standard [1] defines short/int/long/long long as signed. Bit-fields are discussed in 9.6 and have lots of implementation-defined behavior, but I don’t see anything about signedness. The ABI (e.g., [2]) defines whether char is signed or unsigned. [1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3485.pdf [2]: http://www.cs.tufts.edu/comp/40/readings/amd64-abi.pdf