Renato Golin
2011-Sep-22  10:29 UTC
[LLVMdev] question on difference of bitcode between C and C++
On 22 September 2011 03:30, Eli Friedman <eli.friedman at gmail.com> wrote:>> I am trying to find such difference of bitcode between C and C++. > > There isn't any difference in that sense... in IR, a constructor is > just a function call, a reference is just a pointer, etc.Hi Fei, While Clang (like others) lowers C++ into C semantics and lower that into IR, there are some changes that exist in C++ and not in C. The IR has the same features, but some assumptions on the semantics are different. I can give you two examples: 1. Classes in C++ are like C structures in IR, but the C++ ABI makes it difficult to express Base/Derived classes in pure structures. (http://www.systemcall.org/blog/2011/01/cpp-class-sizes/ and http://www.systemcall.org/blog/2011/03/cpp-class-sizes-2/). So, if your pass depends on identifying the same types, you could end up thinking that the types are different, but they're not. They're just different struct representations (base vs. derived) of the same type. 2. Virtual table tables encode offsets in two different ways: addresses and offsets, and the two representations are normally on the same global static array in IR. So, while the type of the array is int (or pointer), it contains addresses and offsets from addresses bitcasting to the type of the global. These are not the only C-lowering that C++ front-ends do, but it gives you a taste of what to expect. As Eli said, most of C++ can be lowered into C-like structures and the IR is very similar, but some semantic interpretations are done differently, and the internal IR (that deals with those objects) is slightly different. -- cheers, --renato http://systemcall.org/
Anna Zaks
2011-Sep-22  21:33 UTC
[LLVMdev] question on difference of bitcode between C and C++
Another difference is the presence of exceptions in C++, which would require you to handle more IR instructions. This might not matter depending on type of analysis you do. See: http://llvm.org/docs/LangRef.html#i_invoke (Note that there is a substantial rewrite of exception handling going into 3.0) Anna. On Sep 22, 2011, at 3:29 AM, Renato Golin wrote:> On 22 September 2011 03:30, Eli Friedman <eli.friedman at gmail.com> wrote: >>> I am trying to find such difference of bitcode between C and C++. >> >> There isn't any difference in that sense... in IR, a constructor is >> just a function call, a reference is just a pointer, etc. > > Hi Fei, > > While Clang (like others) lowers C++ into C semantics and lower that > into IR, there are some changes that exist in C++ and not in C. The IR > has the same features, but some assumptions on the semantics are > different. > > I can give you two examples: > > 1. Classes in C++ are like C structures in IR, but the C++ ABI makes > it difficult to express Base/Derived classes in pure structures. > (http://www.systemcall.org/blog/2011/01/cpp-class-sizes/ and > http://www.systemcall.org/blog/2011/03/cpp-class-sizes-2/). > > So, if your pass depends on identifying the same types, you could end > up thinking that the types are different, but they're not. They're > just different struct representations (base vs. derived) of the same > type. > > 2. Virtual table tables encode offsets in two different ways: > addresses and offsets, and the two representations are normally on the > same global static array in IR. So, while the type of the array is int > (or pointer), it contains addresses and offsets from addresses > bitcasting to the type of the global. > > These are not the only C-lowering that C++ front-ends do, but it gives > you a taste of what to expect. As Eli said, most of C++ can be lowered > into C-like structures and the IR is very similar, but some semantic > interpretations are done differently, and the internal IR (that deals > with those objects) is slightly different. > > > -- > cheers, > --renato > > http://systemcall.org/ > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110922/9610829d/attachment.html>
Fei Jia
2011-Sep-23  00:49 UTC
[LLVMdev] question on difference of bitcode between C and C++
Thanks! Your suggestions are really helpful to me. I am verifying the differences and will provide my findings soon. -- Best Regards, Fei Jia On Thu, Sep 22, 2011 at 2:33 PM, Anna Zaks <ganna at apple.com> wrote:> Another difference is the presence of exceptions in C++, which would > require you to handle more IR instructions. This might not matter depending > on type of analysis you do. > See: http://llvm.org/docs/LangRef.html#i_invoke > (Note that there is a substantial rewrite of exception handling going into > 3.0) > > Anna. > On Sep 22, 2011, at 3:29 AM, Renato Golin wrote: > > On 22 September 2011 03:30, Eli Friedman <eli.friedman at gmail.com> wrote: > > I am trying to find such difference of bitcode between C and C++. > > > There isn't any difference in that sense... in IR, a constructor is > > just a function call, a reference is just a pointer, etc. > > > Hi Fei, > > While Clang (like others) lowers C++ into C semantics and lower that > into IR, there are some changes that exist in C++ and not in C. The IR > has the same features, but some assumptions on the semantics are > different. > > I can give you two examples: > > 1. Classes in C++ are like C structures in IR, but the C++ ABI makes > it difficult to express Base/Derived classes in pure structures. > (http://www.systemcall.org/blog/2011/01/cpp-class-sizes/ and > http://www.systemcall.org/blog/2011/03/cpp-class-sizes-2/). > > So, if your pass depends on identifying the same types, you could end > up thinking that the types are different, but they're not. They're > just different struct representations (base vs. derived) of the same > type. > > 2. Virtual table tables encode offsets in two different ways: > addresses and offsets, and the two representations are normally on the > same global static array in IR. So, while the type of the array is int > (or pointer), it contains addresses and offsets from addresses > bitcasting to the type of the global. > > These are not the only C-lowering that C++ front-ends do, but it gives > you a taste of what to expect. As Eli said, most of C++ can be lowered > into C-like structures and the IR is very similar, but some semantic > interpretations are done differently, and the internal IR (that deals > with those objects) is slightly different. > > > -- > cheers, > --renato > > http://systemcall.org/ > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110922/82408238/attachment.html>
Maybe Matching Threads
- [LLVMdev] question on difference of bitcode between C and C++
- [LLVMdev] question on difference of bitcode between C and C++
- [LLVMdev] question on difference of bitcode between C and C++
- [LLVMdev] question on difference of bitcode between C and C++
- [LLVMdev] question on difference of bitcode between C and C++