On Fri, Mar 30, 2012 at 15:22, Eli Friedman <eli.friedman at gmail.com> wrote:> On Fri, Mar 30, 2012 at 12:12 PM, Sean Hunt <scshunt at csclub.uwaterloo.ca> > wrote: > > Why is it that high (>127) bytes in symbol names get mangled by LLVM into > > _XX_, where XX is the hex representation of the character? Is this > required > > by ELF or some similar standard? This behavior is inconsistent with GCC. > > I think it's just so that we have a way to actually write out the > symbol into the assembly file. What does gcc do? > > -Eli > >It emits the high bits literally. The consequence is that UTF-8-encoded identifiers come out in UTF-8: scshunt at natural-flavours:~$ gcc -fextended-identifiers -std=c99 -x c -c -o test.o - int i\u03bb; scshunt at natural-flavours:~$ nm test.o 00000004 C iλ scshunt at natural-flavours:~$ As you can see, the nm output includes the literal lambda. Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120330/af3e3f76/attachment.html>
On Fri, Mar 30, 2012 at 6:17 PM, Sean Hunt <scshunt at csclub.uwaterloo.ca> wrote:> On Fri, Mar 30, 2012 at 15:22, Eli Friedman <eli.friedman at gmail.com> wrote: >> >> On Fri, Mar 30, 2012 at 12:12 PM, Sean Hunt <scshunt at csclub.uwaterloo.ca> >> wrote: >> > Why is it that high (>127) bytes in symbol names get mangled by LLVM >> > into >> > _XX_, where XX is the hex representation of the character? Is this >> > required >> > by ELF or some similar standard? This behavior is inconsistent with GCC. >> >> I think it's just so that we have a way to actually write out the >> symbol into the assembly file. What does gcc do? >> >> -Eli >> > > It emits the high bits literally. The consequence is that UTF-8-encoded > identifiers come out in UTF-8: > > scshunt at natural-flavours:~$ gcc -fextended-identifiers -std=c99 -x c -c -o > test.o - > int i\u03bb; > scshunt at natural-flavours:~$ nm test.o > 00000004 C iλ > scshunt at natural-flavours:~$ > > As you can see, the nm output includes the literal lambda.Okay... then we should probably support that as well. Might need to be a bit careful to make sure the assembly files work correctly. -Eli
On Fri, Mar 30, 2012 at 21:22, Eli Friedman <eli.friedman at gmail.com> wrote:> Okay... then we should probably support that as well. Might need to > be a bit careful to make sure the assembly files work correctly. > > -Eli >You mean machine assembly and not IR, right? Sean
Apparently Analagous Threads
- [LLVMdev] Mangling of UTF-8 characters in symbol names
- [LLVMdev] Mangling of UTF-8 characters in symbol names
- [LLVMdev] Mangling of UTF-8 characters in symbol names
- [LLVMdev] Mangling of UTF-8 characters in symbol names
- [LLVMdev] Mangling of UTF-8 characters in symbol names