David Woodhouse
2015-Aug-05 14:23 UTC
[llvm-dev] [BUG] Incorrect ASCII escape characters on Mac
On Wed, 2015-08-05 at 10:02 -0400, Ramkumar Ramachandra wrote:> > - at 5 = internal global [10 x i8] c"\22\D0\12\F4!\00\15\F9\EC\E1" > - at 6 = internal global [10 x i8] c"\D0\19\FB+\FD\F8#\03\E2\11" > + at 5 = internal global [10 x i8] c"\22Ð\12ô!\00\15ùìá" > + at 6 = internal global [10 x i8] c"Ð\19û+ýø#\03â\11" > > The diff is between Linux and Mac, where lines added are from Mac. > Both the @5 character sequences represent: > > 34 208 18 244 33 0 21 249 236 225Not in this century, they don't. That Ð, for example, is U+00D0 LATIN CAPITAL LETTER ETH, which in any 21st century system should be represented by the UTF-8 bytes 195,144. Your string "\22Ð\12ô!\00\15ùìá" is much more likely be: 34 195 144 18 195 180 33 0 21 195 185 195 172 195 161 Your "Linux" version is encoding the bytes directly and not making assumptions about character sets. -- David Woodhouse Open Source Technology Centre David.Woodhouse at intel.com Intel Corporation -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 5691 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150805/c103c7d3/attachment.bin>
Ramkumar Ramachandra
2015-Aug-05 20:49 UTC
[llvm-dev] [BUG] Incorrect ASCII escape characters on Mac
Thanks. Whose fault is it exactly though? I can reproduce the issue with this construction code: IRBuilder<> builder(getGlobalContext()); auto M = new llvm::Module("main", getGlobalContext()); std::vector<Type *> ATyAr; auto i8Ty = builder.getInt8Ty(); auto i8ArTy = ArrayType::get(builder.getInt8Ty(), 10); auto pi8ArTy = PointerType::get(i8ArTy, 0); auto FTy = FunctionType::get(i8ArTy, ATyAr, false); auto Fcn = cast<Function>(M->getOrInsertFunction("testFcn", FTy)); auto BB = BasicBlock::Create(builder.getContext(), "entry", Fcn); builder.SetInsertPoint(BB); std::vector<int> rawData = { 34 , -48 , 18 , -12 , 33 , 0 , 21 , -7 , -20 , -31 }; std::vector<Constant *> coercedData; for (auto el : rawData) { coercedData.push_back(ConstantInt::get(i8Ty, el, 10)); } auto V = ConstantArray::get(i8ArTy, ArrayRef<Constant *>(coercedData)); builder.CreateRet(V); std::string scratchspace; raw_string_ostream scratch(scratchspace); auto aaw = new AssemblyAnnotationWriter; M->print(scratch, aaw); It prints fine on Linux, but prints those weird characters on Mac. Is just the pretty-printer broken? Thanks. Ram
Ramkumar Ramachandra via llvm-dev
2015-Aug-06 16:07 UTC
[llvm-dev] [BUG] Incorrect ASCII escape characters on Mac
*sigh* So, it turns out that it's just a bug in dump(), which doesn't bother me enough. Sure enough, the i8 array itself is fine. IRBuilder<> Builder(getGlobalContext()); auto M = new llvm::Module("main", getGlobalContext()); std::vector<Type *> ATyAr; auto FTy = FunctionType::get(Builder.getVoidTy(), ATyAr, false); auto ExecutionHandle = cast<Function>(M->getOrInsertFunction("main", FTy)); auto BB = BasicBlock::Create(Builder.getContext(), "entry", ExecutionHandle); Builder.SetInsertPoint(BB); std::vector<int> rawData = { 34 , -48 , 18 , -12 , 33 , 0 , 21 , -7 , -20 , -31 }; std::vector<Constant *> coercedData; auto i8Ty = Builder.getInt8Ty(); for (auto el : rawData) { coercedData.push_back(ConstantInt::get(i8Ty, el, 10)); } auto i8PtrTy = Builder.getInt8PtrTy(); ArrayRef<Type *> ArgTys(i8PtrTy); FunctionType *PrintfTy FunctionType::get(Builder.getInt32Ty(), ArgTys, /* IsVarArgs = */ true); auto PrintfHandle dyn_cast<Function>(M->getOrInsertFunction("printf", PrintfTy)); PrintfHandle->setCallingConv(CallingConv::C); auto FormatStringPtr = Builder.CreateGlobalStringPtr("%d "); auto i8ArTy = ArrayType::get(i8Ty, 10); auto StrConstant = ConstantArray::get(i8ArTy, ArrayRef<Constant *>(coercedData)); auto GV = new GlobalVariable(*M, StrConstant->getType(), true, GlobalValue::PrivateLinkage, StrConstant); auto ConstantZero ConstantInt::get(Type::getInt32Ty(Builder.getContext()), 0); for (auto i = 0; i < 10; i++) { std::vector<Value *> ThisElIdx { ConstantZero, ConstantInt::get(Type::getInt32Ty(Builder.getContext()), i) }; auto FirstEl = Builder.CreateGEP(GV, ArrayRef<Value *>(ThisElIdx)); auto LoadedV = Builder.CreateLoad(FirstEl); Builder.CreateCall2(PrintfHandle, FormatStringPtr, LoadedV); } Builder.CreateRet(nullptr); LLVMInitializeNativeTarget(); LLVMInitializeNativeAsmPrinter(); auto EE = EngineBuilder(M).create(); assert(EE && "Error creating MCJIT with EngineBuilder"); typedef int (*MainFTy)(); union { uint64_t raw; MainFTy usable; } functionPointer; functionPointer.raw = (uint64_t)EE->getPointerToFunction(ExecutionHandle); assert(functionPointer.usable && "no main function found"); testing::internal::CaptureStdout(); functionPointer.usable(); auto Captured = testing::internal::GetCapturedStdout(); // check Captured