David Woodhouse
2015-Aug-05 14:23 UTC
[llvm-dev] [BUG] Incorrect ASCII escape characters on Mac
On Wed, 2015-08-05 at 10:02 -0400, Ramkumar Ramachandra wrote:> > - at 5 = internal global [10 x i8] c"\22\D0\12\F4!\00\15\F9\EC\E1" > - at 6 = internal global [10 x i8] c"\D0\19\FB+\FD\F8#\03\E2\11" > + at 5 = internal global [10 x i8] c"\22Ð\12ô!\00\15ùìá" > + at 6 = internal global [10 x i8] c"Ð\19û+ýø#\03â\11" > > The diff is between Linux and Mac, where lines added are from Mac. > Both the @5 character sequences represent: > > 34 208 18 244 33 0 21 249 236 225Not in this century, they don't. That Ð, for example, is U+00D0 LATIN CAPITAL LETTER ETH, which in any 21st century system should be represented by the UTF-8 bytes 195,144. Your string "\22Ð\12ô!\00\15ùìá" is much more likely be: 34 195 144 18 195 180 33 0 21 195 185 195 172 195 161 Your "Linux" version is encoding the bytes directly and not making assumptions about character sets. -- David Woodhouse Open Source Technology Centre David.Woodhouse at intel.com Intel Corporation -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 5691 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150805/c103c7d3/attachment.bin>
Ramkumar Ramachandra
2015-Aug-05 20:49 UTC
[llvm-dev] [BUG] Incorrect ASCII escape characters on Mac
Thanks. Whose fault is it exactly though? I can reproduce the issue
with this construction code:
IRBuilder<> builder(getGlobalContext());
auto M = new llvm::Module("main", getGlobalContext());
std::vector<Type *> ATyAr;
auto i8Ty = builder.getInt8Ty();
auto i8ArTy = ArrayType::get(builder.getInt8Ty(), 10);
auto pi8ArTy = PointerType::get(i8ArTy, 0);
auto FTy = FunctionType::get(i8ArTy, ATyAr, false);
auto Fcn =
cast<Function>(M->getOrInsertFunction("testFcn", FTy));
auto BB = BasicBlock::Create(builder.getContext(), "entry", Fcn);
builder.SetInsertPoint(BB);
std::vector<int> rawData = { 34 , -48 , 18 , -12 , 33 , 0 , 21 ,
-7 , -20 , -31 };
std::vector<Constant *> coercedData;
for (auto el : rawData) {
coercedData.push_back(ConstantInt::get(i8Ty, el, 10));
}
auto V = ConstantArray::get(i8ArTy, ArrayRef<Constant
*>(coercedData));
builder.CreateRet(V);
std::string scratchspace;
raw_string_ostream scratch(scratchspace);
auto aaw = new AssemblyAnnotationWriter;
M->print(scratch, aaw);
It prints fine on Linux, but prints those weird characters on Mac.
Is just the pretty-printer broken?
Thanks.
Ram
Ramkumar Ramachandra via llvm-dev
2015-Aug-06 16:07 UTC
[llvm-dev] [BUG] Incorrect ASCII escape characters on Mac
*sigh*
So, it turns out that it's just a bug in dump(), which doesn't bother
me enough. Sure enough, the i8 array itself is fine.
IRBuilder<> Builder(getGlobalContext());
auto M = new llvm::Module("main", getGlobalContext());
std::vector<Type *> ATyAr;
auto FTy = FunctionType::get(Builder.getVoidTy(), ATyAr, false);
auto ExecutionHandle =
cast<Function>(M->getOrInsertFunction("main", FTy));
auto BB = BasicBlock::Create(Builder.getContext(), "entry",
ExecutionHandle);
Builder.SetInsertPoint(BB);
std::vector<int> rawData = { 34 , -48 , 18 , -12 , 33 , 0 , 21 ,
-7 , -20 , -31 };
std::vector<Constant *> coercedData;
auto i8Ty = Builder.getInt8Ty();
for (auto el : rawData) {
coercedData.push_back(ConstantInt::get(i8Ty, el, 10));
}
auto i8PtrTy = Builder.getInt8PtrTy();
ArrayRef<Type *> ArgTys(i8PtrTy);
FunctionType *PrintfTy FunctionType::get(Builder.getInt32Ty(),
ArgTys, /* IsVarArgs = */ true);
auto PrintfHandle
dyn_cast<Function>(M->getOrInsertFunction("printf",
PrintfTy));
PrintfHandle->setCallingConv(CallingConv::C);
auto FormatStringPtr = Builder.CreateGlobalStringPtr("%d ");
auto i8ArTy = ArrayType::get(i8Ty, 10);
auto StrConstant = ConstantArray::get(i8ArTy, ArrayRef<Constant
*>(coercedData));
auto GV = new GlobalVariable(*M, StrConstant->getType(), true,
GlobalValue::PrivateLinkage, StrConstant);
auto ConstantZero ConstantInt::get(Type::getInt32Ty(Builder.getContext()),
0);
for (auto i = 0; i < 10; i++) {
std::vector<Value *> ThisElIdx { ConstantZero,
ConstantInt::get(Type::getInt32Ty(Builder.getContext()), i) };
auto FirstEl = Builder.CreateGEP(GV, ArrayRef<Value
*>(ThisElIdx));
auto LoadedV = Builder.CreateLoad(FirstEl);
Builder.CreateCall2(PrintfHandle, FormatStringPtr, LoadedV);
}
Builder.CreateRet(nullptr);
LLVMInitializeNativeTarget();
LLVMInitializeNativeAsmPrinter();
auto EE = EngineBuilder(M).create();
assert(EE && "Error creating MCJIT with EngineBuilder");
typedef int (*MainFTy)();
union {
uint64_t raw;
MainFTy usable;
} functionPointer;
functionPointer.raw =
(uint64_t)EE->getPointerToFunction(ExecutionHandle);
assert(functionPointer.usable && "no main function
found");
testing::internal::CaptureStdout();
functionPointer.usable();
auto Captured = testing::internal::GetCapturedStdout();
// check Captured