Hi! Of course nobody wants to implement unicode support for windows because windows should support an utf8-locale and windows is obsolete anyway ;-) But there is a simple solution: use boost::filesystem::path everywhere you use file names and paths, for example in clang::FileManager::getFile. With version 3 opening a file is easy: std::fstream file(path.c_str()). Internally boost::filesystem::path uses the native encoding which is utf16 for windows but you won't notice it since it recodes 8 bit strings automatically (which is no-op on unix and macos). If you don't want to become dependent on boost, I suggest reimplementing the most important features always using 8 bit strings and then have something like this: #ifdef HAVE_BOOST namespace fs = boost::filesystem; #else // simple implementation here #endif -Jochen
On Nov 25, 2010, at 5:01 PM, Jochen Wilhelmy <j.wilhelmy at arcor.de> wrote:> Hi! > > Of course nobody wants to implement unicode support for windows > because windows should support an utf8-locale and windows is obsolete > anyway ;-) > > But there is a simple solution: use boost::filesystem::path everywhere you > use file names and paths, for example in clang::FileManager::getFile. > With version 3 opening a file is easy: std::fstream file(path.c_str()). > Internally boost::filesystem::path uses the native encoding which is > utf16 for windows but you won't notice it since it recodes 8 bit strings > automatically (which is no-op on unix and macos). > > If you don't want to become dependent on boost, I suggest reimplementing > the most important features always using 8 bit strings and then have > something > like this: > > #ifdef HAVE_BOOST > namespace fs = boost::filesystem; > #else > // simple implementation here > #endif > > -JochenThis happens to be very close to the code I'm working on now (I assume this post was prompted by my patches). I'll be adding unicode support to the Windows implementation, however, paths will remain utf-8 encoded outside of System. I really wish UCS-2 never existed ;/. - Michael Spencer -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101125/623b3043/attachment.html>
On 25.11.2010 23:56, Michael Spencer wrote:> On Nov 25, 2010, at 5:01 PM, Jochen Wilhelmy <j.wilhelmy at arcor.de > <mailto:j.wilhelmy at arcor.de>> wrote: > >> Hi! >> >> Of course nobody wants to implement unicode support for windows >> because windows should support an utf8-locale and windows is obsolete >> anyway ;-) >> >> But there is a simple solution: use boost::filesystem::path >> everywhere you >> use file names and paths, for example in clang::FileManager::getFile. >> With version 3 opening a file is easy: std::fstream file(path.c_str()). >> Internally boost::filesystem::path uses the native encoding which is >> utf16 for windows but you won't notice it since it recodes 8 bit strings >> automatically (which is no-op on unix and macos). >> >> If you don't want to become dependent on boost, I suggest reimplementing >> the most important features always using 8 bit strings and then have >> something >> like this: >> >> #ifdef HAVE_BOOST >> namespace fs = boost::filesystem; >> #else >> // simple implementation here >> #endif >> >> -Jochen > > This happens to be very close to the code I'm working on now (I assume > this post was prompted by my patches). I'll be adding unicode support > to the Windows implementation, however, paths will remain utf-8 > encoded outside of System.No, this post was prompted since I switched to boost::filesystem version 3 in my own code and llvm/clang 2.8 was the only lib with no unicode support on windows. Will your code be api compatible to boost::filesystem? The reason for this is that maybe boost::filesystem will become part of the standard and it is possible to imbue() a locale on boost::filesystem. While this feature is not needed on unix/macos it gives you global control whether you want to use ansi or unicode on windows. If you implement your own code with always utf-8 this may break compatibility with windows ansi encoding if you don't take care and why reinvent the wheel? maybe you could even copy/paste the boost implementation and use the #ifdef HAVE_BOOST approach. -Jochen -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101126/ff62f531/attachment.html>