thr3ads.net - llvm dev - [LLVMdev] [cfe-dev] Unicode path handling on Windows [Oct 2011]

If this information is useful, please help other people find it:
Share via:

Bryce Cogswell

2011-Oct-04 06:24 UTC

[LLVMdev] [cfe-dev] Unicode path handling on Windows

That should be fine. I don't believe the concern about performing a
char-by-char conversion is valid; for example the NTFS-3G driver uses a
simplistic upcase table and seems to work fine. I suspect Windows does the same.


On Oct 3, 2011, at 1:12 PM, Nikola Smiljanic wrote:
> How about this:
> 
> for (int i = 0; i != NumWChars; ++i)
>         absPath[i] = std::tolower(absPath[i], std::locale());
> 
> seems to be working just fine?
> 
> On Mon, Oct 3, 2011 at 9:27 PM, Bryce Cogswell <bryceco at gmail.com>
wrote:
> Right, but maybe if you switch to using tolower_l() and pass an appropriate
locale you can get it to work the same way. I'm not sure what locale that
would have to be, but it needs to match whatever NTFS uses for its $upcase file.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111003/95923b63/attachment.html>

Nikola Smiljanic

2011-Oct-04 09:19 UTC

head link

[LLVMdev] [cfe-dev] Unicode path handling on Windows

In that case I think that this is it :)

On Tue, Oct 4, 2011 at 8:24 AM, Bryce Cogswell <bryceco at gmail.com>
wrote:
> That should be fine. I don't believe the concern about performing a
> char-by-char conversion is valid; for example the NTFS-3G driver uses a
> simplistic upcase table and seems to work fine. I suspect Windows does the
> same.
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111004/2af4c8db/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clang.patch
Type: application/octet-stream
Size: 15811 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111004/2af4c8db/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm.patch
Type: application/octet-stream
Size: 4685 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111004/2af4c8db/attachment-0001.obj>

Aaron Ballman

2011-Oct-04 12:25 UTC

head link

[LLVMdev] [cfe-dev] Unicode path handling on Windows

On Tue, Oct 4, 2011 at 4:19 AM, Nikola Smiljanic <popizdeh at gmail.com>
wrote:> In that case I think that this is it :)
>
> On Tue, Oct 4, 2011 at 8:24 AM, Bryce Cogswell <bryceco at gmail.com>
wrote:
>>
>> That should be fine. I don't believe the concern about performing a
>> char-by-char conversion is valid; for example the NTFS-3G driver uses a
>> simplistic upcase table and seems to work fine. I suspect Windows does
the
>> same.
Index: include/llvm/Support/FileSystem.h
==================================================================---
include/llvm/Support/FileSystem.h	(revision 141071)
+++ include/llvm/Support/FileSystem.h	(working copy)
@@ -436,6 +436,32 @@

+  for (int i = 0; i != argc; ++i)
+  {
+    // check lenght

May want to correct the typo.

Index: lib/Basic/ConvertUTF.c
==================================================================---
lib/Basic/ConvertUTF.c	(revision 141071)
+++ lib/Basic/ConvertUTF.c	(working copy)
@@ -218,74 +218,7 @@

+      /* Figure out how many bytes the result will require */
+      if (ch < (UTF32)0x80) {      bytesToWrite = 1;
+      } else if (ch < (UTF32)0x800) {     bytesToWrite = 2;
+      } else if (ch < (UTF32)0x10000) {   bytesToWrite = 3;
+      } else if (ch < (UTF32)0x110000) {  bytesToWrite = 4;
+      } else {                            bytesToWrite = 3;
+      ch = UNI_REPLACEMENT_CHAR;
+      }

I think this should be formatted more like this to meet our coding
standards (but am not 100% sure, so I will defer to others):

if (ch < (UTF32)0x80)
   bytesToWrite = 1;
else if (ch < (UTF32)0x800)
  bytesToWrite = 2;
...
else
  ch = UNI_REPLACEMENT_CHAR;

Those are the only minor nitpicks, well done!

~Aaron

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Oct 2011 - [LLVMdev] [cfe-dev] Unicode path handling on Windows

[LLVMdev] [cfe-dev] Unicode path handling on Windows

[LLVMdev] [cfe-dev] Unicode path handling on Windows

[LLVMdev] [cfe-dev] Unicode path handling on Windows

Reasonably Related Threads