Hi, I''ve got a preliminary version of the pure Ruby version of win32-dir in CVS. However, I was hoping to work out the Unicode issue. Run this: from = "C:\\test" to = "?????" Dir.mkdir(from) unless File.exists?(from) Dir.create_junction(to, from) It works, but my explorer (and dos) window shows the name garbled. I don''t think it''s a font encoding issue within Explorer, since I can create a folder called "?????" manually within Explorer and it looks correct. Any ideas? Dan
Hi, 2006/5/26, Daniel Berger <djberg96 at gmail.com>:> Hi,>> I''ve got a preliminary version of the pure Ruby version of win32-dir in> CVS. However, I was hoping to work out the Unicode issue. Run this:>> from = "C:\\test"> to = "?????"> Dir.mkdir(from) unless File.exists?(from)> Dir.create_junction(to, from)>> It works, but my explorer (and dos) window shows the name garbled. I> don''t think it''s a font encoding issue within Explorer, since I can> create a folder called "?????" manually within Explorer and it looks> correct.>You can check whether the folder name is whether unicode or ansi ? How about some modification of create_function method like this: def self.create_junction(to, from,unicode=false) # Normalize the paths to.tr!(''/'', "\\") from.tr!(''/'', "\\") to_path = 0.chr * 260 from_path = 0.chr * 260 buf_target = 0.chr * 260 if GetFullPathName(from, from_path.size, from_path, 0) == 0 raise StandardError, ''GetFullPathName() failed: '' + get_last_error endif unicode if GetFullPathNameW(to, to_path.size, to_path, 0) == 0 raise StandardError, ''GetFullPathName() failed: '' + get_last_error endelse if GetFullPathName(to, to_path.size, to_path, 0) == 0 raise StandardError, ''GetFullPathName() failed: '' + get_last_error endend to_path = to_path.split("\0\0").first from_path = from_path.split(0.chr).first # You can create a junction to a directory that already exists, so # long as it''s empty. if unicode rv = CreateDirectoryW(to_path, 0) if rv == 0 && rv != ERROR_ALREADY_EXISTS raise StandardError, ''CreateDirectory() failed: '' + get_last_error end handle = CreateFileW( to_path, GENERIC_READ | GENERIC_WRITE, 0, 0, OPEN_EXISTING, FILE_FLAG_OPEN_REPARSE_POINT | FILE_FLAG_BACKUP_SEMANTICS, 0 )else rv = CreateDirectory(to_path, 0) if rv == 0 && rv != ERROR_ALREADY_EXISTS raise StandardError, ''CreateDirectory() failed: '' + get_last_error end handle = CreateFile( to_path, GENERIC_READ | GENERIC_WRITE, 0, 0, OPEN_EXISTING, FILE_FLAG_OPEN_REPARSE_POINT | FILE_FLAG_BACKUP_SEMANTICS, 0 )end...end And call like this:from = "C:\\test"Dir.mkdir(from) unless File.exists?(from)#unicode folder nameto = "\x95\x03\xBB\x03\xBB\x03\xAC\x03\xC3\x03\0\0" # "?????"Dir.create_junction(to, from,true)#ansi folder nameto = "ansi"Dir.create_junction(to, from,false) Regards, Park Heesob
Heesob Park wrote:> Hi, > 2006/5/26, Daniel Berger <djberg96 at gmail.com>:> Hi,>> I''ve got a preliminary version of the pure Ruby version of win32-dir in> CVS. However, I was hoping to work out the Unicode issue. Run this:>> from = "C:\\test"> to = "?????"> Dir.mkdir(from) unless File.exists?(from)> Dir.create_junction(to, from)>> It works, but my explorer (and dos) window shows the name garbled. I> don''t think it''s a font encoding issue within Explorer, since I can> create a folder called "?????" manually within Explorer and it looks> correct.>You can check whether the folder name is whether unicode or ansi ? > How about some modification of create_function method like this: > def self.create_junction(to, from,unicode=false) # Normalize the paths to.tr!(''/'', "\\") from.tr!(''/'', "\\") > to_path = 0.chr * 260 from_path = 0.chr * 260 buf_target = 0.chr * 260 > if GetFullPathName(from, from_path.size, from_path, 0) == 0 raise StandardError, ''GetFullPathName() failed: '' + get_last_error endif unicode if GetFullPathNameW(to, to_path.size, to_path, 0) == 0 raise StandardError, ''GetFullPathName() failed: '' + get_last_error endelse if GetFullPathName(to, to_path.size, to_path, 0) == 0 raise StandardError, ''GetFullPathName() failed: '' + get_last_error endend to_path = to_path.split("\0\0").first from_path = from_path.split(0.chr).first > # You can create a junction to a directory that already exists, so # long as it''s empty. > if unicode rv = CreateDirectoryW(to_path, 0) if rv == 0 && rv != ERROR_ALREADY_EXISTS raise StandardError, ''CreateDirectory() failed: '' + get_last_error end > handle = CreateFileW( to_path, GENERIC_READ | GENERIC_WRITE, 0, 0, OPEN_EXISTING, FILE_FLAG_OPEN_REPARSE_POINT | FILE_FLAG_BACKUP_SEMANTICS, 0 )else rv = CreateDirectory(to_path, 0) if rv == 0 && rv != ERROR_ALREADY_EXISTS raise StandardError, ''CreateDirectory() failed: '' + get_last_error end > handle = CreateFile( to_path, GENERIC_READ | GENERIC_WRITE, 0, 0, OPEN_EXISTING, FILE_FLAG_OPEN_REPARSE_POINT | FILE_FLAG_BACKUP_SEMANTICS, 0 )end...end > And call like this:from = "C:\\test"Dir.mkdir(from) unless File.exists?(from)#unicode folder nameto = "\x95\x03\xBB\x03\xBB\x03\xAC\x03\xC3\x03\0\0" # "?????"Dir.create_junction(to, from,true)#ansi folder nameto = "ansi"Dir.create_junction(to, from,false) > Regards, > Park Heesob > _______________________________________________ > win32utils-devel mailing list > win32utils-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/win32utils-develHi Heesob, That came through as a single line for some reason... Anyway, I modified the approach somewhat. Instead of adding a ''unicode'' flag, I altered the windows-pr package to wrap methods like this: def CreateDirectory(path, attributes) if $KCODE != ''NONE'' CreateDirectoryW.call(path, attributes) != 0 else CreateDirectory.call(path, attributes) != 0 end end Within dir.rb, I did this: if $KCODE != ''NONE'' to_path = to_path.split("\0\0").first else to_path = to_path.split(0.chr).first end However, after making those changes I got this: dir.rb:123:in `create_junction'': DeviceIoControl() failed: The data present in the reparse point buffer is invalid. (RuntimeError). Does rdb need to be packed differently? BTW, I also tinkered with the idea of using IsTextUnicode(): def CreateDirectory(path, attributes) if IsTextUnicode(path, path.size, 0) CreateDirectoryW.call(path, attributes) != 0 else CreateDirectory.call(path, attributes) != 0 end end But this didn''t seem to handle UCS-2, i.e it worked for "\x95\x03\xBB\x03\xBB\x03\xAC\x03\xC3\x03\0\0" but not "?????". I imagine it''s also slower. Regards, Dan
Hi, 2006/5/27, Daniel Berger <djberg96 at gmail.com>:<snip>> Hi Heesob,>> That came through as a single line for some reason...>> Anyway, I modified the approach somewhat. Instead of adding a ''unicode''> flag, I altered the windows-pr package to wrap methods like this:>> def CreateDirectory(path, attributes)> if $KCODE != ''NONE''> CreateDirectoryW.call(path, attributes) != 0> else> CreateDirectory.call(path, attributes) != 0> end> end>> Within dir.rb, I did this:>> if $KCODE != ''NONE''> to_path = to_path.split("\0\0").first> else> to_path = to_path.split(0.chr).first> end>> However, after making those changes I got this:>> dir.rb:123:in `create_junction'': DeviceIoControl() failed: The data> present in the reparse point buffer is invalid. (RuntimeError).>> Does rdb need to be packed differently?>> BTW, I also tinkered with the idea of using IsTextUnicode():>> def CreateDirectory(path, attributes)> if IsTextUnicode(path, path.size, 0)> CreateDirectoryW.call(path, attributes) != 0> else> CreateDirectory.call(path, attributes) != 0> end> end>> But this didn''t seem to handle UCS-2, i.e it worked for> "\x95\x03\xBB\x03\xBB\x03\xAC\x03\xC3\x03\0\0" but not "?????". I> imagine it''s also slower.>> Regards,>> Dan> _______________________________________________> win32utils-devel mailing list> win32utils-devel at rubyforge.org> http://rubyforge.org/mailman/listinfo/win32utils-develI have managed to succeed in creating junction with create_junction method. First modify multi_to_wide and wide_to_multi to support UTF8 and Ansi string.For english users, UTF8 encoding is sufficient but other languageusers like Korean, both UTF8 and Ansi are required. def multi_to_wide(str) cp = ($KCODE == ''UTF8'') ? CP_UTF8 : CP_ACP buf = 0.chr * 260 int = MultiByteToWideChar(cp, 0, str, -1, buf, buf.size) if int > 0 buf[0, int*2] else str end end def wide_to_multi(str) cp = ($KCODE == ''UTF8'') ? CP_UTF8 : CP_ACP buf = 0.chr * 260 int = WideCharToMultiByte(cp, 0, str, -1, buf, buf.size, 0, 0) if int > 0 buf[0, int] else str end end Second, modify create_junction method like this: def self.create_junction(to, from) # Normalize the paths to.tr!(''/'', "\\") from.tr!(''/'', "\\") to_path = 0.chr * 260 from_path = 0.chr * 260 buf_target = 0.chr * 260 from = multi_to_wide(from) to = multi_to_wide(to) if GetFullPathNameW.call(from, from_path.size, from_path, 0) == 0 raise StandardError, ''GetFullPathName() failed: '' + get_last_error end if GetFullPathNameW.call(to, to_path.size, to_path, 0) == 0 raise StandardError, ''GetFullPathName() failed: '' + get_last_error end to_path = to_path.strip+"\0\0\0" from_path = from_path.strip+"\0\0\0" # You can create a junction to a directory that already exists, so # long as it''s empty. rv = CreateDirectoryW.call(to_path, 0) if rv == 0 && GetLastError.call != ERROR_ALREADY_EXISTS raise StandardError, ''CreateDirectory() failed: '' + get_last_error end handle = CreateFileW.call( to_path, GENERIC_READ | GENERIC_WRITE, 0, 0, OPEN_EXISTING, FILE_FLAG_OPEN_REPARSE_POINT | FILE_FLAG_BACKUP_SEMANTICS, 0 ) if handle == INVALID_HANDLE_VALUE raise StandardError, ''CreateFile() failed: '' + get_last_error end buf_target = multi_to_wide("\\??\\") length = buf_target.size-2 buf_target = buf_target[0,length] + from_path length = buf_target.size-2 wide_string = buf_target # REPARSE_JDATA_BUFFER rdb = [ "0xA0000003L".hex, # ReparseTag (IO_REPARSE_TAG_MOUNT_POINT) length + 12, # ReparseDataLength 0, # Reserved 0, # SubstituteNameOffset length, # SubstituteNameLength length + 2, # PrintNameOffset 0, # PrintNameLength wide_string # PathBuffer ].pack(''LSSSSSSa'' + (length + 4).to_s) bytes = [length].pack(''L'') bool = DeviceIoControl( handle, CTL_CODE(FILE_DEVICE_FILE_SYSTEM, 41, METHOD_BUFFERED,FILE_ANY_ACCESS), rdb, rdb.size, 0, 0, bytes, 0 ) unless bool error = ''DeviceIoControl() failed: '' + get_last_error RemoveDirectoryW.call(to_path) CloseHandle(handle) raise error end CloseHandle(handle) self end Make test.rb like this: from = "C:\\test"to = "?????"Dir.mkdir(from) unless File.exists?(from)Dir.create_junction(to, from) The file format must be UTF8 DOS format. Run with KCODE option like this: ruby -Ku test.rb Regards, Park Heesob
Heesob Park wrote:> Hi, > 2006/5/27, Daniel Berger <djberg96 at gmail.com>:<snip>> Hi Heesob,>> That came through as a single line for some reason...>> Anyway, I modified the approach somewhat. Instead of adding a ''unicode''> flag, I altered the windows-pr package to wrap methods like this:>> def CreateDirectory(path, attributes)> if $KCODE != ''NONE''> CreateDirectoryW.call(path, attributes) != 0> else> CreateDirectory.call(path, attributes) != 0> end> end>> Within dir.rb, I did this:>> if $KCODE != ''NONE''> to_path = to_path.split("\0\0").first> else> to_path = to_path.split(0.chr).first> end>> However, after making those changes I got this:>> dir.rb:123:in `create_junction'': DeviceIoControl() failed: The data> present in the reparse point buffer is invalid. (RuntimeError).>> Does rdb need to be packed differently?>> BTW, I also tinkered with the idea of using IsTextUnicode():>> def CreateDirectory(path, attributes)> if IsTextUnicode(path, path.size, 0)> CreateDirectoryW.call(path, attributes) != 0> else> CreateDirectory.call(path, attributes) != 0> end> end>> But this didn''t seem to handle UCS-2, i.e it worked for> "\x95\x03\xBB\x03\xBB\x03\xAC\x03\xC3\x03\0\0" but not "?????". I> imagine it''s also slower.>> Regards,>> Dan> _______________________________________________> win32utils-devel mailing list> win32utils-devel at rubyforge.org> http://rubyforge.org/mailman/listinfo/win32utils-develI have managed to succeed in creating junction with create_junction method.> First modify multi_to_wide and wide_to_multi to support UTF8 and Ansi string.For english users, UTF8 encoding is sufficient but other languageusers like Korean, both UTF8 and Ansi are required. > def multi_to_wide(str) cp = ($KCODE == ''UTF8'') ? CP_UTF8 : CP_ACP buf = 0.chr * 260 int = MultiByteToWideChar(cp, 0, str, -1, buf, buf.size) if int > 0 buf[0, int*2] else str end end > def wide_to_multi(str) cp = ($KCODE == ''UTF8'') ? CP_UTF8 : CP_ACP buf = 0.chr * 260 int = WideCharToMultiByte(cp, 0, str, -1, buf, buf.size, 0, 0) > if int > 0 buf[0, int] else str end end > Second, modify create_junction method like this: > > def self.create_junction(to, from) # Normalize the paths to.tr!(''/'', "\\") from.tr!(''/'', "\\") > to_path = 0.chr * 260 from_path = 0.chr * 260 buf_target = 0.chr * 260 > from = multi_to_wide(from) to = multi_to_wide(to) if GetFullPathNameW.call(from, from_path.size, from_path, 0) == 0 raise StandardError, ''GetFullPathName() failed: '' + get_last_error end > if GetFullPathNameW.call(to, to_path.size, to_path, 0) == 0 raise StandardError, ''GetFullPathName() failed: '' + get_last_error end > to_path = to_path.strip+"\0\0\0" from_path = from_path.strip+"\0\0\0" > # You can create a junction to a directory that already exists, so # long as it''s empty. > rv = CreateDirectoryW.call(to_path, 0) if rv == 0 && GetLastError.call != ERROR_ALREADY_EXISTS raise StandardError, ''CreateDirectory() failed: '' + get_last_error end > handle = CreateFileW.call( to_path, GENERIC_READ | GENERIC_WRITE, 0, 0, OPEN_EXISTING, FILE_FLAG_OPEN_REPARSE_POINT | FILE_FLAG_BACKUP_SEMANTICS, 0 ) > if handle == INVALID_HANDLE_VALUE raise StandardError, ''CreateFile() failed: '' + get_last_error end buf_target = multi_to_wide("\\??\\") length = buf_target.size-2 buf_target = buf_target[0,length] + from_path length = buf_target.size-2 wide_string = buf_target > # REPARSE_JDATA_BUFFER rdb = [ "0xA0000003L".hex, # ReparseTag (IO_REPARSE_TAG_MOUNT_POINT) length + 12, # ReparseDataLength 0, # Reserved 0, # SubstituteNameOffset length, # SubstituteNameLength length + 2, # PrintNameOffset 0, # PrintNameLength wide_string # PathBuffer ].pack(''LSSSSSSa'' + (length + 4).to_s) > bytes = [length].pack(''L'') > > bool = DeviceIoControl( handle, CTL_CODE(FILE_DEVICE_FILE_SYSTEM, 41, METHOD_BUFFERED,FILE_ANY_ACCESS), rdb, rdb.size, 0, 0, bytes, 0 ) > unless bool error = ''DeviceIoControl() failed: '' + get_last_error > RemoveDirectoryW.call(to_path) CloseHandle(handle) raise error end > CloseHandle(handle) > self end > Make test.rb like this: > from = "C:\\test"to = "?????"Dir.mkdir(from) unless File.exists?(from)Dir.create_junction(to, from) > The file format must be UTF8 DOS format. > > Run with KCODE option like this: > ruby -Ku test.rbThanks Heesob, but I''m getting some weird segfaults with wide character functions and buffers over 245 characters. I posted about this to ruby-talk as well. Here''s some sample code that demonstrates the problem: require ''Win32API'' GetFullPathNameW = Win32API.new(''kernel32'', ''GetFullPathNameW'', ''PLPP'', ''L'') path = "C:\\test" buf = 0.chr * 260 # 245 or less works ok if GetFullPathNameW.call(path, buf.size, buf, 0) == 0 puts "Failed" exit end p buf.split("\0\0").first # BOOM! I''m not sure what the significance of 245 or less is. I can inspect ''buf'', copy and paste it to a separate editor as a string and run ops on it with no problem, so I''m very curious as to what''s making Ruby segfault. Anyway, I think I''m not going to worry about Unicode support for the initial release of the pure Ruby version for now. Thanks, Dan
Hi, 2006/5/28, Daniel Berger <djberg96 at gmail.com>:> Heesob Park wrote: > > Hi,<snip>> > Thanks Heesob, but I''m getting some weird segfaults with wide character > functions and buffers over 245 characters. I posted about this to > ruby-talk as well. Here''s some sample code that demonstrates the problem: > > require ''Win32API'' > > GetFullPathNameW = Win32API.new(''kernel32'', ''GetFullPathNameW'', ''PLPP'', ''L'') > > path = "C:\\test" > buf = 0.chr * 260 # 245 or less works ok > > if GetFullPathNameW.call(path, buf.size, buf, 0) == 0 > puts "Failed" > exit > end > > p buf.split("\0\0").first # BOOM! > > I''m not sure what the significance of 245 or less is. I can inspect > ''buf'', copy and paste it to a separate editor as a string and run ops on > it with no problem, so I''m very curious as to what''s making Ruby segfault. >I''m very curious too. Your sample code don''t segfault. But I came across segfaults several times in modifing create_junction method. It''s behaviour is very unstable, as I insert p method,it sometimes runs OK. It''s location was mainly split method or get_last_error method. I guess it''s not related with 245 or 260, but it seems to underlying C pointer memory access failure problem. Can you give me a sample stable segfault generating code? Regards, Park Heesob
Heesob Park wrote:> Hi, > > 2006/5/28, Daniel Berger <djberg96 at gmail.com>: >> Heesob Park wrote: >>> Hi, > <snip> >> Thanks Heesob, but I''m getting some weird segfaults with wide character >> functions and buffers over 245 characters. I posted about this to >> ruby-talk as well. Here''s some sample code that demonstrates the problem: >> >> require ''Win32API'' >> >> GetFullPathNameW = Win32API.new(''kernel32'', ''GetFullPathNameW'', ''PLPP'', ''L'') >> >> path = "C:\\test" >> buf = 0.chr * 260 # 245 or less works ok >> >> if GetFullPathNameW.call(path, buf.size, buf, 0) == 0 >> puts "Failed" >> exit >> end >> >> p buf.split("\0\0").first # BOOM! >> >> I''m not sure what the significance of 245 or less is. I can inspect >> ''buf'', copy and paste it to a separate editor as a string and run ops on >> it with no problem, so I''m very curious as to what''s making Ruby segfault. >> > I''m very curious too. > > Your sample code don''t segfault. > But I came across segfaults several times in modifing create_junction method. > It''s behaviour is very unstable, as I insert p method,it sometimes runs OK. > It''s location was mainly split method or get_last_error method. > > I guess it''s not related with 245 or 260, but it seems to underlying C > pointer memory access failure problem. > Can you give me a sample stable segfault generating code?To add even more mystery to this problem, that code isn''t segfaulting for me at the moment, though it was regularly last night (I still have the console window open that shows the segfaults to prove it to myself). I saw the same behavior you mentioned - inserting a ''puts'' would sometimes cause code that was previously segfaulting to suddenly work. Well, let''s put the Unicode stuff on hold for now. Perhaps deep inspection of string.c some day will reveal what the potential problem is. Many thanks, Dan
Hi, 2006/5/28, Daniel Berger <djberg96 at gmail.com>:> Heesob Park wrote: > > Hi, > > > > 2006/5/28, Daniel Berger <djberg96 at gmail.com>: > >> Heesob Park wrote: > >>> Hi, > > <snip> > >> Thanks Heesob, but I''m getting some weird segfaults with wide character > >> functions and buffers over 245 characters. I posted about this to > >> ruby-talk as well. Here''s some sample code that demonstrates the problem: > >> > >> require ''Win32API'' > >> > >> GetFullPathNameW = Win32API.new(''kernel32'', ''GetFullPathNameW'', ''PLPP'', ''L'') > >> > >> path = "C:\\test" > >> buf = 0.chr * 260 # 245 or less works ok > >> > >> if GetFullPathNameW.call(path, buf.size, buf, 0) == 0 > >> puts "Failed" > >> exit > >> end > >> > >> p buf.split("\0\0").first # BOOM! > >> > >> I''m not sure what the significance of 245 or less is. I can inspect > >> ''buf'', copy and paste it to a separate editor as a string and run ops on > >> it with no problem, so I''m very curious as to what''s making Ruby segfault. > >> > > I''m very curious too. > > > > Your sample code don''t segfault. > > But I came across segfaults several times in modifing create_junction method. > > It''s behaviour is very unstable, as I insert p method,it sometimes runs OK. > > It''s location was mainly split method or get_last_error method. > > > > I guess it''s not related with 245 or 260, but it seems to underlying C > > pointer memory access failure problem. > > Can you give me a sample stable segfault generating code? > > To add even more mystery to this problem, that code isn''t segfaulting > for me at the moment, though it was regularly last night (I still have > the console window open that shows the segfaults to prove it to myself). > > I saw the same behavior you mentioned - inserting a ''puts'' would > sometimes cause code that was previously segfaulting to suddenly work. > > Well, let''s put the Unicode stuff on hold for now. Perhaps deep > inspection of string.c some day will reveal what the potential problem is. >I have found out what is the problem. It''s not bug of Ruby or Windows, it is only bug of code. First try this: require ''Win32API'' GetFullPathNameW = Win32API.new(''kernel32'',''GetFullPathNameW'',''PLPP'', ''L'') for i in 1..100 path = "c:\\test" buf = 0.chr * 260 if GetFullPathNameW.call(path, buf.size, buf, 0) == 0 puts "Failed" end p buf.split("\0\0").first end It will cause various errors like uninitialized constant GetFullPathNameW (NameError) or segfault. Next, try this: require ''Win32API'' GetFullPathNameW = Win32API.new(''kernel32'',''GetFullPathNameW'',''PLPP'', ''L'') for i in 1..100 path = "c:\\test" buf = 0.chr * 260 # buf.size/2 -> actual length of buf if GetFullPathNameW.call(path, buf.size/2, buf, 0) == 0 puts "Failed" end p buf.split("\0\0").first end It runs Ok. but the result is not correct. Next , try this: require ''Win32API'' GetFullPathNameW = Win32API.new(''kernel32'',''GetFullPathNameW'',''PLPP'', ''L'') for i in 1..100 # append \0 to path path = "c:\\test\0" buf = 0.chr * 260 # buf.size/2 -> actual length of buf in unicode string if GetFullPathNameW.call(path, buf.size/2, buf, 0) == 0 puts "Failed" end p buf.split("\0\0").first end It runs ok. The result is correct. Finally, the complete and correct code is like this: require ''Win32API'' GetFullPathNameW = Win32API.new(''kernel32'',''GetFullPathNameW'',''PLPP'', ''L'') for i in 1..100 path = "c\0:\0\\\0t\0e\0s\0t\0\0" buf = 0.chr * 260 # buf.size/2 -> actual length of buf in unicode string if GetFullPathNameW.call(path, buf.size/2, buf, 0) == 0 puts "Failed" end buf = buf.split("\0\0").first buf = (buf.size % 2).zero? ? buf : buf+"\0" p buf end Remeber, Ruby''s string is terminated with "\0" implicitly, but UTF16 string requires double "\0". For ascii chars, it happens trailing three "\0" : one for ascii char and two for string termination. Regards, Park Heesob
Heesob Park wrote:> Hi, > > 2006/5/28, Daniel Berger <djberg96 at gmail.com>: >> Heesob Park wrote: >>> Hi, >>> >>> 2006/5/28, Daniel Berger <djberg96 at gmail.com>: >>>> Heesob Park wrote: >>>>> Hi, >>> <snip> >>>> Thanks Heesob, but I''m getting some weird segfaults with wide character >>>> functions and buffers over 245 characters. I posted about this to >>>> ruby-talk as well. Here''s some sample code that demonstrates the problem: >>>> >>>> require ''Win32API'' >>>> >>>> GetFullPathNameW = Win32API.new(''kernel32'', ''GetFullPathNameW'', ''PLPP'', ''L'') >>>> >>>> path = "C:\\test" >>>> buf = 0.chr * 260 # 245 or less works ok >>>> >>>> if GetFullPathNameW.call(path, buf.size, buf, 0) == 0 >>>> puts "Failed" >>>> exit >>>> end >>>> >>>> p buf.split("\0\0").first # BOOM! >>>> >>>> I''m not sure what the significance of 245 or less is. I can inspect >>>> ''buf'', copy and paste it to a separate editor as a string and run ops on >>>> it with no problem, so I''m very curious as to what''s making Ruby segfault. >>>> >>> I''m very curious too. >>> >>> Your sample code don''t segfault. >>> But I came across segfaults several times in modifing create_junction method. >>> It''s behaviour is very unstable, as I insert p method,it sometimes runs OK. >>> It''s location was mainly split method or get_last_error method. >>> >>> I guess it''s not related with 245 or 260, but it seems to underlying C >>> pointer memory access failure problem. >>> Can you give me a sample stable segfault generating code? >> To add even more mystery to this problem, that code isn''t segfaulting >> for me at the moment, though it was regularly last night (I still have >> the console window open that shows the segfaults to prove it to myself). >> >> I saw the same behavior you mentioned - inserting a ''puts'' would >> sometimes cause code that was previously segfaulting to suddenly work. >> >> Well, let''s put the Unicode stuff on hold for now. Perhaps deep >> inspection of string.c some day will reveal what the potential problem is. >> > I have found out what is the problem. > It''s not bug of Ruby or Windows, it is only bug of code. > > First try this: > > require ''Win32API'' > GetFullPathNameW = Win32API.new(''kernel32'',''GetFullPathNameW'',''PLPP'', ''L'') > for i in 1..100 > path = "c:\\test" > buf = 0.chr * 260 > if GetFullPathNameW.call(path, buf.size, buf, 0) == 0 > puts "Failed" > end > p buf.split("\0\0").first > end > > It will cause various errors like > uninitialized constant GetFullPathNameW (NameError) > or > segfault. > > Next, try this: > > require ''Win32API'' > GetFullPathNameW = Win32API.new(''kernel32'',''GetFullPathNameW'',''PLPP'', ''L'') > for i in 1..100 > path = "c:\\test" > buf = 0.chr * 260 > # buf.size/2 -> actual length of buf > if GetFullPathNameW.call(path, buf.size/2, buf, 0) == 0 > puts "Failed" > end > p buf.split("\0\0").first > end > > It runs Ok. but the result is not correct. > > Next , try this: > > require ''Win32API'' > GetFullPathNameW = Win32API.new(''kernel32'',''GetFullPathNameW'',''PLPP'', ''L'') > for i in 1..100 > # append \0 to path > path = "c:\\test\0" > buf = 0.chr * 260 > # buf.size/2 -> actual length of buf in unicode string > if GetFullPathNameW.call(path, buf.size/2, buf, 0) == 0 > puts "Failed" > end > p buf.split("\0\0").first > end > > It runs ok. The result is correct. > > Finally, the complete and correct code is like this: > > require ''Win32API'' > GetFullPathNameW = Win32API.new(''kernel32'',''GetFullPathNameW'',''PLPP'', ''L'') > for i in 1..100 > path = "c\0:\0\\\0t\0e\0s\0t\0\0" > buf = 0.chr * 260 > # buf.size/2 -> actual length of buf in unicode string > if GetFullPathNameW.call(path, buf.size/2, buf, 0) == 0 > puts "Failed" > end > buf = buf.split("\0\0").first > buf = (buf.size % 2).zero? ? buf : buf+"\0" > p buf > end > > Remeber, Ruby''s string is terminated with "\0" implicitly, but UTF16 > string requires double "\0". > For ascii chars, it happens trailing three "\0" : one for ascii char > and two for string termination. > > Regards, > > Park HeesobWhile I understand why this code works, I''m still not entirely clear why the previous code would cause the interpreter to segfault. Bad pointer address? In any case, excellent work, thank you! Now I''m trying to work out a general approach for the windows-pr stuff. Given a method like this: def GetFullPathName(file, buf, buf_size, part) if $KCODE != ''NONE'' GetFullPathNameW.call(file, buf, buf_size, part) else GetFullPathName.call(file, buf, buf_size, part) end end Should I modify it to try to do a best-guess? if $KCODE != ''NONE'' GetFullPathNameW.call(file, buf, buf_size/2, part) end Or do you think that''s the user''s job? Thanks, Dan
Hi, 2006/5/29, Daniel Berger <djberg96 at gmail.com>: <snip>> While I understand why this code works, I''m still not entirely clear why > the previous code would cause the interpreter to segfault. Bad pointer > address? >Yes, Ruby''s string is not just character array, it is actually a structure and a tainted structure causes unexpected behaviour.> In any case, excellent work, thank you! >You are welcome.> Now I''m trying to work out a general approach for the windows-pr stuff. > Given a method like this: > > def GetFullPathName(file, buf, buf_size, part) > if $KCODE != ''NONE'' > GetFullPathNameW.call(file, buf, buf_size, part) > else > GetFullPathName.call(file, buf, buf_size, part) > end > end > > Should I modify it to try to do a best-guess? > > if $KCODE != ''NONE'' > GetFullPathNameW.call(file, buf, buf_size/2, part) > end >Be careful, before calling GetFullPathNameW, the "file" must be UTF16 string. I recommend all function call using W function, and if string is not UTF16 then first convert it to UTF16 string. $KCODE must be used to only determine the string is UTF8 string, not determine to call wheter Ansi function or W function. Because Ruby interpreter cannot handle UTF16 code file, the user has no chance to use UTF16 string in real world in the ruby code. Sample code is like this: from = multi_to_wide(from) to = multi_to_wide(to) if GetFullPathNameW.call(from, from_path.size, from_path, 0) == 0 raise StandardError, ''GetFullPathName() failed: '' + get_last_error end if GetFullPathNameW.call(to, to_path.size, to_path, 0) == 0 raise StandardError, ''GetFullPathName() failed: '' + get_last_error end> Or do you think that''s the user''s job? >I think the user would''nt care about file name is unicode or ansi string. Regards, Park Heesob
Heesob Park wrote:> Hi, > > 2006/5/29, Daniel Berger <djberg96 at gmail.com>: > <snip> >> While I understand why this code works, I''m still not entirely clear why >> the previous code would cause the interpreter to segfault. Bad pointer >> address? >> > Yes, Ruby''s string is not just character array, it is actually a > structure and a tainted structure causes unexpected behaviour.Ah, right.> >> In any case, excellent work, thank you! >> > You are welcome. > >> Now I''m trying to work out a general approach for the windows-pr stuff. >> Given a method like this: >> >> def GetFullPathName(file, buf, buf_size, part) >> if $KCODE != ''NONE'' >> GetFullPathNameW.call(file, buf, buf_size, part) >> else >> GetFullPathName.call(file, buf, buf_size, part) >> end >> end >> >> Should I modify it to try to do a best-guess? >> >> if $KCODE != ''NONE'' >> GetFullPathNameW.call(file, buf, buf_size/2, part) >> end >> > > Be careful, before calling GetFullPathNameW, the "file" must be UTF16 string. > I recommend all function call using W function, and if string is not > UTF16 then first convert it to UTF16 string. > $KCODE must be used to only determine the string is UTF8 string, not > determine to call wheter Ansi function or W function.What would you recommend then? How should I determine within Ruby if the string being passed to a function is UTF16? IsTextUnicode()? Something else? How would you define GetFullPathName within file.rb (from windows-pr) then, for example? Thanks, Dan
2006/5/29, Daniel Berger <djberg96 at gmail.com>:> Heesob Park wrote: > > Hi, > > > > 2006/5/29, Daniel Berger <djberg96 at gmail.com>: > > <snip> > >> While I understand why this code works, I''m still not entirely clear why > >> the previous code would cause the interpreter to segfault. Bad pointer > >> address? > >> > > Yes, Ruby''s string is not just character array, it is actually a > > structure and a tainted structure causes unexpected behaviour. > > Ah, right. > > > > >> In any case, excellent work, thank you! > >> > > You are welcome. > > > >> Now I''m trying to work out a general approach for the windows-pr stuff. > >> Given a method like this: > >> > >> def GetFullPathName(file, buf, buf_size, part) > >> if $KCODE != ''NONE'' > >> GetFullPathNameW.call(file, buf, buf_size, part) > >> else > >> GetFullPathName.call(file, buf, buf_size, part) > >> end > >> end > >> > >> Should I modify it to try to do a best-guess? > >> > >> if $KCODE != ''NONE'' > >> GetFullPathNameW.call(file, buf, buf_size/2, part) > >> end > >> > > > > Be careful, before calling GetFullPathNameW, the "file" must be UTF16 string. > > I recommend all function call using W function, and if string is not > > UTF16 then first convert it to UTF16 string. > > $KCODE must be used to only determine the string is UTF8 string, not > > determine to call wheter Ansi function or W function. > > What would you recommend then? How should I determine within Ruby if > the string being passed to a function is UTF16? IsTextUnicode()? > Something else? >The user might not call function with UTF16 by accident. But the user who want call function with UTF16 string on purpose, use utf16 flag.> How would you define GetFullPathName within file.rb (from windows-pr) > then, for example? >How about this? def GetFullPathName(file, buf_size, buf, part, utf16 = false) file = multi_to_wide(file) unless utf16 GetFullPathNameW.call(file, buf.size/2, buf, part) end Regards, Park Heesob
Heesob Park wrote: <snip>> How about this? > > def GetFullPathName(file, buf_size, buf, part, utf16 = false) > file = multi_to_wide(file) unless utf16 > GetFullPathNameW.call(file, buf.size/2, buf, part) > endExcept that means adding an extra argument to a lot of methods. Hm....what about: def GetFullPathName(file, buf_size, buf, part) file = multi_to_wide(file) unless IsTextUnicode(file) GetFullPathNameW.call(file, buf.size/2, buf, part) end A little more work for me, but less for the user to remember. Will that work? Or do you think IsTextUnicode() is too unreliable? (Sorry if you answered this previously) Regards, Dan
Hi, 2006/5/29, Daniel Berger <djberg96 at gmail.com>:> Heesob Park wrote: > > <snip> > > > How about this? > > > > def GetFullPathName(file, buf_size, buf, part, utf16 = false) > > file = multi_to_wide(file) unless utf16 > > GetFullPathNameW.call(file, buf.size/2, buf, part) > > end > > Except that means adding an extra argument to a lot of methods. > > Hm....what about: > > def GetFullPathName(file, buf_size, buf, part) > file = multi_to_wide(file) unless IsTextUnicode(file) > GetFullPathNameW.call(file, buf.size/2, buf, part) > end > > A little more work for me, but less for the user to remember. >That''s Ok for internal Use. But what if the user wants the result of the function ? Every function needs to be a->w;callw;w->a; conversion like this? def GetFullPathName(file, buf_size, buf, part) file = multi_to_wide(file) unless IsTextUnicode(file) GetFullPathNameW.call(file, buf.size/2, buf, part) buf = wide_to_multi(buf) end I recommend to separate two functions like this: def GetFullPathName(file, buf_size, buf, part) GetFullPathName.call(file, buf.size/2, buf, part) end def GetFullPathNameW(file, buf_size, buf, part) file = multi_to_wide(file) unless IsTextUnicode(file) GetFullPathNameW.call(file, buf.size/2, buf, part) end> Will that work? Or do you think IsTextUnicode() is too unreliable? >I think it is a useful function. I just don''t want to make slow code with using another api function:) Regards, Park Heesob