Hi all, Looking at the IO.readlines source in io.c, it looks to me like they grab 8k chunks, split on the input record separator, and buffer accordingly. Since it looks like ReadFileScatter() does some of that work automatically (in page file sized chunks), I thought I''d give it a try. Here''s what I''ve got, but it doesn''t work. I have an incorrect parameter in the call to ReadFileScatter(). So, I''ve either got the size wrong, bad alignment or I need to pass in a packed data structure of some sort. Any ideas? BTW, you''ll want to grab the latest windows-pr from CVS in order to run this code. Thanks, Dan # WinIO.readlines require ''windows/handle'' require ''windows/error'' require ''windows/system_info'' require ''windows/nio'' require ''windows/file'' class WinIO extend Windows::Error extend Windows::Handle extend Windows::NIO extend Windows::File extend Windows::MSVCRT::IO extend Windows::SystemInfo include Windows::File def self.readlines(file, sep = $INPUT_RECORD_SEPARATOR) handle = CreateFile( file, GENERIC_READ, FILE_SHARE_READ, nil, OPEN_EXISTING, FILE_FLAG_OVERLAPPED | FILE_FLAG_NO_BUFFERING, nil ) if handle == INVALID_HANDLE_VALUE raise SystemCallError.new(GetLastError()) end sysbuf = 0.chr * 40 GetSystemInfo(sysbuf) page_size = sysbuf[8,4].unpack(''L'')[0] # dwPageSize file_size = File.size(file) # FILE_SEGMENT_ELEMENT fse_struct = (0.chr * page_size) + (0.chr * 8) # Buffer + Align. seg_array = (0.chr * (file_size / fse_struct.size)) + 0.chr olapped = 0.chr * 20 bool = ReadFileScatter(handle, seg_array, file_size, nil, olapped) raise SystemCallError.new(GetLastError()) end sleep 0.01 unless HasOverlappedIoCompleted(olapped) unless CloseHandle(handle) raise SystemCallError.new(GetLastError()) end seg_array.split(sep) end end
Hi, ----- Original Message ----- From: "Daniel Berger" <djberg96 at gmail.com> To: "win32utils-devel" <win32utils-devel at rubyforge.org> Sent: Tuesday, October 09, 2007 8:27 PM Subject: [Win32utils-devel] Playing with ReadFileScatter()> Hi all, > > Looking at the IO.readlines source in io.c, it looks to me like they > grab 8k chunks, split on the input record separator, and buffer accordingly. > > Since it looks like ReadFileScatter() does some of that work > automatically (in page file sized chunks), I thought I''d give it a try. > Here''s what I''ve got, but it doesn''t work. I have an incorrect parameter > in the call to ReadFileScatter(). So, I''ve either got the size wrong, > bad alignment or I need to pass in a packed data structure of some sort. > > Any ideas? BTW, you''ll want to grab the latest windows-pr from CVS in > order to run this code. > > Thanks, > > Dan ><snip> Here is a complete working source using ReadFileScatter. Notice that VirtualAlloc shoud be redifined. Since ReadFileScatter requires page aligned memory buffer, I allocated buffer using VirtualAlloc. Regards, Park Heesob # WinIO.readlines require ''windows/handle'' require ''windows/error'' require ''windows/system_info'' require ''windows/nio'' require ''windows/file'' require ''windows/synchronize'' require ''windows/msvcrt/io'' require ''windows/msvcrt/buffer'' require ''windows/memory'' class WinIO extend Windows::Error extend Windows::Handle extend Windows::NIO extend Windows::File extend Windows::Synchronize extend Windows::MSVCRT::IO extend Windows::MSVCRT::Buffer extend Windows::SystemInfo extend Windows::Memory include Windows::File include Windows::Memory PAGE_READWRITE = 4 ERROR_IO_PENDING = 997 MEM_RELEASE = 0x8000 def self.readlines(file, sep = $INPUT_RECORD_SEPARATOR) handle = CreateFile( file, GENERIC_READ, FILE_SHARE_READ, nil, OPEN_EXISTING, FILE_FLAG_OVERLAPPED | FILE_FLAG_NO_BUFFERING, nil ) if handle == INVALID_HANDLE_VALUE raise SystemCallError.new(GetLastError()) end sysbuf = 0.chr * 40 GetSystemInfo(sysbuf) page_size = sysbuf[4,4].unpack(''L'')[0] # dwPageSize shoud be sysbuf[4,4] not sysbuf[8,4] file_size = File.size(file) page_num = (file_size*1.0 / page_size).ceil API.new(''VirtualAlloc'', ''LLLL'', ''L'') # redefined VirtualAlloc base_address = VirtualAlloc(nil, page_size * page_num, MEM_COMMIT, PAGE_READWRITE) buf_list = [] for i in 0...page_num buf_list.push(base_address+page_size*i) end seg_array = buf_list.pack(''Q*'') + 0.chr * 8 olapped = 0.chr * 20 unless ReadFileScatter(handle, seg_array, page_size * page_num, nil, olapped) error = GetLastError() if error != ERROR_IO_PENDING raise SystemCallError.new(error) end end sleep 0.01 unless HasOverlappedIoCompleted(olapped) unless CloseHandle(handle) raise SystemCallError.new(GetLastError()) end buffer = 0.chr * file_size memcpy(buffer, buf_list[0], file_size) VirtualFree(base_address,0,MEM_RELEASE) buffer.split(sep).map {|x| x+"\n"} end end a = IO.readlines(''c:/work/java.txt'') b = WinIO.readlines(''c:/work/java.txt'',"\r\n") p a==b
> -----Original Message----- > From: win32utils-devel-bounces at rubyforge.org > [mailto:win32utils-devel-bounces at rubyforge.org] On Behalf Of > Park Heesob > Sent: Tuesday, October 09, 2007 8:57 AM > To: Development and ideas for win32utils projects > Subject: Re: [Win32utils-devel] Playing with ReadFileScatter()<snip>> Here is a complete working source using ReadFileScatter. > > Notice that VirtualAlloc shoud be redifined. Since > ReadFileScatter requires page > aligned memory buffer, I allocated buffer using VirtualAlloc.<snip> Thanks! That works. I was curious about performance, so I did some benchmarks. It seems to get dramatically worse as the file size gets larger. Is this to be expected? Here''s a benchmark program I created. The 5 files end up at the following size (in bytes): 550 file1.txt 5,590 file2.txt 56,890 file3.txt 578,890 file4.txt 5,888,890 file5.txt A little profiling indicates that the worst of it is the buffer.split & map at the end. Here it is with split and map: user system total real IO.readlines file1 0.000000 0.000000 0.000000 ( 0.000000) IO.readlines file2 0.010000 0.000000 0.010000 ( 0.010000) IO.readlines file3 0.030000 0.000000 0.030000 ( 0.030000) IO.readlines file4 0.891000 0.020000 0.911000 ( 1.111000) IO.readlines file5 0.922000 0.020000 0.942000 ( 0.942000) WinIO.readlines file1 0.000000 0.010000 0.010000 ( 0.010000) WinIO.readlines file2 0.000000 0.000000 0.000000 ( 0.130000) WinIO.readlines file3 0.120000 0.030000 0.150000 ( 0.280000) WinIO.readlines file4 1.522000 0.040000 1.562000 ( 2.824000) WinIO.readlines file5 12.448000 0.431000 12.879000 ( 23.514000) Here it is if we remove the split and map. Much better, but still slow. user system total real IO.readlines file1 0.000000 0.000000 0.000000 ( 0.000000) IO.readlines file2 0.000000 0.010000 0.010000 ( 0.010000) IO.readlines file3 0.030000 0.010000 0.040000 ( 0.050000) IO.readlines file4 0.811000 0.040000 0.851000 ( 1.042000) IO.readlines file5 0.902000 0.020000 0.922000 ( 1.031000) WinIO.readlines file1 0.010000 0.000000 0.010000 ( 0.010000) WinIO.readlines file2 0.000000 0.010000 0.010000 ( 0.110000) WinIO.readlines file3 0.060000 0.010000 0.070000 ( 0.200000) WinIO.readlines file4 0.420000 0.030000 0.450000 ( 1.743000) WinIO.readlines file5 4.427000 0.140000 4.567000 ( 13.980000) # winio_bench.rb $:.unshift Dir.pwd require ''benchmark'' require ''winio'' fh1 = File.open("file1.txt", "w") fh2 = File.open("file2.txt", "w") fh3 = File.open("file3.txt", "w") fh4 = File.open("file4.txt", "w") fh5 = File.open("file5.txt", "w") s = "The quick brown fox jumped over the lazy dog''s back " 10.times{ |n| fh1.puts s + n.to_s } fh1.close puts "File 1 created" 100.times{ |n| fh2.puts s + n.to_s } fh2.close puts "File 2 created" 1000.times{ |n| fh3.puts s + n.to_s } fh3.close puts "File 3 created" 10000.times{ |n| fh4.puts s + n.to_s } fh4.close puts "File 4 created" 100000.times{ |n| fh5.puts s + n.to_s } fh5.close puts "File 4 created" MAX = 10 Benchmark.bm(35) do |x| x.report("IO.readlines file1"){ MAX.times{ IO.readlines(''file1.txt'') } } x.report("IO.readlines file2"){ MAX.times{ IO.readlines(''file2.txt'') } } x.report("IO.readlines file3"){ MAX.times{ IO.readlines(''file3.txt'') } } x.report("IO.readlines file4"){ MAX.times{ IO.readlines(''file4.txt'') } } x.report("IO.readlines file5"){ MAX.times{ IO.readlines(''file4.txt'') } } x.report("WinIO.readlines file1"){ MAX.times{ WinIO.readlines(''file1.txt'') } } x.report("WinIO.readlines file2"){ MAX.times{ WinIO.readlines(''file2.txt'') } } x.report("WinIO.readlines file3"){ MAX.times{ WinIO.readlines(''file3.txt'') } } x.report("WinIO.readlines file4"){ MAX.times{ WinIO.readlines(''file4.txt'') } } x.report("WinIO.readlines file5"){ MAX.times{ WinIO.readlines(''file5.txt'') } } end Regards, Dan This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
Hi, Just a quick followup. I noticed that VirtualAlloc() works on my Windows XP Pro and Win2k Pro boxes, but not on my XP Home laptop - it returns an empty string (NULL I presume, for failure). Are you all seeing the same thing? The docs don''t specifically mention XP Pro as a requirement, but I wouldn''t be entirely surprised. Also, I added ERROR_IO_PENDING and MEM_RELEASE to the appropriate files in windows-pr. I should probably move the PAGE_xxx constants over to the Windows::Memory module as well (they''re currently in file-mapping.rb). Regards, Dan
Hi, 2007/10/11, Daniel Berger <djberg96 at gmail.com>:> Hi, > > Just a quick followup. I noticed that VirtualAlloc() works on my Windows > XP Pro and Win2k Pro boxes, but not on my XP Home laptop - it returns an > empty string (NULL I presume, for failure). > > Are you all seeing the same thing? The docs don''t specifically mention > XP Pro as a requirement, but I wouldn''t be entirely surprised.In my XP Home Desktop, it works fine. The ReadFileScatter part of source code needs to modify like this: unless ReadFileScatter(handle, seg_array, page_size * page_num, nil, olapped) error = GetLastError() if error == ERROR_IO_PENDING while not HasOverlappedIoCompleted(olapped); end else raise SystemCallError.new(error) end end I noticed setting the benchmark iteration count MAX = 1 shows very different result with MAX = 10. I guess it is due to GC. Some tests show that in case of the large file(about 30MB) ,the ReadFileScatter reads faster than IO.readlines.> Also, I added ERROR_IO_PENDING and MEM_RELEASE to the appropriate files > in windows-pr. I should probably move the PAGE_xxx constants over to the > Windows::Memory module as well (they''re currently in file-mapping.rb). > > Regards, > > DanRegards, Park Heesob -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/win32utils-devel/attachments/20071011/c6fe458e/attachment.html
> -----Original Message----- > From: win32utils-devel-bounces at rubyforge.org[mailto:win32utils-devel-bounces at rubyforge.org] On Behalf Of Heesob Park> Sent: Wednesday, October 10, 2007 10:37 PM > To: Development and ideas for win32utils projects > Subject: Re: [Win32utils-devel] Playing with ReadFileScatter()<snip> Had to do some hand-editing of this reply - MS Outlook is being stupid today...> In my XP Home Desktop, it works fine.Odd. I wonder if I configured something that''s causing it to fail. The system settings show that I should have 1.5gb of page file space. I''ll play with it some more tonight.> The ReadFileScatter part of source code needs to modify like this: > > unless ReadFileScatter(handle, seg_array, page_size * page_num,nil, olapped)> error = GetLastError() > if error == ERROR_IO_PENDING > while not HasOverlappedIoCompleted(olapped); end > else > raise SystemCallError.new(error) > end > endOk, made that change. I also moved the GetSystemInfo out of the method itself and made it a constant that gets set when the file gets loaded.> I noticed setting the benchmark iteration count MAX = 1 shows verydifferent result with MAX = 10.> I guess it is due to GC. > > Some tests show that in case of the large file(about 30MB) ,theReadFileScatter reads faster than IO.readlines. At work I get wildly different results of even a single iteration, even with a standard MRI IO.readlines call. The first time I call it, IO.readlines (not WinIO.readlines) takes about 9 seconds on a 30mb file. If I run it again in quick succession, it only takes 2 seconds! I''m guessing there''s some caching going on. Similarly, when I run WinIO.readlines, the speed improves if I do successive iterations. Same for an equivalent Perl script. This makes benchmarking somewhat problematic. But, if I just look at the *first* run (waiting a few minutes between each run), our scatter method seems to be about 2 seconds faster than Ruby''s builtin IO.readlines method. I''ll play around with it some more tonight, though. Regards, Dan This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.