Hi Nick,
Thank you for using Gluster and sending us such detailed description of the
problem you are seeing. We will try a run with exactly the same switches and
config as you mention and see if we can reproduce this inhouse to make debugging
easier.
Regards,
Tejas.
----- Original Message -----
From: "Nick Birkett" <nick at streamline-computing.com>
To: gluster-users at gluster.org
Sent: Wednesday, December 23, 2009 3:04:43 PM GMT +05:30 Chennai, Kolkata,
Mumbai, New Delhi
Subject: [Gluster-users] gluster 3.0 read hangs
I ran some benchmarks last week using 2.0.8. Single server with 8 Intel
e1000e bonded mode=balance-alb
All worked fine and I got some good results using 8 clients. All Gigabit.
The benchmarks did 2 passes of IOZONE in network mode using 1-8 threads
per client and using 1 - 8 clients. Each client used 32Gbyte files.
All jobs completed successfully. This takes about 32 hours to run
through all cases.
Yesterday I updated to 3.0.0 (server and clients) and re-configured the
server and client vol files using glusterfs-volgen (renamed some of the
vol names).
RedHat EL5 binary packages from Glusterfs site installed
glusterfs-server-3.0.0-1.x86_64
glusterfs-common-3.0.0-1.x86_64
glusterfs-client-3.0.0-1.x86_64
All works mainly ok, except every so often the IOZONE job just stops.
The network IO drops to zero.
This is always happens during either a read or re-read test. It happes
just as the IOZONE read
test starts. It doesnt happen every time and it may run
for several hours without incident. This has happened 6 times on
different test cases (thread/clients).
Anyone else noticed this ? Perhaps I have done something wrong ?
vol files attached - I know I dont need to distribute 1 remote vol -
part of larger test with multiple vols.
Attached sample outputs. 4 clients 4 files per client ran fine. 4
clients 8 files per client hung at re-read
on 2nd pass of IOZONE. All jobs with 5 clients and 8 clients ran to
completion.
Thanks,
Nick
This e-mail message may contain confidential and/or privileged information. If
you are not an addressee or otherwise authorized to receive this message, you
should not use, copy, disclose or take any action based on this e-mail or any
information contained in the message.
If you have received this material in error, please advise the sender
immediately by reply e-mail and delete this message. Thank you.
Streamline Computing is a trading division of Concurrent Thinking Limited:
Registered in England and Wales No: 03913912
Registered Address: The Innovation Centre, Warwick Technology Park, Gallows
Hill, Warwick, CV34 6UW, United Kingdom
volume brick00.server-e
type protocol/client
option transport-type tcp
option transport.socket.nodelay on
option transport.remote-port 6996
option remote-host 192.168.100.200 # can be IP or hostname
option remote-subvolume brick00
end-volume
volume distribute
type cluster/distribute
subvolumes brick00.server-e
end-volume
volume writebehind
type performance/write-behind
option cache-size 4MB
subvolumes distribute
end-volume
volume readahead
type performance/read-ahead
option page-count 4
subvolumes writebehind
end-volume
volume iocache
type performance/io-cache
option cache-size 1GB
option cache-timeout 1
subvolumes readahead
end-volume
volume quickread
type performance/quick-read
option cache-timeout 1
option max-file-size 64kB
subvolumes iocache
end-volume
volume statprefetch
type performance/stat-prefetch
subvolumes quickread
end-volume
#glusterfsd_keep=0
volume posix00
type storage/posix
option directory /data/data00
end-volume
volume locks00
type features/locks
subvolumes posix00
end-volume
volume brick00
type performance/io-threads
option thread-count 8
subvolumes locks00
end-volume
volume server
type protocol/server
option transport-type tcp
option transport.socket.listen-port 6996
option transport.socket.nodelay on
option auth.addr.brick00.allow *
subvolumes brick00
end-volume
=========================================================Cluster name :
Delldemo
Arch : x86_64
SGE job submitted : Tue Dec 22 22:21:38 GMT 2009
Number of CPUS 8
Running Parallel IOZONE on ral03
Creating files in /data2/sccomp
NTHREADS=4
Total data size = 48196 MBytes
Running loop 1 of 2
Iozone: Performance Test of File I/O
Version $Revision: 3.326 $
Compiled for 64 bit mode.
Build: linux
Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
Al Slater, Scott Rhine, Mike Wisner, Ken Goss
Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy,
Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root.
Run began: Tue Dec 22 22:21:38 2009
Network distribution mode enabled.
File size set to 12338176 KB
Command line used: /opt/iozone/bin/iozone -+m hosts.556 -s 12049m -S 8192 -T -i
0 -i 1 -t 16 -F /data2/sccomp/BIG.0.comp03.streamline
/data2/sccomp/BIG.1.comp03.streamline /data2/sccomp/BIG.2.comp03.streamline
/data2/sccomp/BIG.3.comp03.streamline /data2/sccomp/BIG.0.ral02.streamline
/data2/sccomp/BIG.1.ral02.streamline /data2/sccomp/BIG.2.ral02.streamline
/data2/sccomp/BIG.3.ral02.streamline /data2/sccomp/BIG.0.ral03.streamline
/data2/sccomp/BIG.1.ral03.streamline /data2/sccomp/BIG.2.ral03.streamline
/data2/sccomp/BIG.3.ral03.streamline /data2/sccomp/BIG.0.ral04.streamline
/data2/sccomp/BIG.1.ral04.streamline /data2/sccomp/BIG.2.ral04.streamline
/data2/sccomp/BIG.3.ral04.streamline
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 8192 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
Throughput test with 16 threads
Each thread writes a 12338176 Kbyte file in 4 Kbyte records
Test running:
Children see throughput for 16 initial writers = 424003.62 KB/sec
Min throughput per thread = 26480.14 KB/sec
Max throughput per thread = 26517.04 KB/sec
Avg throughput per thread = 26500.23 KB/sec
Min xfer = 12321928.00 KB
Test running:
Children see throughput for 16 rewriters = 424109.61 KB/sec
Min throughput per thread = 26483.30 KB/sec
Max throughput per thread = 26530.66 KB/sec
Avg throughput per thread = 26506.85 KB/sec
Min xfer = 12316680.00 KB
Test running:
Children see throughput for 16 readers = 454358.62 KB/sec
Min throughput per thread = 28298.30 KB/sec
Max throughput per thread = 28592.02 KB/sec
Avg throughput per thread = 28397.41 KB/sec
Min xfer = 12211568.00 KB
Test running:
Children see throughput for 16 re-readers = 459262.06 KB/sec
Min throughput per thread = 28600.55 KB/sec
Max throughput per thread = 28892.20 KB/sec
Avg throughput per thread = 28703.88 KB/sec
Min xfer = 12219504.00 KB
Test cleanup:
iozone test complete.
Running loop 2 of 2
Iozone: Performance Test of File I/O
Version $Revision: 3.326 $
Compiled for 64 bit mode.
Build: linux
Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
Al Slater, Scott Rhine, Mike Wisner, Ken Goss
Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy,
Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root.
Run began: Tue Dec 22 22:53:33 2009
Network distribution mode enabled.
File size set to 12338176 KB
Command line used: /opt/iozone/bin/iozone -+m hosts.556 -s 12049m -S 8192 -T -i
0 -i 1 -t 16 -F /data2/sccomp/BIG.0.comp03.streamline
/data2/sccomp/BIG.1.comp03.streamline /data2/sccomp/BIG.2.comp03.streamline
/data2/sccomp/BIG.3.comp03.streamline /data2/sccomp/BIG.0.ral02.streamline
/data2/sccomp/BIG.1.ral02.streamline /data2/sccomp/BIG.2.ral02.streamline
/data2/sccomp/BIG.3.ral02.streamline /data2/sccomp/BIG.0.ral03.streamline
/data2/sccomp/BIG.1.ral03.streamline /data2/sccomp/BIG.2.ral03.streamline
/data2/sccomp/BIG.3.ral03.streamline /data2/sccomp/BIG.0.ral04.streamline
/data2/sccomp/BIG.1.ral04.streamline /data2/sccomp/BIG.2.ral04.streamline
/data2/sccomp/BIG.3.ral04.streamline
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 8192 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
Throughput test with 16 threads
Each thread writes a 12338176 Kbyte file in 4 Kbyte records
Test running:
Children see throughput for 16 initial writers = 425851.12 KB/sec
Min throughput per thread = 26593.95 KB/sec
Max throughput per thread = 26634.84 KB/sec
Avg throughput per thread = 26615.70 KB/sec
Min xfer = 12319368.00 KB
Test running:
Children see throughput for 16 rewriters = 424954.77 KB/sec
Min throughput per thread = 26459.38 KB/sec
Max throughput per thread = 26656.61 KB/sec
Avg throughput per thread = 26559.67 KB/sec
Min xfer = 12247176.00 KB
Test running:
Children see throughput for 16 readers = 459433.33 KB/sec
Min throughput per thread = 28449.77 KB/sec
Max throughput per thread = 28964.50 KB/sec
Avg throughput per thread = 28714.58 KB/sec
Min xfer = 12119024.00 KB
Test running:
Children see throughput for 16 re-readers = 458413.46 KB/sec
Min throughput per thread = 28457.53 KB/sec
Max throughput per thread = 28831.23 KB/sec
Avg throughput per thread = 28650.84 KB/sec
Min xfer = 12178288.00 KB
Test cleanup:
iozone test complete.
echo
echo ---------------
echo Job output ends
echo ========================================================echo SGE job:
finished date = Tue Dec 22 23:25:20 GMT 2009
echo Total run time : 1 Hours 3 Minutes 42 Seconds
echo Time in seconds: 3822 Seconds
echo ========================================================
=========================================================Cluster name :
Delldemo
Arch : x86_64
SGE job submitted : Tue Dec 22 23:25:30 GMT 2009
Number of CPUS 8
Running Parallel IOZONE on comp01
Creating files in /data2/sccomp
NTHREADS=8
Total data size = 32240 MBytes
Running loop 1 of 2
Iozone: Performance Test of File I/O
Version $Revision: 3.326 $
Compiled for 64 bit mode.
Build: linux
Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
Al Slater, Scott Rhine, Mike Wisner, Ken Goss
Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy,
Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root.
Run began: Tue Dec 22 23:25:30 2009
Network distribution mode enabled.
File size set to 4126720 KB
Command line used: /opt/iozone/bin/iozone -+m hosts.557 -s 4030m -S 512 -T -i 0
-i 1 -t 32 -F /data2/sccomp/BIG.0.comp00.streamline
/data2/sccomp/BIG.1.comp00.streamline /data2/sccomp/BIG.2.comp00.streamline
/data2/sccomp/BIG.3.comp00.streamline /data2/sccomp/BIG.4.comp00.streamline
/data2/sccomp/BIG.5.comp00.streamline /data2/sccomp/BIG.6.comp00.streamline
/data2/sccomp/BIG.7.comp00.streamline /data2/sccomp/BIG.0.comp01.streamline
/data2/sccomp/BIG.1.comp01.streamline /data2/sccomp/BIG.2.comp01.streamline
/data2/sccomp/BIG.3.comp01.streamline /data2/sccomp/BIG.4.comp01.streamline
/data2/sccomp/BIG.5.comp01.streamline /data2/sccomp/BIG.6.comp01.streamline
/data2/sccomp/BIG.7.comp01.streamline /data2/sccomp/BIG.0.comp02.streamline
/data2/sccomp/BIG.1.comp02.streamline /data2/sccomp/BIG.2.comp02.streamline
/data2/sccomp/BIG.3.comp02.streamline /data2/sccomp/BIG.4.comp02.streamline
/data2/sccomp/BIG.5.comp02.streamline /data2/sccomp/BIG.6.comp02.streamline
/data2/sccomp/BIG.7.comp02.streamline /data2/sccomp/BIG.0.ral01.streamline
/data2/sccomp/BIG.1.ral01.streamlineCommand line too long to save completely.
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 512 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
Throughput test with 32 threads
Each thread writes a 4126720 Kbyte file in 4 Kbyte records
Test running:
Children see throughput for 32 initial writers = 431608.71 KB/sec
Min throughput per thread = 13462.86 KB/sec
Max throughput per thread = 13516.74 KB/sec
Avg throughput per thread = 13487.77 KB/sec
Min xfer = 4110728.00 KB
Test running:
Children see throughput for 32 rewriters = 433205.56 KB/sec
Min throughput per thread = 13512.67 KB/sec
Max throughput per thread = 13550.23 KB/sec
Avg throughput per thread = 13537.67 KB/sec
Min xfer = 4116360.00 KB
Test running:
Children see throughput for 32 readers = 458239.61 KB/sec
Min throughput per thread = 13983.61 KB/sec
Max throughput per thread = 14699.36 KB/sec
Avg throughput per thread = 14319.99 KB/sec
Min xfer = 3925872.00 KB
Test running:
Children see throughput for 32 re-readers = 457589.70 KB/sec
Min throughput per thread = 13990.14 KB/sec
Max throughput per thread = 14654.56 KB/sec
Avg throughput per thread = 14299.68 KB/sec
Min xfer = 3939696.00 KB
Test cleanup:
iozone test complete.
Running loop 2 of 2
Iozone: Performance Test of File I/O
Version $Revision: 3.326 $
Compiled for 64 bit mode.
Build: linux
Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
Al Slater, Scott Rhine, Mike Wisner, Ken Goss
Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy,
Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root.
Run began: Tue Dec 22 23:48:31 2009
Network distribution mode enabled.
File size set to 4126720 KB
Command line used: /opt/iozone/bin/iozone -+m hosts.557 -s 4030m -S 512 -T -i 0
-i 1 -t 32 -F /data2/sccomp/BIG.0.comp00.streamline
/data2/sccomp/BIG.1.comp00.streamline /data2/sccomp/BIG.2.comp00.streamline
/data2/sccomp/BIG.3.comp00.streamline /data2/sccomp/BIG.4.comp00.streamline
/data2/sccomp/BIG.5.comp00.streamline /data2/sccomp/BIG.6.comp00.streamline
/data2/sccomp/BIG.7.comp00.streamline /data2/sccomp/BIG.0.comp01.streamline
/data2/sccomp/BIG.1.comp01.streamline /data2/sccomp/BIG.2.comp01.streamline
/data2/sccomp/BIG.3.comp01.streamline /data2/sccomp/BIG.4.comp01.streamline
/data2/sccomp/BIG.5.comp01.streamline /data2/sccomp/BIG.6.comp01.streamline
/data2/sccomp/BIG.7.comp01.streamline /data2/sccomp/BIG.0.comp02.streamline
/data2/sccomp/BIG.1.comp02.streamline /data2/sccomp/BIG.2.comp02.streamline
/data2/sccomp/BIG.3.comp02.streamline /data2/sccomp/BIG.4.comp02.streamline
/data2/sccomp/BIG.5.comp02.streamline /data2/sccomp/BIG.6.comp02.streamline
/data2/sccomp/BIG.7.comp02.streamline /data2/sccomp/BIG.0.ral01.streamline
/data2/sccomp/BIG.1.ral01.streamlineCommand line too long to save completely.
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 512 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
Throughput test with 32 threads
Each thread writes a 4126720 Kbyte file in 4 Kbyte records
Test running:
Children see throughput for 32 initial writers = 432863.52 KB/sec
Min throughput per thread = 13489.46 KB/sec
Max throughput per thread = 13564.23 KB/sec
Avg throughput per thread = 13526.99 KB/sec
Min xfer = 4104456.00 KB
Test running:
Children see throughput for 32 rewriters = 433386.73 KB/sec
Min throughput per thread = 13525.65 KB/sec
Max throughput per thread = 13553.97 KB/sec
Avg throughput per thread = 13543.34 KB/sec
Min xfer = 4118280.00 KB
Test running:
Children see throughput for 32 readers = 458043.86 KB/sec
Min throughput per thread = 13969.76 KB/sec
Max throughput per thread = 14944.34 KB/sec
Avg throughput per thread = 14313.87 KB/sec
Min xfer = 3857648.00 KB
Test running:
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users