Some questions regarding the working of glusterfs striped over multiple servers (glusterfs 3.0.4) (1) I notice that when writing files I seem to get a file of (approximately) the same size on each file server: eg using /tmp (ext3) filesystem on each of 8 servers: comp00: -rw-r----- 1 sccomp users 4208066560 Jun 28 16:13 /tmp/BIG0 comp01: -rw-r----- 1 sccomp users 4208197632 Jun 28 16:13 /tmp/BIG0 comp02: -rw-r----- 1 sccomp users 4208328704 Jun 28 16:13 /tmp/BIG0 comp03: -rw-r----- 1 sccomp users 4208459776 Jun 28 16:13 /tmp/BIG0 comp04: -rw-r----- 1 sccomp users 4208590848 Jun 28 16:13 /tmp/BIG0 comp05: -rw-r----- 1 sccomp users 4208721920 Jun 28 16:13 /tmp/BIG0 comp06: -rw-r----- 1 sccomp users 4208852992 Jun 28 16:13 /tmp/BIG0 comp07: -rw-r----- 1 sccomp users 4208984064 Jun 28 16:13 /tmp/BIG0 This corresponds to an 4Gbyte file BIG0 on the glusterfs filesystem. I was expecting 0.5Gbyte file on each server. (2) When running IOZONE benchmark with this setup I get a very poor write performance ( this would be expected if the file is being written to each server). The read performance is as expected. eg 115Mbyte/s write 600Mbytes/s read. I am wondering if my setup is faulty or this is expected. Thanks, Nick This e-mail message may contain confidential and/or privileged information. If you are not an addressee or otherwise authorized to receive this message, you should not use, copy, disclose or take any action based on this e-mail or any information contained in the message. If you have received this material in error, please advise the sender immediately by reply e-mail and delete this message. Thank you. Streamline Computing is a a company registered in England and Wales No: 03913912 Registered Address: The Innovation Centre, Warwick Technology Park, Gallows Hill, Warwick, CV34 6UW, United Kingdom
On 06/28/2010 11:23 AM, Nick Birkett wrote:> Some questions regarding the working of glusterfs striped over multiple > servers (glusterfs 3.0.4) > > (1) > I notice that when writing files I seem to get a file of (approximately) > the same size on each file server: eg using /tmp (ext3) > filesystem on each of 8 servers: > > comp00: -rw-r----- 1 sccomp users 4208066560 Jun 28 16:13 /tmp/BIG0 > comp01: -rw-r----- 1 sccomp users 4208197632 Jun 28 16:13 /tmp/BIG0 > comp02: -rw-r----- 1 sccomp users 4208328704 Jun 28 16:13 /tmp/BIG0 > comp03: -rw-r----- 1 sccomp users 4208459776 Jun 28 16:13 /tmp/BIG0 > comp04: -rw-r----- 1 sccomp users 4208590848 Jun 28 16:13 /tmp/BIG0 > comp05: -rw-r----- 1 sccomp users 4208721920 Jun 28 16:13 /tmp/BIG0 > comp06: -rw-r----- 1 sccomp users 4208852992 Jun 28 16:13 /tmp/BIG0 > comp07: -rw-r----- 1 sccomp users 4208984064 Jun 28 16:13 /tmp/BIG0 > > This corresponds to an 4Gbyte file BIG0 on the glusterfs filesystem. > > I was expecting 0.5Gbyte file on each server.What you're seeing is the effect of your local filesystem storing sparse files efficiently. When you do "ls -l" what you see is not the number of bytes used but the offset one beyond the last byte written. Since you're doing an eight-way stripe, seven out of eight blocks within each copy of the file will not be written and that fact will be properly noted by the local filesystem. If you use "du" instead of "ls" you should see the expected result.> (2) > When running IOZONE benchmark with this setup I get a very poor write > performance > ( this would be expected if the file is being written to each server). > The read performance is > as expected. > > eg 115Mbyte/s write > 600Mbytes/s read.Are you sure you're getting uncached read numbers? Also, what kind of network are you using, and what were the exact arguments you used for iozone? These numbers might be expected, or they might be anomalous, but it's hard to tell without that info.