Pranith Kumar Karampuri
2016-Jan-12 03:32 UTC
[Gluster-users] High I/O And Processor Utilization
On 01/12/2016 08:52 AM, Lindsay Mathieson wrote:> On 11/01/16 15:37, Krutika Dhananjay wrote: >> Kyle, >> >> Based on the testing we have done from our end, we've found that >> 512MB is a good number that is neither too big nor too small, >> and provides good performance both on the IO side and with respect to >> self-heal. > > > Hi Krutika, I experimented a lot with different chunk sizes, didn't > find all that much difference between 4MB and 1GB > > But benchmarks are tricky things - I used Crystal Diskmark inside a > VM, which is probably not the best assessment. And two of the bricks > on my replica 3 are very slow, just test drives, not production. So I > guess that would effevt things :) > > These are my current setting - what do you use? > > Volume Name: datastore1 > Type: Replicate > Volume ID: 1261175d-64e1-48b1-9158-c32802cc09f0 > Status: Started > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: vnb.proxmox.softlog:/vmdata/datastore1 > Brick2: vng.proxmox.softlog:/vmdata/datastore1 > Brick3: vna.proxmox.softlog:/vmdata/datastore1 > Options Reconfigured: > network.remote-dio: enable > cluster.eager-lock: enable > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > performance.stat-prefetch: off > performance.strict-write-ordering: on > performance.write-behind: off > nfs.enable-ino32: off > nfs.addr-namelookup: off > nfs.disable: on > performance.cache-refresh-timeout: 4 > performance.io-thread-count: 32 > cluster.server-quorum-type: server > cluster.quorum-type: auto > client.event-threads: 4 > server.event-threads: 4 > cluster.self-heal-window-size: 256 > features.shard-block-size: 512MB > features.shard: on > performance.readdir-ahead: off > >Most of these tests are done by Paul Cuzner (CCed). Pranith
I thought I might take a minute to update everyone on this situation. I rebuilt the glusterFS using a shard size of 256MB and then imported all VMs back on to the cluster. I rebuilt it from scratch rather than just doing an export/import on the data so I could start everything fresh. I wish now I would have used 512MB but unfortunately I didn?t see that suggestion in time. Anyway, the good news is the system load has greatly decreased. The systems are all now in a usable state. The bad news is that I am still seeing a bunch of heals and not sure why. Because of that, I am also still seeing the drives slow down from over 110 MB/sec without Gluster running on them to ~ 25 MB/sec with bricks running on the drives. So it seems to me there is still an issue here. Also as fate would have it, I am having issues due to a bug in the Gluster NFS implementation on this version (3.7) that necessitates the need to set nfs.acl to off so also hoping for a fix for that soon. I think this is the bug but not sure: https://bugzilla.redhat.com/show_bug.cgi?id=1238318 So to sum up, things are working but the performance leaves much to be desired due mostly I suspect due to all the heals taking place. - Kyle On Mon, Jan 11, 2016 at 9:32 PM, Pranith Kumar Karampuri < pkarampu at redhat.com> wrote:> > > On 01/12/2016 08:52 AM, Lindsay Mathieson wrote: > >> On 11/01/16 15:37, Krutika Dhananjay wrote: >> >>> Kyle, >>> >>> Based on the testing we have done from our end, we've found that 512MB >>> is a good number that is neither too big nor too small, >>> and provides good performance both on the IO side and with respect to >>> self-heal. >>> >> >> >> Hi Krutika, I experimented a lot with different chunk sizes, didn't find >> all that much difference between 4MB and 1GB >> >> But benchmarks are tricky things - I used Crystal Diskmark inside a VM, >> which is probably not the best assessment. And two of the bricks on my >> replica 3 are very slow, just test drives, not production. So I guess that >> would effevt things :) >> >> These are my current setting - what do you use? >> >> Volume Name: datastore1 >> Type: Replicate >> Volume ID: 1261175d-64e1-48b1-9158-c32802cc09f0 >> Status: Started >> Number of Bricks: 1 x 3 = 3 >> Transport-type: tcp >> Bricks: >> Brick1: vnb.proxmox.softlog:/vmdata/datastore1 >> Brick2: vng.proxmox.softlog:/vmdata/datastore1 >> Brick3: vna.proxmox.softlog:/vmdata/datastore1 >> Options Reconfigured: >> network.remote-dio: enable >> cluster.eager-lock: enable >> performance.io-cache: off >> performance.read-ahead: off >> performance.quick-read: off >> performance.stat-prefetch: off >> performance.strict-write-ordering: on >> performance.write-behind: off >> nfs.enable-ino32: off >> nfs.addr-namelookup: off >> nfs.disable: on >> performance.cache-refresh-timeout: 4 >> performance.io-thread-count: 32 >> cluster.server-quorum-type: server >> cluster.quorum-type: auto >> client.event-threads: 4 >> server.event-threads: 4 >> cluster.self-heal-window-size: 256 >> features.shard-block-size: 512MB >> features.shard: on >> performance.readdir-ahead: off >> >> >> Most of these tests are done by Paul Cuzner (CCed). > > Pranith >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160114/2c5ee326/attachment.html>