Geoffrey Letessier
2015-Jun-02 21:45 UTC
[Gluster-users] GlusterFS 3.7 - slow/poor performances
Hi Ben, I just check my messages log files, both on client and server, and I dont find any hung task you notice on yours.. As you can read below, i dont note the performance issue in a simple DD but I think my issue is concerning a set of small files (tens of thousands nay more)? [root at nisus test]# ddt -t 10g /mnt/test/ Writing to /mnt/test/ddt.8362 ... syncing ... done. sleeping 10 seconds ... done. Reading from /mnt/test/ddt.8362 ... done. 10240MiB KiB/s CPU% Write 114770 4 Read 40675 4 for info: /mnt/test concerns the single v2 GlFS volume [root at nisus test]# ddt -t 10g /mnt/fhgfs/ Writing to /mnt/fhgfs/ddt.8380 ... syncing ... done. sleeping 10 seconds ... done. Reading from /mnt/fhgfs/ddt.8380 ... done. 10240MiB KiB/s CPU% Write 102591 1 Read 98079 2 Do you have a idea how to tune/optimize performance settings? and/or TCP settings (MTU, etc.)? --------------------------------------------------------------- | | UNTAR | DU | FIND | TAR | RM | --------------------------------------------------------------- | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s | --------------------------------------------------------------- | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s | --------------------------------------------------------------- | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s | --------------------------------------------------------------- | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s | --------------------------------------------------------------- | native FS | ~11s | ~4s | ~2s | ~56s | ~10s | --------------------------------------------------------------- | BeeGFS | ~3m43s | ~15s | ~3s | ~1m33s | ~46s | --------------------------------------------------------------- | single (v2) | ~3m6s | ~14s | ~32s | ~1m2s | ~44s | --------------------------------------------------------------- for info: -BeeGFS is a distributed FS (4 bricks, 2 bricks per server and 2 servers) - single (v2): simple gluster volume with default settings I also note I obtain the same tar/untar performance issue with FhGFS/BeeGFS but the rest (DU, FIND, RM) looks like to be OK. Thank you very much for your reply and help. Geoffrey ----------------------------------------------- Geoffrey Letessier Responsable informatique & ing?nieur syst?me CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique Institut de Biologie Physico-Chimique 13, rue Pierre et Marie Curie - 75005 Paris Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr Le 2 juin 2015 ? 21:53, Ben Turner <bturner at redhat.com> a ?crit :> I am seeing problems on 3.7 as well. Can you check /var/log/messages on both the clients and servers for hung tasks like: > > Jun 2 15:23:14 gqac006 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Jun 2 15:23:14 gqac006 kernel: iozone D 0000000000000001 0 21999 1 0x00000080 > Jun 2 15:23:14 gqac006 kernel: ffff880611321cc8 0000000000000082 ffff880611321c18 ffffffffa027236e > Jun 2 15:23:14 gqac006 kernel: ffff880611321c48 ffffffffa0272c10 ffff88052bd1e040 ffff880611321c78 > Jun 2 15:23:14 gqac006 kernel: ffff88052bd1e0f0 ffff88062080c7a0 ffff880625addaf8 ffff880611321fd8 > Jun 2 15:23:14 gqac006 kernel: Call Trace: > Jun 2 15:23:14 gqac006 kernel: [<ffffffffa027236e>] ? rpc_make_runnable+0x7e/0x80 [sunrpc] > Jun 2 15:23:14 gqac006 kernel: [<ffffffffa0272c10>] ? rpc_execute+0x50/0xa0 [sunrpc] > Jun 2 15:23:14 gqac006 kernel: [<ffffffff810aaa21>] ? ktime_get_ts+0xb1/0xf0 > Jun 2 15:23:14 gqac006 kernel: [<ffffffff811242d0>] ? sync_page+0x0/0x50 > Jun 2 15:23:14 gqac006 kernel: [<ffffffff8152a1b3>] io_schedule+0x73/0xc0 > Jun 2 15:23:14 gqac006 kernel: [<ffffffff8112430d>] sync_page+0x3d/0x50 > Jun 2 15:23:14 gqac006 kernel: [<ffffffff8152ac7f>] __wait_on_bit+0x5f/0x90 > Jun 2 15:23:14 gqac006 kernel: [<ffffffff81124543>] wait_on_page_bit+0x73/0x80 > Jun 2 15:23:14 gqac006 kernel: [<ffffffff8109eb80>] ? wake_bit_function+0x0/0x50 > Jun 2 15:23:14 gqac006 kernel: [<ffffffff8113a525>] ? pagevec_lookup_tag+0x25/0x40 > Jun 2 15:23:14 gqac006 kernel: [<ffffffff8112496b>] wait_on_page_writeback_range+0xfb/0x190 > Jun 2 15:23:14 gqac006 kernel: [<ffffffff81124b38>] filemap_write_and_wait_range+0x78/0x90 > Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c07ce>] vfs_fsync_range+0x7e/0x100 > Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c08bd>] vfs_fsync+0x1d/0x20 > Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c08fe>] do_fsync+0x3e/0x60 > Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c0950>] sys_fsync+0x10/0x20 > Jun 2 15:23:14 gqac006 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b > > Do you see a perf problem with just a simple DD or do you need a more complex workload to hit the issue? I think I saw an issue with metadata performance that I am trying to run down, let me know if you can see the problem with simple DD reads / writes or if we need to do some sort of dir / metadata access as well. > > -b > > ----- Original Message ----- >> From: "Geoffrey Letessier" <geoffrey.letessier at cnrs.fr> >> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com> >> Cc: gluster-users at gluster.org >> Sent: Tuesday, June 2, 2015 8:09:04 AM >> Subject: Re: [Gluster-users] GlusterFS 3.7 - slow/poor performances >> >> Hi Pranith, >> >> I?m sorry but I cannot bring you any comparison because comparison will be >> distorted by the fact in my HPC cluster in production the network technology >> is InfiniBand QDR and my volumes are quite different (brick in RAID6 >> (12x2TB), 2 bricks per server and 4 servers into my pool) >> >> Concerning your demand, in attachments you can find all expected results >> hoping it can help you to solve this serious performance issue (maybe I need >> play with glusterfs parameters?). >> >> Thank you very much by advance, >> Geoffrey >> ------------------------------------------------------ >> Geoffrey Letessier >> Responsable informatique & ing?nieur syst?me >> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >> Institut de Biologie Physico-Chimique >> 13, rue Pierre et Marie Curie - 75005 Paris >> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr >> >> >> >> >> Le 2 juin 2015 ? 10:09, Pranith Kumar Karampuri < pkarampu at redhat.com > a >> ?crit : >> >> hi Geoffrey, >> Since you are saying it happens on all types of volumes, lets do the >> following: >> 1) Create a dist-repl volume >> 2) Set the options etc you need. >> 3) enable gluster volume profile using "gluster volume profile <volname> >> start" >> 4) run the work load >> 5) give output of "gluster volume profile <volname> info" >> >> Repeat the steps above on new and old version you are comparing this with. >> That should give us insight into what could be causing the slowness. >> >> Pranith >> On 06/02/2015 03:22 AM, Geoffrey Letessier wrote: >> >> >> Dear all, >> >> I have a crash test cluster where i?ve tested the new version of GlusterFS >> (v3.7) before upgrading my HPC cluster in production. >> But? all my tests show me very very low performances. >> >> For my benches, as you can read below, I do some actions (untar, du, find, >> tar, rm) with linux kernel sources, dropping cache, each on distributed, >> replicated, distributed-replicated, single (single brick) volumes and the >> native FS of one brick. >> >> # time (echo 3 > /proc/sys/vm/drop_caches; tar xJf ~/linux-4.1-rc5.tar.xz; >> sync; echo 3 > /proc/sys/vm/drop_caches) >> # time (echo 3 > /proc/sys/vm/drop_caches; du -sh linux-4.1-rc5/; echo 3 > >> /proc/sys/vm/drop_caches) >> # time (echo 3 > /proc/sys/vm/drop_caches; find linux-4.1-rc5/|wc -l; echo 3 >>> /proc/sys/vm/drop_caches) >> # time (echo 3 > /proc/sys/vm/drop_caches; tar czf linux-4.1-rc5.tgz >> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches) >> # time (echo 3 > /proc/sys/vm/drop_caches; rm -rf linux-4.1-rc5.tgz >> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches) >> >> And here are the process times: >> >> --------------------------------------------------------------- >> | | UNTAR | DU | FIND | TAR | RM | >> --------------------------------------------------------------- >> | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s | >> --------------------------------------------------------------- >> | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s | >> --------------------------------------------------------------- >> | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s | >> --------------------------------------------------------------- >> | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s | >> --------------------------------------------------------------- >> | native FS | ~11s | ~4s | ~2s | ~56s | ~10s | >> --------------------------------------------------------------- >> >> I get the same results, whether with default configurations with custom >> configurations. >> >> if I look at the side of the ifstat command, I can note my IO write processes >> never exceed 3MBs... >> >> EXT4 native FS seems to be faster (roughly 15-20% but no more) than XFS one >> >> My [test] storage cluster config is composed by 2 identical servers (biCPU >> Intel Xeon X5355, 8GB of RAM, 2x2TB HDD (no-RAID) and Gb ethernet) >> >> My volume settings: >> single: 1server 1 brick >> replicated: 2 servers 1 brick each >> distributed: 2 servers 2 bricks each >> dist-repl: 2 bricks in the same server and replica 2 >> >> All seems to be OK in gluster status command line. >> >> Do you have an idea why I obtain so bad results? >> Thanks in advance. >> Geoffrey >> ----------------------------------------------- >> Geoffrey Letessier >> >> Responsable informatique & ing?nieur syst?me >> CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique >> Institut de Biologie Physico-Chimique >> 13, rue Pierre et Marie Curie - 75005 Paris >> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr >> >> >> >> _______________________________________________ >> Gluster-users mailing list Gluster-users at gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-users >> >> >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150602/e044916b/attachment.html>
Hello, Concerning the 3.5.3 version of GlusterFS, I met this morning a strange issue writing file when quota is exceeded. One person of my lab, whose her quota is exceeded (but she didn?t know about) try to modify a file but, because of exceeded quota, she was unable to and decided to exit VI. Now, her file is empty/blank as you can read below: pdsh at lucifer: cl-storage3: ssh exited with exit code 2 cl-storage1: ---------T 2 tarus amyloid_team 0 19 f?vr. 12:34 /export/brick_home/brick1/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh cl-storage1: -rwxrw-r-- 2 tarus amyloid_team 0 8 juin 12:38 /export/brick_home/brick2/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh In addition, i dont understand why, my volume being a distributed volume inside replica (cl-storage[1,3] is replicated only on cl-storage[2,4]), i have 2 ? same ? files (complete path) in 2 different bricks (as you can read above). Thanks by advance for your help and clarification. Geoffrey ------------------------------------------------------ Geoffrey Letessier Responsable informatique & ing?nieur syst?me UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique Institut de Biologie Physico-Chimique 13, rue Pierre et Marie Curie - 75005 Paris Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr> Le 2 juin 2015 ? 23:45, Geoffrey Letessier <geoffrey.letessier at cnrs.fr> a ?crit : > > Hi Ben, > > I just check my messages log files, both on client and server, and I dont find any hung task you notice on yours.. > > As you can read below, i dont note the performance issue in a simple DD but I think my issue is concerning a set of small files (tens of thousands nay more)? > > [root at nisus test]# ddt -t 10g /mnt/test/ > Writing to /mnt/test/ddt.8362 ... syncing ... done. > sleeping 10 seconds ... done. > Reading from /mnt/test/ddt.8362 ... done. > 10240MiB KiB/s CPU% > Write 114770 4 > Read 40675 4 > > for info: /mnt/test concerns the single v2 GlFS volume > > [root at nisus test]# ddt -t 10g /mnt/fhgfs/ > Writing to /mnt/fhgfs/ddt.8380 ... syncing ... done. > sleeping 10 seconds ... done. > Reading from /mnt/fhgfs/ddt.8380 ... done. > 10240MiB KiB/s CPU% > Write 102591 1 > Read 98079 2 > > Do you have a idea how to tune/optimize performance settings? and/or TCP settings (MTU, etc.)? > > --------------------------------------------------------------- > | | UNTAR | DU | FIND | TAR | RM | > --------------------------------------------------------------- > | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s | > --------------------------------------------------------------- > | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s | > --------------------------------------------------------------- > | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s | > --------------------------------------------------------------- > | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s | > --------------------------------------------------------------- > | native FS | ~11s | ~4s | ~2s | ~56s | ~10s | > --------------------------------------------------------------- > | BeeGFS | ~3m43s | ~15s | ~3s | ~1m33s | ~46s | > --------------------------------------------------------------- > | single (v2) | ~3m6s | ~14s | ~32s | ~1m2s | ~44s | > --------------------------------------------------------------- > for info: > -BeeGFS is a distributed FS (4 bricks, 2 bricks per server and 2 servers) > - single (v2): simple gluster volume with default settings > > I also note I obtain the same tar/untar performance issue with FhGFS/BeeGFS but the rest (DU, FIND, RM) looks like to be OK. > > Thank you very much for your reply and help. > Geoffrey > ----------------------------------------------- > Geoffrey Letessier > > Responsable informatique & ing?nieur syst?me > CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique > Institut de Biologie Physico-Chimique > 13, rue Pierre et Marie Curie - 75005 Paris > Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr <mailto:geoffrey.letessier at cnrs.fr> > Le 2 juin 2015 ? 21:53, Ben Turner <bturner at redhat.com <mailto:bturner at redhat.com>> a ?crit : > >> I am seeing problems on 3.7 as well. Can you check /var/log/messages on both the clients and servers for hung tasks like: >> >> Jun 2 15:23:14 gqac006 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> Jun 2 15:23:14 gqac006 kernel: iozone D 0000000000000001 0 21999 1 0x00000080 >> Jun 2 15:23:14 gqac006 kernel: ffff880611321cc8 0000000000000082 ffff880611321c18 ffffffffa027236e >> Jun 2 15:23:14 gqac006 kernel: ffff880611321c48 ffffffffa0272c10 ffff88052bd1e040 ffff880611321c78 >> Jun 2 15:23:14 gqac006 kernel: ffff88052bd1e0f0 ffff88062080c7a0 ffff880625addaf8 ffff880611321fd8 >> Jun 2 15:23:14 gqac006 kernel: Call Trace: >> Jun 2 15:23:14 gqac006 kernel: [<ffffffffa027236e>] ? rpc_make_runnable+0x7e/0x80 [sunrpc] >> Jun 2 15:23:14 gqac006 kernel: [<ffffffffa0272c10>] ? rpc_execute+0x50/0xa0 [sunrpc] >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff810aaa21>] ? ktime_get_ts+0xb1/0xf0 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811242d0>] ? sync_page+0x0/0x50 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8152a1b3>] io_schedule+0x73/0xc0 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8112430d>] sync_page+0x3d/0x50 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8152ac7f>] __wait_on_bit+0x5f/0x90 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff81124543>] wait_on_page_bit+0x73/0x80 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8109eb80>] ? wake_bit_function+0x0/0x50 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8113a525>] ? pagevec_lookup_tag+0x25/0x40 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8112496b>] wait_on_page_writeback_range+0xfb/0x190 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff81124b38>] filemap_write_and_wait_range+0x78/0x90 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c07ce>] vfs_fsync_range+0x7e/0x100 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c08bd>] vfs_fsync+0x1d/0x20 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c08fe>] do_fsync+0x3e/0x60 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c0950>] sys_fsync+0x10/0x20 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b >> >> Do you see a perf problem with just a simple DD or do you need a more complex workload to hit the issue? I think I saw an issue with metadata performance that I am trying to run down, let me know if you can see the problem with simple DD reads / writes or if we need to do some sort of dir / metadata access as well. >> >> -b >> >> ----- Original Message ----- >>> From: "Geoffrey Letessier" <geoffrey.letessier at cnrs.fr <mailto:geoffrey.letessier at cnrs.fr>> >>> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> >>> Cc: gluster-users at gluster.org <mailto:gluster-users at gluster.org> >>> Sent: Tuesday, June 2, 2015 8:09:04 AM >>> Subject: Re: [Gluster-users] GlusterFS 3.7 - slow/poor performances >>> >>> Hi Pranith, >>> >>> I?m sorry but I cannot bring you any comparison because comparison will be >>> distorted by the fact in my HPC cluster in production the network technology >>> is InfiniBand QDR and my volumes are quite different (brick in RAID6 >>> (12x2TB), 2 bricks per server and 4 servers into my pool) >>> >>> Concerning your demand, in attachments you can find all expected results >>> hoping it can help you to solve this serious performance issue (maybe I need >>> play with glusterfs parameters?). >>> >>> Thank you very much by advance, >>> Geoffrey >>> ------------------------------------------------------ >>> Geoffrey Letessier >>> Responsable informatique & ing?nieur syst?me >>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >>> Institut de Biologie Physico-Chimique >>> 13, rue Pierre et Marie Curie - 75005 Paris >>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr <mailto:geoffrey.letessier at ibpc.fr> >>> >>> >>> >>> >>> Le 2 juin 2015 ? 10:09, Pranith Kumar Karampuri < pkarampu at redhat.com <mailto:pkarampu at redhat.com> > a >>> ?crit : >>> >>> hi Geoffrey, >>> Since you are saying it happens on all types of volumes, lets do the >>> following: >>> 1) Create a dist-repl volume >>> 2) Set the options etc you need. >>> 3) enable gluster volume profile using "gluster volume profile <volname> >>> start" >>> 4) run the work load >>> 5) give output of "gluster volume profile <volname> info" >>> >>> Repeat the steps above on new and old version you are comparing this with. >>> That should give us insight into what could be causing the slowness. >>> >>> Pranith >>> On 06/02/2015 03:22 AM, Geoffrey Letessier wrote: >>> >>> >>> Dear all, >>> >>> I have a crash test cluster where i?ve tested the new version of GlusterFS >>> (v3.7) before upgrading my HPC cluster in production. >>> But? all my tests show me very very low performances. >>> >>> For my benches, as you can read below, I do some actions (untar, du, find, >>> tar, rm) with linux kernel sources, dropping cache, each on distributed, >>> replicated, distributed-replicated, single (single brick) volumes and the >>> native FS of one brick. >>> >>> # time (echo 3 > /proc/sys/vm/drop_caches; tar xJf ~/linux-4.1-rc5.tar.xz; >>> sync; echo 3 > /proc/sys/vm/drop_caches) >>> # time (echo 3 > /proc/sys/vm/drop_caches; du -sh linux-4.1-rc5/; echo 3 > >>> /proc/sys/vm/drop_caches) >>> # time (echo 3 > /proc/sys/vm/drop_caches; find linux-4.1-rc5/|wc -l; echo 3 >>>> /proc/sys/vm/drop_caches) >>> # time (echo 3 > /proc/sys/vm/drop_caches; tar czf linux-4.1-rc5.tgz >>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches) >>> # time (echo 3 > /proc/sys/vm/drop_caches; rm -rf linux-4.1-rc5.tgz >>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches) >>> >>> And here are the process times: >>> >>> --------------------------------------------------------------- >>> | | UNTAR | DU | FIND | TAR | RM | >>> --------------------------------------------------------------- >>> | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s | >>> --------------------------------------------------------------- >>> | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s | >>> --------------------------------------------------------------- >>> | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s | >>> --------------------------------------------------------------- >>> | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s | >>> --------------------------------------------------------------- >>> | native FS | ~11s | ~4s | ~2s | ~56s | ~10s | >>> --------------------------------------------------------------- >>> >>> I get the same results, whether with default configurations with custom >>> configurations. >>> >>> if I look at the side of the ifstat command, I can note my IO write processes >>> never exceed 3MBs... >>> >>> EXT4 native FS seems to be faster (roughly 15-20% but no more) than XFS one >>> >>> My [test] storage cluster config is composed by 2 identical servers (biCPU >>> Intel Xeon X5355, 8GB of RAM, 2x2TB HDD (no-RAID) and Gb ethernet) >>> >>> My volume settings: >>> single: 1server 1 brick >>> replicated: 2 servers 1 brick each >>> distributed: 2 servers 2 bricks each >>> dist-repl: 2 bricks in the same server and replica 2 >>> >>> All seems to be OK in gluster status command line. >>> >>> Do you have an idea why I obtain so bad results? >>> Thanks in advance. >>> Geoffrey >>> ----------------------------------------------- >>> Geoffrey Letessier >>> >>> Responsable informatique & ing?nieur syst?me >>> CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique >>> Institut de Biologie Physico-Chimique >>> 13, rue Pierre et Marie Curie - 75005 Paris >>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr <mailto:geoffrey.letessier at cnrs.fr> >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>> http://www.gluster.org/mailman/listinfo/gluster-users <http://www.gluster.org/mailman/listinfo/gluster-users> >>> >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>> http://www.gluster.org/mailman/listinfo/gluster-users <http://www.gluster.org/mailman/listinfo/gluster-users>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150608/818f8f16/attachment.html>
Geoffrey Letessier
2015-Jun-08 12:37 UTC
[Gluster-users] GlusterFS 3.7 - slow/poor performances
Hello, Do you know more about? In addition, do you know how to ? activate ? RDMA for my volume with Intel/QLogic QDR? Currently, i mount my volumes with RDMA transport-type option (both in server and client side) but I notice all streams are using TCP stack -and my bandwith never exceed 2.0-2.5Gbs (250-300MB/s). Thanks in advance, Geoffrey ------------------------------------------------------ Geoffrey Letessier Responsable informatique & ing?nieur syst?me UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique Institut de Biologie Physico-Chimique 13, rue Pierre et Marie Curie - 75005 Paris Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr> Le 2 juin 2015 ? 23:45, Geoffrey Letessier <geoffrey.letessier at cnrs.fr> a ?crit : > > Hi Ben, > > I just check my messages log files, both on client and server, and I dont find any hung task you notice on yours.. > > As you can read below, i dont note the performance issue in a simple DD but I think my issue is concerning a set of small files (tens of thousands nay more)? > > [root at nisus test]# ddt -t 10g /mnt/test/ > Writing to /mnt/test/ddt.8362 ... syncing ... done. > sleeping 10 seconds ... done. > Reading from /mnt/test/ddt.8362 ... done. > 10240MiB KiB/s CPU% > Write 114770 4 > Read 40675 4 > > for info: /mnt/test concerns the single v2 GlFS volume > > [root at nisus test]# ddt -t 10g /mnt/fhgfs/ > Writing to /mnt/fhgfs/ddt.8380 ... syncing ... done. > sleeping 10 seconds ... done. > Reading from /mnt/fhgfs/ddt.8380 ... done. > 10240MiB KiB/s CPU% > Write 102591 1 > Read 98079 2 > > Do you have a idea how to tune/optimize performance settings? and/or TCP settings (MTU, etc.)? > > --------------------------------------------------------------- > | | UNTAR | DU | FIND | TAR | RM | > --------------------------------------------------------------- > | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s | > --------------------------------------------------------------- > | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s | > --------------------------------------------------------------- > | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s | > --------------------------------------------------------------- > | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s | > --------------------------------------------------------------- > | native FS | ~11s | ~4s | ~2s | ~56s | ~10s | > --------------------------------------------------------------- > | BeeGFS | ~3m43s | ~15s | ~3s | ~1m33s | ~46s | > --------------------------------------------------------------- > | single (v2) | ~3m6s | ~14s | ~32s | ~1m2s | ~44s | > --------------------------------------------------------------- > for info: > -BeeGFS is a distributed FS (4 bricks, 2 bricks per server and 2 servers) > - single (v2): simple gluster volume with default settings > > I also note I obtain the same tar/untar performance issue with FhGFS/BeeGFS but the rest (DU, FIND, RM) looks like to be OK. > > Thank you very much for your reply and help. > Geoffrey > ----------------------------------------------- > Geoffrey Letessier > > Responsable informatique & ing?nieur syst?me > CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique > Institut de Biologie Physico-Chimique > 13, rue Pierre et Marie Curie - 75005 Paris > Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr <mailto:geoffrey.letessier at cnrs.fr> > Le 2 juin 2015 ? 21:53, Ben Turner <bturner at redhat.com <mailto:bturner at redhat.com>> a ?crit : > >> I am seeing problems on 3.7 as well. Can you check /var/log/messages on both the clients and servers for hung tasks like: >> >> Jun 2 15:23:14 gqac006 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> Jun 2 15:23:14 gqac006 kernel: iozone D 0000000000000001 0 21999 1 0x00000080 >> Jun 2 15:23:14 gqac006 kernel: ffff880611321cc8 0000000000000082 ffff880611321c18 ffffffffa027236e >> Jun 2 15:23:14 gqac006 kernel: ffff880611321c48 ffffffffa0272c10 ffff88052bd1e040 ffff880611321c78 >> Jun 2 15:23:14 gqac006 kernel: ffff88052bd1e0f0 ffff88062080c7a0 ffff880625addaf8 ffff880611321fd8 >> Jun 2 15:23:14 gqac006 kernel: Call Trace: >> Jun 2 15:23:14 gqac006 kernel: [<ffffffffa027236e>] ? rpc_make_runnable+0x7e/0x80 [sunrpc] >> Jun 2 15:23:14 gqac006 kernel: [<ffffffffa0272c10>] ? rpc_execute+0x50/0xa0 [sunrpc] >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff810aaa21>] ? ktime_get_ts+0xb1/0xf0 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811242d0>] ? sync_page+0x0/0x50 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8152a1b3>] io_schedule+0x73/0xc0 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8112430d>] sync_page+0x3d/0x50 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8152ac7f>] __wait_on_bit+0x5f/0x90 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff81124543>] wait_on_page_bit+0x73/0x80 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8109eb80>] ? wake_bit_function+0x0/0x50 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8113a525>] ? pagevec_lookup_tag+0x25/0x40 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8112496b>] wait_on_page_writeback_range+0xfb/0x190 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff81124b38>] filemap_write_and_wait_range+0x78/0x90 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c07ce>] vfs_fsync_range+0x7e/0x100 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c08bd>] vfs_fsync+0x1d/0x20 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c08fe>] do_fsync+0x3e/0x60 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c0950>] sys_fsync+0x10/0x20 >> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b >> >> Do you see a perf problem with just a simple DD or do you need a more complex workload to hit the issue? I think I saw an issue with metadata performance that I am trying to run down, let me know if you can see the problem with simple DD reads / writes or if we need to do some sort of dir / metadata access as well. >> >> -b >> >> ----- Original Message ----- >>> From: "Geoffrey Letessier" <geoffrey.letessier at cnrs.fr <mailto:geoffrey.letessier at cnrs.fr>> >>> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> >>> Cc: gluster-users at gluster.org <mailto:gluster-users at gluster.org> >>> Sent: Tuesday, June 2, 2015 8:09:04 AM >>> Subject: Re: [Gluster-users] GlusterFS 3.7 - slow/poor performances >>> >>> Hi Pranith, >>> >>> I?m sorry but I cannot bring you any comparison because comparison will be >>> distorted by the fact in my HPC cluster in production the network technology >>> is InfiniBand QDR and my volumes are quite different (brick in RAID6 >>> (12x2TB), 2 bricks per server and 4 servers into my pool) >>> >>> Concerning your demand, in attachments you can find all expected results >>> hoping it can help you to solve this serious performance issue (maybe I need >>> play with glusterfs parameters?). >>> >>> Thank you very much by advance, >>> Geoffrey >>> ------------------------------------------------------ >>> Geoffrey Letessier >>> Responsable informatique & ing?nieur syst?me >>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >>> Institut de Biologie Physico-Chimique >>> 13, rue Pierre et Marie Curie - 75005 Paris >>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr <mailto:geoffrey.letessier at ibpc.fr> >>> >>> >>> >>> >>> Le 2 juin 2015 ? 10:09, Pranith Kumar Karampuri < pkarampu at redhat.com <mailto:pkarampu at redhat.com> > a >>> ?crit : >>> >>> hi Geoffrey, >>> Since you are saying it happens on all types of volumes, lets do the >>> following: >>> 1) Create a dist-repl volume >>> 2) Set the options etc you need. >>> 3) enable gluster volume profile using "gluster volume profile <volname> >>> start" >>> 4) run the work load >>> 5) give output of "gluster volume profile <volname> info" >>> >>> Repeat the steps above on new and old version you are comparing this with. >>> That should give us insight into what could be causing the slowness. >>> >>> Pranith >>> On 06/02/2015 03:22 AM, Geoffrey Letessier wrote: >>> >>> >>> Dear all, >>> >>> I have a crash test cluster where i?ve tested the new version of GlusterFS >>> (v3.7) before upgrading my HPC cluster in production. >>> But? all my tests show me very very low performances. >>> >>> For my benches, as you can read below, I do some actions (untar, du, find, >>> tar, rm) with linux kernel sources, dropping cache, each on distributed, >>> replicated, distributed-replicated, single (single brick) volumes and the >>> native FS of one brick. >>> >>> # time (echo 3 > /proc/sys/vm/drop_caches; tar xJf ~/linux-4.1-rc5.tar.xz; >>> sync; echo 3 > /proc/sys/vm/drop_caches) >>> # time (echo 3 > /proc/sys/vm/drop_caches; du -sh linux-4.1-rc5/; echo 3 > >>> /proc/sys/vm/drop_caches) >>> # time (echo 3 > /proc/sys/vm/drop_caches; find linux-4.1-rc5/|wc -l; echo 3 >>>> /proc/sys/vm/drop_caches) >>> # time (echo 3 > /proc/sys/vm/drop_caches; tar czf linux-4.1-rc5.tgz >>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches) >>> # time (echo 3 > /proc/sys/vm/drop_caches; rm -rf linux-4.1-rc5.tgz >>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches) >>> >>> And here are the process times: >>> >>> --------------------------------------------------------------- >>> | | UNTAR | DU | FIND | TAR | RM | >>> --------------------------------------------------------------- >>> | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s | >>> --------------------------------------------------------------- >>> | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s | >>> --------------------------------------------------------------- >>> | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s | >>> --------------------------------------------------------------- >>> | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s | >>> --------------------------------------------------------------- >>> | native FS | ~11s | ~4s | ~2s | ~56s | ~10s | >>> --------------------------------------------------------------- >>> >>> I get the same results, whether with default configurations with custom >>> configurations. >>> >>> if I look at the side of the ifstat command, I can note my IO write processes >>> never exceed 3MBs... >>> >>> EXT4 native FS seems to be faster (roughly 15-20% but no more) than XFS one >>> >>> My [test] storage cluster config is composed by 2 identical servers (biCPU >>> Intel Xeon X5355, 8GB of RAM, 2x2TB HDD (no-RAID) and Gb ethernet) >>> >>> My volume settings: >>> single: 1server 1 brick >>> replicated: 2 servers 1 brick each >>> distributed: 2 servers 2 bricks each >>> dist-repl: 2 bricks in the same server and replica 2 >>> >>> All seems to be OK in gluster status command line. >>> >>> Do you have an idea why I obtain so bad results? >>> Thanks in advance. >>> Geoffrey >>> ----------------------------------------------- >>> Geoffrey Letessier >>> >>> Responsable informatique & ing?nieur syst?me >>> CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique >>> Institut de Biologie Physico-Chimique >>> 13, rue Pierre et Marie Curie - 75005 Paris >>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr <mailto:geoffrey.letessier at cnrs.fr> >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>> http://www.gluster.org/mailman/listinfo/gluster-users <http://www.gluster.org/mailman/listinfo/gluster-users> >>> >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>> http://www.gluster.org/mailman/listinfo/gluster-users <http://www.gluster.org/mailman/listinfo/gluster-users>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150608/40e3ad8d/attachment.html>