thr3ads.net - Gluster users - [Gluster-users] GlusterFS 3.7

If this information is useful, please help other people find it:
Share via:

Geoffrey Letessier

2015-Jun-02 21:45 UTC

[Gluster-users] GlusterFS 3.7 - slow/poor performances

Hi Ben,

I just check my messages log files, both on client and server, and I dont find
any hung task you notice on yours..

As you can read below, i dont note the performance issue in a simple DD but I
think my issue is concerning a set of small files (tens of thousands nay more)?

[root at nisus test]# ddt -t 10g /mnt/test/
Writing to /mnt/test/ddt.8362 ... syncing ... done.
sleeping 10 seconds ... done.
Reading from /mnt/test/ddt.8362 ... done.
10240MiB    KiB/s  CPU%
Write      114770     4
Read        40675     4

for info: /mnt/test concerns the single v2 GlFS volume

[root at nisus test]# ddt -t 10g /mnt/fhgfs/
Writing to /mnt/fhgfs/ddt.8380 ... syncing ... done.
sleeping 10 seconds ... done.
Reading from /mnt/fhgfs/ddt.8380 ... done.
10240MiB    KiB/s  CPU%
Write      102591     1
Read        98079     2

Do you have a idea how to tune/optimize performance settings? and/or TCP
settings (MTU, etc.)?

---------------------------------------------------------------
|             |  UNTAR  |   DU   |  FIND   |   TAR   |   RM   |
---------------------------------------------------------------
| single      |  ~3m45s |   ~43s |    ~47s |  ~3m10s | ~3m15s |
---------------------------------------------------------------
| replicated  |  ~5m10s |   ~59s |   ~1m6s |  ~1m19s | ~1m49s |
---------------------------------------------------------------
| distributed |  ~4m18s |   ~41s |    ~57s |  ~2m24s | ~1m38s |
---------------------------------------------------------------
| dist-repl   |  ~8m18s |  ~1m4s |  ~1m11s |  ~1m24s | ~2m40s |
---------------------------------------------------------------
| native FS   |    ~11s |    ~4s |     ~2s |    ~56s |   ~10s |
---------------------------------------------------------------
| BeeGFS      |  ~3m43s |   ~15s |     ~3s |  ~1m33s |   ~46s |
---------------------------------------------------------------
| single (v2) |   ~3m6s |   ~14s |    ~32s |   ~1m2s |   ~44s |
---------------------------------------------------------------
for info: 
	-BeeGFS is a distributed FS (4 bricks, 2 bricks per server and 2 servers)
	- single (v2): simple gluster volume with default settings

I also note I obtain the same tar/untar performance issue with FhGFS/BeeGFS but
the rest (DU, FIND, RM) looks like to be OK.

Thank you very much for your reply and help.
Geoffrey
-----------------------------------------------
Geoffrey Letessier

Responsable informatique & ing?nieur syst?me
CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr

Le 2 juin 2015 ? 21:53, Ben Turner <bturner at redhat.com> a ?crit :
> I am seeing problems on 3.7 as well.  Can you check /var/log/messages on
both the clients and servers for hung tasks like:
> 
> Jun  2 15:23:14 gqac006 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Jun  2 15:23:14 gqac006 kernel: iozone        D 0000000000000001     0
21999      1 0x00000080
> Jun  2 15:23:14 gqac006 kernel: ffff880611321cc8 0000000000000082
ffff880611321c18 ffffffffa027236e
> Jun  2 15:23:14 gqac006 kernel: ffff880611321c48 ffffffffa0272c10
ffff88052bd1e040 ffff880611321c78
> Jun  2 15:23:14 gqac006 kernel: ffff88052bd1e0f0 ffff88062080c7a0
ffff880625addaf8 ffff880611321fd8
> Jun  2 15:23:14 gqac006 kernel: Call Trace:
> Jun  2 15:23:14 gqac006 kernel: [<ffffffffa027236e>] ?
rpc_make_runnable+0x7e/0x80 [sunrpc]
> Jun  2 15:23:14 gqac006 kernel: [<ffffffffa0272c10>] ?
rpc_execute+0x50/0xa0 [sunrpc]
> Jun  2 15:23:14 gqac006 kernel: [<ffffffff810aaa21>] ?
ktime_get_ts+0xb1/0xf0
> Jun  2 15:23:14 gqac006 kernel: [<ffffffff811242d0>] ?
sync_page+0x0/0x50
> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8152a1b3>]
io_schedule+0x73/0xc0
> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8112430d>]
sync_page+0x3d/0x50
> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8152ac7f>]
__wait_on_bit+0x5f/0x90
> Jun  2 15:23:14 gqac006 kernel: [<ffffffff81124543>]
wait_on_page_bit+0x73/0x80
> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8109eb80>] ?
wake_bit_function+0x0/0x50
> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8113a525>] ?
pagevec_lookup_tag+0x25/0x40
> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8112496b>]
wait_on_page_writeback_range+0xfb/0x190
> Jun  2 15:23:14 gqac006 kernel: [<ffffffff81124b38>]
filemap_write_and_wait_range+0x78/0x90
> Jun  2 15:23:14 gqac006 kernel: [<ffffffff811c07ce>]
vfs_fsync_range+0x7e/0x100
> Jun  2 15:23:14 gqac006 kernel: [<ffffffff811c08bd>]
vfs_fsync+0x1d/0x20
> Jun  2 15:23:14 gqac006 kernel: [<ffffffff811c08fe>]
do_fsync+0x3e/0x60
> Jun  2 15:23:14 gqac006 kernel: [<ffffffff811c0950>]
sys_fsync+0x10/0x20
> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8100b072>]
system_call_fastpath+0x16/0x1b
> 
> Do you see a perf problem with just a simple DD or do you need a more
complex workload to hit the issue?  I think I saw an issue with metadata
performance that I am trying to run down, let me know if you can see the problem
with simple DD reads / writes or if we need to do some sort of dir / metadata
access as well.
> 
> -b
> 
> ----- Original Message -----
>> From: "Geoffrey Letessier" <geoffrey.letessier at
cnrs.fr>
>> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
>> Cc: gluster-users at gluster.org
>> Sent: Tuesday, June 2, 2015 8:09:04 AM
>> Subject: Re: [Gluster-users] GlusterFS 3.7 - slow/poor performances
>> 
>> Hi Pranith,
>> 
>> I?m sorry but I cannot bring you any comparison because comparison will
be
>> distorted by the fact in my HPC cluster in production the network
technology
>> is InfiniBand QDR and my volumes are quite different (brick in RAID6
>> (12x2TB), 2 bricks per server and 4 servers into my pool)
>> 
>> Concerning your demand, in attachments you can find all expected
results
>> hoping it can help you to solve this serious performance issue (maybe I
need
>> play with glusterfs parameters?).
>> 
>> Thank you very much by advance,
>> Geoffrey
>> ------------------------------------------------------
>> Geoffrey Letessier
>> Responsable informatique & ing?nieur syst?me
>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
>> Institut de Biologie Physico-Chimique
>> 13, rue Pierre et Marie Curie - 75005 Paris
>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>> 
>> 
>> 
>> 
>> Le 2 juin 2015 ? 10:09, Pranith Kumar Karampuri < pkarampu at
redhat.com > a
>> ?crit :
>> 
>> hi Geoffrey,
>> Since you are saying it happens on all types of volumes, lets do the
>> following:
>> 1) Create a dist-repl volume
>> 2) Set the options etc you need.
>> 3) enable gluster volume profile using "gluster volume profile
<volname>
>> start"
>> 4) run the work load
>> 5) give output of "gluster volume profile <volname>
info"
>> 
>> Repeat the steps above on new and old version you are comparing this
with.
>> That should give us insight into what could be causing the slowness.
>> 
>> Pranith
>> On 06/02/2015 03:22 AM, Geoffrey Letessier wrote:
>> 
>> 
>> Dear all,
>> 
>> I have a crash test cluster where i?ve tested the new version of
GlusterFS
>> (v3.7) before upgrading my HPC cluster in production.
>> But? all my tests show me very very low performances.
>> 
>> For my benches, as you can read below, I do some actions (untar, du,
find,
>> tar, rm) with linux kernel sources, dropping cache, each on
distributed,
>> replicated, distributed-replicated, single (single brick) volumes and
the
>> native FS of one brick.
>> 
>> # time (echo 3 > /proc/sys/vm/drop_caches; tar xJf
~/linux-4.1-rc5.tar.xz;
>> sync; echo 3 > /proc/sys/vm/drop_caches)
>> # time (echo 3 > /proc/sys/vm/drop_caches; du -sh linux-4.1-rc5/;
echo 3 >
>> /proc/sys/vm/drop_caches)
>> # time (echo 3 > /proc/sys/vm/drop_caches; find linux-4.1-rc5/|wc
-l; echo 3
>>> /proc/sys/vm/drop_caches)
>> # time (echo 3 > /proc/sys/vm/drop_caches; tar czf linux-4.1-rc5.tgz
>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches)
>> # time (echo 3 > /proc/sys/vm/drop_caches; rm -rf linux-4.1-rc5.tgz
>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches)
>> 
>> And here are the process times:
>> 
>> ---------------------------------------------------------------
>> | | UNTAR | DU | FIND | TAR | RM |
>> ---------------------------------------------------------------
>> | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s |
>> ---------------------------------------------------------------
>> | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s |
>> ---------------------------------------------------------------
>> | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s |
>> ---------------------------------------------------------------
>> | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s |
>> ---------------------------------------------------------------
>> | native FS | ~11s | ~4s | ~2s | ~56s | ~10s |
>> ---------------------------------------------------------------
>> 
>> I get the same results, whether with default configurations with custom
>> configurations.
>> 
>> if I look at the side of the ifstat command, I can note my IO write
processes
>> never exceed 3MBs...
>> 
>> EXT4 native FS seems to be faster (roughly 15-20% but no more) than XFS
one
>> 
>> My [test] storage cluster config is composed by 2 identical servers
(biCPU
>> Intel Xeon X5355, 8GB of RAM, 2x2TB HDD (no-RAID) and Gb ethernet)
>> 
>> My volume settings:
>> single: 1server 1 brick
>> replicated: 2 servers 1 brick each
>> distributed: 2 servers 2 bricks each
>> dist-repl: 2 bricks in the same server and replica 2
>> 
>> All seems to be OK in gluster status command line.
>> 
>> Do you have an idea why I obtain so bad results?
>> Thanks in advance.
>> Geoffrey
>> -----------------------------------------------
>> Geoffrey Letessier
>> 
>> Responsable informatique & ing?nieur syst?me
>> CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique
>> Institut de Biologie Physico-Chimique
>> 13, rue Pierre et Marie Curie - 75005 Paris
>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr
>> 
>> 
>> 
>> _______________________________________________
>> Gluster-users mailing list Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150602/e044916b/attachment.html>

Geoffrey Letessier

2015-Jun-08 12:30 UTC

head link

[Gluster-users] Quota issue

Hello,

Concerning the 3.5.3 version of GlusterFS, I met this morning a strange issue
writing file when quota is exceeded.

One person of my lab, whose her quota is exceeded (but she didn?t know about)
try to modify a file but, because of exceeded quota, she was unable to and
decided to exit VI. Now, her file is empty/blank as you can read below:
pdsh at lucifer: cl-storage3: ssh exited with exit code 2
cl-storage1: ---------T 2 tarus amyloid_team 0 19 f?vr. 12:34
/export/brick_home/brick1/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh
cl-storage1: -rwxrw-r-- 2 tarus amyloid_team 0  8 juin  12:38
/export/brick_home/brick2/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh

In addition, i dont understand why, my volume being a distributed volume inside
replica (cl-storage[1,3] is replicated only on cl-storage[2,4]), i have 2 ? same
? files (complete path) in 2 different bricks (as you can read above).

Thanks by advance for your help and clarification.
Geoffrey
------------------------------------------------------
Geoffrey Letessier
Responsable informatique & ing?nieur syst?me
UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
> Le 2 juin 2015 ? 23:45, Geoffrey Letessier <geoffrey.letessier at
cnrs.fr> a ?crit :
> 
> Hi Ben,
> 
> I just check my messages log files, both on client and server, and I dont
find any hung task you notice on yours..
> 
> As you can read below, i dont note the performance issue in a simple DD but
I think my issue is concerning a set of small files (tens of thousands nay
more)?
> 
> [root at nisus test]# ddt -t 10g /mnt/test/
> Writing to /mnt/test/ddt.8362 ... syncing ... done.
> sleeping 10 seconds ... done.
> Reading from /mnt/test/ddt.8362 ... done.
> 10240MiB    KiB/s  CPU%
> Write      114770     4
> Read        40675     4
> 
> for info: /mnt/test concerns the single v2 GlFS volume
> 
> [root at nisus test]# ddt -t 10g /mnt/fhgfs/
> Writing to /mnt/fhgfs/ddt.8380 ... syncing ... done.
> sleeping 10 seconds ... done.
> Reading from /mnt/fhgfs/ddt.8380 ... done.
> 10240MiB    KiB/s  CPU%
> Write      102591     1
> Read        98079     2
> 
> Do you have a idea how to tune/optimize performance settings? and/or TCP
settings (MTU, etc.)?
> 
> ---------------------------------------------------------------
> |             |  UNTAR  |   DU   |  FIND   |   TAR   |   RM   |
> ---------------------------------------------------------------
> | single      |  ~3m45s |   ~43s |    ~47s |  ~3m10s | ~3m15s |
> ---------------------------------------------------------------
> | replicated  |  ~5m10s |   ~59s |   ~1m6s |  ~1m19s | ~1m49s |
> ---------------------------------------------------------------
> | distributed |  ~4m18s |   ~41s |    ~57s |  ~2m24s | ~1m38s |
> ---------------------------------------------------------------
> | dist-repl   |  ~8m18s |  ~1m4s |  ~1m11s |  ~1m24s | ~2m40s |
> ---------------------------------------------------------------
> | native FS   |    ~11s |    ~4s |     ~2s |    ~56s |   ~10s |
> ---------------------------------------------------------------
> | BeeGFS      |  ~3m43s |   ~15s |     ~3s |  ~1m33s |   ~46s |
> ---------------------------------------------------------------
> | single (v2) |   ~3m6s |   ~14s |    ~32s |   ~1m2s |   ~44s |
> ---------------------------------------------------------------
> for info: 
> 	-BeeGFS is a distributed FS (4 bricks, 2 bricks per server and 2 servers)
> 	- single (v2): simple gluster volume with default settings
> 
> I also note I obtain the same tar/untar performance issue with FhGFS/BeeGFS
but the rest (DU, FIND, RM) looks like to be OK.
> 
> Thank you very much for your reply and help.
> Geoffrey
> -----------------------------------------------
> Geoffrey Letessier
> 
> Responsable informatique & ing?nieur syst?me
> CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique
> Institut de Biologie Physico-Chimique
> 13, rue Pierre et Marie Curie - 75005 Paris
> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr
<mailto:geoffrey.letessier at cnrs.fr>
> Le 2 juin 2015 ? 21:53, Ben Turner <bturner at redhat.com
<mailto:bturner at redhat.com>> a ?crit :
> 
>> I am seeing problems on 3.7 as well.  Can you check /var/log/messages
on both the clients and servers for hung tasks like:
>> 
>> Jun  2 15:23:14 gqac006 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Jun  2 15:23:14 gqac006 kernel: iozone        D 0000000000000001     0
21999      1 0x00000080
>> Jun  2 15:23:14 gqac006 kernel: ffff880611321cc8 0000000000000082
ffff880611321c18 ffffffffa027236e
>> Jun  2 15:23:14 gqac006 kernel: ffff880611321c48 ffffffffa0272c10
ffff88052bd1e040 ffff880611321c78
>> Jun  2 15:23:14 gqac006 kernel: ffff88052bd1e0f0 ffff88062080c7a0
ffff880625addaf8 ffff880611321fd8
>> Jun  2 15:23:14 gqac006 kernel: Call Trace:
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffffa027236e>] ?
rpc_make_runnable+0x7e/0x80 [sunrpc]
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffffa0272c10>] ?
rpc_execute+0x50/0xa0 [sunrpc]
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff810aaa21>] ?
ktime_get_ts+0xb1/0xf0
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff811242d0>] ?
sync_page+0x0/0x50
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8152a1b3>]
io_schedule+0x73/0xc0
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8112430d>]
sync_page+0x3d/0x50
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8152ac7f>]
__wait_on_bit+0x5f/0x90
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff81124543>]
wait_on_page_bit+0x73/0x80
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8109eb80>] ?
wake_bit_function+0x0/0x50
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8113a525>] ?
pagevec_lookup_tag+0x25/0x40
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8112496b>]
wait_on_page_writeback_range+0xfb/0x190
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff81124b38>]
filemap_write_and_wait_range+0x78/0x90
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff811c07ce>]
vfs_fsync_range+0x7e/0x100
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff811c08bd>]
vfs_fsync+0x1d/0x20
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff811c08fe>]
do_fsync+0x3e/0x60
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff811c0950>]
sys_fsync+0x10/0x20
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8100b072>]
system_call_fastpath+0x16/0x1b
>> 
>> Do you see a perf problem with just a simple DD or do you need a more
complex workload to hit the issue?  I think I saw an issue with metadata
performance that I am trying to run down, let me know if you can see the problem
with simple DD reads / writes or if we need to do some sort of dir / metadata
access as well.
>> 
>> -b
>> 
>> ----- Original Message -----
>>> From: "Geoffrey Letessier" <geoffrey.letessier at
cnrs.fr <mailto:geoffrey.letessier at cnrs.fr>>
>>> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com
<mailto:pkarampu at redhat.com>>
>>> Cc: gluster-users at gluster.org <mailto:gluster-users at
gluster.org>
>>> Sent: Tuesday, June 2, 2015 8:09:04 AM
>>> Subject: Re: [Gluster-users] GlusterFS 3.7 - slow/poor performances
>>> 
>>> Hi Pranith,
>>> 
>>> I?m sorry but I cannot bring you any comparison because comparison
will be
>>> distorted by the fact in my HPC cluster in production the network
technology
>>> is InfiniBand QDR and my volumes are quite different (brick in
RAID6
>>> (12x2TB), 2 bricks per server and 4 servers into my pool)
>>> 
>>> Concerning your demand, in attachments you can find all expected
results
>>> hoping it can help you to solve this serious performance issue
(maybe I need
>>> play with glusterfs parameters?).
>>> 
>>> Thank you very much by advance,
>>> Geoffrey
>>> ------------------------------------------------------
>>> Geoffrey Letessier
>>> Responsable informatique & ing?nieur syst?me
>>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
>>> Institut de Biologie Physico-Chimique
>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
<mailto:geoffrey.letessier at ibpc.fr>
>>> 
>>> 
>>> 
>>> 
>>> Le 2 juin 2015 ? 10:09, Pranith Kumar Karampuri < pkarampu at
redhat.com <mailto:pkarampu at redhat.com> > a
>>> ?crit :
>>> 
>>> hi Geoffrey,
>>> Since you are saying it happens on all types of volumes, lets do
the
>>> following:
>>> 1) Create a dist-repl volume
>>> 2) Set the options etc you need.
>>> 3) enable gluster volume profile using "gluster volume profile
<volname>
>>> start"
>>> 4) run the work load
>>> 5) give output of "gluster volume profile <volname>
info"
>>> 
>>> Repeat the steps above on new and old version you are comparing
this with.
>>> That should give us insight into what could be causing the
slowness.
>>> 
>>> Pranith
>>> On 06/02/2015 03:22 AM, Geoffrey Letessier wrote:
>>> 
>>> 
>>> Dear all,
>>> 
>>> I have a crash test cluster where i?ve tested the new version of
GlusterFS
>>> (v3.7) before upgrading my HPC cluster in production.
>>> But? all my tests show me very very low performances.
>>> 
>>> For my benches, as you can read below, I do some actions (untar,
du, find,
>>> tar, rm) with linux kernel sources, dropping cache, each on
distributed,
>>> replicated, distributed-replicated, single (single brick) volumes
and the
>>> native FS of one brick.
>>> 
>>> # time (echo 3 > /proc/sys/vm/drop_caches; tar xJf
~/linux-4.1-rc5.tar.xz;
>>> sync; echo 3 > /proc/sys/vm/drop_caches)
>>> # time (echo 3 > /proc/sys/vm/drop_caches; du -sh
linux-4.1-rc5/; echo 3 >
>>> /proc/sys/vm/drop_caches)
>>> # time (echo 3 > /proc/sys/vm/drop_caches; find
linux-4.1-rc5/|wc -l; echo 3
>>>> /proc/sys/vm/drop_caches)
>>> # time (echo 3 > /proc/sys/vm/drop_caches; tar czf
linux-4.1-rc5.tgz
>>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches)
>>> # time (echo 3 > /proc/sys/vm/drop_caches; rm -rf
linux-4.1-rc5.tgz
>>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches)
>>> 
>>> And here are the process times:
>>> 
>>> ---------------------------------------------------------------
>>> | | UNTAR | DU | FIND | TAR | RM |
>>> ---------------------------------------------------------------
>>> | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s |
>>> ---------------------------------------------------------------
>>> | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s |
>>> ---------------------------------------------------------------
>>> | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s |
>>> ---------------------------------------------------------------
>>> | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s |
>>> ---------------------------------------------------------------
>>> | native FS | ~11s | ~4s | ~2s | ~56s | ~10s |
>>> ---------------------------------------------------------------
>>> 
>>> I get the same results, whether with default configurations with
custom
>>> configurations.
>>> 
>>> if I look at the side of the ifstat command, I can note my IO write
processes
>>> never exceed 3MBs...
>>> 
>>> EXT4 native FS seems to be faster (roughly 15-20% but no more) than
XFS one
>>> 
>>> My [test] storage cluster config is composed by 2 identical servers
(biCPU
>>> Intel Xeon X5355, 8GB of RAM, 2x2TB HDD (no-RAID) and Gb ethernet)
>>> 
>>> My volume settings:
>>> single: 1server 1 brick
>>> replicated: 2 servers 1 brick each
>>> distributed: 2 servers 2 bricks each
>>> dist-repl: 2 bricks in the same server and replica 2
>>> 
>>> All seems to be OK in gluster status command line.
>>> 
>>> Do you have an idea why I obtain so bad results?
>>> Thanks in advance.
>>> Geoffrey
>>> -----------------------------------------------
>>> Geoffrey Letessier
>>> 
>>> Responsable informatique & ing?nieur syst?me
>>> CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique
>>> Institut de Biologie Physico-Chimique
>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr
<mailto:geoffrey.letessier at cnrs.fr>
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Gluster-users mailing list Gluster-users at gluster.org
<mailto:Gluster-users at gluster.org>
>>> http://www.gluster.org/mailman/listinfo/gluster-users
<http://www.gluster.org/mailman/listinfo/gluster-users>
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>> http://www.gluster.org/mailman/listinfo/gluster-users
<http://www.gluster.org/mailman/listinfo/gluster-users>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150608/818f8f16/attachment.html>

Geoffrey Letessier

2015-Jun-08 12:37 UTC

head link

[Gluster-users] GlusterFS 3.7 - slow/poor performances

Hello,

Do you know more about?

In addition, do you know how to ? activate ? RDMA for my volume with
Intel/QLogic QDR? Currently, i mount my volumes with RDMA transport-type option
(both in server and client side) but I notice all streams are using TCP stack
-and my bandwith never exceed 2.0-2.5Gbs (250-300MB/s).

Thanks in advance,
Geoffrey
------------------------------------------------------
Geoffrey Letessier
Responsable informatique & ing?nieur syst?me
UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
> Le 2 juin 2015 ? 23:45, Geoffrey Letessier <geoffrey.letessier at
cnrs.fr> a ?crit :
> 
> Hi Ben,
> 
> I just check my messages log files, both on client and server, and I dont
find any hung task you notice on yours..
> 
> As you can read below, i dont note the performance issue in a simple DD but
I think my issue is concerning a set of small files (tens of thousands nay
more)?
> 
> [root at nisus test]# ddt -t 10g /mnt/test/
> Writing to /mnt/test/ddt.8362 ... syncing ... done.
> sleeping 10 seconds ... done.
> Reading from /mnt/test/ddt.8362 ... done.
> 10240MiB    KiB/s  CPU%
> Write      114770     4
> Read        40675     4
> 
> for info: /mnt/test concerns the single v2 GlFS volume
> 
> [root at nisus test]# ddt -t 10g /mnt/fhgfs/
> Writing to /mnt/fhgfs/ddt.8380 ... syncing ... done.
> sleeping 10 seconds ... done.
> Reading from /mnt/fhgfs/ddt.8380 ... done.
> 10240MiB    KiB/s  CPU%
> Write      102591     1
> Read        98079     2
> 
> Do you have a idea how to tune/optimize performance settings? and/or TCP
settings (MTU, etc.)?
> 
> ---------------------------------------------------------------
> |             |  UNTAR  |   DU   |  FIND   |   TAR   |   RM   |
> ---------------------------------------------------------------
> | single      |  ~3m45s |   ~43s |    ~47s |  ~3m10s | ~3m15s |
> ---------------------------------------------------------------
> | replicated  |  ~5m10s |   ~59s |   ~1m6s |  ~1m19s | ~1m49s |
> ---------------------------------------------------------------
> | distributed |  ~4m18s |   ~41s |    ~57s |  ~2m24s | ~1m38s |
> ---------------------------------------------------------------
> | dist-repl   |  ~8m18s |  ~1m4s |  ~1m11s |  ~1m24s | ~2m40s |
> ---------------------------------------------------------------
> | native FS   |    ~11s |    ~4s |     ~2s |    ~56s |   ~10s |
> ---------------------------------------------------------------
> | BeeGFS      |  ~3m43s |   ~15s |     ~3s |  ~1m33s |   ~46s |
> ---------------------------------------------------------------
> | single (v2) |   ~3m6s |   ~14s |    ~32s |   ~1m2s |   ~44s |
> ---------------------------------------------------------------
> for info: 
> 	-BeeGFS is a distributed FS (4 bricks, 2 bricks per server and 2 servers)
> 	- single (v2): simple gluster volume with default settings
> 
> I also note I obtain the same tar/untar performance issue with FhGFS/BeeGFS
but the rest (DU, FIND, RM) looks like to be OK.
> 
> Thank you very much for your reply and help.
> Geoffrey
> -----------------------------------------------
> Geoffrey Letessier
> 
> Responsable informatique & ing?nieur syst?me
> CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique
> Institut de Biologie Physico-Chimique
> 13, rue Pierre et Marie Curie - 75005 Paris
> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr
<mailto:geoffrey.letessier at cnrs.fr>
> Le 2 juin 2015 ? 21:53, Ben Turner <bturner at redhat.com
<mailto:bturner at redhat.com>> a ?crit :
> 
>> I am seeing problems on 3.7 as well.  Can you check /var/log/messages
on both the clients and servers for hung tasks like:
>> 
>> Jun  2 15:23:14 gqac006 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Jun  2 15:23:14 gqac006 kernel: iozone        D 0000000000000001     0
21999      1 0x00000080
>> Jun  2 15:23:14 gqac006 kernel: ffff880611321cc8 0000000000000082
ffff880611321c18 ffffffffa027236e
>> Jun  2 15:23:14 gqac006 kernel: ffff880611321c48 ffffffffa0272c10
ffff88052bd1e040 ffff880611321c78
>> Jun  2 15:23:14 gqac006 kernel: ffff88052bd1e0f0 ffff88062080c7a0
ffff880625addaf8 ffff880611321fd8
>> Jun  2 15:23:14 gqac006 kernel: Call Trace:
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffffa027236e>] ?
rpc_make_runnable+0x7e/0x80 [sunrpc]
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffffa0272c10>] ?
rpc_execute+0x50/0xa0 [sunrpc]
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff810aaa21>] ?
ktime_get_ts+0xb1/0xf0
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff811242d0>] ?
sync_page+0x0/0x50
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8152a1b3>]
io_schedule+0x73/0xc0
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8112430d>]
sync_page+0x3d/0x50
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8152ac7f>]
__wait_on_bit+0x5f/0x90
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff81124543>]
wait_on_page_bit+0x73/0x80
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8109eb80>] ?
wake_bit_function+0x0/0x50
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8113a525>] ?
pagevec_lookup_tag+0x25/0x40
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8112496b>]
wait_on_page_writeback_range+0xfb/0x190
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff81124b38>]
filemap_write_and_wait_range+0x78/0x90
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff811c07ce>]
vfs_fsync_range+0x7e/0x100
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff811c08bd>]
vfs_fsync+0x1d/0x20
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff811c08fe>]
do_fsync+0x3e/0x60
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff811c0950>]
sys_fsync+0x10/0x20
>> Jun  2 15:23:14 gqac006 kernel: [<ffffffff8100b072>]
system_call_fastpath+0x16/0x1b
>> 
>> Do you see a perf problem with just a simple DD or do you need a more
complex workload to hit the issue?  I think I saw an issue with metadata
performance that I am trying to run down, let me know if you can see the problem
with simple DD reads / writes or if we need to do some sort of dir / metadata
access as well.
>> 
>> -b
>> 
>> ----- Original Message -----
>>> From: "Geoffrey Letessier" <geoffrey.letessier at
cnrs.fr <mailto:geoffrey.letessier at cnrs.fr>>
>>> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com
<mailto:pkarampu at redhat.com>>
>>> Cc: gluster-users at gluster.org <mailto:gluster-users at
gluster.org>
>>> Sent: Tuesday, June 2, 2015 8:09:04 AM
>>> Subject: Re: [Gluster-users] GlusterFS 3.7 - slow/poor performances
>>> 
>>> Hi Pranith,
>>> 
>>> I?m sorry but I cannot bring you any comparison because comparison
will be
>>> distorted by the fact in my HPC cluster in production the network
technology
>>> is InfiniBand QDR and my volumes are quite different (brick in
RAID6
>>> (12x2TB), 2 bricks per server and 4 servers into my pool)
>>> 
>>> Concerning your demand, in attachments you can find all expected
results
>>> hoping it can help you to solve this serious performance issue
(maybe I need
>>> play with glusterfs parameters?).
>>> 
>>> Thank you very much by advance,
>>> Geoffrey
>>> ------------------------------------------------------
>>> Geoffrey Letessier
>>> Responsable informatique & ing?nieur syst?me
>>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
>>> Institut de Biologie Physico-Chimique
>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
<mailto:geoffrey.letessier at ibpc.fr>
>>> 
>>> 
>>> 
>>> 
>>> Le 2 juin 2015 ? 10:09, Pranith Kumar Karampuri < pkarampu at
redhat.com <mailto:pkarampu at redhat.com> > a
>>> ?crit :
>>> 
>>> hi Geoffrey,
>>> Since you are saying it happens on all types of volumes, lets do
the
>>> following:
>>> 1) Create a dist-repl volume
>>> 2) Set the options etc you need.
>>> 3) enable gluster volume profile using "gluster volume profile
<volname>
>>> start"
>>> 4) run the work load
>>> 5) give output of "gluster volume profile <volname>
info"
>>> 
>>> Repeat the steps above on new and old version you are comparing
this with.
>>> That should give us insight into what could be causing the
slowness.
>>> 
>>> Pranith
>>> On 06/02/2015 03:22 AM, Geoffrey Letessier wrote:
>>> 
>>> 
>>> Dear all,
>>> 
>>> I have a crash test cluster where i?ve tested the new version of
GlusterFS
>>> (v3.7) before upgrading my HPC cluster in production.
>>> But? all my tests show me very very low performances.
>>> 
>>> For my benches, as you can read below, I do some actions (untar,
du, find,
>>> tar, rm) with linux kernel sources, dropping cache, each on
distributed,
>>> replicated, distributed-replicated, single (single brick) volumes
and the
>>> native FS of one brick.
>>> 
>>> # time (echo 3 > /proc/sys/vm/drop_caches; tar xJf
~/linux-4.1-rc5.tar.xz;
>>> sync; echo 3 > /proc/sys/vm/drop_caches)
>>> # time (echo 3 > /proc/sys/vm/drop_caches; du -sh
linux-4.1-rc5/; echo 3 >
>>> /proc/sys/vm/drop_caches)
>>> # time (echo 3 > /proc/sys/vm/drop_caches; find
linux-4.1-rc5/|wc -l; echo 3
>>>> /proc/sys/vm/drop_caches)
>>> # time (echo 3 > /proc/sys/vm/drop_caches; tar czf
linux-4.1-rc5.tgz
>>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches)
>>> # time (echo 3 > /proc/sys/vm/drop_caches; rm -rf
linux-4.1-rc5.tgz
>>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches)
>>> 
>>> And here are the process times:
>>> 
>>> ---------------------------------------------------------------
>>> | | UNTAR | DU | FIND | TAR | RM |
>>> ---------------------------------------------------------------
>>> | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s |
>>> ---------------------------------------------------------------
>>> | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s |
>>> ---------------------------------------------------------------
>>> | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s |
>>> ---------------------------------------------------------------
>>> | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s |
>>> ---------------------------------------------------------------
>>> | native FS | ~11s | ~4s | ~2s | ~56s | ~10s |
>>> ---------------------------------------------------------------
>>> 
>>> I get the same results, whether with default configurations with
custom
>>> configurations.
>>> 
>>> if I look at the side of the ifstat command, I can note my IO write
processes
>>> never exceed 3MBs...
>>> 
>>> EXT4 native FS seems to be faster (roughly 15-20% but no more) than
XFS one
>>> 
>>> My [test] storage cluster config is composed by 2 identical servers
(biCPU
>>> Intel Xeon X5355, 8GB of RAM, 2x2TB HDD (no-RAID) and Gb ethernet)
>>> 
>>> My volume settings:
>>> single: 1server 1 brick
>>> replicated: 2 servers 1 brick each
>>> distributed: 2 servers 2 bricks each
>>> dist-repl: 2 bricks in the same server and replica 2
>>> 
>>> All seems to be OK in gluster status command line.
>>> 
>>> Do you have an idea why I obtain so bad results?
>>> Thanks in advance.
>>> Geoffrey
>>> -----------------------------------------------
>>> Geoffrey Letessier
>>> 
>>> Responsable informatique & ing?nieur syst?me
>>> CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique
>>> Institut de Biologie Physico-Chimique
>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr
<mailto:geoffrey.letessier at cnrs.fr>
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Gluster-users mailing list Gluster-users at gluster.org
<mailto:Gluster-users at gluster.org>
>>> http://www.gluster.org/mailman/listinfo/gluster-users
<http://www.gluster.org/mailman/listinfo/gluster-users>
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>> http://www.gluster.org/mailman/listinfo/gluster-users
<http://www.gluster.org/mailman/listinfo/gluster-users>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150608/40e3ad8d/attachment.html>

Gluster users - Jun 2015 - GlusterFS 3.7 - slow/poor performances

[Gluster-users] GlusterFS 3.7 - slow/poor performances

[Gluster-users] Quota issue

[Gluster-users] GlusterFS 3.7 - slow/poor performances