thr3ads.net - Gluster users - [Gluster-users] Cascading errors and very bad write performance [Aug 2015]

If this information is useful, please help other people find it:
Share via:

Geoffrey Letessier

2015-Aug-09 08:53 UTC

[Gluster-users] Cascading errors and very bad write performance

Hi Mathieu,

First of all, thanks for replying.

I?ve done your proposal but there?s no change: my brick logs are still growing
up in the server where the file is written, with this kind of lines:
[...]
[2015-08-09 08:16:57.856987] W [marker-quota.c:3379:_mq_initiate_quota_txn]
0-vol_home-marker: parent is NULL for
<gfid:64c302ab-2171-4656-8e5f-47e474de80b6>, aborting updation txn
[2015-08-09 08:16:57.857691] W [marker-quota.c:3379:_mq_initiate_quota_txn]
0-vol_home-marker: parent is NULL for
<gfid:64c302ab-2171-4656-8e5f-47e474de80b6>, aborting updation txn
[2015-08-09 08:16:57.858403] W [marker-quota.c:3379:_mq_initiate_quota_txn]
0-vol_home-marker: parent is NULL for
<gfid:64c302ab-2171-4656-8e5f-47e474de80b6>, aborting updation txn
[2015-08-09 08:16:57.859226] W [marker-quota.c:3379:_mq_initiate_quota_txn]
0-vol_home-marker: parent is NULL for
<gfid:64c302ab-2171-4656-8e5f-47e474de80b6>, aborting updation txn
[2015-08-09 08:16:57.859982] W [marker-quota.c:3379:_mq_initiate_quota_txn]
0-vol_home-marker: parent is NULL for
<gfid:64c302ab-2171-4656-8e5f-47e474de80b6>, aborting updating txn
The message "W [MSGID: 113001]
[posix.c:3700:posix_get_ancestry_non_directory] 0-vol_home-posix: listxattr
failed
on/export/brick_home/brick2/data/.glusterfs/64/c3/64c302ab-2171-4656-8e5f-47e474de80b6
[Aucun fichier ou dossier de ce type]" repeated 149711 times between
[2015-08-09 08:15:17.811919] and [2015-08-09 08:16:57.859754]
[2015-08-09 08:16:59.629692] W [MSGID: 113001]
[posix.c:3700:posix_get_ancestry_non_directory] 0-vol_home-posix: listxattr
failed
on/export/brick_home/brick2/data/.glusterfs/64/c3/64c302ab-2171-4656-8e5f-47e474de80b6
[Aucun fichier ou dossier de ce type]
[...]

and here the ddt output:
# ddt -t 35g /home/
Writing to /home/ddt.12247 ... syncing ... done.
sleeping 10 seconds ... done.
Reading from /home/ddt.12247 ... done.
35840MiB    KiB/s  CPU%
Write      184737     2
Read       484209     3

For just a write of only one 35GB file (with a blank log files before) :
# grep "parent is NULL"
/var/log/glusterfs/bricks/export-brick_home-brick2-data.log|wc -l
286720

# grep "xattr"
/var/log/glusterfs/bricks/export-brick_home-brick2-data.log|wc -l
5

# wc -l /var/log/glusterfs/bricks/export-brick_home-brick2-data.log
286733 /var/log/glusterfs/bricks/export-brick_home-brick2-data.log

and the other kind of lines in the brick log file:
# grep -vE "(xattr|parent is NULL)"
/var/log/glusterfs/bricks/export-brick_home-brick2-data.log
[2015-08-09 08:13:16.368705] I [MSGID: 115034]
[server.c:397:_check_for_auth_option] 0-/export/brick_home/brick2/data: skip
format check for non-addr auth option
auth.login./export/brick_home/brick2/data.allow
[2015-08-09 08:13:16.368858] I [MSGID: 115034]
[server.c:397:_check_for_auth_option] 0-/export/brick_home/brick2/data: skip
format check for non-addr auth option
auth.login.dffafb7e-3ff2-4e91-b30b-eb87c6cfe621.password
[2015-08-09 08:13:16.368953] E [MSGID: 115041] [server.c:833:reconfigure]
0-vol_home-server: Reconfigure not found for transport
[2015-08-09 08:13:16.377119] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk]
0-glusterfs: No change in volfile, continuing
[2015-08-09 08:13:16.393164] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk]
0-glusterfs: No change in volfile, continuing
[2015-08-09 08:13:16.402136] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk]
0-glusterfs: No change in volfile, continuing
[2015-08-09 08:13:16.410998] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk]
0-glusterfs: No change in volfile, continuing
[2015-08-09 08:22:16.000685] E [MSGID: 113104]
[posix-handle.c:154:posix_make_ancestryfromgfid] 0-vol_home-posix: could not
read the link from the gfid handle
/export/brick_home/brick2/data/.glusterfs/b3/7a/b37a7750-f250-4ab4-8b29-bba519b6dc69
[Aucun fichier ou dossier de ce type]
[2015-08-09 08:32:17.000668] E [MSGID: 113104]
[posix-handle.c:154:posix_make_ancestryfromgfid] 0-vol_home-posix: could not
read the link from the gfid handle
/export/brick_home/brick2/data/.glusterfs/b3/7a/b37a7750-f250-4ab4-8b29-bba519b6dc69
[Aucun fichier ou dossier de ce type]

No change in logs if i run the command with a simple user but a slightly better
performance for write but a slightly lower performance for read:
$ ddt -t 35g /home/admin_team/letessier/
Writing to /home/admin_team/letessier/ddt.12489 ... syncing ... done.
sleeping 10 seconds ... done.
Reading from /home/admin_team/letessier/ddt.12489 ... done.
35840MiB    KiB/s  CPU%
Write      280981     3
Read       313502     2

Any other idea?

Frankly, I'm very frustrated for having stopped our scientific computing
production more than six weeks ago and, due to cascading issues in GlusterFS, to
not be able to restart it for the moment and?

Thanks again,
Geoffrey
------------------------------------------------------
Geoffrey Letessier
Responsable informatique & ing?nieur syst?me
UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr

Le 8 ao?t 2015 ? 10:02, Mathieu Chateau <mathieu.chateau at lotp.fr> a
?crit :
> Maybe related to the insecure port issue reported ?
> 
> try with :
> 
> gluster volume set xxx server.allow-insecure on
> 
> Cordialement,
> Mathieu CHATEAU
> http://www.lotp.fr
> 
> 2015-08-07 23:47 GMT+02:00 Geoffrey Letessier <geoffrey.letessier at
cnrs.fr>:
> I?m not really sure to well understand your answer.
> 
> I try to set inode-lru-limit to 1, I can not notice any good effect. 
> 
> When i re-run ddt application, I can note 2 kinds of messages:
> [2015-08-07 21:29:21.792156] W [marker-quota.c:3379:_mq_initiate_quota_txn]
0-vol_home-marker: parent is NULL for
<gfid:5a32328a-7fd9-474e-9bc6-cafde9c41af7>, aborting updation txn
> [2015-08-07 21:29:21.792176] W [marker-quota.c:3379:_mq_initiate_quota_txn]
0-vol_home-marker: parent is NULL for
<gfid:5a32328a-7fd9-474e-9bc6-cafde9c41af7>, aborting updation txn
> 
> and/or:
> [2015-08-07 21:44:19.279971] E [marker-quota.c:2990:mq_start_quota_txn_v2]
0-vol_home-marker: contribution node list is empty
(31d7bf88-b63a-4731-a737-a3dce73b8cd1)
> [2015-08-07 21:41:26.177095] E [dict.c:1418:dict_copy_with_ref]
(-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(server_resolve_inode+0x60)
[0x7f85e9a6a410]
-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(resolve_gfid+0x88)
[0x7f85e9a6a188] -->/usr/lib64/libglusterfs.so.0(dict_copy_with_ref+0xa4)
[0x3e99c20674] ) 0-dict: invalid argument: dict [Argument invalide]
> 
> And concerning the bad IO performance?
> 
> [letessier at node031 ~]$ ddt -t 35g /home/admin_team/letessier/
> Writing to /home/admin_team/letessier/ddt.25259 ... syncing ... done.
> sleeping 10 seconds ... done.
> Reading from /home/admin_team/letessier/ddt.25259 ... done.
> 35840MiB    KiB/s  CPU%
> Write      277451     3
> Read       188682     1
> [letessier at node031 ~]$ logout
> [root at node031 ~]# ddt -t 35g /home/
> Writing to /home/ddt.25559 ... syncing ... done.
> sleeping 10 seconds ... done.
> Reading from /home/ddt.25559 ... done.
> 35840MiB    KiB/s  CPU%
> Write      196539     2
> Read       438944     3
> Notice the read/write throughput differences when i?m root and when i?m a
simple user.
> 
> Thanks.
> Geoffrey
> ------------------------------------------------------
> Geoffrey Letessier
> Responsable informatique & ing?nieur syst?me
> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
> Institut de Biologie Physico-Chimique
> 13, rue Pierre et Marie Curie - 75005 Paris
> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
> 
> Le 7 ao?t 2015 ? 14:57, Vijaikumar M <vmallika at redhat.com> a ?crit
:
> 
>> 
>> 
>> On Friday 07 August 2015 05:34 PM, Geoffrey Letessier wrote:
>>> Hi Vijay, 
>>> 
>>> My brick logs issue and big performance problem have begun when I
upgraded Gluster into 3.7.3 version; before write throughput was good enough
(~500MBs) -but not as good as with GlusterFS 3.5.3 (especially with distributed
volumes)- and didn?t notice these probl?me with brick-logs.
>>> 
>>> OK? in live:
>>> 
>>> i just disable to quota for my home volume and now my performance
appears to be relatively better (around 300MBs) but i still see the logs (from
storage1 and its replicate storage2) growing up with only this kind of lines:
>>> [2015-08-07 11:16:51.746142] E [dict.c:1418:dict_copy_with_ref]
(-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(server_resolve_inode+0x60)
[0x7f85e9a6a410]
-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(resolve_gfid+0x88)
[0x7f85e9a6a188] -->/usr/lib64/libglusterfs.so.0(dict_copy_with_ref+0xa4)
[0x3e99c20674] ) 0-dict: invalid argument: dict [Argument invalide]
>>> 
>> We have root caused log issue,  bug# 1244613 tracks this issue
>> 
>> 
>>> After a few minutes: my write throughput seems to be now correct
(~550MBs) but the log are still growing up (to not say exploding). So one part
of the problem looks like taking its origin in the quota system management.
>>> ? after a few minutes (and still only 1 client connected), now it
is the read operation which is very very slow? -I?m gonna become crazy! :/-
>>> # ddt -t 50g /home/
>>> Writing to /home/ddt.11293 ... syncing ... done.
>>> sleeping 10 seconds ... done.
>>> Reading from /home/ddt.11293 ... done.
>>> 35840MiB    KiB/s  CPU%
>>> Write      568201     5
>>> Read       567008     4
>>> # ddt -t 50g /home/
>>> Writing to /home/ddt.11397 ... syncing ... done.
>>> sleeping 10 seconds ... done.
>>> Reading from /home/ddt.11397 ... done.
>>> 51200MiB    KiB/s  CPU%
>>> Write      573631     5
>>> Read       164716     1
>>> 
>>> and my log are still exploding?
>>> 
>>> After having re-enabled the quota on my volume: 
>>> # ddt -t 50g /home/
>>> Writing to /home/ddt.11817 ... syncing ... done.
>>> sleeping 10 seconds ... done.
>>> Reading from /home/ddt.11817 ... done.
>>> 51200MiB    KiB/s  CPU%
>>> Write      269608     3
>>> Read       160219     1
>>> 
>>> Thanks 
>>> Geoffrey 
>>> ------------------------------------------------------
>>> Geoffrey Letessier
>>> Responsable informatique & ing?nieur syst?me
>>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
>>> Institut de Biologie Physico-Chimique
>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>>> 
>>> Le 7 ao?t 2015 ? 06:28, Vijaikumar M <vmallika at redhat.com>
a ?crit :
>>> 
>>>> Hi Geoffrey,
>>>> 
>>>> Some performance improvements has been done in quota in
glusterfs-3.7.3.
>>>> Could you upgrade to glusterfs-3.7.3 and see if this helps
>>>> 
>>>> Thanks,
>>>> Vijay
>>>> 
>>>> 
>>>> On Friday 07 August 2015 05:02 AM, Geoffrey Letessier wrote:
>>>>> Hi,
>>>>> 
>>>>> No idea to help me fix this issue? (big logs, small write
performance (/4), etc.)
>>>>> 
>>>>> For comparison, here to volumes: 
>>>>> 
>>>>>                 - home: distributed on 4 bricks / 2 nodes 
(and replicated on 4 other bricks / 2 other nodes):
>>>>> # ddt -t 35g /home
>>>>> Writing to /home/ddt.24172 ... syncing ... done.
>>>>> sleeping 10 seconds ... done.
>>>>> Reading from /home/ddt.24172 ... done.
>>>>> 33792MiB    KiB/s  CPU%
>>>>> Write      103659     1
>>>>> Read       391955     3
>>>>> 
>>>>> 
>>>>>                 - workdir: distributed on 4 bricks / 2
nodes (one the same RAID volumes and servers than home):
>>>>> # ddt -t 35g /workdir
>>>>> Writing to /workdir/ddt.24717 ... syncing ... done.
>>>>> sleeping 10 seconds ... done.
>>>>> Reading from /workdir/ddt.24717 ... done.
>>>>> 35840MiB    KiB/s  CPU%
>>>>> Write      738314     4
>>>>> Read       536497     4
>>>>> 
>>>>> For information, previously on 3.5.3-2 version, I obtained
roughly 1.1GBs for workdir volume and ~550-600MBs for home.
>>>>> 
>>>>> All my tests (CP, RSYNC, etc.) provides me the same result
(write throughput between 100MBs and 150MBs)
>>>>> 
>>>>> Thanks.
>>>>> Geoffrey
>>>>> ------------------------------------------------------
>>>>> Geoffrey Letessier
>>>>> Responsable informatique & ing?nieur syst?me
>>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
>>>>> Institut de Biologie Physico-Chimique
>>>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>>>>> 
>>>>> Le 5 ao?t 2015 ? 10:40, Geoffrey Letessier
<geoffrey.letessier at cnrs.fr> a ?crit :
>>>>> 
>>>>>> Hello,
>>>>>> 
>>>>>> In addition, knowing I have reactivated the log
(brick-log-level = INFO not CRITICAL) only for the file creation duration (i.e.
a few minutes), do you have noticed the log sizes and the number of lines
inside:
>>>>>> # ls -lh storage*
>>>>>> -rw-------  1 letessier  staff    18M  5 ao? 00:54
storage1__export-brick_home-brick1-data.log
>>>>>> -rw-------  1 letessier  staff   2,1K  5 ao? 00:54
storage1__export-brick_home-brick2-data.log
>>>>>> -rw-------  1 letessier  staff    15M  5 ao? 00:56
storage2__export-brick_home-brick1-data.log
>>>>>> -rw-------  1 letessier  staff   2,1K  5 ao? 00:54
storage2__export-brick_home-brick2-data.log
>>>>>> -rw-------  1 letessier  staff    47M  5 ao? 00:55
storage3__export-brick_home-brick1-data.log
>>>>>> -rw-------  1 letessier  staff   2,1K  5 ao? 00:54
storage3__export-brick_home-brick2-data.log
>>>>>> -rw-------  1 letessier  staff    47M  5 ao? 00:55
storage4__export-brick_home-brick1-data.log
>>>>>> -rw-------  1 letessier  staff   2,1K  5 ao? 00:55
storage4__export-brick_home-brick2-data.log
>>>>>> 
>>>>>> # wc -l storage*
>>>>>>    55381 storage1__export-brick_home-brick1-data.log
>>>>>>       17 storage1__export-brick_home-brick2-data.log
>>>>>>    41636 storage2__export-brick_home-brick1-data.log
>>>>>>       17 storage2__export-brick_home-brick2-data.log
>>>>>>   270360 storage3__export-brick_home-brick1-data.log
>>>>>>       17 storage3__export-brick_home-brick2-data.log
>>>>>>   270358 storage4__export-brick_home-brick1-data.log
>>>>>>       17 storage4__export-brick_home-brick2-data.log
>>>>>>   637803 total
>>>>>> 
>>>>>> If the let brick-log-level to INFO, the brick log files
in each server will consume all my /var partition capacity within only a few
hours/days?
>>>>>> 
>>>>>> Thanks in advance,
>>>>>> Geoffrey
>>>>>> ------------------------------------------------------
>>>>>> Geoffrey Letessier
>>>>>> Responsable informatique & ing?nieur syst?me
>>>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
>>>>>> Institut de Biologie Physico-Chimique
>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at
ibpc.fr
>>>>>> 
>>>>>> Le 5 ao?t 2015 ? 01:12, Geoffrey Letessier
<geoffrey.letessier at cnrs.fr> a ?crit :
>>>>>> 
>>>>>>> Hello,
>>>>>>> 
>>>>>>> Since the problem motioned previously (all errors
noticed in brick log files), i notice a very very bad performance: i can note my
write performance divided by 4 than previously -knowing it was not so good
before.
>>>>>>> Now, a write of a 33GB file, my write throughput is
around 150MBs (with Infiniband), before it was around 550-600MBs; and this, both
with RDMA and TCP protocol.
>>>>>>> 
>>>>>>> During this test, more than 40 000 error lines (as
the following) were added to the brick log files.
>>>>>>> [2015-08-04 22:34:27.337622] E
[dict.c:1418:dict_copy_with_ref]
(-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(server_resolve_inode+0x60)
[0x7f021c6f7410]
-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(resolve_gfid+0x88)
[0x7f021c6f7188] -->/usr/lib64/libglusterfs.so.0(dict_copy_with_ref+0xa4)
[0x7f0229cba674] ) 0-dict: invalid argument: dict [Argument invalide]
>>>>>>> 
>>>>>>> 
>>>>>>> All brick log files are in attachments.
>>>>>>> 
>>>>>>> Thanks in advance for all your help and fix,
>>>>>>> Best,
>>>>>>> Geoffrey
>>>>>>> 
>>>>>>> PS: question: is it possible to easily downgrade
GlusterFS to a previous version from 3.7 (for example: v3.5)?
>>>>>>> 
>>>>>>>
------------------------------------------------------
>>>>>>> Geoffrey Letessier
>>>>>>> Responsable informatique & ing?nieur syst?me
>>>>>>> UPR 9080 - CNRS - Laboratoire de Biochimie
Th?orique
>>>>>>> Institut de Biologie Physico-Chimique
>>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at
ibpc.fr
>>>>>>> <bricks-logs.tgz>
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150809/c9b48f55/attachment.html>

Raghavendra Talur

2015-Aug-10 14:04 UTC

head link

[Gluster-users] Cascading errors and very bad write performance

Will look into the issue from RDMA side and let you know if we find
something.

A few more details from your setup are required.

1. ls -l on all the brick mounts.
2. Did you change transport of the volume with the volume on?

Raghavendra Talur

On Sun, Aug 9, 2015 at 2:23 PM, Geoffrey Letessier <
geoffrey.letessier at cnrs.fr> wrote:
> Hi Mathieu,
>
> First of all, thanks for replying.
>
> I?ve done your proposal but there?s no change: my brick logs are still
> growing up in the server where the file is written, with this kind of
lines:
> [...]
> [2015-08-09 08:16:57.856987] W
> [marker-quota.c:3379:_mq_initiate_quota_txn] 0-vol_home-marker: parent is
> NULL for <gfid:64c302ab-2171-4656-8e5f-47e474de80b6>, aborting
updation txn
> [2015-08-09 08:16:57.857691] W
> [marker-quota.c:3379:_mq_initiate_quota_txn] 0-vol_home-marker: parent is
> NULL for <gfid:64c302ab-2171-4656-8e5f-47e474de80b6>, aborting
updation txn
> [2015-08-09 08:16:57.858403] W
> [marker-quota.c:3379:_mq_initiate_quota_txn] 0-vol_home-marker: parent is
> NULL for <gfid:64c302ab-2171-4656-8e5f-47e474de80b6>, aborting
updation txn
> [2015-08-09 08:16:57.859226] W
> [marker-quota.c:3379:_mq_initiate_quota_txn] 0-vol_home-marker: parent is
> NULL for <gfid:64c302ab-2171-4656-8e5f-47e474de80b6>, aborting
updation txn
> [2015-08-09 08:16:57.859982] W
> [marker-quota.c:3379:_mq_initiate_quota_txn] 0-vol_home-marker: parent is
> NULL for <gfid:64c302ab-2171-4656-8e5f-47e474de80b6>, aborting
updating txn
> The message "W [MSGID: 113001]
> [posix.c:3700:posix_get_ancestry_non_directory] 0-vol_home-posix: listxattr
> failed
>
on/export/brick_home/brick2/data/.glusterfs/64/c3/64c302ab-2171-4656-8e5f-47e474de80b6
> [Aucun fichier ou dossier de ce type]" repeated 149711 times between
> [2015-08-09 08:15:17.811919] and [2015-08-09 08:16:57.859754]
> [2015-08-09 08:16:59.629692] W [MSGID: 113001]
> [posix.c:3700:posix_get_ancestry_non_directory] 0-vol_home-posix: listxattr
> failed
>
on/export/brick_home/brick2/data/.glusterfs/64/c3/64c302ab-2171-4656-8e5f-47e474de80b6
> [Aucun fichier ou dossier de ce type]
> [...]
>
> and here the ddt output:
> # ddt -t 35g /home/
> Writing to /home/ddt.12247 ... syncing ... done.
> sleeping 10 seconds ... done.
> Reading from /home/ddt.12247 ... done.
> 35840MiB    KiB/s  CPU%
> Write      184737     2
> Read       484209     3
>
> For just a write of only one 35GB file (with a blank log files before) :
> # grep "parent is NULL"
> /var/log/glusterfs/bricks/export-brick_home-brick2-data.log|wc -l
> 286720
>
> # grep "xattr"
> /var/log/glusterfs/bricks/export-brick_home-brick2-data.log|wc -l
> 5
>
> # wc -l /var/log/glusterfs/bricks/export-brick_home-brick2-data.log
> 286733 /var/log/glusterfs/bricks/export-brick_home-brick2-data.log
>
> and the other kind of lines in the brick log file:
> # grep -vE "(xattr|parent is NULL)"
> /var/log/glusterfs/bricks/export-brick_home-brick2-data.log
> [2015-08-09 08:13:16.368705] I [MSGID: 115034]
> [server.c:397:_check_for_auth_option] 0-/export/brick_home/brick2/data:
> skip format check for non-addr auth option
> auth.login./export/brick_home/brick2/data.allow
> [2015-08-09 08:13:16.368858] I [MSGID: 115034]
> [server.c:397:_check_for_auth_option] 0-/export/brick_home/brick2/data:
> skip format check for non-addr auth option
> auth.login.dffafb7e-3ff2-4e91-b30b-eb87c6cfe621.password
> [2015-08-09 08:13:16.368953] E [MSGID: 115041] [server.c:833:reconfigure]
> 0-vol_home-server: Reconfigure not found for transport
> [2015-08-09 08:13:16.377119] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk]
> 0-glusterfs: No change in volfile, continuing
> [2015-08-09 08:13:16.393164] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk]
> 0-glusterfs: No change in volfile, continuing
> [2015-08-09 08:13:16.402136] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk]
> 0-glusterfs: No change in volfile, continuing
> [2015-08-09 08:13:16.410998] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk]
> 0-glusterfs: No change in volfile, continuing
> [2015-08-09 08:22:16.000685] E [MSGID: 113104]
> [posix-handle.c:154:posix_make_ancestryfromgfid] 0-vol_home-posix: could
> not read the link from the gfid handle
>
/export/brick_home/brick2/data/.glusterfs/b3/7a/b37a7750-f250-4ab4-8b29-bba519b6dc69
> [Aucun fichier ou dossier de ce type]
> [2015-08-09 08:32:17.000668] E [MSGID: 113104]
> [posix-handle.c:154:posix_make_ancestryfromgfid] 0-vol_home-posix: could
> not read the link from the gfid handle
>
/export/brick_home/brick2/data/.glusterfs/b3/7a/b37a7750-f250-4ab4-8b29-bba519b6dc69
> [Aucun fichier ou dossier de ce type]
>
> No change in logs if i run the command with a simple user but a slightly
> better performance for write but a slightly lower performance for read:
> $ ddt -t 35g /home/admin_team/letessier/
> Writing to /home/admin_team/letessier/ddt.12489 ... syncing ... done.
> sleeping 10 seconds ... done.
> Reading from /home/admin_team/letessier/ddt.12489 ... done.
> 35840MiB    KiB/s  CPU%
> Write      280981     3
> Read       313502     2
>
> Any other idea?
>
> Frankly, I'm very frustrated for having stopped our scientific
computing
> production more than six weeks ago and, due to cascading issues in
> GlusterFS, to not be able to restart it for the moment and?
>
> Thanks again,
> Geoffrey
> ------------------------------------------------------
> Geoffrey Letessier
> Responsable informatique & ing?nieur syst?me
> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
> Institut de Biologie Physico-Chimique
> 13, rue Pierre et Marie Curie - 75005 Paris
> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>
> Le 8 ao?t 2015 ? 10:02, Mathieu Chateau <mathieu.chateau at lotp.fr>
a ?crit
> :
>
> Maybe related to the insecure port issue reported ?
>
> try with :
>
> gluster volume set xxx server.allow-insecure on
>
> Cordialement,
> Mathieu CHATEAU
> http://www.lotp.fr
>
> 2015-08-07 23:47 GMT+02:00 Geoffrey Letessier <geoffrey.letessier at
cnrs.fr>
> :
>
>> I?m not really sure to well understand your answer.
>>
>> I try to set inode-lru-limit to 1, I can not notice any good effect.
>>
>> When i re-run ddt application, I can note 2 kinds of messages:
>> [2015-08-07 21:29:21.792156] W
>> [marker-quota.c:3379:_mq_initiate_quota_txn] 0-vol_home-marker: parent
is
>> NULL for <gfid:5a32328a-7fd9-474e-9bc6-cafde9c41af7>, aborting
updation txn
>> [2015-08-07 21:29:21.792176] W
>> [marker-quota.c:3379:_mq_initiate_quota_txn] 0-vol_home-marker: parent
is
>> NULL for <gfid:5a32328a-7fd9-474e-9bc6-cafde9c41af7>, aborting
updation txn
>>
>> and/or:
>> [2015-08-07 21:44:19.279971] E
>> [marker-quota.c:2990:mq_start_quota_txn_v2] 0-vol_home-marker:
contribution
>> node list is empty (31d7bf88-b63a-4731-a737-a3dce73b8cd1)
>> [2015-08-07 21:41:26.177095] E [dict.c:1418:dict_copy_with_ref]
>>
(-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(server_resolve_inode+0x60)
>> [0x7f85e9a6a410]
>>
-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(resolve_gfid+0x88)
>> [0x7f85e9a6a188]
-->/usr/lib64/libglusterfs.so.0(dict_copy_with_ref+0xa4)
>> [0x3e99c20674] ) 0-dict: invalid argument: dict [Argument invalide]
>>
>> And concerning the bad IO performance?
>>
>> [letessier at node031 ~]$ ddt -t 35g /home/admin_team/letessier/
>> Writing to /home/admin_team/letessier/ddt.25259 ... syncing ... done.
>> sleeping 10 seconds ... done.
>> Reading from /home/admin_team/letessier/ddt.25259 ... done.
>> 35840MiB    KiB/s  CPU%
>> Write      277451     3
>> Read       188682     1
>> [letessier at node031 ~]$ logout
>> [root at node031 ~]# ddt -t 35g /home/
>> Writing to /home/ddt.25559 ... syncing ... done.
>> sleeping 10 seconds ... done.
>> Reading from /home/ddt.25559 ... done.
>> 35840MiB    KiB/s  CPU%
>> Write      196539     2
>> Read       438944     3
>> Notice the read/write throughput differences when i?m root and when i?m
a
>> simple user.
>>
>> Thanks.
>> Geoffrey
>> ------------------------------------------------------
>> Geoffrey Letessier
>> Responsable informatique & ing?nieur syst?me
>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
>> Institut de Biologie Physico-Chimique
>> 13, rue Pierre et Marie Curie - 75005 Paris
>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>>
>> Le 7 ao?t 2015 ? 14:57, Vijaikumar M <vmallika at redhat.com> a
?crit :
>>
>>
>>
>> On Friday 07 August 2015 05:34 PM, Geoffrey Letessier wrote:
>>
>> Hi Vijay,
>>
>> My brick logs issue and big performance problem have begun when I
>> upgraded Gluster into 3.7.3 version; before write throughput was good
>> enough (~500MBs) -but not as good as with GlusterFS 3.5.3 (especially
with
>> distributed volumes)- and didn?t notice these probl?me with brick-logs.
>>
>> OK? in live:
>>
>> i just disable to quota for my home volume and now my performance
appears
>> to be relatively better (around 300MBs) but i still see the logs (from
>> storage1 and its replicate storage2) growing up with only this kind of
>> lines:
>> [2015-08-07 11:16:51.746142] E [dict.c:1418:dict_copy_with_ref]
>>
(-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(server_resolve_inode+0x60)
>> [0x7f85e9a6a410]
>>
-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(resolve_gfid+0x88)
>> [0x7f85e9a6a188]
-->/usr/lib64/libglusterfs.so.0(dict_copy_with_ref+0xa4)
>> [0x3e99c20674] ) 0-dict: invalid argument: dict [Argument invalide]
>>
>> We have root caused log issue,  bug# 1244613 tracks this issue
>>
>>
>> After a few minutes: my write throughput seems to be now correct
>> (~550MBs) but the log are still growing up (to not say exploding). So
one
>> part of the problem looks like taking its origin in the quota system
>> management.
>> ? after a few minutes (and still only 1 client connected), now it is
the
>> read operation which is very very slow? -I?m gonna become crazy! :/-
>> # ddt -t 50g /home/
>> Writing to /home/ddt.11293 ... syncing ... done.
>> sleeping 10 seconds ... done.
>> Reading from /home/ddt.11293 ... done.
>> 35840MiB    KiB/s  CPU%
>> Write      568201     5
>> Read       567008     4
>> # ddt -t 50g /home/
>> Writing to /home/ddt.11397 ... syncing ... done.
>> sleeping 10 seconds ... done.
>> Reading from /home/ddt.11397 ... done.
>> 51200MiB    KiB/s  CPU%
>> Write      573631     5
>> Read       164716     1
>>
>> and my log are still exploding?
>>
>> After having re-enabled the quota on my volume:
>> # ddt -t 50g /home/
>> Writing to /home/ddt.11817 ... syncing ... done.
>> sleeping 10 seconds ... done.
>> Reading from /home/ddt.11817 ... done.
>> 51200MiB    KiB/s  CPU%
>> Write      269608     3
>> Read       160219     1
>>
>> Thanks
>> Geoffrey
>> ------------------------------------------------------
>> Geoffrey Letessier
>> Responsable informatique & ing?nieur syst?me
>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
>> Institut de Biologie Physico-Chimique
>> 13, rue Pierre et Marie Curie - 75005 Paris
>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>>
>> Le 7 ao?t 2015 ? 06:28, Vijaikumar M <vmallika at redhat.com> a
?crit :
>>
>> Hi Geoffrey,
>>
>> Some performance improvements has been done in quota in
glusterfs-3.7.3.
>> Could you upgrade to glusterfs-3.7.3 and see if this helps
>>
>> Thanks,
>> Vijay
>>
>>
>> On Friday 07 August 2015 05:02 AM, Geoffrey Letessier wrote:
>>
>> Hi,
>>
>> No idea to help me fix this issue? (big logs, small write performance
>> (/4), etc.)
>>
>> For comparison, here to volumes:
>> - home: distributed on 4 bricks / 2 nodes  (and replicated on 4 other
>> bricks / 2 other nodes):
>> # ddt -t 35g /home
>> Writing to /home/ddt.24172 ... syncing ... done.
>> sleeping 10 seconds ... done.
>> Reading from /home/ddt.24172 ... done.
>> 33792MiB    KiB/s  CPU%
>> Write      103659     1
>> Read       391955     3
>>
>> - workdir: distributed on 4 bricks / 2 nodes (one the same RAID volumes
>> and servers than home):
>> # ddt -t 35g /workdir
>> Writing to /workdir/ddt.24717 ... syncing ... done.
>> sleeping 10 seconds ... done.
>> Reading from /workdir/ddt.24717 ... done.
>> 35840MiB    KiB/s  CPU%
>> Write      738314     4
>> Read       536497     4
>>
>> For information, previously on 3.5.3-2 version, I obtained roughly
1.1GBs
>> for workdir volume and ~550-600MBs for home.
>>
>> All my tests (CP, RSYNC, etc.) provides me the same result (write
>> throughput between 100MBs and 150MBs)
>>
>> Thanks.
>> Geoffrey
>> ------------------------------------------------------
>> Geoffrey Letessier
>> Responsable informatique & ing?nieur syst?me
>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
>> Institut de Biologie Physico-Chimique
>> 13, rue Pierre et Marie Curie - 75005 Paris
>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>>
>> Le 5 ao?t 2015 ? 10:40, Geoffrey Letessier <geoffrey.letessier at
cnrs.fr>
>> a ?crit :
>>
>> Hello,
>>
>> In addition, knowing I have reactivated the log (brick-log-level = INFO
>> not CRITICAL) only for the file creation duration (i.e. a few minutes),
do
>> you have noticed the log sizes and the number of lines inside:
>> # ls -lh storage*
>> -rw-------  1 letessier  staff    18M  5 ao? 00:54
>> storage1__export-brick_home-brick1-data.log
>> -rw-------  1 letessier  staff   2,1K  5 ao? 00:54
>> storage1__export-brick_home-brick2-data.log
>> -rw-------  1 letessier  staff    15M  5 ao? 00:56
>> storage2__export-brick_home-brick1-data.log
>> -rw-------  1 letessier  staff   2,1K  5 ao? 00:54
>> storage2__export-brick_home-brick2-data.log
>> -rw-------  1 letessier  staff    47M  5 ao? 00:55
>> storage3__export-brick_home-brick1-data.log
>> -rw-------  1 letessier  staff   2,1K  5 ao? 00:54
>> storage3__export-brick_home-brick2-data.log
>> -rw-------  1 letessier  staff    47M  5 ao? 00:55
>> storage4__export-brick_home-brick1-data.log
>> -rw-------  1 letessier  staff   2,1K  5 ao? 00:55
>> storage4__export-brick_home-brick2-data.log
>>
>> # wc -l storage*
>>    55381 storage1__export-brick_home-brick1-data.log
>>       17 storage1__export-brick_home-brick2-data.log
>>    41636 storage2__export-brick_home-brick1-data.log
>>       17 storage2__export-brick_home-brick2-data.log
>>   270360 storage3__export-brick_home-brick1-data.log
>>       17 storage3__export-brick_home-brick2-data.log
>>   270358 storage4__export-brick_home-brick1-data.log
>>       17 storage4__export-brick_home-brick2-data.log
>>   637803 total
>>
>> If the let brick-log-level to INFO, the brick log files in each server
>> will consume all my /var partition capacity within only a few
hours/days?
>>
>> Thanks in advance,
>> Geoffrey
>> ------------------------------------------------------
>> Geoffrey Letessier
>> Responsable informatique & ing?nieur syst?me
>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
>> Institut de Biologie Physico-Chimique
>> 13, rue Pierre et Marie Curie - 75005 Paris
>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>>
>> Le 5 ao?t 2015 ? 01:12, Geoffrey Letessier <geoffrey.letessier at
cnrs.fr>
>> a ?crit :
>>
>> Hello,
>>
>> Since the problem motioned previously (all errors noticed in brick log
>> files), i notice a very very bad performance: i can note my write
>> performance divided by 4 than previously -knowing it was not so good
before.
>> Now, a write of a 33GB file, my write throughput is around 150MBs (with
>> Infiniband), before it was around 550-600MBs; and this, both with RDMA
and
>> TCP protocol.
>>
>> During this test, more than 40 000 error lines (as the following) were
>> added to the brick log files.
>> [2015-08-04 22:34:27.337622] E [dict.c:1418:dict_copy_with_ref]
>>
(-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(server_resolve_inode+0x60)
>> [0x7f021c6f7410]
>>
-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(resolve_gfid+0x88)
>> [0x7f021c6f7188]
-->/usr/lib64/libglusterfs.so.0(dict_copy_with_ref+0xa4)
>> [0x7f0229cba674] ) 0-dict: invalid argument: dict [Argument invalide]
>>
>>
>> All brick log files are in attachments.
>>
>> Thanks in advance for all your help and fix,
>> Best,
>> Geoffrey
>>
>> PS: question: is it possible to easily downgrade GlusterFS to a
previous
>> version from 3.7 (for example: v3.5)?
>>
>> ------------------------------------------------------
>> Geoffrey Letessier
>> Responsable informatique & ing?nieur syst?me
>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
>> Institut de Biologie Physico-Chimique
>> 13, rue Pierre et Marie Curie - 75005 Paris
>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>> <bricks-logs.tgz>
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150810/e8fd8a41/attachment.html>

Gluster users - Aug 2015 - Cascading errors and very bad write performance

[Gluster-users] Cascading errors and very bad write performance

[Gluster-users] Cascading errors and very bad write performance