Geoffrey Letessier
2015-Aug-07 21:47 UTC
[Gluster-users] Cascading errors and very bad write performance
I?m not really sure to well understand your answer. I try to set inode-lru-limit to 1, I can not notice any good effect. When i re-run ddt application, I can note 2 kinds of messages: [2015-08-07 21:29:21.792156] W [marker-quota.c:3379:_mq_initiate_quota_txn] 0-vol_home-marker: parent is NULL for <gfid:5a32328a-7fd9-474e-9bc6-cafde9c41af7>, aborting updation txn [2015-08-07 21:29:21.792176] W [marker-quota.c:3379:_mq_initiate_quota_txn] 0-vol_home-marker: parent is NULL for <gfid:5a32328a-7fd9-474e-9bc6-cafde9c41af7>, aborting updation txn and/or: [2015-08-07 21:44:19.279971] E [marker-quota.c:2990:mq_start_quota_txn_v2] 0-vol_home-marker: contribution node list is empty (31d7bf88-b63a-4731-a737-a3dce73b8cd1) [2015-08-07 21:41:26.177095] E [dict.c:1418:dict_copy_with_ref] (-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(server_resolve_inode+0x60) [0x7f85e9a6a410] -->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(resolve_gfid+0x88) [0x7f85e9a6a188] -->/usr/lib64/libglusterfs.so.0(dict_copy_with_ref+0xa4) [0x3e99c20674] ) 0-dict: invalid argument: dict [Argument invalide] And concerning the bad IO performance? [letessier at node031 ~]$ ddt -t 35g /home/admin_team/letessier/ Writing to /home/admin_team/letessier/ddt.25259 ... syncing ... done. sleeping 10 seconds ... done. Reading from /home/admin_team/letessier/ddt.25259 ... done. 35840MiB KiB/s CPU% Write 277451 3 Read 188682 1 [letessier at node031 ~]$ logout [root at node031 ~]# ddt -t 35g /home/ Writing to /home/ddt.25559 ... syncing ... done. sleeping 10 seconds ... done. Reading from /home/ddt.25559 ... done. 35840MiB KiB/s CPU% Write 196539 2 Read 438944 3 Notice the read/write throughput differences when i?m root and when i?m a simple user. Thanks. Geoffrey ------------------------------------------------------ Geoffrey Letessier Responsable informatique & ing?nieur syst?me UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique Institut de Biologie Physico-Chimique 13, rue Pierre et Marie Curie - 75005 Paris Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr Le 7 ao?t 2015 ? 14:57, Vijaikumar M <vmallika at redhat.com> a ?crit :> > > On Friday 07 August 2015 05:34 PM, Geoffrey Letessier wrote: >> Hi Vijay, >> >> My brick logs issue and big performance problem have begun when I upgraded Gluster into 3.7.3 version; before write throughput was good enough (~500MBs) -but not as good as with GlusterFS 3.5.3 (especially with distributed volumes)- and didn?t notice these probl?me with brick-logs. >> >> OK? in live: >> >> i just disable to quota for my home volume and now my performance appears to be relatively better (around 300MBs) but i still see the logs (from storage1 and its replicate storage2) growing up with only this kind of lines: >> [2015-08-07 11:16:51.746142] E [dict.c:1418:dict_copy_with_ref] (-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(server_resolve_inode+0x60) [0x7f85e9a6a410] -->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(resolve_gfid+0x88) [0x7f85e9a6a188] -->/usr/lib64/libglusterfs.so.0(dict_copy_with_ref+0xa4) [0x3e99c20674] ) 0-dict: invalid argument: dict [Argument invalide] >> > We have root caused log issue, bug# 1244613 tracks this issue > > >> After a few minutes: my write throughput seems to be now correct (~550MBs) but the log are still growing up (to not say exploding). So one part of the problem looks like taking its origin in the quota system management. >> ? after a few minutes (and still only 1 client connected), now it is the read operation which is very very slow? -I?m gonna become crazy! :/- >> # ddt -t 50g /home/ >> Writing to /home/ddt.11293 ... syncing ... done. >> sleeping 10 seconds ... done. >> Reading from /home/ddt.11293 ... done. >> 35840MiB KiB/s CPU% >> Write 568201 5 >> Read 567008 4 >> # ddt -t 50g /home/ >> Writing to /home/ddt.11397 ... syncing ... done. >> sleeping 10 seconds ... done. >> Reading from /home/ddt.11397 ... done. >> 51200MiB KiB/s CPU% >> Write 573631 5 >> Read 164716 1 >> >> and my log are still exploding? >> >> After having re-enabled the quota on my volume: >> # ddt -t 50g /home/ >> Writing to /home/ddt.11817 ... syncing ... done. >> sleeping 10 seconds ... done. >> Reading from /home/ddt.11817 ... done. >> 51200MiB KiB/s CPU% >> Write 269608 3 >> Read 160219 1 >> >> Thanks >> Geoffrey >> ------------------------------------------------------ >> Geoffrey Letessier >> Responsable informatique & ing?nieur syst?me >> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >> Institut de Biologie Physico-Chimique >> 13, rue Pierre et Marie Curie - 75005 Paris >> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr >> >> Le 7 ao?t 2015 ? 06:28, Vijaikumar M <vmallika at redhat.com> a ?crit : >> >>> Hi Geoffrey, >>> >>> Some performance improvements has been done in quota in glusterfs-3.7.3. >>> Could you upgrade to glusterfs-3.7.3 and see if this helps >>> >>> Thanks, >>> Vijay >>> >>> >>> On Friday 07 August 2015 05:02 AM, Geoffrey Letessier wrote: >>>> Hi, >>>> >>>> No idea to help me fix this issue? (big logs, small write performance (/4), etc.) >>>> >>>> For comparison, here to volumes: >>>> >>>> - home: distributed on 4 bricks / 2 nodes (and replicated on 4 other bricks / 2 other nodes): >>>> # ddt -t 35g /home >>>> Writing to /home/ddt.24172 ... syncing ... done. >>>> sleeping 10 seconds ... done. >>>> Reading from /home/ddt.24172 ... done. >>>> 33792MiB KiB/s CPU% >>>> Write 103659 1 >>>> Read 391955 3 >>>> >>>> >>>> - workdir: distributed on 4 bricks / 2 nodes (one the same RAID volumes and servers than home): >>>> # ddt -t 35g /workdir >>>> Writing to /workdir/ddt.24717 ... syncing ... done. >>>> sleeping 10 seconds ... done. >>>> Reading from /workdir/ddt.24717 ... done. >>>> 35840MiB KiB/s CPU% >>>> Write 738314 4 >>>> Read 536497 4 >>>> >>>> For information, previously on 3.5.3-2 version, I obtained roughly 1.1GBs for workdir volume and ~550-600MBs for home. >>>> >>>> All my tests (CP, RSYNC, etc.) provides me the same result (write throughput between 100MBs and 150MBs) >>>> >>>> Thanks. >>>> Geoffrey >>>> ------------------------------------------------------ >>>> Geoffrey Letessier >>>> Responsable informatique & ing?nieur syst?me >>>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >>>> Institut de Biologie Physico-Chimique >>>> 13, rue Pierre et Marie Curie - 75005 Paris >>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr >>>> >>>> Le 5 ao?t 2015 ? 10:40, Geoffrey Letessier <geoffrey.letessier at cnrs.fr> a ?crit : >>>> >>>>> Hello, >>>>> >>>>> In addition, knowing I have reactivated the log (brick-log-level = INFO not CRITICAL) only for the file creation duration (i.e. a few minutes), do you have noticed the log sizes and the number of lines inside: >>>>> # ls -lh storage* >>>>> -rw------- 1 letessier staff 18M 5 ao? 00:54 storage1__export-brick_home-brick1-data.log >>>>> -rw------- 1 letessier staff 2,1K 5 ao? 00:54 storage1__export-brick_home-brick2-data.log >>>>> -rw------- 1 letessier staff 15M 5 ao? 00:56 storage2__export-brick_home-brick1-data.log >>>>> -rw------- 1 letessier staff 2,1K 5 ao? 00:54 storage2__export-brick_home-brick2-data.log >>>>> -rw------- 1 letessier staff 47M 5 ao? 00:55 storage3__export-brick_home-brick1-data.log >>>>> -rw------- 1 letessier staff 2,1K 5 ao? 00:54 storage3__export-brick_home-brick2-data.log >>>>> -rw------- 1 letessier staff 47M 5 ao? 00:55 storage4__export-brick_home-brick1-data.log >>>>> -rw------- 1 letessier staff 2,1K 5 ao? 00:55 storage4__export-brick_home-brick2-data.log >>>>> >>>>> # wc -l storage* >>>>> 55381 storage1__export-brick_home-brick1-data.log >>>>> 17 storage1__export-brick_home-brick2-data.log >>>>> 41636 storage2__export-brick_home-brick1-data.log >>>>> 17 storage2__export-brick_home-brick2-data.log >>>>> 270360 storage3__export-brick_home-brick1-data.log >>>>> 17 storage3__export-brick_home-brick2-data.log >>>>> 270358 storage4__export-brick_home-brick1-data.log >>>>> 17 storage4__export-brick_home-brick2-data.log >>>>> 637803 total >>>>> >>>>> If the let brick-log-level to INFO, the brick log files in each server will consume all my /var partition capacity within only a few hours/days? >>>>> >>>>> Thanks in advance, >>>>> Geoffrey >>>>> ------------------------------------------------------ >>>>> Geoffrey Letessier >>>>> Responsable informatique & ing?nieur syst?me >>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >>>>> Institut de Biologie Physico-Chimique >>>>> 13, rue Pierre et Marie Curie - 75005 Paris >>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr >>>>> >>>>> Le 5 ao?t 2015 ? 01:12, Geoffrey Letessier <geoffrey.letessier at cnrs.fr> a ?crit : >>>>> >>>>>> Hello, >>>>>> >>>>>> Since the problem motioned previously (all errors noticed in brick log files), i notice a very very bad performance: i can note my write performance divided by 4 than previously -knowing it was not so good before. >>>>>> Now, a write of a 33GB file, my write throughput is around 150MBs (with Infiniband), before it was around 550-600MBs; and this, both with RDMA and TCP protocol. >>>>>> >>>>>> During this test, more than 40 000 error lines (as the following) were added to the brick log files. >>>>>> [2015-08-04 22:34:27.337622] E [dict.c:1418:dict_copy_with_ref] (-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(server_resolve_inode+0x60) [0x7f021c6f7410] -->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(resolve_gfid+0x88) [0x7f021c6f7188] -->/usr/lib64/libglusterfs.so.0(dict_copy_with_ref+0xa4) [0x7f0229cba674] ) 0-dict: invalid argument: dict [Argument invalide] >>>>>> >>>>>> >>>>>> All brick log files are in attachments. >>>>>> >>>>>> Thanks in advance for all your help and fix, >>>>>> Best, >>>>>> Geoffrey >>>>>> >>>>>> PS: question: is it possible to easily downgrade GlusterFS to a previous version from 3.7 (for example: v3.5)? >>>>>> >>>>>> ------------------------------------------------------ >>>>>> Geoffrey Letessier >>>>>> Responsable informatique & ing?nieur syst?me >>>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >>>>>> Institut de Biologie Physico-Chimique >>>>>> 13, rue Pierre et Marie Curie - 75005 Paris >>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr >>>>>> <bricks-logs.tgz> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150807/f4d309de/attachment.html>
Mathieu Chateau
2015-Aug-08 08:02 UTC
[Gluster-users] Cascading errors and very bad write performance
Maybe related to the insecure port issue reported ? try with : gluster volume set xxx server.allow-insecure on Cordialement, Mathieu CHATEAU http://www.lotp.fr 2015-08-07 23:47 GMT+02:00 Geoffrey Letessier <geoffrey.letessier at cnrs.fr>:> I?m not really sure to well understand your answer. > > I try to set inode-lru-limit to 1, I can not notice any good effect. > > When i re-run ddt application, I can note 2 kinds of messages: > [2015-08-07 21:29:21.792156] W > [marker-quota.c:3379:_mq_initiate_quota_txn] 0-vol_home-marker: parent is > NULL for <gfid:5a32328a-7fd9-474e-9bc6-cafde9c41af7>, aborting updation txn > [2015-08-07 21:29:21.792176] W > [marker-quota.c:3379:_mq_initiate_quota_txn] 0-vol_home-marker: parent is > NULL for <gfid:5a32328a-7fd9-474e-9bc6-cafde9c41af7>, aborting updation txn > > and/or: > [2015-08-07 21:44:19.279971] E [marker-quota.c:2990:mq_start_quota_txn_v2] > 0-vol_home-marker: contribution node list is empty > (31d7bf88-b63a-4731-a737-a3dce73b8cd1) > [2015-08-07 21:41:26.177095] E [dict.c:1418:dict_copy_with_ref] > (-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(server_resolve_inode+0x60) > [0x7f85e9a6a410] > -->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(resolve_gfid+0x88) > [0x7f85e9a6a188] -->/usr/lib64/libglusterfs.so.0(dict_copy_with_ref+0xa4) > [0x3e99c20674] ) 0-dict: invalid argument: dict [Argument invalide] > > And concerning the bad IO performance? > > [letessier at node031 ~]$ ddt -t 35g /home/admin_team/letessier/ > Writing to /home/admin_team/letessier/ddt.25259 ... syncing ... done. > sleeping 10 seconds ... done. > Reading from /home/admin_team/letessier/ddt.25259 ... done. > 35840MiB KiB/s CPU% > Write 277451 3 > Read 188682 1 > [letessier at node031 ~]$ logout > [root at node031 ~]# ddt -t 35g /home/ > Writing to /home/ddt.25559 ... syncing ... done. > sleeping 10 seconds ... done. > Reading from /home/ddt.25559 ... done. > 35840MiB KiB/s CPU% > Write 196539 2 > Read 438944 3 > Notice the read/write throughput differences when i?m root and when i?m a > simple user. > > Thanks. > Geoffrey > ------------------------------------------------------ > Geoffrey Letessier > Responsable informatique & ing?nieur syst?me > UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique > Institut de Biologie Physico-Chimique > 13, rue Pierre et Marie Curie - 75005 Paris > Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr > > Le 7 ao?t 2015 ? 14:57, Vijaikumar M <vmallika at redhat.com> a ?crit : > > > > On Friday 07 August 2015 05:34 PM, Geoffrey Letessier wrote: > > Hi Vijay, > > My brick logs issue and big performance problem have begun when I upgraded > Gluster into 3.7.3 version; before write throughput was good enough > (~500MBs) -but not as good as with GlusterFS 3.5.3 (especially with > distributed volumes)- and didn?t notice these probl?me with brick-logs. > > OK? in live: > > i just disable to quota for my home volume and now my performance appears > to be relatively better (around 300MBs) but i still see the logs (from > storage1 and its replicate storage2) growing up with only this kind of > lines: > [2015-08-07 11:16:51.746142] E [dict.c:1418:dict_copy_with_ref] > (-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(server_resolve_inode+0x60) > [0x7f85e9a6a410] > -->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(resolve_gfid+0x88) > [0x7f85e9a6a188] -->/usr/lib64/libglusterfs.so.0(dict_copy_with_ref+0xa4) > [0x3e99c20674] ) 0-dict: invalid argument: dict [Argument invalide] > > We have root caused log issue, bug# 1244613 tracks this issue > > > After a few minutes: my write throughput seems to be now correct (~550MBs) > but the log are still growing up (to not say exploding). So one part of the > problem looks like taking its origin in the quota system management. > ? after a few minutes (and still only 1 client connected), now it is the > read operation which is very very slow? -I?m gonna become crazy! :/- > # ddt -t 50g /home/ > Writing to /home/ddt.11293 ... syncing ... done. > sleeping 10 seconds ... done. > Reading from /home/ddt.11293 ... done. > 35840MiB KiB/s CPU% > Write 568201 5 > Read 567008 4 > # ddt -t 50g /home/ > Writing to /home/ddt.11397 ... syncing ... done. > sleeping 10 seconds ... done. > Reading from /home/ddt.11397 ... done. > 51200MiB KiB/s CPU% > Write 573631 5 > Read 164716 1 > > and my log are still exploding? > > After having re-enabled the quota on my volume: > # ddt -t 50g /home/ > Writing to /home/ddt.11817 ... syncing ... done. > sleeping 10 seconds ... done. > Reading from /home/ddt.11817 ... done. > 51200MiB KiB/s CPU% > Write 269608 3 > Read 160219 1 > > Thanks > Geoffrey > ------------------------------------------------------ > Geoffrey Letessier > Responsable informatique & ing?nieur syst?me > UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique > Institut de Biologie Physico-Chimique > 13, rue Pierre et Marie Curie - 75005 Paris > Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr > > Le 7 ao?t 2015 ? 06:28, Vijaikumar M <vmallika at redhat.com> a ?crit : > > Hi Geoffrey, > > Some performance improvements has been done in quota in glusterfs-3.7.3. > Could you upgrade to glusterfs-3.7.3 and see if this helps > > Thanks, > Vijay > > > On Friday 07 August 2015 05:02 AM, Geoffrey Letessier wrote: > > Hi, > > No idea to help me fix this issue? (big logs, small write performance > (/4), etc.) > > For comparison, here to volumes: > - home: distributed on 4 bricks / 2 nodes (and replicated on 4 other > bricks / 2 other nodes): > # ddt -t 35g /home > Writing to /home/ddt.24172 ... syncing ... done. > sleeping 10 seconds ... done. > Reading from /home/ddt.24172 ... done. > 33792MiB KiB/s CPU% > Write 103659 1 > Read 391955 3 > > - workdir: distributed on 4 bricks / 2 nodes (one the same RAID volumes > and servers than home): > # ddt -t 35g /workdir > Writing to /workdir/ddt.24717 ... syncing ... done. > sleeping 10 seconds ... done. > Reading from /workdir/ddt.24717 ... done. > 35840MiB KiB/s CPU% > Write 738314 4 > Read 536497 4 > > For information, previously on 3.5.3-2 version, I obtained roughly 1.1GBs > for workdir volume and ~550-600MBs for home. > > All my tests (CP, RSYNC, etc.) provides me the same result (write > throughput between 100MBs and 150MBs) > > Thanks. > Geoffrey > ------------------------------------------------------ > Geoffrey Letessier > Responsable informatique & ing?nieur syst?me > UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique > Institut de Biologie Physico-Chimique > 13, rue Pierre et Marie Curie - 75005 Paris > Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr > > Le 5 ao?t 2015 ? 10:40, Geoffrey Letessier <geoffrey.letessier at cnrs.fr> a > ?crit : > > Hello, > > In addition, knowing I have reactivated the log (brick-log-level = INFO > not CRITICAL) only for the file creation duration (i.e. a few minutes), do > you have noticed the log sizes and the number of lines inside: > # ls -lh storage* > -rw------- 1 letessier staff 18M 5 ao? 00:54 > storage1__export-brick_home-brick1-data.log > -rw------- 1 letessier staff 2,1K 5 ao? 00:54 > storage1__export-brick_home-brick2-data.log > -rw------- 1 letessier staff 15M 5 ao? 00:56 > storage2__export-brick_home-brick1-data.log > -rw------- 1 letessier staff 2,1K 5 ao? 00:54 > storage2__export-brick_home-brick2-data.log > -rw------- 1 letessier staff 47M 5 ao? 00:55 > storage3__export-brick_home-brick1-data.log > -rw------- 1 letessier staff 2,1K 5 ao? 00:54 > storage3__export-brick_home-brick2-data.log > -rw------- 1 letessier staff 47M 5 ao? 00:55 > storage4__export-brick_home-brick1-data.log > -rw------- 1 letessier staff 2,1K 5 ao? 00:55 > storage4__export-brick_home-brick2-data.log > > # wc -l storage* > 55381 storage1__export-brick_home-brick1-data.log > 17 storage1__export-brick_home-brick2-data.log > 41636 storage2__export-brick_home-brick1-data.log > 17 storage2__export-brick_home-brick2-data.log > 270360 storage3__export-brick_home-brick1-data.log > 17 storage3__export-brick_home-brick2-data.log > 270358 storage4__export-brick_home-brick1-data.log > 17 storage4__export-brick_home-brick2-data.log > 637803 total > > If the let brick-log-level to INFO, the brick log files in each server > will consume all my /var partition capacity within only a few hours/days? > > Thanks in advance, > Geoffrey > ------------------------------------------------------ > Geoffrey Letessier > Responsable informatique & ing?nieur syst?me > UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique > Institut de Biologie Physico-Chimique > 13, rue Pierre et Marie Curie - 75005 Paris > Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr > > Le 5 ao?t 2015 ? 01:12, Geoffrey Letessier <geoffrey.letessier at cnrs.fr> a > ?crit : > > Hello, > > Since the problem motioned previously (all errors noticed in brick log > files), i notice a very very bad performance: i can note my write > performance divided by 4 than previously -knowing it was not so good before. > Now, a write of a 33GB file, my write throughput is around 150MBs (with > Infiniband), before it was around 550-600MBs; and this, both with RDMA and > TCP protocol. > > During this test, more than 40 000 error lines (as the following) were > added to the brick log files. > [2015-08-04 22:34:27.337622] E [dict.c:1418:dict_copy_with_ref] > (-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(server_resolve_inode+0x60) > [0x7f021c6f7410] > -->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(resolve_gfid+0x88) > [0x7f021c6f7188] -->/usr/lib64/libglusterfs.so.0(dict_copy_with_ref+0xa4) > [0x7f0229cba674] ) 0-dict: invalid argument: dict [Argument invalide] > > > All brick log files are in attachments. > > Thanks in advance for all your help and fix, > Best, > Geoffrey > > PS: question: is it possible to easily downgrade GlusterFS to a previous > version from 3.7 (for example: v3.5)? > > ------------------------------------------------------ > Geoffrey Letessier > Responsable informatique & ing?nieur syst?me > UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique > Institut de Biologie Physico-Chimique > 13, rue Pierre et Marie Curie - 75005 Paris > Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr > <bricks-logs.tgz> > > > > > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150808/5436eb15/attachment.html>