Resent to gluster-devel and gluster-user. Anyone have any ideas what
might be causing this issue?
------ Forwarded Message ------
From: "David Robinson" <drobinson at corvidtec.com>
To: "Shyam" <srangana at redhat.com>
Cc: "Patrick Glomski" <patrick.glomski at corvidtec.com>
Sent: 8/17/2015 6:09:07 PM
Subject: Re[2]: [Red Hat Bugzilla] Your Outstanding Requests
Shyam,
I am revisiting this with some additional testing. I think that a lot
of the "small file performance" issues that people bring up could be
related to this bug. I know that it is causing me major issues on my
production system as it continues to slow down as files are added. We
currently have 200TB on that system and we are getting to a point that
the system is becoming fairly unusable due to the increasing slowdowns
we are witnessing. e.g. 'ls -R boost' has gone from 5-seconds to over
3-minutes
The test is repeatable and it does not appear to be related to XFS or
the system hardware or configuration. Let me know if you are okay with
me posting this to the gluster-user and gluster-devel message boards. I
would like to know if anyone else is seeing the same behavior.
The steps I took are:
Run speedtest1 (see attached) Creates a gluster volume (testbrick01) and
mounts the volume Run timing of operations (tar is done 10x to show
slowdown; slowdown is directly proportional to the number of files on
the volume) Repeat timing of the operations directly on the XFS
partition (/data/brick01) Reboot the machine and run speedtest2 (see
attached) repeat the timings from speedtest1 to show the slowdown after
the reboot *** After the reboot, all of the operations are significantly
slowed down. The slowdown isn't just in the creation of the symbolic
links in .glusterfs as previously thought. *** The timings done after
the reboot directly on the XFS partition (/data/brick01) do not show any
slowdown. This indicates that it isn't an XFS issue, right?Run
speedtest3 (see attached) Create a new gluster volume (testbrick02),
mount, and repeat the above steps to show that everything is working
fine.
After a reboot, the gluster volume slows down significantly and the
amount of the slowdown is proportional to the number of files (not the
amount of space used). After only 10-extractions, there are roughly
7.5-million files in the volume and the extraction time has gone from
1.5 minutes to 5-minutes. The "du -h" went from 8-seconds to
30-seconds. Note that these timings for speedtest2 will continually get
worse if you add more files to the gluster volume. Summary of timing
results:
Gluster Timing
Test
speedtest1
speedtest2
speedtest3
gluster: tar -xPf
1m30s
3.5-8m
1m30s
gluster: du -h -s
8s
30s
7s
gluster: find
8s
27s
7s
gluster: ls -R
8s
18s
7s
xfs: tar -xPf
2s
2s
2s
xfs: du -h -s
.07s
.07s
.07s
xfs: find
.09s
.09s
.09s
xfs: ls -R
.13s
.13s
.13s
Note that the good news is that there is absolutely no slowdown in the
system if you do not do a reboot, so gluster is performing extremely
well up to the point of a reboot. I created a test volume and extracted
the boost directory 100-times to create 70-million files. There was no
slowdown detected even with this extremely large number of files.
However, after rebooting the system the tar extraction went to 20+
minutes and an 'ls -R' went to almost 3-minutes.
Are the system startup options in gluster different after a reboot when
compared to when the volume is initially created?
Are the mount options different during system startup when compared to
simply using a mount command?
Do you have any other ideas what could be different from a reboot? Keep
in mind that after reboot, I can create a new gluster volume and extract
the files many, many times with no slowdown. The slowdown only occurs
to an existing gluster volume after a machine reboot. Even after a
reboot, if I create a clean volume, there is no slowdown until the next
machine reboot.
Note: The log files attached have the "No data available" messages
parsed out to reduce the file size. There were an enormous amount of
these. One of my colleagues submitted something to the message board
about these errors in 3.7.3.>[2015-08-17 17:03:37.270219] W [fuse-bridge.c:1230:fuse_err_cbk]
>0-glusterfs-fuse: 6643: REMOVEXATTR()
>/boost_1_57_0/boost/accumulators/accumulators.hpp => -1 (No data
>available)
>[2015-08-17 17:03:37.271004] W [fuse-bridge.c:1230:fuse_err_cbk]
>0-glusterfs-fuse: 6646: REMOVEXATTR()
>/boost_1_57_0/boost/accumulators/accumulators.hpp => -1 (No data
>available)
>[2015-08-17 17:03:37.271663] W [fuse-bridge.c:1230:fuse_err_cbk]
>0-glusterfs-fuse: 6648: REMOVEXATTR()
>/boost_1_57_0/boost/accumulators/accumulators.hpp => -1 (No data
>available)
>[2015-08-17 17:03:37.274273] W [fuse-bridge.c:1230:fuse_err_cbk]
>0-glusterfs-fuse: 6662: REMOVEXATTR()
>/boost_1_57_0/boost/accumulators/accumulators_fwd.hpp => -1 (No data
>available)
System details: Scientific Linux 7.1, XFS, the latest gluster (3.7.3),
and the simplest volume possible (single brick without replication,
distribution, etc):>>[root at gfstest ~]# cat /etc/redhat-release
>>Scientific Linux release 7.1 (Nitrogen)
>>[root at gfstest ~]# uname -a
>>Linux gfstest.corvidtec.com 3.10.0-229.11.1.el7.x86_64 #1 SMP Wed Aug
>>5 14:37:37 CDT 2015 x86_64 x86_64 x86_64 GNU/Linux
>>[root at gfstest ~]# gluster volume info testbrick01
>>
>>Volume Name: testbrick01
>>Type: Distribute
>>Volume ID: 4a003c5c-caef-4838-b62e-ac5d574aadcf
>>Status: Started
>>Number of Bricks: 1
>>Transport-type: tcp
>>Bricks:
>>Brick1: gfstest.corvidtec.com:/data/brick01/testbrick01
>>Options Reconfigured:
>>performance.readdir-ahead: on
>>[root at gfstest ~]# xfs_info /data/brick01
>>meta-data=/dev/sdb1 isize=512 agcount=22,
>>agsize=268435455 blks
>> = sectsz=512 attr=2,
>>projid32bit=1
>> = crc=0 finobt=0
>>data = bsize=4096
>>blocks=5859966208, imaxpct=5
>> = sunit=0 swidth=0
>>blks
>>naming =version 2 bsize=4096 ascii-ci=0 ftype=0
>>log =internal bsize=4096 blocks=521728,
>>version=2
>> = sectsz=512 sunit=0 blks,
>>lazy-count=1
>>realtime =none extsz=4096 blocks=0, rtextents=0
>>
David
------ Original Message ------
From: "Shyam" <srangana at redhat.com>
To: "David Robinson" <drobinson at corvidtec.com>
Sent: 8/3/2015 5:30:15 PM
Subject: Re: [Red Hat Bugzilla] Your Outstanding Requests
Hi David, Not much. The perf results, on viewing in our local machines
still had some symbols missing. We also had a release in between that
got over last week. So things are a bit calmer now. I am thinking of
running this test again internally as there was a point where I was able
to see this issue in house, so that I can show it first hand to the file
system folks. I have talked to the perf. engineer regarding the same,
and we are working towards a free slot when we can test this, should be
towards the end of this week or beginning of the next, as the systems
are tied up in some benchmarking runs for the release. Regards, Shyam
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150818/a96f7d44/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: speedtest1
Type: application/octet-stream
Size: 1500 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150818/a96f7d44/attachment-0006.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: speedtest2
Type: application/octet-stream
Size: 937 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150818/a96f7d44/attachment-0007.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: speedtest3
Type: application/octet-stream
Size: 1500 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150818/a96f7d44/attachment-0008.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: speedtest1.log
Type: application/octet-stream
Size: 2268 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150818/a96f7d44/attachment-0009.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: speedtest2.log
Type: application/octet-stream
Size: 1314 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150818/a96f7d44/attachment-0010.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: speedtest3.log
Type: application/octet-stream
Size: 2271 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150818/a96f7d44/attachment-0011.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: testbrick01.log.1.small.gz
Type: application/x-gzip
Size: 512004 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150818/a96f7d44/attachment-0003.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: testbrick01.log.2.small.gz
Type: application/x-gzip
Size: 159041 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150818/a96f7d44/attachment-0004.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: testbrick02.log.1.small.gz
Type: application/x-gzip
Size: 509618 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150818/a96f7d44/attachment-0005.gz>