thr3ads.net - Gluster users - [Gluster-users] GlusterFS 3.5.3 - untar: very poor performance [Jun 2015]

If this information is useful, please help other people find it:
Share via:

Geoffrey Letessier

2015-Jun-20 00:12 UTC

[Gluster-users] GlusterFS 3.5.3 - untar: very poor performance

Dear all,

I just noticed on my main volume of my HPC cluster my IO operations become
impressively poor..

Doing some file operations above a linux kernel sources compressed file, the
untar operation can take more than 1/2 hours for this file (roughly 80MB and 52
000 files inside) as you read below:
#######################################################
################  UNTAR time consumed  ################
#######################################################


real	32m42.967s
user	0m11.783s
sys	0m15.050s

#######################################################
#################  DU time consumed  ##################
#######################################################

557M	linux-4.1-rc6

real	0m25.060s
user	0m0.068s
sys	0m0.344s

#######################################################
#################  FIND time consumed  ################
#######################################################

52663

real	0m25.687s
user	0m0.084s
sys	0m0.387s

#######################################################
#################  GREP time consumed  ################
#######################################################

7952

real	2m15.890s
user	0m0.887s
sys	0m2.777s

#######################################################
#################  TAR time consumed  #################
#######################################################


real	1m5.551s
user	0m26.536s
sys	0m2.609s

#######################################################
#################  RM time consumed  ##################
#######################################################


real	2m51.485s
user	0m0.167s
sys	0m1.663s

For information, this volume is a distributed replicated one and is composed by
4 servers with 2 bricks each. Each bricks is a 12-drives RAID6 vdisk with nice
native performances (around 1.2GBs).

In comparison, when I use DD to generate a 100GB file on the same volume, my
write throughput is around 1GB (client side) and 500MBs (server side) because of
replication:
Client side:
[root at node056 ~]# ifstat -i ib0
       ib0        
 KB/s in  KB/s out
 3251.45  1.09e+06
 3139.80  1.05e+06
 3185.29  1.06e+06
 3293.84  1.09e+06
...

Server side:
[root at lucifer ~]# ifstat -i ib0
       ib0        
 KB/s in  KB/s out
561818.1   1746.42
560020.3   1737.92
526337.1   1648.20
513972.7   1613.69
...

DD command:
[root at node056 ~]# dd if=/dev/zero of=/home/root/test.dd bs=1M count=100000
100000+0 enregistrements lus
100000+0 enregistrements ?crits
104857600000 octets (105 GB) copi?s, 202,99 s, 517 MB/s

So this issue doesn?t seem coming from the network (which is Infiniband
technology in this case)

You can find in attachments a set of files:
	- mybench.sh: the bench script
	- benches.txt: output of my "bench"
	- profile.txt: gluster volume profile during the "bench"
	- vol_status.txt: gluster volume status
	- vol_info.txt: gluster volume info

Can someone help me to fix it (it?s very critical because this volume is on a
HPC cluster in production).

Thanks by advance,
Geoffrey
-----------------------------------------------
Geoffrey Letessier

Responsable informatique & ing?nieur syst?me
CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150620/0a40d412/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: benches.txt
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150620/0a40d412/attachment.txt>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150620/0a40d412/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mybench.sh
Type: application/octet-stream
Size: 1427 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150620/0a40d412/attachment.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150620/0a40d412/attachment-0002.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: profile.txt
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150620/0a40d412/attachment-0001.txt>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150620/0a40d412/attachment-0003.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: vol_info.txt
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150620/0a40d412/attachment-0002.txt>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150620/0a40d412/attachment-0004.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: vol_status.txt
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150620/0a40d412/attachment-0003.txt>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150620/0a40d412/attachment-0005.html>

Geoffrey Letessier

2015-Jun-20 00:34 UTC

head link

[Gluster-users] GlusterFS 3.5.3 - untar: very poor performance

Re,

For comparison, here is the output of the same script run on a distributed only
volume (2 servers of the 4 previously described, 2 bricks each):
#######################################################
################  UNTAR time consumed  ################
#######################################################


real	1m44.698s
user	0m8.891s
sys	0m8.353s

#######################################################
#################  DU time consumed  ##################
#######################################################

554M	linux-4.1-rc6

real	0m21.062s
user	0m0.100s
sys	0m1.040s

#######################################################
#################  FIND time consumed  ################
#######################################################

52663

real	0m21.325s
user	0m0.104s
sys	0m1.054s

#######################################################
#################  GREP time consumed  ################
#######################################################

7952

real	0m43.618s
user	0m0.922s
sys	0m3.626s

#######################################################
#################  TAR time consumed  #################
#######################################################


real	0m50.577s
user	0m29.745s
sys	0m4.086s

#######################################################
#################  RM time consumed  ##################
#######################################################


real	0m41.133s
user	0m0.171s
sys	0m2.522s

The performances are amazing different!

Geoffrey
-----------------------------------------------
Geoffrey Letessier

Responsable informatique & ing?nieur syst?me
CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr

Le 20 juin 2015 ? 02:12, Geoffrey Letessier <geoffrey.letessier at
cnrs.fr> a ?crit :
> Dear all,
> 
> I just noticed on my main volume of my HPC cluster my IO operations become
impressively poor..
> 
> Doing some file operations above a linux kernel sources compressed file,
the untar operation can take more than 1/2 hours for this file (roughly 80MB and
52 000 files inside) as you read below:
> #######################################################
> ################  UNTAR time consumed  ################
> #######################################################
> 
> 
> real	32m42.967s
> user	0m11.783s
> sys	0m15.050s
> 
> #######################################################
> #################  DU time consumed  ##################
> #######################################################
> 
> 557M	linux-4.1-rc6
> 
> real	0m25.060s
> user	0m0.068s
> sys	0m0.344s
> 
> #######################################################
> #################  FIND time consumed  ################
> #######################################################
> 
> 52663
> 
> real	0m25.687s
> user	0m0.084s
> sys	0m0.387s
> 
> #######################################################
> #################  GREP time consumed  ################
> #######################################################
> 
> 7952
> 
> real	2m15.890s
> user	0m0.887s
> sys	0m2.777s
> 
> #######################################################
> #################  TAR time consumed  #################
> #######################################################
> 
> 
> real	1m5.551s
> user	0m26.536s
> sys	0m2.609s
> 
> #######################################################
> #################  RM time consumed  ##################
> #######################################################
> 
> 
> real	2m51.485s
> user	0m0.167s
> sys	0m1.663s
> 
> For information, this volume is a distributed replicated one and is
composed by 4 servers with 2 bricks each. Each bricks is a 12-drives RAID6 vdisk
with nice native performances (around 1.2GBs).
> 
> In comparison, when I use DD to generate a 100GB file on the same volume,
my write throughput is around 1GB (client side) and 500MBs (server side) because
of replication:
> Client side:
> [root at node056 ~]# ifstat -i ib0
>        ib0        
>  KB/s in  KB/s out
>  3251.45  1.09e+06
>  3139.80  1.05e+06
>  3185.29  1.06e+06
>  3293.84  1.09e+06
> ...
> 
> Server side:
> [root at lucifer ~]# ifstat -i ib0
>        ib0        
>  KB/s in  KB/s out
> 561818.1   1746.42
> 560020.3   1737.92
> 526337.1   1648.20
> 513972.7   1613.69
> ...
> 
> DD command:
> [root at node056 ~]# dd if=/dev/zero of=/home/root/test.dd bs=1M
count=100000
> 100000+0 enregistrements lus
> 100000+0 enregistrements ?crits
> 104857600000 octets (105 GB) copi?s, 202,99 s, 517 MB/s
> 
> So this issue doesn?t seem coming from the network (which is Infiniband
technology in this case)
> 
> You can find in attachments a set of files:
> 	- mybench.sh: the bench script
> 	- benches.txt: output of my "bench"
> 	- profile.txt: gluster volume profile during the "bench"
> 	- vol_status.txt: gluster volume status
> 	- vol_info.txt: gluster volume info
> 
> Can someone help me to fix it (it?s very critical because this volume is on
a HPC cluster in production).
> 
> Thanks by advance,
> Geoffrey
> -----------------------------------------------
> Geoffrey Letessier
> 
> Responsable informatique & ing?nieur syst?me
> CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique
> Institut de Biologie Physico-Chimique
> 13, rue Pierre et Marie Curie - 75005 Paris
> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr
> <benches.txt>
> <mybench.sh>
> <profile.txt>
> <vol_info.txt>
> <vol_status.txt>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150620/158af5a4/attachment.html>

Gluster users - Jun 2015 - GlusterFS 3.5.3 - untar: very poor performance

[Gluster-users] GlusterFS 3.5.3 - untar: very poor performance

[Gluster-users] GlusterFS 3.5.3 - untar: very poor performance