thr3ads.net - Gluster users - [Gluster-users] Gluster performance on the small files [Feb 2015]

If this information is useful, please help other people find it:
Share via:

Ben Turner

2015-Feb-16 22:16 UTC

[Gluster-users] Gluster performance on the small files

----- Original Message -----> From: "Joe Julian" <joe at julianfamily.org>
> To: "Punit Dambiwal" <hypunit at gmail.com>, gluster-users
at gluster.org, "Humble Devassy Chirammal"
> <humble.devassy at gmail.com>
> Sent: Monday, February 16, 2015 3:32:31 PM
> Subject: Re: [Gluster-users] Gluster performance on the small files
> 
> 
> On 02/12/2015 10:58 PM, Punit Dambiwal wrote:
> 
> 
> 
> Hi,
> 
> I have seen the gluster performance is dead slow on the small files...even
i
> am using the SSD....it's too bad performance....even i am getting
better
> performance in my SAN with normal SATA disk...
> 
> I am using distributed replicated glusterfs with replica count=2...i have
all
> SSD disks on the brick...
> 
> 
> 
> root at vm3:~# dd bs=64k count=4k if=/dev/zero of=test oflag=dsync
> 
> 4096+0 records in
> 
> 4096+0 records out
> 
> 268435456 bytes (268 MB) copied, 57.3145 s, 4.7 MB/s
> 
This seems pretty slow, even if you are using gigabit.  Here is what I get:

[root at gqac031 smallfile]# dd bs=64k count=4k if=/dev/zero
of=/gluster-emptyvol/test oflag=dsync
4096+0 records in
4096+0 records out
268435456 bytes (268 MB) copied, 10.5965 s, 25.3 MB/s

FYI this is on my 2 node pure replica + spinning disks(RAID 6, this is not setup
for smallfile workloads.  For smallfile I normally use RAID 10) + 10G.

The single threaded DD process is defiantly a bottle neck here, the power in
distributed systems is doing things in parallel across clients / threads.  You
may want to try smallfile:

http://www.gluster.org/community/documentation/index.php/Performance_Testing

Smallfile command used - python /small-files/smallfile/smallfile_cli.py
--operation create --threads 8 --file-size 64 --files 10000 --top
/gluster-emptyvol/ --pause 1000 --host-set "client1, client2"

total threads = 16
total files = 157100
total data =     9.589 GB
 98.19% of requested files processed, minimum is  70.00
41.271602 sec elapsed time
3806.491454 files/sec
3806.491454 IOPS
237.905716 MB/sec

If you wanted to do something similar with DD you could do:

<my script>
for i in `seq 1..4`
do
    dd bs=64k count=4k if=/dev/zero of=/gluster-emptyvol/test$i oflag=dsync
&
done
for pid in $(pidof dd); do
    while kill -0 "$pid"; do
        sleep 0.1
    done
done

# time myscript.sh

Then do the math to figure out the MB / sec of the system.

-b 
> 
> 
> root at vm3:~# dd bs=64k count=4k if=/dev/zero of=test conv=fdatasync
> 
> 4096+0 records in
> 
> 4096+0 records out
> 
> 268435456 bytes (268 MB) copied, 1.80093 s, 149 MB/s
> 
> 
> 
> How small is your VM image? The image is the file that GlusterFS is
serving,
> not the small files within it. Perhaps the filesystem you're using
within
> your VM is inefficient with regard to how it handles disk writes.
> 
> I believe your concept of "small file" performance is
misunderstood, as is
> often the case with this phrase. The "small file" issue has to do
with the
> overhead of finding and checking the validity of any file, but with a small
> file the percentage of time doing those checks is proportionally greater.
> With your VM image, that file is already open. There are no self-heal
checks
> or lookups that are happening in your tests, so that overhead is not the
> problem.
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

Punit Dambiwal

2015-Feb-17 02:03 UTC

head link

[Gluster-users] Gluster performance on the small files

Hi Vijay,

Please find the volume info here :-

[root at cpu01 ~]# gluster volume info

Volume Name: ds01
Type: Distributed-Replicate
Volume ID: 369d3fdc-c8eb-46b7-a33e-0a49f2451ff6
Status: Started
Number of Bricks: 48 x 2 = 96
Transport-type: tcp
Bricks:
Brick1: cpu01:/bricks/1/vol1
Brick2: cpu02:/bricks/1/vol1
Brick3: cpu03:/bricks/1/vol1
Brick4: cpu04:/bricks/1/vol1
Brick5: cpu01:/bricks/2/vol1
Brick6: cpu02:/bricks/2/vol1
Brick7: cpu03:/bricks/2/vol1
Brick8: cpu04:/bricks/2/vol1
Brick9: cpu01:/bricks/3/vol1
Brick10: cpu02:/bricks/3/vol1
Brick11: cpu03:/bricks/3/vol1
Brick12: cpu04:/bricks/3/vol1
Brick13: cpu01:/bricks/4/vol1
Brick14: cpu02:/bricks/4/vol1
Brick15: cpu03:/bricks/4/vol1
Brick16: cpu04:/bricks/4/vol1
Brick17: cpu01:/bricks/5/vol1
Brick18: cpu02:/bricks/5/vol1
Brick19: cpu03:/bricks/5/vol1
Brick20: cpu04:/bricks/5/vol1
Brick21: cpu01:/bricks/6/vol1
Brick22: cpu02:/bricks/6/vol1
Brick23: cpu03:/bricks/6/vol1
Brick24: cpu04:/bricks/6/vol1
Brick25: cpu01:/bricks/7/vol1
Brick26: cpu02:/bricks/7/vol1
Brick27: cpu03:/bricks/7/vol1
Brick28: cpu04:/bricks/7/vol1
Brick29: cpu01:/bricks/8/vol1
Brick30: cpu02:/bricks/8/vol1
Brick31: cpu03:/bricks/8/vol1
Brick32: cpu04:/bricks/8/vol1
Brick33: cpu01:/bricks/9/vol1
Brick34: cpu02:/bricks/9/vol1
Brick35: cpu03:/bricks/9/vol1
Brick36: cpu04:/bricks/9/vol1
Brick37: cpu01:/bricks/10/vol1
Brick38: cpu02:/bricks/10/vol1
Brick39: cpu03:/bricks/10/vol1
Brick40: cpu04:/bricks/10/vol1
Brick41: cpu01:/bricks/11/vol1
Brick42: cpu02:/bricks/11/vol1
Brick43: cpu03:/bricks/11/vol1
Brick44: cpu04:/bricks/11/vol1
Brick45: cpu01:/bricks/12/vol1
Brick46: cpu02:/bricks/12/vol1
Brick47: cpu03:/bricks/12/vol1
Brick48: cpu04:/bricks/12/vol1
Brick49: cpu01:/bricks/13/vol1
Brick50: cpu02:/bricks/13/vol1
Brick51: cpu03:/bricks/13/vol1
Brick52: cpu04:/bricks/13/vol1
Brick53: cpu01:/bricks/14/vol1
Brick54: cpu02:/bricks/14/vol1
Brick55: cpu03:/bricks/14/vol1
Brick56: cpu04:/bricks/14/vol1
Brick57: cpu01:/bricks/15/vol1
Brick58: cpu02:/bricks/15/vol1
Brick59: cpu03:/bricks/15/vol1
Brick60: cpu04:/bricks/15/vol1
Brick61: cpu01:/bricks/16/vol1
Brick62: cpu02:/bricks/16/vol1
Brick63: cpu03:/bricks/16/vol1
Brick64: cpu04:/bricks/16/vol1
Brick65: cpu01:/bricks/17/vol1
Brick66: cpu02:/bricks/17/vol1
Brick67: cpu03:/bricks/17/vol1
Brick68: cpu04:/bricks/17/vol1
Brick69: cpu01:/bricks/18/vol1
Brick70: cpu02:/bricks/18/vol1
Brick71: cpu03:/bricks/18/vol1
Brick72: cpu04:/bricks/18/vol1
Brick73: cpu01:/bricks/19/vol1
Brick74: cpu02:/bricks/19/vol1
Brick75: cpu03:/bricks/19/vol1
Brick76: cpu04:/bricks/19/vol1
Brick77: cpu01:/bricks/20/vol1
Brick78: cpu02:/bricks/20/vol1
Brick79: cpu03:/bricks/20/vol1
Brick80: cpu04:/bricks/20/vol1
Brick81: cpu01:/bricks/21/vol1
Brick82: cpu02:/bricks/21/vol1
Brick83: cpu03:/bricks/21/vol1
Brick84: cpu04:/bricks/21/vol1
Brick85: cpu01:/bricks/22/vol1
Brick86: cpu02:/bricks/22/vol1
Brick87: cpu03:/bricks/22/vol1
Brick88: cpu04:/bricks/22/vol1
Brick89: cpu01:/bricks/23/vol1
Brick90: cpu02:/bricks/23/vol1
Brick91: cpu03:/bricks/23/vol1
Brick92: cpu04:/bricks/23/vol1
Brick93: cpu01:/bricks/24/vol1
Brick94: cpu02:/bricks/24/vol1
Brick95: cpu03:/bricks/24/vol1
Brick96: cpu04:/bricks/24/vol1
Options Reconfigured:
nfs.disable: off
user.cifs: enable
auth.allow: 10.10.0.*
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
server.allow-insecure: on
[root at cpu01 ~]#

Thanks,
punit

On Tue, Feb 17, 2015 at 6:16 AM, Ben Turner <bturner at redhat.com> wrote:
> ----- Original Message -----
> > From: "Joe Julian" <joe at julianfamily.org>
> > To: "Punit Dambiwal" <hypunit at gmail.com>,
gluster-users at gluster.org,
> "Humble Devassy Chirammal"
> > <humble.devassy at gmail.com>
> > Sent: Monday, February 16, 2015 3:32:31 PM
> > Subject: Re: [Gluster-users] Gluster performance on the small files
> >
> >
> > On 02/12/2015 10:58 PM, Punit Dambiwal wrote:
> >
> >
> >
> > Hi,
> >
> > I have seen the gluster performance is dead slow on the small
> files...even i
> > am using the SSD....it's too bad performance....even i am getting
better
> > performance in my SAN with normal SATA disk...
> >
> > I am using distributed replicated glusterfs with replica count=2...i
> have all
> > SSD disks on the brick...
> >
> >
> >
> > root at vm3:~# dd bs=64k count=4k if=/dev/zero of=test oflag=dsync
> >
> > 4096+0 records in
> >
> > 4096+0 records out
> >
> > 268435456 bytes (268 MB) copied, 57.3145 s, 4.7 MB/s
> >
>
> This seems pretty slow, even if you are using gigabit.  Here is what I get:
>
> [root at gqac031 smallfile]# dd bs=64k count=4k if=/dev/zero
> of=/gluster-emptyvol/test oflag=dsync
> 4096+0 records in
> 4096+0 records out
> 268435456 bytes (268 MB) copied, 10.5965 s, 25.3 MB/s
>
> FYI this is on my 2 node pure replica + spinning disks(RAID 6, this is not
> setup for smallfile workloads.  For smallfile I normally use RAID 10) +
10G.
>
> The single threaded DD process is defiantly a bottle neck here, the power
> in distributed systems is doing things in parallel across clients /
> threads.  You may want to try smallfile:
>
>
>
http://www.gluster.org/community/documentation/index.php/Performance_Testing
>
> Smallfile command used - python /small-files/smallfile/smallfile_cli.py
> --operation create --threads 8 --file-size 64 --files 10000 --top
> /gluster-emptyvol/ --pause 1000 --host-set "client1, client2"
>
> total threads = 16
> total files = 157100
> total data =     9.589 GB
>  98.19% of requested files processed, minimum is  70.00
> 41.271602 sec elapsed time
> 3806.491454 files/sec
> 3806.491454 IOPS
> 237.905716 MB/sec
>
> If you wanted to do something similar with DD you could do:
>
> <my script>
> for i in `seq 1..4`
> do
>     dd bs=64k count=4k if=/dev/zero of=/gluster-emptyvol/test$i
> oflag=dsync &
> done
> for pid in $(pidof dd); do
>     while kill -0 "$pid"; do
>         sleep 0.1
>     done
> done
>
> # time myscript.sh
>
> Then do the math to figure out the MB / sec of the system.
>
> -b
>
> >
> >
> > root at vm3:~# dd bs=64k count=4k if=/dev/zero of=test conv=fdatasync
> >
> > 4096+0 records in
> >
> > 4096+0 records out
> >
> > 268435456 bytes (268 MB) copied, 1.80093 s, 149 MB/s
> >
> >
> >
> > How small is your VM image? The image is the file that GlusterFS is
> serving,
> > not the small files within it. Perhaps the filesystem you're using
within
> > your VM is inefficient with regard to how it handles disk writes.
> >
> > I believe your concept of "small file" performance is
misunderstood, as
> is
> > often the case with this phrase. The "small file" issue has
to do with
> the
> > overhead of finding and checking the validity of any file, but with a
> small
> > file the percentage of time doing those checks is proportionally
greater.
> > With your VM image, that file is already open. There are no self-heal
> checks
> > or lookups that are happening in your tests, so that overhead is not
the
> > problem.
> >
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150217/27304c1f/attachment.html>

Gluster users - Feb 2015 - Gluster performance on the small files

[Gluster-users] Gluster performance on the small files

[Gluster-users] Gluster performance on the small files