thr3ads.net - Gluster users - [Gluster-users] Poor performance with small files [Apr 2015]

If this information is useful, please help other people find it:
Share via:

Ben Turner

2015-Apr-29 19:03 UTC

[Gluster-users] Poor performance with small files

----- Original Message -----> From: "Ron Trompert" <ron.trompert at surfsara.nl>
> To: gluster-users at gluster.org
> Sent: Wednesday, April 29, 2015 1:25:59 PM
> Subject: [Gluster-users] Poor performance with small files
> 
> Hi,
> 
> We run gluster as storage solution for our Owncloud-based sync and share
> service. At the moment we have about 30 million files in the system
> which addup to a little more than  30TB. Most of these files are as you
> may expect very small, i.e. in the 100KB ball park. For about a year
> everything ran perfectly fine. We run 3.6.2 by the way.
Upgrade to 3.6.3 and set client.event-threads and server.event-threads to at
least 4:

"Previously, epoll thread did socket even-handling and the same thread was
used for serving the client or processing the response received from the server.
Due to this, other requests were in a queue untill the current epoll thread
completed its operation. With multi-threaded epoll, events are distributed that
improves the performance due the parallel processing of requests/responses
received."

Here are the guidelines for tuning them:

https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Administration_Guide/Small_File_Performance_Enhancements.html

In my testing with epoll threads at 4 I saw a between a 15% and 50% increase
depending on the workload.

There are several smallfile perf enhancements in the works:

*http://www.gluster.org/community/documentation/index.php/Features/Feature_Smallfile_Perf

*Lookup unhashed is the next feature and should be ready with 3.7(correct me if
I am wrong).

*If you are using RAID 6 you may want to do some testing with RAID 10 or JBOD,
but the benefits here only come into play with alot of concurrent access(30+
processes / threads working with different files).

*Tiering may help here if you want to add some SSDs, this is also a 3.7 feature.

HTH!

-b
> 
> Now we are trying to commission new hardware. We have done this by
> adding the new nodes to our cluster and using the add-brick and
> remove-brick procedure to get the data to the new nodes. In a week we
> have migrated only 8.5TB this way. What are we doing wrong here? Is
> there a way to improve the gluster performance on small files?
> 
> I have another question. If you want to setup a gluster that will
> contain lots of very small files. What would be a good practice to set
> things up in terms configuration, sizes of bricks related tot memory and
> number of cores, number of brick per node etc.?
> 
> 
> 
> Best regards and thanks in advance,
> 
> Ron
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>

Sander Zijlstra

2015-Apr-29 22:28 UTC

head link

[Gluster-users] Poor performance with small files

Hi Ben,

Thanks for the extensive response?.

3.6.3 has been released recently hasn?t it?? we just upgraded to 3.6.2 two weeks
ago because our new glusterfs servers were freshly installed and indeed got the
latest packages available at the time of installation.

As the remove-brick operation is still running, can we stop this process without
problems?? In the RedHat documents it?s stated as a ?technology preview? but in
the github documentation for glusterfs it?s stated that the data migration will
simply stop and leave the data as is, at the time of stopping the remove-brick.

Met vriendelijke groet / kind regards,

Sander Zijlstra

> On 29 Apr 2015, at 21:03, Ben Turner <bturner at redhat.com> wrote:
> 
> ----- Original Message -----
>> From: "Ron Trompert" <ron.trompert at surfsara.nl>
>> To: gluster-users at gluster.org
>> Sent: Wednesday, April 29, 2015 1:25:59 PM
>> Subject: [Gluster-users] Poor performance with small files
>> 
>> Hi,
>> 
>> We run gluster as storage solution for our Owncloud-based sync and
share
>> service. At the moment we have about 30 million files in the system
>> which addup to a little more than  30TB. Most of these files are as you
>> may expect very small, i.e. in the 100KB ball park. For about a year
>> everything ran perfectly fine. We run 3.6.2 by the way.
> 
> Upgrade to 3.6.3 and set client.event-threads and server.event-threads to
at least 4:
> 
> "Previously, epoll thread did socket even-handling and the same thread
was used for serving the client or processing the response received from the
server. Due to this, other requests were in a queue untill the current epoll
thread completed its operation. With multi-threaded epoll, events are
distributed that improves the performance due the parallel processing of
requests/responses received."
> 
> Here are the guidelines for tuning them:
> 
>
https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Administration_Guide/Small_File_Performance_Enhancements.html
> 
> In my testing with epoll threads at 4 I saw a between a 15% and 50%
increase depending on the workload.
> 
> There are several smallfile perf enhancements in the works:
> 
>
*http://www.gluster.org/community/documentation/index.php/Features/Feature_Smallfile_Perf
> 
> *Lookup unhashed is the next feature and should be ready with 3.7(correct
me if I am wrong).
> 
> *If you are using RAID 6 you may want to do some testing with RAID 10 or
JBOD, but the benefits here only come into play with alot of concurrent
access(30+ processes / threads working with different files).
> 
> *Tiering may help here if you want to add some SSDs, this is also a 3.7
feature.
> 
> HTH!
> 
> -b
> 
>> 
>> Now we are trying to commission new hardware. We have done this by
>> adding the new nodes to our cluster and using the add-brick and
>> remove-brick procedure to get the data to the new nodes. In a week we
>> have migrated only 8.5TB this way. What are we doing wrong here? Is
>> there a way to improve the gluster performance on small files?
>> 
>> I have another question. If you want to setup a gluster that will
>> contain lots of very small files. What would be a good practice to set
>> things up in terms configuration, sizes of bricks related tot memory
and
>> number of cores, number of brick per node etc.?
>> 
>> 
>> 
>> Best regards and thanks in advance,
>> 
>> Ron
>> 
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: Message signed with OpenPGP using GPGMail
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150430/c1b18a13/attachment.sig>

Ron Trompert

2015-Apr-30 05:25 UTC

head link

[Gluster-users] Poor performance with small files

Hi Ben,

Thanks for the info.

Cheers,

Ron


On 29/04/15 21:03, Ben Turner wrote:> ----- Original Message -----
>> From: "Ron Trompert" <ron.trompert at surfsara.nl>
>> To: gluster-users at gluster.org
>> Sent: Wednesday, April 29, 2015 1:25:59 PM
>> Subject: [Gluster-users] Poor performance with small files
>>
>> Hi,
>>
>> We run gluster as storage solution for our Owncloud-based sync and
share
>> service. At the moment we have about 30 million files in the system
>> which addup to a little more than  30TB. Most of these files are as you
>> may expect very small, i.e. in the 100KB ball park. For about a year
>> everything ran perfectly fine. We run 3.6.2 by the way.
> 
> Upgrade to 3.6.3 and set client.event-threads and server.event-threads to
at least 4:
> 
> "Previously, epoll thread did socket even-handling and the same thread
was used for serving the client or processing the response received from the
server. Due to this, other requests were in a queue untill the current epoll
thread completed its operation. With multi-threaded epoll, events are
distributed that improves the performance due the parallel processing of
requests/responses received."
> 
> Here are the guidelines for tuning them:
> 
>
https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Administration_Guide/Small_File_Performance_Enhancements.html
> 
> In my testing with epoll threads at 4 I saw a between a 15% and 50%
increase depending on the workload.
> 
> There are several smallfile perf enhancements in the works:
> 
>
*http://www.gluster.org/community/documentation/index.php/Features/Feature_Smallfile_Perf
> 
> *Lookup unhashed is the next feature and should be ready with 3.7(correct
me if I am wrong).
> 
> *If you are using RAID 6 you may want to do some testing with RAID 10 or
JBOD, but the benefits here only come into play with alot of concurrent
access(30+ processes / threads working with different files).
> 
> *Tiering may help here if you want to add some SSDs, this is also a 3.7
feature.
> 
> HTH!
> 
> -b
> 
>>
>> Now we are trying to commission new hardware. We have done this by
>> adding the new nodes to our cluster and using the add-brick and
>> remove-brick procedure to get the data to the new nodes. In a week we
>> have migrated only 8.5TB this way. What are we doing wrong here? Is
>> there a way to improve the gluster performance on small files?
>>
>> I have another question. If you want to setup a gluster that will
>> contain lots of very small files. What would be a good practice to set
>> things up in terms configuration, sizes of bricks related tot memory
and
>> number of cores, number of brick per node etc.?
>>
>>
>>
>> Best regards and thanks in advance,
>>
>> Ron
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>

Gluster users - Apr 2015 - Poor performance with small files

[Gluster-users] Poor performance with small files

[Gluster-users] Poor performance with small files

[Gluster-users] Poor performance with small files