Pranith Kumar Karampuri
2018-Jul-27 08:31 UTC
[Gluster-users] Gluter 3.12.12: performance during heal and in general
On Fri, Jul 27, 2018 at 1:32 PM, Hu Bert <revirii at googlemail.com> wrote:> 2018-07-27 9:22 GMT+02:00 Pranith Kumar Karampuri <pkarampu at redhat.com>: > > > > > > On Fri, Jul 27, 2018 at 12:36 PM, Hu Bert <revirii at googlemail.com> > wrote: > >> > >> 2018-07-27 8:52 GMT+02:00 Pranith Kumar Karampuri <pkarampu at redhat.com > >: > >> > > >> > > >> > On Fri, Jul 27, 2018 at 11:53 AM, Hu Bert <revirii at googlemail.com> > >> > wrote: > >> >> > >> >> > Do you already have all the 190000 directories already created? If > >> >> > not > >> >> > could you find out which of the paths need it and do a stat > directly > >> >> > instead > >> >> > of find? > >> >> > >> >> Quite probable not all of them have been created (but counting how > >> >> much would take very long...). Hm, maybe running stat in a double > loop > >> >> (thx to our directory structure) would help. Something like this (may > >> >> be not 100% correct): > >> >> > >> >> for a in ${100..999}; do > >> >> for b in ${100..999}; do > >> >> stat /$a/$b/ > >> >> done > >> >> done > >> >> > >> >> Should run stat on all directories. I think i'll give this a try. > >> > > >> > > >> > Just to prevent these served from a cache, it is probably better to do > >> > this > >> > from a fresh mount? > >> > > >> > -- > >> > Pranith > >> > >> Good idea. I'll install glusterfs client on a little used machine, so > >> there should be no caching. Thx! Have a good weekend when the time > >> comes :-) > > > > > > If this proves effective, what you need to also do is unmount and mount > > again, something like: > > > > mount > > for a in ${100..999}; do > > for b in ${100..999}; do > > stat /$a/$b/ > > done > > done > > unmount > > I'll see what is possible over the weekend. > > Btw.: i've seen in the munin stats that the disk utilization for > bricksdd1 on the healthy gluster servers is between 70% (night) and > almost 99% (daytime). So it looks like that the basic problem is the > disk which seems not to be able to work faster? If so (heal) > performance won't improve with this setup, i assume.It could be saturating in the day. But if enough self-heals are going on, even in the night it should have been close to 100%.> Maybe switching > to RAID10 (conventional hard disks), SSDs or even add 3 additional > gluster servers (distributed replicated) could help? >It definitely will give better protection against hardware failure. Failure domain will be lesser. -- Pranith -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180727/cdf2527e/attachment.html>
Hu Bert
2018-Jul-27 08:47 UTC
[Gluster-users] Gluter 3.12.12: performance during heal and in general
>> Btw.: i've seen in the munin stats that the disk utilization for >> bricksdd1 on the healthy gluster servers is between 70% (night) and >> almost 99% (daytime). So it looks like that the basic problem is the >> disk which seems not to be able to work faster? If so (heal) >> performance won't improve with this setup, i assume. > > > It could be saturating in the day. But if enough self-heals are going on, > even in the night it should have been close to 100%.Lowest utilization was 70% over night, but i'll check this evening/weekend. Also that 'stat...' is running.>> Maybe switching >> to RAID10 (conventional hard disks), SSDs or even add 3 additional >> gluster servers (distributed replicated) could help? > > > It definitely will give better protection against hardware failure. Failure > domain will be lesser.What, in your opinion, would be better for permormance? - Having 3 servers and RAID10 (with conventional disks) - Having 3 additional servers with 4 hdds (JBOD) each (distribute replicate, replica 3) - SSDs? (would be quite expensive to reach the storage amount we have at the moment) Just curious. It seems we'll have to adjust our setup during winter anyway :-) Thanx again :-)
Hu Bert
2018-Aug-01 07:31 UTC
[Gluster-users] Gluter 3.12.12: performance during heal and in general
Hello :-) Just wanted to give a short report...>> It could be saturating in the day. But if enough self-heals are going on, >> even in the night it should have been close to 100%. > > Lowest utilization was 70% over night, but i'll check this > evening/weekend. Also that 'stat...' is running.At the moment 1.1TB of 2.0TB got healed, disk utilization still between 100% (day) and 70% (night). So this will take another 10-14 days.> What, in your opinion, would be better for permormance? > > - Having 3 servers and RAID10 (with conventional disks) > - Having 3 additional servers with 4 hdds (JBOD) each (distribute > replicate, replica 3) > - SSDs? (would be quite expensive to reach the storage amount we have > at the moment) > > Just curious. It seems we'll have to adjust our setup during winter anyway :-)Well, we'll definitely rethink our setup this autumn :-)