Amudhan P
2017-Apr-18 08:36 UTC
[Gluster-users] How to Speed UP heal process in Glusterfs 3.10.1
I actually used this (find /mnt/gluster -d -exec getfattr -h -n trusted.ec.heal {} \; > /dev/null ) command on a specific folder to trigger heal but it was also not showing any difference in speed. I was asking about reading data in same disperse set like 8+2 disperse config if one disk is replaced and when heal is in process and when client reads data which is available in rest of the 9 disks. I am sure there was no bottleneck on network/disk IO in my case. I have tested 3.10.1 heal with disperse.shd-max-threads = 4. heal completed data size of 27GB in 13M15s. so it works well in a test environment but production environment it differs. On Tue, Apr 18, 2017 at 12:47 PM, Serkan ?oban <cobanserkan at gmail.com> wrote:> You can increase heal speed by running below command from a client: > find /mnt/gluster -d -exec getfattr -h -n trusted.ec.heal {} \; > /dev/null > > You can write a script with different folders to make it parallel. > > In my case I see 6TB data was healed within 7-8 days with above command > running. > >did you face any issue in reading data from rest of the good bricks in > the set. like slow read < KB/s. > No, nodes generally have balanced network/disk IO during heal.. > > You should make a detailed tests with non-prod cluster and try to find > optimum heal configuration for your use case.. > Our new servers are on the way, in a couple of months I also will do > detailed tests with 3.10.x and parallel disperse heal, will post the > results here... > > > On Tue, Apr 18, 2017 at 9:51 AM, Amudhan P <amudhan83 at gmail.com> wrote: > > Serkan, > > > > I have initially changed shd-max-thread 1 to 2 saw a little difference > and > > changing it to 4 & 8. doesn't make any difference. > > disk write speed was about <1MB and data passed in thru network for > healing > > node from other node were 4MB combined. > > > > Also, I tried ls -l from mount point to the folders and files which need > to > > be healed but have not seen any difference in performance. > > > > But after 3 days of heal process running disk write speed was increased > to 9 > > - 11MB and data passed thru network for healing node from other node were > > 40MB combined. > > > > Still 14GB of data to be healed when comparing to other disks in set. > > > > I saw in another thread you also had the issue with heal speed, did you > face > > any issue in reading data from rest of the good bricks in the set. like > slow > > read < KB/s. > > > > On Mon, Apr 17, 2017 at 2:05 PM, Serkan ?oban <cobanserkan at gmail.com> > wrote: > >> > >> Normally I see 8-10MB/sec/brick heal speed with gluster 3.7.11. > >> I tested parallel heal for disperse with version 3.9.0 and see that it > >> increase the heal speed to 20-40MB/sec > >> I tested with shd-max-threads 2,4,8 and saw that best performance > >> achieved with 2 or 4 threads. > >> you can try to start with 2 and test with 4 and 8 and compare the > results? > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170418/0f13ac2a/attachment.html>
Serkan Çoban
2017-Apr-18 09:59 UTC
[Gluster-users] How to Speed UP heal process in Glusterfs 3.10.1
>I was asking about reading data in same disperse set like 8+2 disperseconfig if one disk is replaced and when heal is in process and when client reads data which is available in rest of the 9 disks. My use case is write heavy, we barely read data, so I do not know if read speed degrades during heal. But I know write speed do not change during heal. How big is your files? How many files on average in each directory? On Tue, Apr 18, 2017 at 11:36 AM, Amudhan P <amudhan83 at gmail.com> wrote:> > I actually used this (find /mnt/gluster -d -exec getfattr -h -n > trusted.ec.heal {} \; > /dev/null > ) command on a specific folder to trigger heal but it was also not showing > any difference in speed. > > I was asking about reading data in same disperse set like 8+2 disperse > config if one disk is replaced and when heal is in process and when client > reads data which is available in rest of the 9 disks. > > I am sure there was no bottleneck on network/disk IO in my case. > > I have tested 3.10.1 heal with disperse.shd-max-threads = 4. heal > completed data size of 27GB in 13M15s. so it works well in a test > environment but production environment it differs. > > > > On Tue, Apr 18, 2017 at 12:47 PM, Serkan ?oban <cobanserkan at gmail.com> > wrote: > >> You can increase heal speed by running below command from a client: >> find /mnt/gluster -d -exec getfattr -h -n trusted.ec.heal {} \; > >> /dev/null >> >> You can write a script with different folders to make it parallel. >> >> In my case I see 6TB data was healed within 7-8 days with above command >> running. >> >did you face any issue in reading data from rest of the good bricks in >> the set. like slow read < KB/s. >> No, nodes generally have balanced network/disk IO during heal.. >> >> You should make a detailed tests with non-prod cluster and try to find >> optimum heal configuration for your use case.. >> Our new servers are on the way, in a couple of months I also will do >> detailed tests with 3.10.x and parallel disperse heal, will post the >> results here... >> >> >> On Tue, Apr 18, 2017 at 9:51 AM, Amudhan P <amudhan83 at gmail.com> wrote: >> > Serkan, >> > >> > I have initially changed shd-max-thread 1 to 2 saw a little difference >> and >> > changing it to 4 & 8. doesn't make any difference. >> > disk write speed was about <1MB and data passed in thru network for >> healing >> > node from other node were 4MB combined. >> > >> > Also, I tried ls -l from mount point to the folders and files which >> need to >> > be healed but have not seen any difference in performance. >> > >> > But after 3 days of heal process running disk write speed was increased >> to 9 >> > - 11MB and data passed thru network for healing node from other node >> were >> > 40MB combined. >> > >> > Still 14GB of data to be healed when comparing to other disks in set. >> > >> > I saw in another thread you also had the issue with heal speed, did you >> face >> > any issue in reading data from rest of the good bricks in the set. like >> slow >> > read < KB/s. >> > >> > On Mon, Apr 17, 2017 at 2:05 PM, Serkan ?oban <cobanserkan at gmail.com> >> wrote: >> >> >> >> Normally I see 8-10MB/sec/brick heal speed with gluster 3.7.11. >> >> I tested parallel heal for disperse with version 3.9.0 and see that it >> >> increase the heal speed to 20-40MB/sec >> >> I tested with shd-max-threads 2,4,8 and saw that best performance >> >> achieved with 2 or 4 threads. >> >> you can try to start with 2 and test with 4 and 8 and compare the >> results? >> > >> > >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170418/dcdd2380/attachment.html>