Diego Remolina
2019-Jan-15 01:18 UTC
[Gluster-users] To good to be truth speed improvements?
Dear all, I was running gluster 3.10.12 on a pair of servers and recently upgraded to 4.1.6. There is a cron job that runs nightly in one machine, which rsyncs the data on the servers over to another machine for backup purposes. The rsync operation runs on one of the gluster servers, which mounts the gluster volume via fuse on /export. When using 3.10.12, this process would start at 8:00PM nightly, and usually end up at around 4:30AM when the servers had been freshly rebooted. From this point, things would start taking a bit longer and stabilize ending at around 7-9AM depending on actual file changes and at some point the servers would start eating up so much ram (up to 30GB) and I would have to reboot them to bring things back to normal as the file system would become extremely slow (perhaps the memory leak I have read was present on 3.10.x). After upgrading to 4.1.6 over the weekend, I was shocked to see the rsync process finish in about 1 hour and 26 minutes. This is compared to 8 hours 30 mins with the older version. This is a nice speed up, however, I can only ask myself what has changed so drastically that this process is now so fast. Have there really been improvements in 4.1.6 that could speed this up so dramatically? In both of my test cases, there would had not really been a lot to copy via rsync given the fresh reboots are done on Saturday after the sync has finished from the day before. In general, the servers (which are accessed via samba for windows clients) are much faster and responsive since the update to 4.1.6. Tonight I will have the first rsync run which will actually have to copy the day's changes and will have another point of comparison. I am still using fuse mounts for samba, due to prior problems with vsf =gluster, which are currently present in Samba 4.8.3-4, and already documented in bugs, for which patches exist, but no official updated samba packages have been released yet. Since I was going from 3.10.12 to 4.1.6 I also did not want to change other things to make sure I could track any issues just related to the change in gluster versions and eliminate other complexity. The file system currently has about 16TB of data in 5142816 files and 696544 directories I've just ran the following code to count files and dirs and it took 67mins 38.957 secs to complete in this gluster volume: https://github.com/ChristopherSchultz/fast-file-count # time ( /root/sbin/dircnt /export ) /export contains 5142816 files and 696544 directories real 67m38.957s user 0m6.225s sys 0m48.939s The gluster options set on the volume are: https://termbin.com/yxtd # gluster v status export Status of volume: export Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.0.1.7:/bricks/hdds/brick 49157 0 Y 13986 Brick 10.0.1.6:/bricks/hdds/brick 49153 0 Y 9953 Self-heal Daemon on localhost N/A N/A Y 21934 Self-heal Daemon on 10.0.1.5 N/A N/A Y 4598 Self-heal Daemon on 10.0.1.6 N/A N/A Y 14485 Task Status of Volume export ------------------------------------------------------------------------------ There are no active volume tasks Truth, there is a 3rd server here, but no bricks on it. Thoughts? Diego <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free. www.avast.com <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link> <#m_-6479459361629161759_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190114/3adbd0fd/attachment.html>
Davide Obbi
2019-Jan-15 10:02 UTC
[Gluster-users] [External] To good to be truth speed improvements?
On Tue, Jan 15, 2019 at 2:18 AM Diego Remolina <dijuremo at gmail.com> wrote:> Dear all, > > I was running gluster 3.10.12 on a pair of servers and recently upgraded > to 4.1.6. There is a cron job that runs nightly in one machine, which > rsyncs the data on the servers over to another machine for backup purposes. > The rsync operation runs on one of the gluster servers, which mounts the > gluster volume via fuse on /export. > > When using 3.10.12, this process would start at 8:00PM nightly, and > usually end up at around 4:30AM when the servers had been freshly rebooted. > From this point, things would start taking a bit longer and stabilize > ending at around 7-9AM depending on actual file changes and at some point > the servers would start eating up so much ram (up to 30GB) and I would have > to reboot them to bring things back to normal as the file system would > become extremely slow (perhaps the memory leak I have read was present on > 3.10.x). > > After upgrading to 4.1.6 over the weekend, I was shocked to see the rsync > process finish in about 1 hour and 26 minutes. This is compared to 8 hours > 30 mins with the older version. This is a nice speed up, however, I can > only ask myself what has changed so drastically that this process is now so > fast. Have there really been improvements in 4.1.6 that could speed this up > so dramatically? In both of my test cases, there would had not really been > a lot to copy via rsync given the fresh reboots are done on Saturday after > the sync has finished from the day before. > > In general, the servers (which are accessed via samba for windows clients) > are much faster and responsive since the update to 4.1.6. Tonight I will > have the first rsync run which will actually have to copy the day's changes > and will have another point of comparison. > > I am still using fuse mounts for samba, due to prior problems with vsf > =gluster, which are currently present in Samba 4.8.3-4, and already > documented in bugs, for which patches exist, but no official updated samba > packages have been released yet. Since I was going from 3.10.12 to 4.1.6 I > also did not want to change other things to make sure I could track any > issues just related to the change in gluster versions and eliminate other > complexity. > > The file system currently has about 16TB of data in > 5142816 files and 696544 directories > > I've just ran the following code to count files and dirs and it took > 67mins 38.957 secs to complete in this gluster volume: > https://github.com/ChristopherSchultz/fast-file-count > > # time ( /root/sbin/dircnt /export ) > /export contains 5142816 files and 696544 directories > > real 67m38.957s > user 0m6.225s > sys 0m48.939s > > The gluster options set on the volume are: > https://termbin.com/yxtd > > # gluster v status export > Status of volume: export > Gluster process TCP Port RDMA Port Online > Pid > > ------------------------------------------------------------------------------ > Brick 10.0.1.7:/bricks/hdds/brick 49157 0 Y > 13986 > Brick 10.0.1.6:/bricks/hdds/brick 49153 0 Y > 9953 > Self-heal Daemon on localhost N/A N/A Y > 21934 > Self-heal Daemon on 10.0.1.5 N/A N/A Y > 4598 > Self-heal Daemon on 10.0.1.6 N/A N/A Y > 14485 > > Task Status of Volume export > > ------------------------------------------------------------------------------ > There are no active volume tasks > > Truth, there is a 3rd server here, but no bricks on it. > > Thoughts? > > Diego > > > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free. > www.avast.com > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link> > <#m_8084651329793795211_m_7462352325940458688_m_-6479459361629161759_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-usersHi Diego, Besides the actual improvements made in the code i think new releases might implement volume options by default that before might have had different setting. I would have been interesting to diff "gluster volume get <volname> all" befor and after the upgrade. Just for curiosity and i am trying to figure out volume options for rsync kind of workloads can you share the command output anyway along with gluster volume info <volname>? thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190115/03f010ff/attachment.html>