Diego Remolina
2019-Jan-15 13:28 UTC
[Gluster-users] [External] To good to be truth speed improvements?
Hi Davide, The options information is already provided in prior e-mail, see the termbin.con link for the options of the volume after the 4.1.6 upgrade. The gluster options set on the volume are: https://termbin.com/yxtd This is the other piece: # gluster v info export Volume Name: export Type: Replicate Volume ID: b4353b3f-6ef6-4813-819a-8e85e5a95cff Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.0.1.7:/bricks/hdds/brick Brick2: 10.0.1.6:/bricks/hdds/brick Options Reconfigured: performance.stat-prefetch: on performance.cache-min-file-size: 0 network.inode-lru-limit: 65536 performance.cache-invalidation: on features.cache-invalidation: on performance.md-cache-timeout: 600 features.cache-invalidation-timeout: 600 performance.cache-samba-metadata: on transport.address-family: inet server.allow-insecure: on performance.cache-size: 10GB cluster.server-quorum-type: server nfs.disable: on performance.io-thread-count: 64 performance.io-cache: on cluster.lookup-optimize: on cluster.readdir-optimize: on server.event-threads: 5 client.event-threads: 5 performance.cache-max-file-size: 256MB diagnostics.client-log-level: INFO diagnostics.brick-log-level: INFO cluster.server-quorum-ratio: 51% Now I did create a backup of /var/lib/glusterd so if you tell me how to pull information from there to compare I can do it. I compared the file /var/lib/glusterd/vols/export/info and it is the same in both, though entries are in different order. Diego On Tue, Jan 15, 2019 at 5:03 AM Davide Obbi <davide.obbi at booking.com> wrote:> > > On Tue, Jan 15, 2019 at 2:18 AM Diego Remolina <dijuremo at gmail.com> wrote: > >> Dear all, >> >> I was running gluster 3.10.12 on a pair of servers and recently upgraded >> to 4.1.6. There is a cron job that runs nightly in one machine, which >> rsyncs the data on the servers over to another machine for backup purposes. >> The rsync operation runs on one of the gluster servers, which mounts the >> gluster volume via fuse on /export. >> >> When using 3.10.12, this process would start at 8:00PM nightly, and >> usually end up at around 4:30AM when the servers had been freshly rebooted. >> From this point, things would start taking a bit longer and stabilize >> ending at around 7-9AM depending on actual file changes and at some point >> the servers would start eating up so much ram (up to 30GB) and I would have >> to reboot them to bring things back to normal as the file system would >> become extremely slow (perhaps the memory leak I have read was present on >> 3.10.x). >> >> After upgrading to 4.1.6 over the weekend, I was shocked to see the rsync >> process finish in about 1 hour and 26 minutes. This is compared to 8 hours >> 30 mins with the older version. This is a nice speed up, however, I can >> only ask myself what has changed so drastically that this process is now so >> fast. Have there really been improvements in 4.1.6 that could speed this up >> so dramatically? In both of my test cases, there would had not really been >> a lot to copy via rsync given the fresh reboots are done on Saturday after >> the sync has finished from the day before. >> >> In general, the servers (which are accessed via samba for windows >> clients) are much faster and responsive since the update to 4.1.6. Tonight >> I will have the first rsync run which will actually have to copy the day's >> changes and will have another point of comparison. >> >> I am still using fuse mounts for samba, due to prior problems with vsf >> =gluster, which are currently present in Samba 4.8.3-4, and already >> documented in bugs, for which patches exist, but no official updated samba >> packages have been released yet. Since I was going from 3.10.12 to 4.1.6 I >> also did not want to change other things to make sure I could track any >> issues just related to the change in gluster versions and eliminate other >> complexity. >> >> The file system currently has about 16TB of data in >> 5142816 files and 696544 directories >> >> I've just ran the following code to count files and dirs and it took >> 67mins 38.957 secs to complete in this gluster volume: >> https://github.com/ChristopherSchultz/fast-file-count >> >> # time ( /root/sbin/dircnt /export ) >> /export contains 5142816 files and 696544 directories >> >> real 67m38.957s >> user 0m6.225s >> sys 0m48.939s >> >> The gluster options set on the volume are: >> https://termbin.com/yxtd >> >> # gluster v status export >> Status of volume: export >> Gluster process TCP Port RDMA Port Online >> Pid >> >> ------------------------------------------------------------------------------ >> Brick 10.0.1.7:/bricks/hdds/brick 49157 0 Y >> 13986 >> Brick 10.0.1.6:/bricks/hdds/brick 49153 0 Y >> 9953 >> Self-heal Daemon on localhost N/A N/A Y >> 21934 >> Self-heal Daemon on 10.0.1.5 N/A N/A Y >> 4598 >> Self-heal Daemon on 10.0.1.6 N/A N/A Y >> 14485 >> >> Task Status of Volume export >> >> ------------------------------------------------------------------------------ >> There are no active volume tasks >> >> Truth, there is a 3rd server here, but no bricks on it. >> >> Thoughts? >> >> Diego >> >> >> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free. >> www.avast.com >> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link> >> <#m_-4021393732076721680_m_8084651329793795211_m_7462352325940458688_m_-6479459361629161759_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > > > Hi Diego, > > Besides the actual improvements made in the code i think new releases > might implement volume options by default that before might have had > different setting. I would have been interesting to diff "gluster volume > get <volname> all" befor and after the upgrade. Just for curiosity and i am > trying to figure out volume options for rsync kind of workloads can you > share the command output anyway along with gluster volume info <volname>? > > thanks > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190115/8496eb4a/attachment.html>
Davide Obbi
2019-Jan-15 19:03 UTC
[Gluster-users] [External] To good to be truth speed improvements?
i think you can find the volume options doing a grep -R option /var/lib/glusterd/vols/ and the .vol files show the options On Tue, Jan 15, 2019 at 2:28 PM Diego Remolina <dijuremo at gmail.com> wrote:> Hi Davide, > > The options information is already provided in prior e-mail, see the > termbin.con link for the options of the volume after the 4.1.6 upgrade. > > The gluster options set on the volume are: > https://termbin.com/yxtd > > This is the other piece: > > # gluster v info export > > Volume Name: export > Type: Replicate > Volume ID: b4353b3f-6ef6-4813-819a-8e85e5a95cff > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: 10.0.1.7:/bricks/hdds/brick > Brick2: 10.0.1.6:/bricks/hdds/brick > Options Reconfigured: > performance.stat-prefetch: on > performance.cache-min-file-size: 0 > network.inode-lru-limit: 65536 > performance.cache-invalidation: on > features.cache-invalidation: on > performance.md-cache-timeout: 600 > features.cache-invalidation-timeout: 600 > performance.cache-samba-metadata: on > transport.address-family: inet > server.allow-insecure: on > performance.cache-size: 10GB > cluster.server-quorum-type: server > nfs.disable: on > performance.io-thread-count: 64 > performance.io-cache: on > cluster.lookup-optimize: on > cluster.readdir-optimize: on > server.event-threads: 5 > client.event-threads: 5 > performance.cache-max-file-size: 256MB > diagnostics.client-log-level: INFO > diagnostics.brick-log-level: INFO > cluster.server-quorum-ratio: 51% > > Now I did create a backup of /var/lib/glusterd so if you tell me how to > pull information from there to compare I can do it. > > I compared the file /var/lib/glusterd/vols/export/info and it is the same > in both, though entries are in different order. > > Diego > > > > > On Tue, Jan 15, 2019 at 5:03 AM Davide Obbi <davide.obbi at booking.com> > wrote: > >> >> >> On Tue, Jan 15, 2019 at 2:18 AM Diego Remolina <dijuremo at gmail.com> >> wrote: >> >>> Dear all, >>> >>> I was running gluster 3.10.12 on a pair of servers and recently upgraded >>> to 4.1.6. There is a cron job that runs nightly in one machine, which >>> rsyncs the data on the servers over to another machine for backup purposes. >>> The rsync operation runs on one of the gluster servers, which mounts the >>> gluster volume via fuse on /export. >>> >>> When using 3.10.12, this process would start at 8:00PM nightly, and >>> usually end up at around 4:30AM when the servers had been freshly rebooted. >>> From this point, things would start taking a bit longer and stabilize >>> ending at around 7-9AM depending on actual file changes and at some point >>> the servers would start eating up so much ram (up to 30GB) and I would have >>> to reboot them to bring things back to normal as the file system would >>> become extremely slow (perhaps the memory leak I have read was present on >>> 3.10.x). >>> >>> After upgrading to 4.1.6 over the weekend, I was shocked to see the >>> rsync process finish in about 1 hour and 26 minutes. This is compared to 8 >>> hours 30 mins with the older version. This is a nice speed up, however, I >>> can only ask myself what has changed so drastically that this process is >>> now so fast. Have there really been improvements in 4.1.6 that could speed >>> this up so dramatically? In both of my test cases, there would had not >>> really been a lot to copy via rsync given the fresh reboots are done on >>> Saturday after the sync has finished from the day before. >>> >>> In general, the servers (which are accessed via samba for windows >>> clients) are much faster and responsive since the update to 4.1.6. Tonight >>> I will have the first rsync run which will actually have to copy the day's >>> changes and will have another point of comparison. >>> >>> I am still using fuse mounts for samba, due to prior problems with vsf >>> =gluster, which are currently present in Samba 4.8.3-4, and already >>> documented in bugs, for which patches exist, but no official updated samba >>> packages have been released yet. Since I was going from 3.10.12 to 4.1.6 I >>> also did not want to change other things to make sure I could track any >>> issues just related to the change in gluster versions and eliminate other >>> complexity. >>> >>> The file system currently has about 16TB of data in >>> 5142816 files and 696544 directories >>> >>> I've just ran the following code to count files and dirs and it took >>> 67mins 38.957 secs to complete in this gluster volume: >>> https://github.com/ChristopherSchultz/fast-file-count >>> >>> # time ( /root/sbin/dircnt /export ) >>> /export contains 5142816 files and 696544 directories >>> >>> real 67m38.957s >>> user 0m6.225s >>> sys 0m48.939s >>> >>> The gluster options set on the volume are: >>> https://termbin.com/yxtd >>> >>> # gluster v status export >>> Status of volume: export >>> Gluster process TCP Port RDMA Port Online >>> Pid >>> >>> ------------------------------------------------------------------------------ >>> Brick 10.0.1.7:/bricks/hdds/brick 49157 0 Y >>> 13986 >>> Brick 10.0.1.6:/bricks/hdds/brick 49153 0 Y >>> 9953 >>> Self-heal Daemon on localhost N/A N/A Y >>> 21934 >>> Self-heal Daemon on 10.0.1.5 N/A N/A Y >>> 4598 >>> Self-heal Daemon on 10.0.1.6 N/A N/A Y >>> 14485 >>> >>> Task Status of Volume export >>> >>> ------------------------------------------------------------------------------ >>> There are no active volume tasks >>> >>> Truth, there is a 3rd server here, but no bricks on it. >>> >>> Thoughts? >>> >>> Diego >>> >>> >>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free. >>> www.avast.com >>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link> >>> <#m_-657708050556748564_m_-2130281720557425520_m_-4021393732076721680_m_8084651329793795211_m_7462352325940458688_m_-6479459361629161759_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >> >> >> Hi Diego, >> >> Besides the actual improvements made in the code i think new releases >> might implement volume options by default that before might have had >> different setting. I would have been interesting to diff "gluster volume >> get <volname> all" befor and after the upgrade. Just for curiosity and i am >> trying to figure out volume options for rsync kind of workloads can you >> share the command output anyway along with gluster volume info <volname>? >> >> thanks >> >>-- Davide Obbi Senior System Administrator Booking.com B.V. Vijzelstraat 66-80 Amsterdam 1017HL Netherlands Direct +31207031558 [image: Booking.com] <https://www.booking.com/> Empowering people to experience the world since 1996 43 languages, 214+ offices worldwide, 141,000+ global destinations, 29 million reported listings Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190115/c764003c/attachment.html>