thr3ads.net - Gluster users - [Gluster-users] Extremely low performance

If this information is useful, please help other people find it:
Share via:

Strahil

2019-Jul-04 04:18 UTC

[Gluster-users] Extremely low performance - am I doing somethingwrong?

I think , it'related to the sync type of oflag.
Do you have a raid controller on each brick , to immediate take the data into
the cache ?

Best Regards,
Strahil NikolovOn Jul 3, 2019 23:15, Vladimir Melnik <v.melnik at
tucha.ua> wrote:>
> Indeed, I wouldn't be surprised if I had around 80-100 MB/s, but 10-15 
> MB/s is really few. :-( 
>
> Even when I mount a filesystem on the same GlusterFS node, I have the 
> following result: 
> 10485760 bytes (10 MB) copied, 0.409856 s, 25.6 MB/s 
> 10485760 bytes (10 MB) copied, 0.38967 s, 26.9 MB/s 
> 10485760 bytes (10 MB) copied, 0.466758 s, 22.5 MB/s 
> 10485760 bytes (10 MB) copied, 0.412075 s, 25.4 MB/s 
> 10485760 bytes (10 MB) copied, 0.381626 s, 27.5 MB/s 
>
> At the same time on the same node when I'm writing directly to the
disk:
> 10485760 bytes (10 MB) copied, 0.0326612 s, 321 MB/s 
> 10485760 bytes (10 MB) copied, 0.0302878 s, 346 MB/s 
> 10485760 bytes (10 MB) copied, 0.0352449 s, 298 MB/s 
> 10485760 bytes (10 MB) copied, 0.0316872 s, 331 MB/s 
> 10485760 bytes (10 MB) copied, 0.0333189 s, 315 MB/s 
>
> Can't explain it to myself. Are replica-3 volumes really so slow? 
>
> On Wed, Jul 03, 2019 at 03:16:45PM -0400, Dmitry Filonov wrote: 
> > Well, if your network is limited to 100MB/s then it doesn't matter
if
> > storage is capable of doing 300+MB/s. 
> > But 15 MB/s is still way less than 100 MB/s 
> > 
> > P.S. just tried on my gluster and found out that am getting ~15MB/s on
> > replica 3 volume on SSDs and... 2MB/s on replica 3 volume on HDDs. 
> > Something to look at next week. 
> > 
> > 
> > 
> > -- 
> > Dmitry Filonov 
> > Linux Administrator 
> > SBGrid Core | Harvard Medical School 
> > 250 Longwood Ave, SGM-114 
> > Boston, MA 02115 
> > 
> > 
> > On Wed, Jul 3, 2019 at 12:18 PM Vladimir Melnik <v.melnik at
tucha.ua> wrote:
> > 
> > > Thank you, it helped a little: 
> > > 
> > > $ for i in {1..5}; do { dd if=/dev/zero
of=/mnt/glusterfs1/test.tmp bs=1M
> > > count=10 oflag=sync; rm -f /mnt/glusterfs1/test.tmp; } done
2>&1 | grep
> > > copied 
> > > 10485760 bytes (10 MB) copied, 0.738968 s, 14.2 MB/s 
> > > 10485760 bytes (10 MB) copied, 0.725296 s, 14.5 MB/s 
> > > 10485760 bytes (10 MB) copied, 0.681508 s, 15.4 MB/s 
> > > 10485760 bytes (10 MB) copied, 0.85566 s, 12.3 MB/s 
> > > 10485760 bytes (10 MB) copied, 0.661457 s, 15.9 MB/s 
> > > 
> > > But 14-15 MB/s is still quite far from the actual storage's
performance
> > > (200-3000 MB/s). :-( 
> > > 
> > > Here's full configuration dump (just in case): 
> > > 
> > > Option????????????????????????????????? Value 
> > > ------????????????????????????????????? ----- 
> > > cluster.lookup-unhashed???????????????? on 
> > > cluster.lookup-optimize???????????????? on 
> > > cluster.min-free-disk?????????????????? 10% 
> > > cluster.min-free-inodes???????????????? 5% 
> > > cluster.rebalance-stats???????????????? off 
> > > cluster.subvols-per-directory?????????? (null) 
> > > cluster.readdir-optimize??????????????? off 
> > > cluster.rsync-hash-regex??????????????? (null) 
> > > cluster.extra-hash-regex??????????????? (null) 
> > > cluster.dht-xattr-name????????????????? trusted.glusterfs.dht 
> > > cluster.randomize-hash-range-by-gfid??? off 
> > > cluster.rebal-throttle????????????????? normal 
> > > cluster.lock-migration????????????????? off 
> > > cluster.force-migration???????????????? off 
> > > cluster.local-volume-name?????????????? (null) 
> > > cluster.weighted-rebalance????????????? on 
> > > cluster.switch-pattern????????????????? (null) 
> > > cluster.entry-change-log??????????????? on 
> > > cluster.read-subvolume????????????????? (null) 
> > > cluster.read-subvolume-index??????????? -1 
> > > cluster.read-hash-mode????????????????? 1 
> > > cluster.background-self-heal-count????? 8 
> > > cluster.metadata-self-heal????????????? off 
> > > cluster.data-self-heal????????????????? off 
> > > cluster.entry-self-heal???????????????? off 
> > > cluster.self-heal-daemon??????????????? on 
> > > cluster.heal-timeout??????????????????? 600 
> > > cluster.self-heal-window-size?????????? 1 
> > > cluster.data-change-log???????????????? on 
> > > cluster.metadata-change-log???????????? on 
> > > cluster.data-self-heal-algorithm??????? full 
> > > cluster.eager-lock????????????????????? enable 
> > > disperse.eager-lock???????????????????? on 
> > > disperse.other-eager-lock?????????????? on 
> > > disperse.eager-lock-timeout???????????? 1 
> > > disperse.other-eager-lock-timeout?????? 1 
> > > cluster.quorum-type???????????????????? auto 
> > > cluster.quorum-count??????????????????? (null) 
> > > cluster.choose-local??????????????????? off 
> > > cluster.self-heal-readdir-size????????? 1KB 
> > > cluster.post-op-delay-secs????????????? 1 
> > > cluster.ensure-durability?????????????? on 
> > > cluster.consistent-metadata???????????? no 
> > > cluster.heal-wait-queue-length????????? 128 
> > > cluster.favorite-child-policy?????????? none 
> > > cluster.full-lock?????????????????????? yes 
> > > diagnostics.latency-measurement???????? off 
> > > diagnostics.dump-fd-stats?????????????? off 
> > > diagnostics.count-fop-hits????????????? off 
> > > diagnostics.brick-log-level???????????? INFO 
> > > diagnostics.client-log-level??????????? INFO 
> > > diagnostics.brick-sys-log-level???????? CRITICAL 
> > > diagnostics.client-sys-log-level??????? CRITICAL 
> > > diagnostics.brick-logger??????????????? (null) 
> > > diagnostics.client-logger?????????????? (null) 
> > > diagnostics.brick-log-format??????????? (null) 
> > > diagnostics.client-log-format??????

Vladimir Melnik

2019-Jul-04 09:28 UTC

head link

[Gluster-users] Extremely low performance - am I doing somethingwrong?

All 4 virtual machines working as nodes of the cluster are located on
the same physical server. The server has 6 SSD-modules and a
RAID-controller with a BBU. RAID level is 10, write-back cache is
enabled. Moreover, each node of the GlusterFS cluster shows normal
performance when it writes to the disk where the brick resides even with
oflag=sync:> 10485760 bytes (10 MB) copied, 0.0326612 s, 321 MB/s
> 10485760 bytes (10 MB) copied, 0.0302878 s, 346 MB/s
> 10485760 bytes (10 MB) copied, 0.0352449 s, 298 MB/s
> 10485760 bytes (10 MB) copied, 0.0316872 s, 331 MB/s
> 10485760 bytes (10 MB) copied, 0.0333189 s, 315 MB/s
So, the disk is OK and the network is OK, I'm 100% sure.

Seems to be a GlusterFS-related issue. Either something needs to be
tweaked or it's a normal performance for a replica-3 cluster.

On Thu, Jul 04, 2019 at 07:18:21AM +0300, Strahil wrote:> I think , it'related to the sync type of oflag.
> Do you have a raid controller on each brick , to immediate take the data
into the cache ?
> 
> Best Regards,
> Strahil NikolovOn Jul 3, 2019 23:15, Vladimir Melnik <v.melnik at
tucha.ua> wrote:
> >
> > Indeed, I wouldn't be surprised if I had around 80-100 MB/s, but
10-15
> > MB/s is really few. :-( 
> >
> > Even when I mount a filesystem on the same GlusterFS node, I have the 
> > following result: 
> > 10485760 bytes (10 MB) copied, 0.409856 s, 25.6 MB/s 
> > 10485760 bytes (10 MB) copied, 0.38967 s, 26.9 MB/s 
> > 10485760 bytes (10 MB) copied, 0.466758 s, 22.5 MB/s 
> > 10485760 bytes (10 MB) copied, 0.412075 s, 25.4 MB/s 
> > 10485760 bytes (10 MB) copied, 0.381626 s, 27.5 MB/s 
> >
> > At the same time on the same node when I'm writing directly to the
disk:
> > 10485760 bytes (10 MB) copied, 0.0326612 s, 321 MB/s 
> > 10485760 bytes (10 MB) copied, 0.0302878 s, 346 MB/s 
> > 10485760 bytes (10 MB) copied, 0.0352449 s, 298 MB/s 
> > 10485760 bytes (10 MB) copied, 0.0316872 s, 331 MB/s 
> > 10485760 bytes (10 MB) copied, 0.0333189 s, 315 MB/s 
> >
> > Can't explain it to myself. Are replica-3 volumes really so slow? 
> >
> > On Wed, Jul 03, 2019 at 03:16:45PM -0400, Dmitry Filonov wrote: 
> > > Well, if your network is limited to 100MB/s then it doesn't
matter if
> > > storage is capable of doing 300+MB/s. 
> > > But 15 MB/s is still way less than 100 MB/s 
> > > 
> > > P.S. just tried on my gluster and found out that am getting
~15MB/s on
> > > replica 3 volume on SSDs and... 2MB/s on replica 3 volume on
HDDs.
> > > Something to look at next week. 
> > > 
> > > 
> > > 
> > > -- 
> > > Dmitry Filonov 
> > > Linux Administrator 
> > > SBGrid Core | Harvard Medical School 
> > > 250 Longwood Ave, SGM-114 
> > > Boston, MA 02115 
> > > 
> > > 
> > > On Wed, Jul 3, 2019 at 12:18 PM Vladimir Melnik <v.melnik at
tucha.ua> wrote:
> > > 
> > > > Thank you, it helped a little: 
> > > > 
> > > > $ for i in {1..5}; do { dd if=/dev/zero
of=/mnt/glusterfs1/test.tmp bs=1M
> > > > count=10 oflag=sync; rm -f /mnt/glusterfs1/test.tmp; } done
2>&1 | grep
> > > > copied 
> > > > 10485760 bytes (10 MB) copied, 0.738968 s, 14.2 MB/s 
> > > > 10485760 bytes (10 MB) copied, 0.725296 s, 14.5 MB/s 
> > > > 10485760 bytes (10 MB) copied, 0.681508 s, 15.4 MB/s 
> > > > 10485760 bytes (10 MB) copied, 0.85566 s, 12.3 MB/s 
> > > > 10485760 bytes (10 MB) copied, 0.661457 s, 15.9 MB/s 
> > > > 
> > > > But 14-15 MB/s is still quite far from the actual
storage's performance
> > > > (200-3000 MB/s). :-( 
> > > > 
> > > > Here's full configuration dump (just in case): 
> > > > 
> > > > Option????????????????????????????????? Value 
> > > > ------????????????????????????????????? ----- 
> > > > cluster.lookup-unhashed???????????????? on 
> > > > cluster.lookup-optimize???????????????? on 
> > > > cluster.min-free-disk?????????????????? 10% 
> > > > cluster.min-free-inodes???????????????? 5% 
> > > > cluster.rebalance-stats???????????????? off 
> > > > cluster.subvols-per-directory?????????? (null) 
> > > > cluster.readdir-optimize??????????????? off 
> > > > cluster.rsync-hash-regex??????????????? (null) 
> > > > cluster.extra-hash-regex??????????????? (null) 
> > > > cluster.dht-xattr-name?????????????????
trusted.glusterfs.dht
> > > > cluster.randomize-hash-range-by-gfid??? off 
> > > > cluster.rebal-throttle????????????????? normal 
> > > > cluster.lock-migration????????????????? off 
> > > > cluster.force-migration???????????????? off 
> > > > cluster.local-volume-name?????????????? (null) 
> > > > cluster.weighted-rebalance????????????? on 
> > > > cluster.switch-pattern????????????????? (null) 
> > > > cluster.entry-change-log??????????????? on 
> > > > cluster.read-subvolume????????????????? (null) 
> > > > cluster.read-subvolume-index??????????? -1 
> > > > cluster.read-hash-mode????????????????? 1 
> > > > cluster.background-self-heal-count????? 8 
> > > > cluster.metadata-self-heal????????????? off 
> > > > cluster.data-self-heal????????????????? off 
> > > > cluster.entry-self-heal???????????????? off 
> > > > cluster.self-heal-daemon??????????????? on 
> > > > cluster.heal-timeout??????????????????? 600 
> > > > cluster.self-heal-window-size?????????? 1 
> > > > cluster.data-change-log???????????????? on 
> > > > cluster.metadata-change-log???????????? on 
> > > > cluster.data-self-heal-algorithm??????? full 
> > > > cluster.eager-lock????????????????????? enable 
> > > > disperse.eager-lock???????????????????? on 
> > > > disperse.other-eager-lock?????????????? on 
> > > > disperse.eager-lock-timeout???????????? 1 
> > > > disperse.other-eager-lock-timeout?????? 1 
> > > > cluster.quorum-type???????????????????? auto 
> > > > cluster.quorum-count??????????????????? (null) 
> > > > cluster.choose-local??????????????????? off 
> > > > cluster.self-heal-readdir-size????????? 1KB 
> > > > cluster.post-op-delay-secs????????????? 1 
> > > > cluster.ensure-durability?????????????? on 
> > > > cluster.consistent-metadata???????????? no 
> > > > cluster.heal-wait-queue-length????????? 128 
> > > > cluster.favorite-child-policy?????????? none 
> > > > cluster.full-lock?????????????????????? yes 
> > > > diagnostics.latency-measurement???????? off 
> > > > diagnostics.dump-fd-stats?????????????? off 
> > > > diagnostics.count-fop-hits????????????? off 
> > > > diagnostics.brick-log-level???????????? INFO 
> > > > diagnostics.client-log-level??????????? INFO 
> > > > diagnostics.brick-sys-log-level???????? CRITICAL 
> > > > diagnostics.client-sys-log-level??????? CRITICAL 
> > > > diagnostics.brick-logger??????????????? (null) 
> > > > diagnostics.client-logger?????????????? (null) 
> > > > diagnostics.brick-log-format??????????? (null) 
> > > > diagnostics.client-log-format??????
-- 
V.Melnik

Gluster users - Jul 2019 - Extremely low performance - am I doing somethingwrong?

[Gluster-users] Extremely low performance - am I doing somethingwrong?

[Gluster-users] Extremely low performance - am I doing somethingwrong?