thr3ads.net - Gluster users - [Gluster-users] Slow write times to gluster disk [May 2017]

If this information is useful, please help other people find it:
Share via:

Pranith Kumar Karampuri

2017-Apr-14 06:50 UTC

[Gluster-users] Slow write times to gluster disk

On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N <ravishankar at redhat.com>
wrote:
> Hi Pat,
>
> I'm assuming you are using gluster native (fuse mount). If it helps,
you
> could try mounting it via gluster NFS (gnfs) and then see if there is an
> improvement in speed. Fuse mounts are slower than gnfs mounts but you get
> the benefit of avoiding a single point of failure. Unlike fuse mounts, if
> the gluster node containing the gnfs server goes down, all mounts done
> using that node will fail). For fuse mounts, you could try tweaking the
> write-behind xlator settings to see if it helps. See the
> performance.write-behind and performance.write-behind-window-size options
> in `gluster volume set help`. Of course, even for gnfs mounts, you can
> achieve fail-over by using CTDB.
>
Ravi,
      Do you have any data that suggests fuse mounts are slower than gNFS
servers?

Pat,
      I see that I am late to the thread, but do you happen to have
"profile info" of the workload?

You can follow
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/
to get the information.

>
> Thanks,
> Ravi
>
>
> On 04/08/2017 12:07 AM, Pat Haley wrote:
>
>
> Hi,
>
> We noticed a dramatic slowness when writing to a gluster disk when
> compared to writing to an NFS disk. Specifically when using dd (data
> duplicator) to write a 4.3 GB file of zeros:
>
>    - on NFS disk (/home): 9.5 Gb/s
>    - on gluster disk (/gdata): 508 Mb/s
>
> The gluser disk is 2 bricks joined together, no replication or anything
> else. The hardware is (literally) the same:
>
>    - one server with 70 hard disks  and a hardware RAID card.
>    - 4 disks in a RAID-6 group (the NFS disk)
>    - 32 disks in a RAID-6 group (the max allowed by the card, /mnt/brick1)
>    - 32 disks in another RAID-6 group (/mnt/brick2)
>    - 2 hot spare
>
> Some additional information and more tests results (after changing the log
> level):
>
> glusterfs 3.7.11 built on Apr 27 2016 14:09:22
> CentOS release 6.8 (Final)
> RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108
> [Invader] (rev 02)
>
>
>
> *Create the file to /gdata (gluster)*
> [root at mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M
count=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*
>
> *Create the file to /home (ext4)*
> [root at mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M
count=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3 times as fast
>
>
>
> * Copy from /gdata to /gdata (gluster to gluster) *[root at mseas-data2
> gdata]# dd if=/gdata/zero1 of=/gdata/zero2
> 2048000+0 records in
> 2048000+0 records out
> 1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* - realllyyy
> slooowww
>
>
> *Copy from /gdata to /gdata* *2nd time (gluster to gluster)*
> [root at mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
> 2048000+0 records in
> 2048000+0 records out
> 1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* - realllyyy
> slooowww again
>
>
>
> *Copy from /home to /home (ext4 to ext4)*
> [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
> 2048000+0 records in
> 2048000+0 records out
> 1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times as fast
>
>
> *Copy from /home to /home (ext4 to ext4)*
> [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
> 2048000+0 records in
> 2048000+0 records out
> 1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times as fast
>
>
> As a test, can we copy data directly to the xfs mountpoint (/mnt/brick1)
> and bypass gluster?
>
>
> Any help you could give us would be appreciated.
>
> Thanks
>
> --
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Pat Haley                          Email:  phaley at mit.edu
> Center for Ocean Engineering       Phone:  (617) 253-6824
> Dept. of Mechanical Engineering    Fax:    (617) 253-8125
> MIT, Room 5-213                    http://web.mit.edu/phaley/www/
> 77 Massachusetts Avenue
> Cambridge, MA  02139-4301
>
>
>
> _______________________________________________
> Gluster-users mailing listGluster-users at
gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>


-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170414/dd3f9839/attachment.html>

Ravishankar N

2017-Apr-14 07:01 UTC

head link

[Gluster-users] Slow write times to gluster disk

On 04/14/2017 12:20 PM, Pranith Kumar Karampuri wrote:>
>
> On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N <ravishankar at
redhat.com
> <mailto:ravishankar at redhat.com>> wrote:
>
>     Hi Pat,
>
>     I'm assuming you are using gluster native (fuse mount). If it
>     helps, you could try mounting it via gluster NFS (gnfs) and then
>     see if there is an improvement in speed. Fuse mounts are slower
>     than gnfs mounts but you get the benefit of avoiding a single
>     point of failure. Unlike fuse mounts, if the gluster node
>     containing the gnfs server goes down, all mounts done using that
>     node will fail). For fuse mounts, you could try tweaking the
>     write-behind xlator settings to see if it helps. See the
>     performance.write-behind and performance.write-behind-window-size
>     options in `gluster volume set help`. Of course, even for gnfs
>     mounts, you can achieve fail-over by using CTDB.
>
>
> Ravi,
>       Do you have any data that suggests fuse mounts are slower than 
> gNFS servers?I have heard anecdotal evidence time and again on the ML and IRC, which 
is why I wanted to compare it with NFS numbers on his
setup.>
> Pat,
>       I see that I am late to the thread, but do you happen to have 
> "profile info" of the workload?
>
> You can follow 
>
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/
> to get the information.Yeah, Let's see if profile info shows up anything interesting.
-Ravi>
>
>     Thanks,
>     Ravi
>
>
>     On 04/08/2017 12:07 AM, Pat Haley wrote:
>>
>>     Hi,
>>
>>     We noticed a dramatic slowness when writing to a gluster disk
>>     when compared to writing to an NFS disk. Specifically when using
>>     dd (data duplicator) to write a 4.3 GB file of zeros:
>>
>>       * on NFS disk (/home): 9.5 Gb/s
>>       * on gluster disk (/gdata): 508 Mb/s
>>
>>     The gluser disk is 2 bricks joined together, no replication or
>>     anything else. The hardware is (literally) the same:
>>
>>       * one server with 70 hard disks  and a hardware RAID card.
>>       * 4 disks in a RAID-6 group (the NFS disk)
>>       * 32 disks in a RAID-6 group (the max allowed by the card,
>>         /mnt/brick1)
>>       * 32 disks in another RAID-6 group (/mnt/brick2)
>>       * 2 hot spare
>>
>>     Some additional information and more tests results (after
>>     changing the log level):
>>
>>     glusterfs 3.7.11 built on Apr 27 2016 14:09:22
>>     CentOS release 6.8 (Final)
>>     RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3
>>     3108 [Invader] (rev 02)
>>
>>
>>
>>     *Create the file to /gdata (gluster)*
>>     [root at mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M
>>     count=1000
>>     1000+0 records in
>>     1000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*
>>
>>     *Create the file to /home (ext4)*
>>     [root at mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M
>>     count=1000
>>     1000+0 records in
>>     1000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3
>>     times as fast*
>>
>>
>>     Copy from /gdata to /gdata (gluster to gluster)
>>     *[root at mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>>     2048000+0 records in
>>     2048000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* -
>>     realllyyy slooowww
>>
>>
>>     *Copy from /gdata to /gdata* *2nd time *(gluster to gluster)**
>>     [root at mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>>     2048000+0 records in
>>     2048000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* -
>>     realllyyy slooowww again
>>
>>
>>
>>     *Copy from /home to /home (ext4 to ext4)*
>>     [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
>>     2048000+0 records in
>>     2048000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times
>>     as fast
>>
>>
>>     *Copy from /home to /home (ext4 to ext4)*
>>     [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
>>     2048000+0 records in
>>     2048000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times
>>     as fast
>>
>>
>>     As a test, can we copy data directly to the xfs mountpoint
>>     (/mnt/brick1) and bypass gluster?
>>
>>
>>     Any help you could give us would be appreciated.
>>
>>     Thanks
>>
>>     -- 
>>
>>     -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>     Pat Haley                          Email:phaley at mit.edu
<mailto:phaley at mit.edu>
>>     Center for Ocean Engineering       Phone:  (617) 253-6824
>>     Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>>     MIT, Room 5-213http://web.mit.edu/phaley/www/
>>     77 Massachusetts Avenue
>>     Cambridge, MA  02139-4301
>>
>>     _______________________________________________
>>     Gluster-users mailing list
>>     Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>     http://lists.gluster.org/mailman/listinfo/gluster-users
>>     <http://lists.gluster.org/mailman/listinfo/gluster-users>
>
>     _______________________________________________ Gluster-users
>     mailing list Gluster-users at gluster.org
>     <mailto:Gluster-users at gluster.org>
>     http://lists.gluster.org/mailman/listinfo/gluster-users
>     <http://lists.gluster.org/mailman/listinfo/gluster-users> 
>
> -- 
> Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170414/56a5c46f/attachment.html>

Pat Haley

2017-May-03 04:50 UTC

head link

[Gluster-users] Slow write times to gluster disk

Hi Pranith & Ravi,

Sorry for the delay.  I have the profile info for the past couple of 
days just below.  Is this of any help to you or is there additional 
information I can request?

Brick: mseas-data2:/mnt/brick2
------------------------------
Cumulative Stats:
    Block Size:                  1b+ 2b+                   4b+
  No. of Reads:                    6 38                  1144
No. of Writes:            108032195 8352125             141319922

    Block Size:                  8b+ 16b+                  32b+
  No. of Reads:                  689 1256                  2756
No. of Writes:             13946933 20694915              57845473

    Block Size:                 64b+ 128b+                 256b+
  No. of Reads:                 5522 56492                149462
No. of Writes:            714398165 11923303               2537176

    Block Size:                512b+ 1024b+                2048b+
  No. of Reads:                64285 192872                200488
No. of Writes:              5975842 217173849              94536339

    Block Size:               4096b+ 8192b+               16384b+
  No. of Reads:               300021 764297               1613672
No. of Writes:            112481858 53164978             330177486

    Block Size:              32768b+ 65536b+              131072b+
  No. of Reads:              5101884 14470916            4958306977
No. of Writes:             35098110 19969017            2243344759

    Block Size:             262144b+
  No. of Reads:                    0
No. of Writes:                  547
  %-latency   Avg-latency   Min-Latency   Max-Latency   No. of 
calls         Fop
  ---------   -----------   -----------   ----------- 
------------        ----
       0.00       0.00 us       0.00 us       0.00 us 4052087      FORGET
       0.00       0.00 us       0.00 us       0.00 us 6381234     RELEASE
       0.00       0.00 us       0.00 us       0.00 us 28716633  RELEASEDIR
       0.00      92.81 us      48.00 us     130.00 us             53    
READLINK
       0.00     201.22 us     112.00 us     457.00 us            
188       RMDIR
       0.00     169.36 us      53.00 us   20417.00 us            347    
SETXATTR
       0.00   20497.89 us     241.00 us   57505.00 us             45     
SYMLINK
       0.00     116.97 us      42.00 us   39168.00 us           9172     
SETATTR
       0.00     380.06 us      76.00 us  198427.00 us           
3133        LINK
       0.00     149.60 us      14.00 us  601941.00 us          14426     
INODELK
       0.00     387.81 us      69.00 us  161114.00 us           
6617      RENAME
       0.01      96.47 us      14.00 us 1224734.00 us          
63599      STATFS
       0.01   25041.48 us     299.00 us   93211.00 us            
348       MKDIR
       0.01     380.41 us      31.00 us  561724.00 us          
31452        OPEN
       0.02    1346.42 us      64.00 us  226741.00 us          
18306      UNLINK
       0.02    2123.19 us      42.00 us  802398.00 us          12370   
FTRUNCATE
       0.04   12161.88 us     175.00 us  158072.00 us           
3244       MKNOD
       0.07  132801.87 us      39.00 us 3144448.00 us            
532       FSYNC
       0.13      89.98 us       4.00 us 5550246.00 us 1492793       FLUSH
       0.45      65.89 us       6.00 us 3608035.00 us 7194229       FSTAT
       0.57   14538.33 us     162.00 us 4577282.00 us          
41466      CREATE
       0.70    3183.52 us      16.00 us 4358324.00 us         231728     
OPENDIR
       1.67    7559.32 us       8.00 us 4193443.00 us         
234012        STAT
       2.26     119.27 us      11.00 us 4491219.00 us 20093638       WRITE
       2.51     207.00 us      10.00 us 4993074.00 us 12884466        READ
       4.17     246.12 us      13.00 us 8857354.00 us 17952607    GETXATTR
      23.72   48775.51 us      14.00 us 5022445.00 us         515770    
READDIRP
      63.65    1238.53 us      25.00 us 4483760.00 us 54507520      LOOKUP

     Duration: 9810315 seconds
    Data Read: 651660783328883 bytes
Data Written: 305412177327433 bytes

Interval 0 Stats:
    Block Size:                  1b+ 2b+                   4b+
  No. of Reads:                    6 38                  1144
No. of Writes:            108032195 8352125             141319922

    Block Size:                  8b+ 16b+                  32b+
  No. of Reads:                  689 1256                  2756
No. of Writes:             13946933 20694915              57845473

    Block Size:                 64b+ 128b+                 256b+
  No. of Reads:                 5522 56492                149462
No. of Writes:            714398165 11923303               2537176

    Block Size:                512b+ 1024b+                2048b+
  No. of Reads:                64285 192872                200488
No. of Writes:              5975842 217173849              94536339

    Block Size:               4096b+ 8192b+               16384b+
  No. of Reads:               300021 764297               1613672
No. of Writes:            112481858 53164978             330177486

    Block Size:              32768b+ 65536b+              131072b+
  No. of Reads:              5101884 14470916            4958306977
No. of Writes:             35098110 19969017            2243344759

    Block Size:             262144b+
  No. of Reads:                    0
No. of Writes:                  547
  %-latency   Avg-latency   Min-Latency   Max-Latency   No. of 
calls         Fop
  ---------   -----------   -----------   ----------- 
------------        ----
       0.00       0.00 us       0.00 us       0.00 us 4052087      FORGET
       0.00       0.00 us       0.00 us       0.00 us 6381233     RELEASE
       0.00       0.00 us       0.00 us       0.00 us 28716630  RELEASEDIR
       0.00      92.81 us      48.00 us     130.00 us             53    
READLINK
       0.00     201.22 us     112.00 us     457.00 us            
188       RMDIR
       0.00     169.36 us      53.00 us   20417.00 us            347    
SETXATTR
       0.00   20497.89 us     241.00 us   57505.00 us             45     
SYMLINK
       0.00     116.97 us      42.00 us   39168.00 us           9172     
SETATTR
       0.00     380.06 us      76.00 us  198427.00 us           
3133        LINK
       0.00     149.60 us      14.00 us  601941.00 us          14426     
INODELK
       0.00     387.81 us      69.00 us  161114.00 us           
6617      RENAME
       0.01      96.47 us      14.00 us 1224734.00 us          
63599      STATFS
       0.01   25041.48 us     299.00 us   93211.00 us            
348       MKDIR
       0.01     380.41 us      31.00 us  561724.00 us          
31452        OPEN
       0.02    1346.42 us      64.00 us  226741.00 us          
18306      UNLINK
       0.02    2123.19 us      42.00 us  802398.00 us          12370   
FTRUNCATE
       0.04   12161.88 us     175.00 us  158072.00 us           
3244       MKNOD
       0.07  132801.87 us      39.00 us 3144448.00 us            
532       FSYNC
       0.13      89.98 us       4.00 us 5550246.00 us 1492793       FLUSH
       0.45      65.89 us       6.00 us 3608035.00 us 7194229       FSTAT
       0.57   14538.33 us     162.00 us 4577282.00 us          
41466      CREATE
       0.70    3183.52 us      16.00 us 4358324.00 us         231728     
OPENDIR
       1.67    7559.32 us       8.00 us 4193443.00 us         
234012        STAT
       2.26     119.27 us      11.00 us 4491219.00 us 20093638       WRITE
       2.51     207.00 us      10.00 us 4993074.00 us 12884466        READ
       4.17     246.12 us      13.00 us 8857354.00 us 17952607    GETXATTR
      23.72   48775.51 us      14.00 us 5022445.00 us         515770    
READDIRP
      63.65    1238.53 us      25.00 us 4483760.00 us 54507520      LOOKUP

     Duration: 9810315 seconds
    Data Read: 651660783328883 bytes
Data Written: 305412177327433 bytes

Brick: mseas-data2:/mnt/brick1
------------------------------
Cumulative Stats:
    Block Size:                  1b+ 2b+                   4b+
  No. of Reads:                    4 38                  1482
No. of Writes:            643631512 59055444             235532859

    Block Size:                  8b+ 16b+                  32b+
  No. of Reads:                 1171 2138                  4748
No. of Writes:             31816870 23602175              50161322

    Block Size:                 64b+ 128b+                 256b+
  No. of Reads:                 9461 65360                165954
No. of Writes:            711114605 11760241               4078907

    Block Size:                512b+ 1024b+                2048b+
  No. of Reads:                94563 226053                258803
No. of Writes:              6366990 211643393              95831137

    Block Size:               4096b+ 8192b+               16384b+
  No. of Reads:               383871 1032345               2244921
No. of Writes:            155833532 57850303             339892660

    Block Size:              32768b+ 65536b+              131072b+
  No. of Reads:              7588068 22368398            5387488199
No. of Writes:             38588368 25195605            2463004132

    Block Size:             262144b+
  No. of Reads:                    0
No. of Writes:                  489
  %-latency   Avg-latency   Min-Latency   Max-Latency   No. of 
calls         Fop
  ---------   -----------   -----------   ----------- 
------------        ----
       0.00       0.00 us       0.00 us       0.00 us 4060396      FORGET
       0.00       0.00 us       0.00 us       0.00 us 6244016     RELEASE
       0.00       0.00 us       0.00 us       0.00 us 28716852  RELEASEDIR
       0.00      96.42 us      61.00 us     148.00 us             40    
READLINK
       0.00     208.36 us     114.00 us     322.00 us            
188       RMDIR
       0.00    2231.61 us      57.00 us  716342.00 us            347    
SETXATTR
       0.00   20821.92 us     758.00 us   57852.00 us             38     
SYMLINK
       0.00     519.11 us      76.00 us  952378.00 us           
3149        LINK
       0.00     196.97 us      50.00 us  736928.00 us           9055     
SETATTR
       0.00     164.34 us      18.00 us  736161.00 us          13460     
INODELK
       0.00     375.54 us      73.00 us  198362.00 us           
6274      RENAME
       0.01   20913.10 us     351.00 us  102696.00 us            
348       MKDIR
       0.01     151.39 us      17.00 us  782025.00 us          
63598      STATFS
       0.03    1103.67 us      34.00 us  618187.00 us          
29597        OPEN
       0.03    2833.17 us      43.00 us 1069257.00 us          11693   
FTRUNCATE
       0.04    2267.87 us      61.00 us 3746134.00 us          
17859      UNLINK
       0.04   13105.16 us     254.00 us  179505.00 us           
3177       MKNOD
       0.05   88496.76 us      21.00 us 1718559.00 us            
613       FSYNC
       0.58      73.42 us       6.00 us 1917794.00 us 7848483       FSTAT
       0.71   17177.23 us     177.00 us 7077794.00 us          
40554      CREATE
       0.79     585.79 us       3.00 us 11107703.00 us        
1322036       FLUSH
       1.72    7459.40 us       9.00 us 2764285.00 us         
228033        STAT
       1.96    8350.73 us      19.00 us 2235725.00 us         231728     
OPENDIR
       2.60     115.35 us      12.00 us 4196355.00 us 22239110       WRITE
       4.60     313.20 us      10.00 us 6211594.00 us 14494253        READ
       5.98     307.95 us      13.00 us 9885480.00 us 19163193    GETXATTR
      25.68   48514.34 us      17.00 us 4734636.00 us         522162    
READDIRP
      55.15    1075.93 us      26.00 us 4291535.00 us 50562855      LOOKUP

     Duration: 9810315 seconds
    Data Read: 708869551853133 bytes
Data Written: 335305857076797 bytes

Interval 0 Stats:
    Block Size:                  1b+ 2b+                   4b+
  No. of Reads:                    4 38                  1482
No. of Writes:            643631512 59055444             235532859

    Block Size:                  8b+ 16b+                  32b+
  No. of Reads:                 1171 2138                  4748
No. of Writes:             31816870 23602175              50161322

    Block Size:                 64b+ 128b+                 256b+
  No. of Reads:                 9461 65360                165954
No. of Writes:            711114605 11760241               4078907

    Block Size:                512b+ 1024b+                2048b+
  No. of Reads:                94563 226053                258803
No. of Writes:              6366990 211643393              95831137

    Block Size:               4096b+ 8192b+               16384b+
  No. of Reads:               383871 1032345               2244921
No. of Writes:            155833532 57850303             339892660

    Block Size:              32768b+ 65536b+              131072b+
  No. of Reads:              7588068 22368398            5387488199
No. of Writes:             38588368 25195605            2463004132

    Block Size:             262144b+
  No. of Reads:                    0
No. of Writes:                  489
  %-latency   Avg-latency   Min-Latency   Max-Latency   No. of 
calls         Fop
  ---------   -----------   -----------   ----------- 
------------        ----
       0.00       0.00 us       0.00 us       0.00 us 4060397      FORGET
       0.00       0.00 us       0.00 us       0.00 us 6244015     RELEASE
       0.00       0.00 us       0.00 us       0.00 us 28716850  RELEASEDIR
       0.00      96.42 us      61.00 us     148.00 us             40    
READLINK
       0.00     208.36 us     114.00 us     322.00 us            
188       RMDIR
       0.00    2231.61 us      57.00 us  716342.00 us            347    
SETXATTR
       0.00   20821.92 us     758.00 us   57852.00 us             38     
SYMLINK
       0.00     519.11 us      76.00 us  952378.00 us           
3149        LINK
       0.00     196.97 us      50.00 us  736928.00 us           9055     
SETATTR
       0.00     164.34 us      18.00 us  736161.00 us          13460     
INODELK
       0.00     375.54 us      73.00 us  198362.00 us           
6274      RENAME
       0.01   20913.10 us     351.00 us  102696.00 us            
348       MKDIR
       0.01     151.39 us      17.00 us  782025.00 us          
63598      STATFS
       0.03    1103.67 us      34.00 us  618187.00 us          
29597        OPEN
       0.03    2833.17 us      43.00 us 1069257.00 us          11693   
FTRUNCATE
       0.04    2267.87 us      61.00 us 3746134.00 us          
17859      UNLINK
       0.04   13105.16 us     254.00 us  179505.00 us           
3177       MKNOD
       0.05   88496.76 us      21.00 us 1718559.00 us            
613       FSYNC
       0.58      73.42 us       6.00 us 1917794.00 us 7848483       FSTAT
       0.71   17177.23 us     177.00 us 7077794.00 us          
40554      CREATE
       0.79     585.79 us       3.00 us 11107703.00 us        
1322036       FLUSH
       1.72    7459.40 us       9.00 us 2764285.00 us         
228033        STAT
       1.96    8350.73 us      19.00 us 2235725.00 us         231728     
OPENDIR
       2.60     115.35 us      12.00 us 4196355.00 us 22239110       WRITE
       4.60     313.20 us      10.00 us 6211594.00 us 14494253        READ
       5.98     307.95 us      13.00 us 9885480.00 us 19163193    GETXATTR
      25.68   48514.34 us      17.00 us 4734636.00 us         522162    
READDIRP
      55.15    1075.93 us      26.00 us 4291535.00 us 50562855      LOOKUP

     Duration: 9810315 seconds
    Data Read: 708869551853133 bytes
Data Written: 335305857076797 bytes


On 04/14/2017 02:50 AM, Pranith Kumar Karampuri wrote:>
>
> On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N <ravishankar at
redhat.com
> <mailto:ravishankar at redhat.com>> wrote:
>
>     Hi Pat,
>
>     I'm assuming you are using gluster native (fuse mount). If it
>     helps, you could try mounting it via gluster NFS (gnfs) and then
>     see if there is an improvement in speed. Fuse mounts are slower
>     than gnfs mounts but you get the benefit of avoiding a single
>     point of failure. Unlike fuse mounts, if the gluster node
>     containing the gnfs server goes down, all mounts done using that
>     node will fail). For fuse mounts, you could try tweaking the
>     write-behind xlator settings to see if it helps. See the
>     performance.write-behind and performance.write-behind-window-size
>     options in `gluster volume set help`. Of course, even for gnfs
>     mounts, you can achieve fail-over by using CTDB.
>
>
> Ravi,
>       Do you have any data that suggests fuse mounts are slower than 
> gNFS servers?
>
> Pat,
>       I see that I am late to the thread, but do you happen to have 
> "profile info" of the workload?
>
> You can follow 
>
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/
> to get the information.
>
>
>     Thanks,
>     Ravi
>
>
>     On 04/08/2017 12:07 AM, Pat Haley wrote:
>>
>>     Hi,
>>
>>     We noticed a dramatic slowness when writing to a gluster disk
>>     when compared to writing to an NFS disk. Specifically when using
>>     dd (data duplicator) to write a 4.3 GB file of zeros:
>>
>>       * on NFS disk (/home): 9.5 Gb/s
>>       * on gluster disk (/gdata): 508 Mb/s
>>
>>     The gluser disk is 2 bricks joined together, no replication or
>>     anything else. The hardware is (literally) the same:
>>
>>       * one server with 70 hard disks  and a hardware RAID card.
>>       * 4 disks in a RAID-6 group (the NFS disk)
>>       * 32 disks in a RAID-6 group (the max allowed by the card,
>>         /mnt/brick1)
>>       * 32 disks in another RAID-6 group (/mnt/brick2)
>>       * 2 hot spare
>>
>>     Some additional information and more tests results (after
>>     changing the log level):
>>
>>     glusterfs 3.7.11 built on Apr 27 2016 14:09:22
>>     CentOS release 6.8 (Final)
>>     RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3
>>     3108 [Invader] (rev 02)
>>
>>
>>
>>     *Create the file to /gdata (gluster)*
>>     [root at mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M
>>     count=1000
>>     1000+0 records in
>>     1000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*
>>
>>     *Create the file to /home (ext4)*
>>     [root at mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M
>>     count=1000
>>     1000+0 records in
>>     1000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3
>>     times as fast*
>>
>>
>>     Copy from /gdata to /gdata (gluster to gluster)
>>     *[root at mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>>     2048000+0 records in
>>     2048000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* -
>>     realllyyy slooowww
>>
>>
>>     *Copy from /gdata to /gdata* *2nd time *(gluster to gluster)**
>>     [root at mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>>     2048000+0 records in
>>     2048000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* -
>>     realllyyy slooowww again
>>
>>
>>
>>     *Copy from /home to /home (ext4 to ext4)*
>>     [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
>>     2048000+0 records in
>>     2048000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times
>>     as fast
>>
>>
>>     *Copy from /home to /home (ext4 to ext4)*
>>     [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
>>     2048000+0 records in
>>     2048000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times
>>     as fast
>>
>>
>>     As a test, can we copy data directly to the xfs mountpoint
>>     (/mnt/brick1) and bypass gluster?
>>
>>
>>     Any help you could give us would be appreciated.
>>
>>     Thanks
>>
>>     -- 
>>
>>     -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>     Pat Haley                          Email:phaley at mit.edu
<mailto:phaley at mit.edu>
>>     Center for Ocean Engineering       Phone:  (617) 253-6824
>>     Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>>     MIT, Room 5-213http://web.mit.edu/phaley/www/
>>     77 Massachusetts Avenue
>>     Cambridge, MA  02139-4301
>>
>>     _______________________________________________
>>     Gluster-users mailing list
>>     Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>     http://lists.gluster.org/mailman/listinfo/gluster-users
>>     <http://lists.gluster.org/mailman/listinfo/gluster-users>
>
>     _______________________________________________ Gluster-users
>     mailing list Gluster-users at gluster.org
>     <mailto:Gluster-users at gluster.org>
>     http://lists.gluster.org/mailman/listinfo/gluster-users
>     <http://lists.gluster.org/mailman/listinfo/gluster-users> 
>
> -- 
> Pranith-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  phaley at mit.edu
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170503/b8895d6c/attachment.html>

Joe Julian

2017-May-16 16:08 UTC

head link

[Gluster-users] Slow write times to gluster disk

On 04/13/17 23:50, Pranith Kumar Karampuri wrote:>
>
> On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N <ravishankar at
redhat.com
> <mailto:ravishankar at redhat.com>> wrote:
>
>     Hi Pat,
>
>     I'm assuming you are using gluster native (fuse mount). If it
>     helps, you could try mounting it via gluster NFS (gnfs) and then
>     see if there is an improvement in speed. Fuse mounts are slower
>     than gnfs mounts but you get the benefit of avoiding a single
>     point of failure. Unlike fuse mounts, if the gluster node
>     containing the gnfs server goes down, all mounts done using that
>     node will fail). For fuse mounts, you could try tweaking the
>     write-behind xlator settings to see if it helps. See the
>     performance.write-behind and performance.write-behind-window-size
>     options in `gluster volume set help`. Of course, even for gnfs
>     mounts, you can achieve fail-over by using CTDB.
>
>
> Ravi,
>       Do you have any data that suggests fuse mounts are slower than 
> gNFS servers?
>
> Pat,
>       I see that I am late to the thread, but do you happen to have 
> "profile info" of the workload?
>
I have done actual testing. For directory ops, NFS is faster due to the 
default cache settings in the kernel. For raw throughput, or ops on an 
open file, fuse is faster.

I have yet to test this but I expect with the newer caching features in 
3.8+, even directory op performance should be similar to nfs and more 
accurate.
> You can follow 
>
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/
> to get the information.
>
>
>     Thanks,
>     Ravi
>
>
>     On 04/08/2017 12:07 AM, Pat Haley wrote:
>>
>>     Hi,
>>
>>     We noticed a dramatic slowness when writing to a gluster disk
>>     when compared to writing to an NFS disk. Specifically when using
>>     dd (data duplicator) to write a 4.3 GB file of zeros:
>>
>>       * on NFS disk (/home): 9.5 Gb/s
>>       * on gluster disk (/gdata): 508 Mb/s
>>
>>     The gluser disk is 2 bricks joined together, no replication or
>>     anything else. The hardware is (literally) the same:
>>
>>       * one server with 70 hard disks  and a hardware RAID card.
>>       * 4 disks in a RAID-6 group (the NFS disk)
>>       * 32 disks in a RAID-6 group (the max allowed by the card,
>>         /mnt/brick1)
>>       * 32 disks in another RAID-6 group (/mnt/brick2)
>>       * 2 hot spare
>>
>>     Some additional information and more tests results (after
>>     changing the log level):
>>
>>     glusterfs 3.7.11 built on Apr 27 2016 14:09:22
>>     CentOS release 6.8 (Final)
>>     RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3
>>     3108 [Invader] (rev 02)
>>
>>
>>
>>     *Create the file to /gdata (gluster)*
>>     [root at mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M
>>     count=1000
>>     1000+0 records in
>>     1000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*
>>
>>     *Create the file to /home (ext4)*
>>     [root at mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M
>>     count=1000
>>     1000+0 records in
>>     1000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3
>>     times as fast*
>>
>>
>>     Copy from /gdata to /gdata (gluster to gluster)
>>     *[root at mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>>     2048000+0 records in
>>     2048000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* -
>>     realllyyy slooowww
>>
>>
>>     *Copy from /gdata to /gdata* *2nd time *(gluster to gluster)**
>>     [root at mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>>     2048000+0 records in
>>     2048000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* -
>>     realllyyy slooowww again
>>
>>
>>
>>     *Copy from /home to /home (ext4 to ext4)*
>>     [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
>>     2048000+0 records in
>>     2048000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times
>>     as fast
>>
>>
>>     *Copy from /home to /home (ext4 to ext4)*
>>     [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
>>     2048000+0 records in
>>     2048000+0 records out
>>     1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times
>>     as fast
>>
>>
>>     As a test, can we copy data directly to the xfs mountpoint
>>     (/mnt/brick1) and bypass gluster?
>>
>>
>>     Any help you could give us would be appreciated.
>>
>>     Thanks
>>
>>     -- 
>>
>>     -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>     Pat Haley                          Email:phaley at mit.edu
<mailto:phaley at mit.edu>
>>     Center for Ocean Engineering       Phone:  (617) 253-6824
>>     Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>>     MIT, Room 5-213http://web.mit.edu/phaley/www/
>>     77 Massachusetts Avenue
>>     Cambridge, MA  02139-4301
>>
>>     _______________________________________________
>>     Gluster-users mailing list
>>     Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>     http://lists.gluster.org/mailman/listinfo/gluster-users
>>     <http://lists.gluster.org/mailman/listinfo/gluster-users>
>
>     _______________________________________________ Gluster-users
>     mailing list Gluster-users at gluster.org
>     <mailto:Gluster-users at gluster.org>
>     http://lists.gluster.org/mailman/listinfo/gluster-users
>     <http://lists.gluster.org/mailman/listinfo/gluster-users> 
>
> -- 
> Pranith
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170516/bea82c73/attachment.html>

Gluster users - May 2017 - Slow write times to gluster disk

[Gluster-users] Slow write times to gluster disk

[Gluster-users] Slow write times to gluster disk

[Gluster-users] Slow write times to gluster disk

[Gluster-users] Slow write times to gluster disk