thr3ads.net - Gluster users - [Gluster-users] It appears that readdir is not cached for FUSE mounts [Feb 2020]

If this information is useful, please help other people find it:
Share via:

Matthias Schniedermeyer

2020-Feb-10 12:25 UTC

[Gluster-users] It appears that readdir is not cached for FUSE mounts

Hi


I would describe our basic use case for gluster as:
"data-store for a cold-standby application".

A specific application is installed on 2 hardware machines, the data is
kept in-sync between the 2 machines by a replica-2 gluster volume.
(IOW: "RAID 1")

At any one time only 1 machine has the volume mounted and the
application running. If the machine goes down the application is started
on the remaining machine.
IOW at any one point in time there is only ever 1 "reader &
writer"
running.

I profiled a performance problem we have with this application, which
unfortunately we can't modify.

The profile shows many "opendir/readdirp/releasedir" cycles, the
directory in question has about 1000 files and the application
"stalls"
for several milliseconds any time it decides to do a readdir.
The volume is mounted via FUSE and it appears that said operation is not
cached at all.

To provide a test-case i tried to replicate what the application does.
The problematic operation is nearly perfectly emulated just by using
"ls .".

I created a script that replicates how we use gluster and demonstrates
that a FUSE-mount appears to be lacking any caching of readdir.

A word about the test-environment:
2 identical servers
Dual Socket Xeon CPU E5-2640 v3 (8 cores, 2.60GHz, HT enabled)
RAM: 128GB DDR4 ECC (8x16GB)
Storage: 2TB Intel P3520 PCIe-NVMe-SSD
Network: Gluster: 10GB/s direct connect (no switch), external: 1Gbit/s
OS: CentOS 7.7, Installed with "Minimal" ISO, everything: Default
Up2Date as of: 2020-01-21 (Kernel: 3.10.0-1062.9.1.el7.x86_64)
SELinux: Disabled
SSH-Key for 1 -> 2 exchanged
Gluster 6.7 packages installed via 'centos-release-gluster6'

see attached: gluster-testcase-no-caching-of-dir-operations-for-fuse.sh

The meat of the testcase is this:
a profile of:
ls .
vs:
ls . . . . . . . . . .
(10 dots)

 > cat /root/profile-1-times | grep DIR | head -n 3
       0.00       0.00 us       0.00 us       0.00 us              1  RELEASEDIR
       0.27      66.79 us      66.79 us      66.79 us              1  OPENDIR
      98.65   12190.30 us    9390.88 us   14989.73 us              2  READDIRP

 > cat /root/profile-10-times | grep DIR | head -n 3
       0.00       0.00 us       0.00 us       0.00 us             10  RELEASEDIR
       0.64     108.02 us      85.72 us     131.96 us             10  OPENDIR
      99.36    8388.64 us    5174.71 us   14808.77 us             20  READDIRP

This testcase shows perfect scaling.
10 times the request, results in 10 times the gluster-operations.

I would say ideally there should be no difference in the number of
gluster-operations, regardless of how often a directory is read in a
short amount of time (with no changes in between)


Is there something we can do to enable caching or otherwise improve
performance?



-- 
Matthias
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gluster-testcase-no-caching-of-dir-operations-for-fuse.sh
Type: application/x-sh
Size: 3023 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20200210/6861f5e0/attachment.sh>

Strahil Nikolov

2020-Feb-10 15:21 UTC

head link

[Gluster-users] It appears that readdir is not cached for FUSE mounts

On February 10, 2020 2:25:17 PM GMT+02:00, Matthias Schniedermeyer
<matthias-gluster-users at maxcluster.de> wrote:>Hi
>
>
>I would describe our basic use case for gluster as:
>"data-store for a cold-standby application".
>
>A specific application is installed on 2 hardware machines, the data is
>kept in-sync between the 2 machines by a replica-2 gluster volume.
>(IOW: "RAID 1")
>
>At any one time only 1 machine has the volume mounted and the
>application running. If the machine goes down the application is
>started
>on the remaining machine.
>IOW at any one point in time there is only ever 1 "reader &
writer"
>running.
>
>I profiled a performance problem we have with this application, which
>unfortunately we can't modify.
>
>The profile shows many "opendir/readdirp/releasedir" cycles, the
>directory in question has about 1000 files and the application
"stalls"
>for several milliseconds any time it decides to do a readdir.
>The volume is mounted via FUSE and it appears that said operation is
>not
>cached at all.
>
>To provide a test-case i tried to replicate what the application does.
>The problematic operation is nearly perfectly emulated just by using
>"ls .".
>
>I created a script that replicates how we use gluster and demonstrates
>that a FUSE-mount appears to be lacking any caching of readdir.
>
>A word about the test-environment:
>2 identical servers
>Dual Socket Xeon CPU E5-2640 v3 (8 cores, 2.60GHz, HT enabled)
>RAM: 128GB DDR4 ECC (8x16GB)
>Storage: 2TB Intel P3520 PCIe-NVMe-SSD
>Network: Gluster: 10GB/s direct connect (no switch), external: 1Gbit/s
>OS: CentOS 7.7, Installed with "Minimal" ISO, everything: Default
>Up2Date as of: 2020-01-21 (Kernel: 3.10.0-1062.9.1.el7.x86_64)
>SELinux: Disabled
>SSH-Key for 1 -> 2 exchanged
>Gluster 6.7 packages installed via 'centos-release-gluster6'
>
>see attached: gluster-testcase-no-caching-of-dir-operations-for-fuse.sh
>
>The meat of the testcase is this:
>a profile of:
>ls .
>vs:
>ls . . . . . . . . . .
>(10 dots)
>
> > cat /root/profile-1-times | grep DIR | head -n 3
>0.00       0.00 us       0.00 us       0.00 us              1 
>RELEASEDIR
> 0.27      66.79 us      66.79 us      66.79 us              1  OPENDIR
>98.65   12190.30 us    9390.88 us   14989.73 us              2 
>READDIRP
>
> > cat /root/profile-10-times | grep DIR | head -n 3
>0.00       0.00 us       0.00 us       0.00 us             10 
>RELEASEDIR
> 0.64     108.02 us      85.72 us     131.96 us             10  OPENDIR
>99.36    8388.64 us    5174.71 us   14808.77 us             20 
>READDIRP
>
>This testcase shows perfect scaling.
>10 times the request, results in 10 times the gluster-operations.
>
>I would say ideally there should be no difference in the number of
>gluster-operations, regardless of how often a directory is read in a
>short amount of time (with no changes in between)
>
>
>Is there something we can do to enable caching or otherwise improve
>performance?
Hi Matthias,

Have you tried the 'readdir-ahead' option .
According to docs it is useful for ' improving sequential directory read
performance' .
I'm not sure how gluster defines sequential directory read, but it's
worth trying.
Also, you can try metadata caching , as described in:
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html/administration_guide/sect-directory_operations
The actual group should contain the following:
https://github.com/gluster/glusterfs/blob/master/extras/group-metadata-cache

Best Regards,
Strahil Nikolov

Gluster users - Feb 2020 - It appears that readdir is not cached for FUSE mounts

[Gluster-users] It appears that readdir is not cached for FUSE mounts

[Gluster-users] It appears that readdir is not cached for FUSE mounts