Pedro Costa
2018-Oct-09 13:58 UTC
[Gluster-users] Quick and small file read/write optimization
Hi, I've a 1 x 3 replicated glusterfs 4.1.5 volume, that mounts using fuse on each server into /www for various Node apps that are proxied with nginx. Servers are then load balanced to split traffic. Here's the gvol1 configuration at the moment: Volume Name: gvol1 Type: Replicate Volume ID: 384acec2-XXXX-40da-YYYY-5c53d12b3ae2 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: vm0:/srv/brick1/gvol1 Brick2: vm1:/srv/brick1/gvol1 Brick3: vm2:/srv/brick1/gvol1 Options Reconfigured: cluster.strict-readdir: on client.event-threads: 4 cluster.lookup-optimize: on network.inode-lru-limit: 90000 performance.md-cache-timeout: 600 performance.cache-invalidation: on performance.cache-samba-metadata: on performance.stat-prefetch: on features.cache-invalidation-timeout: 600 features.cache-invalidation: on transport.address-family: inet nfs.disable: on performance.client-io-threads: on storage.fips-mode-rchecksum: on features.utime: on storage.ctime: on server.event-threads: 4 performance.cache-size: 500MB performance.read-ahead: on cluster.readdir-optimize: on cluster.shd-max-threads: 6 performance.strict-o-direct: on server.outstanding-rpc-limit: 128 performance.enable-least-priority: off cluster.nufa: on performance.nl-cache: on performance.nl-cache-timeout: 60 performance.cache-refresh-timeout: 10 performance.rda-cache-limit: 128MB performance.readdir-ahead: on performance.parallel-readdir: on disperse.eager-lock: off network.ping-timeout: 5 cluster.background-self-heal-count: 20 cluster.self-heal-window-size: 2 cluster.self-heal-readdir-size: 2KB On each restart the apps delete a particular folder and rebuild it from internal packages. On one such operation on a particular client to the volume I get repeated logs hundreds of times for the same guid even, sometimes: [2018-10-09 13:40:40.579161] W [MSGID: 114061] [client-common.c:2658:client_pre_flush_v2] 0-gvol1-client-2: (7955fd7a-3147-48b3-bf6a-5306ac97e10d) remote_fd is -1. EBADFD [File descriptor in bad state] [2018-10-09 13:40:40.579313] W [MSGID: 114061] [client-common.c:2658:client_pre_flush_v2] 0-gvol1-client-2: (0ac67ee4-a31e-4989-ba1e-e4f513c1f757) remote_fd is -1. EBADFD [File descriptor in bad state] [2018-10-09 13:40:40.579707] W [MSGID: 114061] [client-common.c:2658:client_pre_flush_v2] 0-gvol1-client-2: (7ea6106d-29f4-4a19-8eb6-6515ffefb9d3) remote_fd is -1. EBADFD [File descriptor in bad state] [2018-10-09 13:40:40.579911] W [MSGID: 114061] [client-common.c:2658:client_pre_flush_v2] 0-gvol1-client-2: (7ea6106d-29f4-4a19-8eb6-6515ffefb9d3) remote_fd is -1. EBADFD [File descriptor in bad state] I assume this is probably because the client didn't catch up with the previous delete? I do control the server (client to the gluster volume) that the restart occurs, and I prevent having more than one rebuilding the same app at the same time, which makes these logs odd. I've implemented the volume options above after reading most of the entries in the archive here over the last few weeks, but I'm not sure what else to tweak because other than the restart of the apps it is working pretty well. If there's any input you may have to help on this particular scenario I'd be much appreciated. Thanks, P. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181009/3b8a18fa/attachment.html>
Vlad Kopylov
2018-Oct-10 04:55 UTC
[Gluster-users] Quick and small file read/write optimization
It also matters how you mount it: glusterfs defaults,_netdev,negative-timeout=10,attribute-timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5 0 0 Options Reconfigured: performance.io-thread-count: 8 server.allow-insecure: on cluster.shd-max-threads: 12 performance.rda-cache-limit: 128MB cluster.readdir-optimize: on cluster.read-hash-mode: 0 performance.strict-o-direct: on cluster.lookup-unhashed: auto performance.nl-cache: on performance.nl-cache-timeout: 600 cluster.lookup-optimize: on client.event-threads: 4 performance.client-io-threads: on performance.md-cache-timeout: 600 server.event-threads: 4 features.cache-invalidation: on features.cache-invalidation-timeout: 600 performance.stat-prefetch: on performance.cache-invalidation: on network.inode-lru-limit: 90000 performance.cache-refresh-timeout: 10 performance.enable-least-priority: off performance.cache-size: 2GB cluster.nufa: on cluster.choose-local: on server.outstanding-rpc-limit: 128 disperse.eager-lock: off nfs.disable: on transport.address-family: inet On Tue, Oct 9, 2018 at 2:33 PM Pedro Costa <pedro at pmc.digital> wrote:> Hi, > > > > I?ve a 1 x 3 replicated glusterfs 4.1.5 volume, that mounts using fuse on > each server into /www for various Node apps that are proxied with nginx. > Servers are then load balanced to split traffic. Here?s the gvol1 > configuration at the moment: > > > > Volume Name: gvol1 > > Type: Replicate > > Volume ID: 384acec2-XXXX-40da-YYYY-5c53d12b3ae2 > > Status: Started > > Snapshot Count: 0 > > Number of Bricks: 1 x 3 = 3 > > Transport-type: tcp > > Bricks: > > Brick1: vm0:/srv/brick1/gvol1 > > Brick2: vm1:/srv/brick1/gvol1 > > Brick3: vm2:/srv/brick1/gvol1 > > Options Reconfigured: > > cluster.strict-readdir: on > > client.event-threads: 4 > > cluster.lookup-optimize: on > > network.inode-lru-limit: 90000 > > performance.md-cache-timeout: 600 > > performance.cache-invalidation: on > > performance.cache-samba-metadata: on > > performance.stat-prefetch: on > > features.cache-invalidation-timeout: 600 > > features.cache-invalidation: on > > transport.address-family: inet > > nfs.disable: on > > performance.client-io-threads: on > > storage.fips-mode-rchecksum: on > > features.utime: on > > storage.ctime: on > > server.event-threads: 4 > > performance.cache-size: 500MB > > performance.read-ahead: on > > cluster.readdir-optimize: on > > cluster.shd-max-threads: 6 > > performance.strict-o-direct: on > > server.outstanding-rpc-limit: 128 > > performance.enable-least-priority: off > > cluster.nufa: on > > performance.nl-cache: on > > performance.nl-cache-timeout: 60 > > performance.cache-refresh-timeout: 10 > > performance.rda-cache-limit: 128MB > > performance.readdir-ahead: on > > performance.parallel-readdir: on > > disperse.eager-lock: off > > network.ping-timeout: 5 > > cluster.background-self-heal-count: 20 > > cluster.self-heal-window-size: 2 > > cluster.self-heal-readdir-size: 2KB > > > > On each restart the apps delete a particular folder and rebuild it from > internal packages. On one such operation on a particular client to the > volume I get repeated logs hundreds of times for the same guid even, > sometimes: > > > > [2018-10-09 13:40:40.579161] W [MSGID: 114061] > [client-common.c:2658:client_pre_flush_v2] 0-gvol1-client-2: > (7955fd7a-3147-48b3-bf6a-5306ac97e10d) remote_fd is -1. EBADFD [File > descriptor in bad state] > > [2018-10-09 13:40:40.579313] W [MSGID: 114061] > [client-common.c:2658:client_pre_flush_v2] 0-gvol1-client-2: > (0ac67ee4-a31e-4989-ba1e-e4f513c1f757) remote_fd is -1. EBADFD [File > descriptor in bad state] > > [2018-10-09 13:40:40.579707] W [MSGID: 114061] > [client-common.c:2658:client_pre_flush_v2] 0-gvol1-client-2: > (7ea6106d-29f4-4a19-8eb6-6515ffefb9d3) remote_fd is -1. EBADFD [File > descriptor in bad state] > > [2018-10-09 13:40:40.579911] W [MSGID: 114061] > [client-common.c:2658:client_pre_flush_v2] 0-gvol1-client-2: > (7ea6106d-29f4-4a19-8eb6-6515ffefb9d3) remote_fd is -1. EBADFD [File > descriptor in bad state] > > > > I assume this is probably because the client didn?t catch up with the > previous delete? I do control the server (client to the gluster volume) > that the restart occurs, and I prevent having more than one rebuilding the > same app at the same time, which makes these logs odd. > > > > I?ve implemented the volume options above after reading most of the > entries in the archive here over the last few weeks, but I?m not sure what > else to tweak because other than the restart of the apps it is working > pretty well. > > > > If there?s any input you may have to help on this particular scenario I?d > be much appreciated. > > > > Thanks, > > P. > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181010/1262ba2b/attachment-0001.html>