Hi All, We've got a glusterfs cluster that houses some php web sites. This is generally considered a bad idea and we can see why. With performance.nl-cache on it actually turns out to be very reasonable, however, with this turned of performance is roughly 5x worse.? meaning a request that would take sub 500ms now takes 2500ms.? In other cases we see far, far worse cases, eg, with nl-cache takes ~1500ms, without takes ~30s (20x worse). So why not use nl-cache?? Well, it results in readdir reporting files which then fails to open with ENOENT.? The cache also never clears even though the configuration says nl-cache entries should only be cached for 60s.? Even for "ls -lah" in affected folders you'll notice ???? mark entries for attributes on files.? If this recovers in a reasonable time (say, a few seconds, sure). # gluster volume info Type: Replicate Volume ID: cbe08331-8b83-41ac-b56d-88ef30c0f5c7 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 = 2 Transport-type: tcp Options Reconfigured: performance.nl-cache: on cluster.readdir-optimize: on config.client-threads: 2 config.brick-threads: 4 config.global-threading: on performance.iot-pass-through: on storage.fips-mode-rchecksum: on cluster.granular-entry-heal: enable cluster.data-self-heal-algorithm: full cluster.locking-scheme: granular client.event-threads: 2 server.event-threads: 2 transport.address-family: inet nfs.disable: on cluster.metadata-self-heal: off cluster.entry-self-heal: off cluster.data-self-heal: off cluster.self-heal-daemon: on server.allow-insecure: on features.ctime: off performance.io-cache: on performance.cache-invalidation: on features.cache-invalidation: on performance.qr-cache-timeout: 600 features.cache-invalidation-timeout: 600 performance.io-cache-size: 128MB performance.cache-size: 128MB Are there any other recommendations short of abandon all hope of redundancy and to revert to a single-server setup (for the web code at least).? Currently the cost of the redundancy seems to outweigh the benefit. Glusterfs version 10.2.? With patch for --inode-table-size, mounts happen with: /usr/sbin/glusterfs --acl --reader-thread-count=2 --lru-limit=524288 --inode-table-size=524288 --invalidate-limit=16 --background-qlen=32 --fuse-mountopts=nodev,nosuid,noexec,noatime --process-name fuse --volfile-server=127.0.0.1 --volfile-id=gv_home --fuse-mountopts=nodev,nosuid,noexec,noatime /home Kind Regards, Jaco
When we used glusterfs for websites, we copied the web dir from gluster to local on frontend boots, then served it from there. Jaco Kroon <jaco at uls.co.za> ? 2022?12?14??? 12:49???> Hi All, > > We've got a glusterfs cluster that houses some php web sites. > > This is generally considered a bad idea and we can see why. > > With performance.nl-cache on it actually turns out to be very > reasonable, however, with this turned of performance is roughly 5x > worse. meaning a request that would take sub 500ms now takes 2500ms. > In other cases we see far, far worse cases, eg, with nl-cache takes > ~1500ms, without takes ~30s (20x worse). > > So why not use nl-cache? Well, it results in readdir reporting files > which then fails to open with ENOENT. The cache also never clears even > though the configuration says nl-cache entries should only be cached for > 60s. Even for "ls -lah" in affected folders you'll notice ???? mark > entries for attributes on files. If this recovers in a reasonable time > (say, a few seconds, sure). > > # gluster volume info > Type: Replicate > Volume ID: cbe08331-8b83-41ac-b56d-88ef30c0f5c7 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Options Reconfigured: > performance.nl-cache: on > cluster.readdir-optimize: on > config.client-threads: 2 > config.brick-threads: 4 > config.global-threading: on > performance.iot-pass-through: on > storage.fips-mode-rchecksum: on > cluster.granular-entry-heal: enable > cluster.data-self-heal-algorithm: full > cluster.locking-scheme: granular > client.event-threads: 2 > server.event-threads: 2 > transport.address-family: inet > nfs.disable: on > cluster.metadata-self-heal: off > cluster.entry-self-heal: off > cluster.data-self-heal: off > cluster.self-heal-daemon: on > server.allow-insecure: on > features.ctime: off > performance.io-cache: on > performance.cache-invalidation: on > features.cache-invalidation: on > performance.qr-cache-timeout: 600 > features.cache-invalidation-timeout: 600 > performance.io-cache-size: 128MB > performance.cache-size: 128MB > > Are there any other recommendations short of abandon all hope of > redundancy and to revert to a single-server setup (for the web code at > least). Currently the cost of the redundancy seems to outweigh the > benefit. > > Glusterfs version 10.2. With patch for --inode-table-size, mounts > happen with: > > /usr/sbin/glusterfs --acl --reader-thread-count=2 --lru-limit=524288 > --inode-table-size=524288 --invalidate-limit=16 --background-qlen=32 > --fuse-mountopts=nodev,nosuid,noexec,noatime --process-name fuse > --volfile-server=127.0.0.1 --volfile-id=gv_home > --fuse-mountopts=nodev,nosuid,noexec,noatime /home > > Kind Regards, > Jaco > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20221214/67f09646/attachment.html>
Quick top-post reply. We host websites on Gluster. It took us a bit of doing to get working quickly but here's roughly how we do it. I'm not saying this is *the* way to do, just a way that works for us: 1. Gluster volume with sharding enabled for storing filesystem files as a backing store for iSCSI. 2. Gluster servers also provide TGT iSCSI LUNS for multipath access. 3. Cluster of NFS servers connecting to iSCSI LUNS with multipathd. We run this using Corosync to manage redundancy/failover. Make sure you get this bit right. It's probably the hardest bit. It's probably also the bit most likely ripe for improvement. 4. VMs running web servers access the NFS data for websites. Each server runs an FS-Cache volume for the NFS mounted volumes. 5. OPCache/FPM for the dynamic stuff (assuming PHP here). Your underlying storage needs to be relatively fast/capable of course. We have our VM volumes on sharded Gluster too. My theory in setting it up this way was to avoid the small-file scenario that Gluster is not very good at for one it is good at, sharded big files. Anyway, hope that helps a little. Ronny Jaco Kroon wrote on 14/12/2022 11:28:> Hi All, > > We've got a glusterfs cluster that houses some php web sites. > > This is generally considered a bad idea and we can see why. > > With performance.nl-cache on it actually turns out to be very reasonable, however, with this turned of performance is roughly 5x worse.? meaning a request that would take sub 500ms now takes 2500ms.? In other cases we see far, far worse cases, eg, with nl-cache takes ~1500ms, without takes ~30s (20x worse). > > So why not use nl-cache?? Well, it results in readdir reporting files which then fails to open with ENOENT.? The cache also never clears even though the configuration says nl-cache entries should only be cached for 60s.? Even for "ls -lah" in affected folders you'll notice ???? mark entries for attributes on files.? If this recovers in a reasonable time (say, a few seconds, sure). > > # gluster volume info > Type: Replicate > Volume ID: cbe08331-8b83-41ac-b56d-88ef30c0f5c7 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Options Reconfigured: > performance.nl-cache: on > cluster.readdir-optimize: on > config.client-threads: 2 > config.brick-threads: 4 > config.global-threading: on > performance.iot-pass-through: on > storage.fips-mode-rchecksum: on > cluster.granular-entry-heal: enable > cluster.data-self-heal-algorithm: full > cluster.locking-scheme: granular > client.event-threads: 2 > server.event-threads: 2 > transport.address-family: inet > nfs.disable: on > cluster.metadata-self-heal: off > cluster.entry-self-heal: off > cluster.data-self-heal: off > cluster.self-heal-daemon: on > server.allow-insecure: on > features.ctime: off > performance.io-cache: on > performance.cache-invalidation: on > features.cache-invalidation: on > performance.qr-cache-timeout: 600 > features.cache-invalidation-timeout: 600 > performance.io-cache-size: 128MB > performance.cache-size: 128MB > > Are there any other recommendations short of abandon all hope of redundancy and to revert to a single-server setup (for the web code at least).? Currently the cost of the redundancy seems to outweigh the benefit. > > Glusterfs version 10.2.? With patch for --inode-table-size, mounts happen with: > > /usr/sbin/glusterfs --acl --reader-thread-count=2 --lru-limit=524288 --inode-table-size=524288 --invalidate-limit=16 --background-qlen=32 --fuse-mountopts=nodev,nosuid,noexec,noatime --process-name fuse --volfile-server=127.0.0.1 --volfile-id=gv_home --fuse-mountopts=nodev,nosuid,noexec,noatime /home > > Kind Regards, > Jaco > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-- Ronny Adsetts Technical Director Amazing Internet Ltd, London t: +44 20 8977 8943 w: www.amazinginternet.com Registered office: 85 Waldegrave Park, Twickenham, TW1 4TJ Registered in England. Company No. 4042957 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20221220/85bbead6/attachment.html>