Artem Russakovskii
2020-May-18 22:10 UTC
[Gluster-users] Extremely slow file listing in folders with many files
Hi, Does the gluster team have any feedback about this? Resolving the "Found anomalies" issues may be key to resolving dir list speed issues. Sincerely, Artem -- Founder, Android Police <http://www.androidpolice.com>, APK Mirror <http://www.apkmirror.com/>, Illogical Robot LLC beerpla.net | @ArtemR <http://twitter.com/ArtemR> On Thu, Apr 30, 2020 at 10:36 PM Strahil Nikolov <hunter86_bg at yahoo.com> wrote:> On April 30, 2020 9:05:19 PM GMT+03:00, Artem Russakovskii < > archon810 at gmail.com> wrote: > >I did this on the same prod instance just now. > > > >'find' on a fuse gluster dir with 40k+ files: > >1st run: 3m56.261s > >2nd run: 0m24.970s > >3rd run: 0m24.099s > > > >At this point, I killed all gluster services on one of the 4 servers > >and > >verified that that brick went offline. > > > >1st run: 0m38.131s > >2nd run: 0m19.369s > >3rd run: 0m23.576s > > > >Nothing conclusive really IMO. > > > >Sincerely, > >Artem > > > >-- > >Founder, Android Police <http://www.androidpolice.com>, APK Mirror > ><http://www.apkmirror.com/>, Illogical Robot LLC > >beerpla.net | @ArtemR <http://twitter.com/ArtemR> > > > > > >On Thu, Apr 30, 2020 at 9:55 AM Strahil Nikolov <hunter86_bg at yahoo.com> > >wrote: > > > >> On April 30, 2020 6:27:10 PM GMT+03:00, Artem Russakovskii < > >> archon810 at gmail.com> wrote: > >> >Hi Strahil, in the original email I included both the times for the > >> >first > >> >and subsequent reads on the fuse mounted gluster volume as well as > >the > >> >xfs > >> >filesystem the gluster data resides on (this is the brick, right?). > >> > > >> >On Thu, Apr 30, 2020, 7:44 AM Strahil Nikolov > ><hunter86_bg at yahoo.com> > >> >wrote: > >> > > >> >> On April 30, 2020 4:24:23 AM GMT+03:00, Artem Russakovskii < > >> >> archon810 at gmail.com> wrote: > >> >> >Hi all, > >> >> > > >> >> >We have 500GB and 10TB 4x1 replicate xfs-based gluster volumes, > >and > >> >the > >> >> >10TB one especially is extremely slow to do certain things with > >(and > >> >> >has > >> >> >been since gluster 3.x when we started). We're currently on 5.13. > >> >> > > >> >> >The number of files isn't even what I'd consider that great - > >under > >> >> >100k > >> >> >per dir. > >> >> > > >> >> >Here are some numbers to look at: > >> >> > > >> >> >On gluster volume in a dir of 45k files: > >> >> >The first time > >> >> > > >> >> >time find | wc -l > >> >> >45423 > >> >> >real 8m44.819s > >> >> >user 0m0.459s > >> >> >sys 0m0.998s > >> >> > > >> >> >And again > >> >> > > >> >> >time find | wc -l > >> >> >45423 > >> >> >real 0m34.677s > >> >> >user 0m0.291s > >> >> >sys 0m0.754s > >> >> > > >> >> > > >> >> >If I run the same operation on the xfs block device itself: > >> >> >The first time > >> >> > > >> >> >time find | wc -l > >> >> >45423 > >> >> >real 0m13.514s > >> >> >user 0m0.144s > >> >> >sys 0m0.501s > >> >> > > >> >> >And again > >> >> > > >> >> >time find | wc -l > >> >> >45423 > >> >> >real 0m0.197s > >> >> >user 0m0.088s > >> >> >sys 0m0.106s > >> >> > > >> >> > > >> >> >I'd expect a performance difference here but just as it was > >several > >> >> >years > >> >> >ago when we started with gluster, it's still huge, and simple > >file > >> >> >listings > >> >> >are incredibly slow. > >> >> > > >> >> >At the time, the team was looking to do some optimizations, but > >I'm > >> >not > >> >> >sure this has happened. > >> >> > > >> >> >What can we do to try to improve performance? > >> >> > > >> >> >Thank you. > >> >> > > >> >> > > >> >> > > >> >> >Some setup values follow. > >> >> > > >> >> >xfs_info /mnt/SNIP_block1 > >> >> >meta-data=/dev/sdc isize=512 agcount=103, > >> >> >agsize=26214400 > >> >> >blks > >> >> > = sectsz=512 attr=2, > >projid32bit=1 > >> >> > = crc=1 finobt=1, sparse=0, > >> >rmapbt=0 > >> >> > = reflink=0 > >> >> >data = bsize=4096 blocks=2684354560, > >> >> >imaxpct=25 > >> >> > = sunit=0 swidth=0 blks > >> >> >naming =version 2 bsize=4096 ascii-ci=0, ftype=1 > >> >> >log =internal log bsize=4096 blocks=51200, > >> >version=2 > >> >> > = sectsz=512 sunit=0 blks, > >> >lazy-count=1 > >> >> >realtime =none extsz=4096 blocks=0, > >rtextents=0 > >> >> > > >> >> >Volume Name: SNIP_data1 > >> >> >Type: Replicate > >> >> >Volume ID: SNIP > >> >> >Status: Started > >> >> >Snapshot Count: 0 > >> >> >Number of Bricks: 1 x 4 = 4 > >> >> >Transport-type: tcp > >> >> >Bricks: > >> >> >Brick1: nexus2:/mnt/SNIP_block1/SNIP_data1 > >> >> >Brick2: forge:/mnt/SNIP_block1/SNIP_data1 > >> >> >Brick3: hive:/mnt/SNIP_block1/SNIP_data1 > >> >> >Brick4: citadel:/mnt/SNIP_block1/SNIP_data1 > >> >> >Options Reconfigured: > >> >> >cluster.quorum-count: 1 > >> >> >cluster.quorum-type: fixed > >> >> >network.ping-timeout: 5 > >> >> >network.remote-dio: enable > >> >> >performance.rda-cache-limit: 256MB > >> >> >performance.readdir-ahead: on > >> >> >performance.parallel-readdir: on > >> >> >network.inode-lru-limit: 500000 > >> >> >performance.md-cache-timeout: 600 > >> >> >performance.cache-invalidation: on > >> >> >performance.stat-prefetch: on > >> >> >features.cache-invalidation-timeout: 600 > >> >> >features.cache-invalidation: on > >> >> >cluster.readdir-optimize: on > >> >> >performance.io-thread-count: 32 > >> >> >server.event-threads: 4 > >> >> >client.event-threads: 4 > >> >> >performance.read-ahead: off > >> >> >cluster.lookup-optimize: on > >> >> >performance.cache-size: 1GB > >> >> >cluster.self-heal-daemon: enable > >> >> >transport.address-family: inet > >> >> >nfs.disable: on > >> >> >performance.client-io-threads: on > >> >> >cluster.granular-entry-heal: enable > >> >> >cluster.data-self-heal-algorithm: full > >> >> > > >> >> >Sincerely, > >> >> >Artem > >> >> > > >> >> >-- > >> >> >Founder, Android Police <http://www.androidpolice.com>, APK > >Mirror > >> >> ><http://www.apkmirror.com/>, Illogical Robot LLC > >> >> >beerpla.net | @ArtemR <http://twitter.com/ArtemR> > >> >> > >> >> Hi Artem, > >> >> > >> >> Have you checked the same on brick level ? How big is the > >difference > >> >? > >> >> > >> >> Best Regards, > >> >> Strahil Nikolov > >> >> > >> > >> Hi Artem, > >> > >> My bad I missed the 'xfs' word... Still the difference is huge. > >> > >> May I ask you to do a test again (pure curiosity) as follows: > >> 1. Repeat the test from before > >> 2. Stop 1 brick and test again. > >> > >> > >> P.S.: You can try it on the test cluster > >> > >> Best Regards, > >> Strahil Nikolov > >> > > Hi Artem, > > I was wondering if the 4th replica is adding additional overhead (another > dir to check), but the test is not very conclusive. > > > Actually the 'anomalities' log entries in your pool could be a symptom of > another pdoblem (just like the long listing time). > > I will try to reproduce your setup (smaller scale - 1 brick 50k files) > and then will try with 3 bricks. > > > Best Regards, > Strahil Nikolov >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200518/cccb0ef0/attachment.html>
Artem Russakovskii
2020-May-18 22:25 UTC
[Gluster-users] Extremely slow file listing in folders with many files
I've launched an ls in a folder with ~100k files/dirs 20 minutes ago. It's still not done and the log is filled with tens of thousands of these. This makes gluster almost unusable right now. [2020-05-18 22:23:35.919504] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 5c539141-dbe5-484a-acc0-acc369d2e55b). Holes=1 overlaps=0 [2020-05-18 22:23:35.919526] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 686ed221-0b5a-4412-bd70-385b707cc63e). Holes=1 overlaps=0 [2020-05-18 22:23:35.919534] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 04459d23-f77d-4ad4-baf6-bb5c39dc59b6). Holes=1 overlaps=0 [2020-05-18 22:23:35.919545] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = e0beed96-4f2b-4e4b-8e02-963a159da4ff). Holes=1 overlaps=0 [2020-05-18 22:23:35.919552] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 4393a4a8-3768-4acf-a4bd-33152f79c7b9). Holes=1 overlaps=0 [2020-05-18 22:23:35.919564] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 2ec6e22b-2226-482f-9149-4ef9259f2777). Holes=1 overlaps=0 [2020-05-18 22:23:35.919571] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 6047055d-59ec-405b-ab9d-69cf3112d51b). Holes=1 overlaps=0 [2020-05-18 22:23:35.919585] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 6cc7de36-ddd6-4daf-b219-3d823b5020d1). Holes=1 overlaps=0 [2020-05-18 22:23:35.919592] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 27dd2dfa-f8f2-48e5-b59d-a407293a89f7). Holes=1 overlaps=0 [2020-05-18 22:23:36.286582] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = b4b8fcde-c0d0-4d4e-ae49-c09b4a4f5360). Holes=1 overlaps=0 [2020-05-18 22:23:36.286631] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 80d7664c-aa3d-4675-bea4-389f3bef9d08). Holes=1 overlaps=0 [2020-05-18 22:23:36.286641] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 6b5a6b00-1c77-4c2a-b92a-000f00b03b8c). Holes=1 overlaps=0 [2020-05-18 22:23:36.286649] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = b6be91a5-c7ff-441f-b495-3aaa360c66aa). Holes=1 overlaps=0 [2020-05-18 22:23:36.286663] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 1d22681c-f55c-4021-82ee-86f1a91caa32). Holes=1 overlaps=0 [2020-05-18 22:23:36.286671] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 93d3cecf-b340-46a7-8fcb-7139d6337021). Holes=1 overlaps=0 [2020-05-18 22:23:36.286713] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 93dfb66d-c60c-4708-826b-a1078a85c741). Holes=1 overlaps=0 [2020-05-18 22:23:36.286722] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = eea71270-342f-45e2-8da6-e3bad3d700c7). Holes=1 overlaps=0 [2020-05-18 22:23:36.286729] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 848849a5-20f8-49f9-870f-e5bf928d98e8). Holes=1 overlaps=0 [2020-05-18 22:23:36.286737] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 97e58dce-7d41-4c82-86e9-0e72ca02bbcd). Holes=1 overlaps=0 [2020-05-18 22:23:36.416493] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 40d108b0-5f5f-4c39-a4ce-06cef1bac6bd). Holes=1 overlaps=0 [2020-05-18 22:23:36.416549] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = d7871c87-2954-4615-86c7-2af660614c85). Holes=1 overlaps=0 [2020-05-18 22:23:36.416560] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 3923df3a-5845-4e14-8299-c9bb3e886a80). Holes=1 overlaps=0 [2020-05-18 22:23:36.416568] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 6cd7dbdb-4224-4fe2-9c40-7e146140d993). Holes=1 overlaps=0 [2020-05-18 22:23:36.416578] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 00fba742-5ebb-404a-8628-bfc697285389). Holes=1 overlaps=0 [2020-05-18 22:23:36.416587] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = d7de7b74-3065-4c63-9c2b-80f174c6c407). Holes=1 overlaps=0 [2020-05-18 22:23:36.416594] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = df7c960b-f867-400a-b1ea-3d2aa9205903). Holes=1 overlaps=0 [2020-05-18 22:23:36.416603] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 3d52e68a-3de4-48d8-b3c3-ff9f02499b8d). Holes=1 overlaps=0 [2020-05-18 22:23:36.416618] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 0ba97207-3539-419d-921a-5883bd839f8b). Holes=1 overlaps=0 [2020-05-18 22:23:36.416625] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 9f569e44-6a7f-49ff-be61-4051614679e0). Holes=1 overlaps=0 [2020-05-18 22:23:36.416632] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 3cb0b764-e0b7-4c80-9e6e-e5a6540378c7). Holes=1 overlaps=0 [2020-05-18 22:23:36.416641] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 4d12a428-b082-4f0d-977a-2e1b21a62f9f). Holes=1 overlaps=0 [2020-05-18 22:23:36.416659] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 4b58cd7c-3470-4d3d-be05-1f7a1ce3b885). Holes=1 overlaps=0 [2020-05-18 22:23:36.416667] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 0b5f9eb1-ff2b-4f53-8068-be3459d24893). Holes=1 overlaps=0 [2020-05-18 22:23:36.416676] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 923bf3d7-e964-4f3f-b067-60b1dc3239e2). Holes=1 overlaps=0 [2020-05-18 22:23:36.592293] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = da9de079-2fff-4589-86d1-1edc2ff1e088). Holes=1 overlaps=0 [2020-05-18 22:23:36.592383] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = c4dad7e7-feb5-4346-96ac-fb2f84c59961). Holes=1 overlaps=0 [2020-05-18 22:23:36.592394] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = c28fa677-cffb-42f7-bf58-809f82c5fda8). Holes=1 overlaps=0 [2020-05-18 22:23:36.592409] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 44d5fc11-ec63-4589-b1a5-435cb9956807). Holes=1 overlaps=0 [2020-05-18 22:23:36.592418] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 4a826ae1-2e5e-4cd1-b7aa-4045095cac88). Holes=1 overlaps=0 [2020-05-18 22:23:36.592425] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = a04618e8-5dc9-4c13-8885-9db27c2e9101). Holes=1 overlaps=0 [2020-05-18 22:23:36.592434] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 52d3cae3-0e94-4508-9409-322230c324e4). Holes=1 overlaps=0 [2020-05-18 22:23:36.592441] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 94d989de-2aa5-470c-a3da-5d169b4dd882). Holes=1 overlaps=0 [2020-05-18 22:23:36.592452] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = c2840cd4-a0c0-4577-8169-4c9c1c14d236). Holes=1 overlaps=0 [2020-05-18 22:23:36.592459] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = eef2b761-8919-425a-be50-ee23e3f2bc5d). Holes=1 overlaps=0 [2020-05-18 22:23:36.592470] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 3e698fd4-fff6-4ba6-abe6-1c9e94c7f238). Holes=1 overlaps=0 [2020-05-18 22:23:36.727125] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = d3c978cd-55a9-4bcf-b923-cc328d19b721). Holes=1 overlaps=0 [2020-05-18 22:23:36.727178] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 2a76f859-8162-4022-b721-ea1914d28f63). Holes=1 overlaps=0 [2020-05-18 22:23:36.761998] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 3616ac4b-7320-4a06-8bb2-ac862a5f0f44). Holes=1 overlaps=0 [2020-05-18 22:23:36.762041] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = a1848805-2cdb-4f86-8677-5532ac97ca60). Holes=1 overlaps=0 [2020-05-18 22:23:36.762087] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = b78a7479-6868-4ffe-beca-b0e929d66aec). Holes=1 overlaps=0 [2020-05-18 22:23:36.762099] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 4467d1b6-1438-4912-958e-d56ee13ce2a4). Holes=1 overlaps=0 [2020-05-18 22:23:36.762109] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 34015db8-3885-42d4-97bb-8b6acf11a55d). Holes=1 overlaps=0 [2020-05-18 22:23:36.762117] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 3cc59a68-6c36-40b8-a7ee-76666a3359c8). Holes=1 overlaps=0 [2020-05-18 22:23:37.157837] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 1f367ff0-b483-4985-961a-508175c8583b). Holes=1 overlaps=0 [2020-05-18 22:23:37.157890] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = fd4d4f5f-8638-42a7-9639-d921eabbb360). Holes=1 overlaps=0 [2020-05-18 22:23:37.157935] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = ebf167c8-1c29-40f2-8762-6c05061fba7c). Holes=1 overlaps=0 [2020-05-18 22:23:37.157945] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 7a9ae43c-896a-4c16-994e-440c0d0f0550). Holes=1 overlaps=0 [2020-05-18 22:23:37.157952] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 2ebccb10-7fb5-4b0e-8633-9c3ed6f67f5f). Holes=1 overlaps=0 [2020-05-18 22:23:37.157962] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 71d9ea44-0b98-416f-98a7-0229bcc08596). Holes=1 overlaps=0 [2020-05-18 22:23:37.157975] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = d3c0e707-51af-49cc-b09c-f2ed07aaedd6). Holes=1 overlaps=0 [2020-05-18 22:23:37.157988] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 8c60366b-5d26-46d9-aee0-9ce668edabf0). Holes=1 overlaps=0 [2020-05-18 22:23:37.158008] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = d4a336e7-b1c0-4858-91e3-37b6c8d09b07). Holes=1 overlaps=0 [2020-05-18 22:23:37.465465] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 73c6b711-976a-4f31-9e08-054e55ca6be9). Holes=1 overlaps=0 [2020-05-18 22:23:37.465525] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 6f9c16eb-f0f1-4cb6-b28c-eba2224e4b80). Holes=1 overlaps=0 [2020-05-18 22:23:37.465553] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 9a233480-ebfa-463f-98bc-a2e1a122e315). Holes=1 overlaps=0 [2020-05-18 22:23:37.465566] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 2c33de93-7a33-4d14-b935-80d8643f7109). Holes=1 overlaps=0 [2020-05-18 22:23:37.517495] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = d1733ee9-91e4-44f4-b7ad-d436e6b24059). Holes=1 overlaps=0 [2020-05-18 22:23:37.517554] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = bf870b8f-809b-4369-9532-91c22e3a8c04). Holes=1 overlaps=0 [2020-05-18 22:23:37.517569] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 769fa28d-e7a5-4387-af45-a2d1c5910997). Holes=1 overlaps=0 [2020-05-18 22:23:37.517582] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 62b2aa19-b3bb-4536-b0cc-3bb54305f668). Holes=1 overlaps=0 [2020-05-18 22:23:37.517590] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-apkmirror_data1-dht: Found anomalies in (null) (gfid = 942eeba7-ce64-460d-82c2-5fdbe3341a5b). Holes=1 overlaps=0 Sincerely, Artem -- Founder, Android Police <http://www.androidpolice.com>, APK Mirror <http://www.apkmirror.com/>, Illogical Robot LLC beerpla.net | @ArtemR <http://twitter.com/ArtemR> On Mon, May 18, 2020 at 3:10 PM Artem Russakovskii <archon810 at gmail.com> wrote:> Hi, > > Does the gluster team have any feedback about this? Resolving the "Found > anomalies" issues may be key to resolving dir list speed issues. > > Sincerely, > Artem > > -- > Founder, Android Police <http://www.androidpolice.com>, APK Mirror > <http://www.apkmirror.com/>, Illogical Robot LLC > beerpla.net | @ArtemR <http://twitter.com/ArtemR> > > > On Thu, Apr 30, 2020 at 10:36 PM Strahil Nikolov <hunter86_bg at yahoo.com> > wrote: > >> On April 30, 2020 9:05:19 PM GMT+03:00, Artem Russakovskii < >> archon810 at gmail.com> wrote: >> >I did this on the same prod instance just now. >> > >> >'find' on a fuse gluster dir with 40k+ files: >> >1st run: 3m56.261s >> >2nd run: 0m24.970s >> >3rd run: 0m24.099s >> > >> >At this point, I killed all gluster services on one of the 4 servers >> >and >> >verified that that brick went offline. >> > >> >1st run: 0m38.131s >> >2nd run: 0m19.369s >> >3rd run: 0m23.576s >> > >> >Nothing conclusive really IMO. >> > >> >Sincerely, >> >Artem >> > >> >-- >> >Founder, Android Police <http://www.androidpolice.com>, APK Mirror >> ><http://www.apkmirror.com/>, Illogical Robot LLC >> >beerpla.net | @ArtemR <http://twitter.com/ArtemR> >> > >> > >> >On Thu, Apr 30, 2020 at 9:55 AM Strahil Nikolov <hunter86_bg at yahoo.com> >> >wrote: >> > >> >> On April 30, 2020 6:27:10 PM GMT+03:00, Artem Russakovskii < >> >> archon810 at gmail.com> wrote: >> >> >Hi Strahil, in the original email I included both the times for the >> >> >first >> >> >and subsequent reads on the fuse mounted gluster volume as well as >> >the >> >> >xfs >> >> >filesystem the gluster data resides on (this is the brick, right?). >> >> > >> >> >On Thu, Apr 30, 2020, 7:44 AM Strahil Nikolov >> ><hunter86_bg at yahoo.com> >> >> >wrote: >> >> > >> >> >> On April 30, 2020 4:24:23 AM GMT+03:00, Artem Russakovskii < >> >> >> archon810 at gmail.com> wrote: >> >> >> >Hi all, >> >> >> > >> >> >> >We have 500GB and 10TB 4x1 replicate xfs-based gluster volumes, >> >and >> >> >the >> >> >> >10TB one especially is extremely slow to do certain things with >> >(and >> >> >> >has >> >> >> >been since gluster 3.x when we started). We're currently on 5.13. >> >> >> > >> >> >> >The number of files isn't even what I'd consider that great - >> >under >> >> >> >100k >> >> >> >per dir. >> >> >> > >> >> >> >Here are some numbers to look at: >> >> >> > >> >> >> >On gluster volume in a dir of 45k files: >> >> >> >The first time >> >> >> > >> >> >> >time find | wc -l >> >> >> >45423 >> >> >> >real 8m44.819s >> >> >> >user 0m0.459s >> >> >> >sys 0m0.998s >> >> >> > >> >> >> >And again >> >> >> > >> >> >> >time find | wc -l >> >> >> >45423 >> >> >> >real 0m34.677s >> >> >> >user 0m0.291s >> >> >> >sys 0m0.754s >> >> >> > >> >> >> > >> >> >> >If I run the same operation on the xfs block device itself: >> >> >> >The first time >> >> >> > >> >> >> >time find | wc -l >> >> >> >45423 >> >> >> >real 0m13.514s >> >> >> >user 0m0.144s >> >> >> >sys 0m0.501s >> >> >> > >> >> >> >And again >> >> >> > >> >> >> >time find | wc -l >> >> >> >45423 >> >> >> >real 0m0.197s >> >> >> >user 0m0.088s >> >> >> >sys 0m0.106s >> >> >> > >> >> >> > >> >> >> >I'd expect a performance difference here but just as it was >> >several >> >> >> >years >> >> >> >ago when we started with gluster, it's still huge, and simple >> >file >> >> >> >listings >> >> >> >are incredibly slow. >> >> >> > >> >> >> >At the time, the team was looking to do some optimizations, but >> >I'm >> >> >not >> >> >> >sure this has happened. >> >> >> > >> >> >> >What can we do to try to improve performance? >> >> >> > >> >> >> >Thank you. >> >> >> > >> >> >> > >> >> >> > >> >> >> >Some setup values follow. >> >> >> > >> >> >> >xfs_info /mnt/SNIP_block1 >> >> >> >meta-data=/dev/sdc isize=512 agcount=103, >> >> >> >agsize=26214400 >> >> >> >blks >> >> >> > = sectsz=512 attr=2, >> >projid32bit=1 >> >> >> > = crc=1 finobt=1, sparse=0, >> >> >rmapbt=0 >> >> >> > = reflink=0 >> >> >> >data = bsize=4096 blocks=2684354560, >> >> >> >imaxpct=25 >> >> >> > = sunit=0 swidth=0 blks >> >> >> >naming =version 2 bsize=4096 ascii-ci=0, ftype=1 >> >> >> >log =internal log bsize=4096 blocks=51200, >> >> >version=2 >> >> >> > = sectsz=512 sunit=0 blks, >> >> >lazy-count=1 >> >> >> >realtime =none extsz=4096 blocks=0, >> >rtextents=0 >> >> >> > >> >> >> >Volume Name: SNIP_data1 >> >> >> >Type: Replicate >> >> >> >Volume ID: SNIP >> >> >> >Status: Started >> >> >> >Snapshot Count: 0 >> >> >> >Number of Bricks: 1 x 4 = 4 >> >> >> >Transport-type: tcp >> >> >> >Bricks: >> >> >> >Brick1: nexus2:/mnt/SNIP_block1/SNIP_data1 >> >> >> >Brick2: forge:/mnt/SNIP_block1/SNIP_data1 >> >> >> >Brick3: hive:/mnt/SNIP_block1/SNIP_data1 >> >> >> >Brick4: citadel:/mnt/SNIP_block1/SNIP_data1 >> >> >> >Options Reconfigured: >> >> >> >cluster.quorum-count: 1 >> >> >> >cluster.quorum-type: fixed >> >> >> >network.ping-timeout: 5 >> >> >> >network.remote-dio: enable >> >> >> >performance.rda-cache-limit: 256MB >> >> >> >performance.readdir-ahead: on >> >> >> >performance.parallel-readdir: on >> >> >> >network.inode-lru-limit: 500000 >> >> >> >performance.md-cache-timeout: 600 >> >> >> >performance.cache-invalidation: on >> >> >> >performance.stat-prefetch: on >> >> >> >features.cache-invalidation-timeout: 600 >> >> >> >features.cache-invalidation: on >> >> >> >cluster.readdir-optimize: on >> >> >> >performance.io-thread-count: 32 >> >> >> >server.event-threads: 4 >> >> >> >client.event-threads: 4 >> >> >> >performance.read-ahead: off >> >> >> >cluster.lookup-optimize: on >> >> >> >performance.cache-size: 1GB >> >> >> >cluster.self-heal-daemon: enable >> >> >> >transport.address-family: inet >> >> >> >nfs.disable: on >> >> >> >performance.client-io-threads: on >> >> >> >cluster.granular-entry-heal: enable >> >> >> >cluster.data-self-heal-algorithm: full >> >> >> > >> >> >> >Sincerely, >> >> >> >Artem >> >> >> > >> >> >> >-- >> >> >> >Founder, Android Police <http://www.androidpolice.com>, APK >> >Mirror >> >> >> ><http://www.apkmirror.com/>, Illogical Robot LLC >> >> >> >beerpla.net | @ArtemR <http://twitter.com/ArtemR> >> >> >> >> >> >> Hi Artem, >> >> >> >> >> >> Have you checked the same on brick level ? How big is the >> >difference >> >> >? >> >> >> >> >> >> Best Regards, >> >> >> Strahil Nikolov >> >> >> >> >> >> >> Hi Artem, >> >> >> >> My bad I missed the 'xfs' word... Still the difference is huge. >> >> >> >> May I ask you to do a test again (pure curiosity) as follows: >> >> 1. Repeat the test from before >> >> 2. Stop 1 brick and test again. >> >> >> >> >> >> P.S.: You can try it on the test cluster >> >> >> >> Best Regards, >> >> Strahil Nikolov >> >> >> >> Hi Artem, >> >> I was wondering if the 4th replica is adding additional overhead >> (another dir to check), but the test is not very conclusive. >> >> >> Actually the 'anomalities' log entries in your pool could be a symptom >> of another pdoblem (just like the long listing time). >> >> I will try to reproduce your setup (smaller scale - 1 brick 50k files) >> and then will try with 3 bricks. >> >> >> Best Regards, >> Strahil Nikolov >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200518/8da8663f/attachment.html>