All, I am new to gluster and have some questions/concerns about some tiering errors that I see in the log files. OS: CentOs 7.3.1611 Gluster version: 3.10.5 Samba version: 4.6.2 I see the following (scrubbed): Node 1 /var/log/glusterfs/tier/<vol>/tierd.log: [2017-10-19 17:52:07.519614] I [MSGID: 109038] [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion failed for <file>(gfid:edaf97e1-02e0- 4838-9d26-71ea3aab22fb) [2017-10-19 17:52:07.525110] E [MSGID: 109011] [dht-common.c:7188:dht_create] 0-<vol>-hot-dht: no subvolume in layout for path=/path/to/<file> [2017-10-19 17:52:07.526088] E [MSGID: 109023] [dht-rebalance.c:757:__dht_rebalance_create_dst_file] 0-<vol>-tier-dht: failed to create <file> on <vol>-hot-dht [Input/output error] [2017-10-19 17:52:07.526111] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] 0-<vol>-tier-dht: Create dst failed on - <vol>-hot-dht for file - <file> [2017-10-19 17:52:07.527214] E [MSGID: 109037] [tier.c:969:tier_migrate_link] 0-<vol>-tier-dht: Failed to migrate <file> [No space left on device] [2017-10-19 17:52:07.527244] I [MSGID: 109038] [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion failed for <file>(gfid:fb4411c4-a387- 4e5f-a2b7-897633ef4aa8) [2017-10-19 17:52:07.533510] E [MSGID: 109011] [dht-common.c:7188:dht_create] 0-<vol>-hot-dht: no subvolume in layout for path=/path/to/<file> [2017-10-19 17:52:07.534434] E [MSGID: 109023] [dht-rebalance.c:757:__dht_rebalance_create_dst_file] 0-<vol>-tier-dht: failed to create <file> on <vol>-hot-dht [Input/output error] [2017-10-19 17:52:07.534453] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] 0-<vol>-tier-dht: Create dst failed on - <vol>-hot-dht for file - <file> [2017-10-19 17:52:07.535570] E [MSGID: 109037] [tier.c:969:tier_migrate_link] 0-<vol>-tier-dht: Failed to migrate <file> [No space left on device] [2017-10-19 17:52:07.535594] I [MSGID: 109038] [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion failed for <file>(gfid:fba421e7-0500- 47c4-bf67-10a40690e13d) [2017-10-19 17:52:07.541363] E [MSGID: 109011] [dht-common.c:7188:dht_create] 0-<vol>-hot-dht: no subvolume in layout for path=/path/to/<file> [2017-10-19 17:52:07.542296] E [MSGID: 109023] [dht-rebalance.c:757:__dht_rebalance_create_dst_file] 0-<vol>-tier-dht: failed to create <file> on <vol>-hot-dht [Input/output error] [2017-10-19 17:52:07.542357] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] 0-<vol>-tier-dht: Create dst failed on - <vol>-hot-dht for file - <file> [2017-10-19 17:52:07.543480] E [MSGID: 109037] [tier.c:969:tier_migrate_link] 0-<vol>-tier-dht: Failed to migrate <file> [No space left on device] [2017-10-19 17:52:07.543521] I [MSGID: 109038] [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion failed for <file>(gfid:fe6799e1-42e6- 43e5-a7eb-ac8facfcbc9f) [2017-10-19 17:52:07.549959] E [MSGID: 109011] [dht-common.c:7188:dht_create] 0-<vol>-hot-dht: no subvolume in layout for path=/path/to/<file> [2017-10-19 17:52:07.550901] E [MSGID: 109023] [dht-rebalance.c:757:__dht_rebalance_create_dst_file] 0-<vol>-tier-dht: failed to create <file> on <vol>-hot-dht [Input/output error] [2017-10-19 17:52:07.550922] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] 0-<vol>-tier-dht: Create dst failed on - <vol>-hot-dht for file - <file> [2017-10-19 17:52:07.551896] E [MSGID: 109037] [tier.c:969:tier_migrate_link] 0-<vol>-tier-dht: Failed to migrate <file> [No space left on device] [2017-10-19 17:52:07.551917] I [MSGID: 109038] [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion failed for <file>(gfid:ffe3a3f2-b170- 43f0-a9fb-97c78e3173eb) [2017-10-19 17:52:07.551945] E [MSGID: 109037] [tier.c:2565:tier_run] 0-<vol>-tier-dht: Promotion failed Node 1 /var/log/samba/glusterfs-<vol>-pool.log: [2017-10-18 17:13:41.481860] E [MSGID: 114031] [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-0: remote operation failed. Path: /pool/testing (7d89b9a8-3e5d-4f28-9e57-039fe4416994) [Invalid argument] [2017-10-18 17:13:41.481860] E [MSGID: 114031] [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-1: remote operation failed. Path: /pool/testing (7d89b9a8-3e5d-4f28-9e57-039fe4416994) [Invalid argument] [2017-10-18 17:13:41.485916] E [MSGID: 109089] [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] 0-<vol>-tier-dht: Failed to open the fd (0x7f02bf1ff570, flags=00) on file 7d89b9a8-3e5d-4f28-9e57-039fe4416994 @ <vol>-cold-dht [Invalid argument] [2017-10-18 17:13:41.488223] E [MSGID: 114031] [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-0: remote operation failed. Path: /pool/testing (7d89b9a8-3e5d-4f28-9e57-039fe4416994) [Invalid argument] [2017-10-18 17:13:41.488235] E [MSGID: 114031] [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-1: remote operation failed. Path: /pool/testing (7d89b9a8-3e5d-4f28-9e57-039fe4416994) [Invalid argument] [2017-10-18 17:13:41.489060] E [MSGID: 109089] [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] 0-<vol>-tier-dht: Failed to open the fd (0x7f02bf1feb50, flags=00) on file 7d89b9a8-3e5d-4f28-9e57-039fe4416994 @ <vol>-cold-dht [Invalid argument] [2017-10-18 17:13:42.339936] E [MSGID: 114031] [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-4: remote operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) [Invalid argument] [2017-10-18 17:13:42.339988] E [MSGID: 114031] [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-5: remote operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) [Invalid argument] [2017-10-18 17:13:42.343769] E [MSGID: 109089] [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] 0-<vol>-tier-dht: Failed to open the fd (0x7f02bf2012c0, flags=00) on file 34d76e11-412f-4bc6-9a3e-b1f89658f13b @ <vol>-hot-dht [Invalid argument] [2017-10-18 17:13:42.345374] E [MSGID: 114031] [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-4: remote operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) [Invalid argument] [2017-10-18 17:13:42.345401] E [MSGID: 114031] [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-5: remote operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) [Invalid argument] [2017-10-18 17:13:42.346259] E [MSGID: 109089] [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] 0-<vol>-tier-dht: Failed to open the fd (0x7f02bf201130, flags=00) on file 34d76e11-412f-4bc6-9a3e-b1f89658f13b @ <vol>-hot-dht [Invalid argument] [2017-10-18 17:13:59.541591] E [MSGID: 108006] [afr-common.c:4808:afr_notify] 0-<vol>-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up. [2017-10-18 17:13:59.541748] E [MSGID: 108006] [afr-common.c:4808:afr_notify] 0-<vol>-replicate-1: All subvolumes are down. Going offline until atleast one of them comes back up. [2017-10-18 17:13:59.541887] E [MSGID: 108006] [afr-common.c:4808:afr_notify] 0-<vol>-replicate-2: All subvolumes are down. Going offline until atleast one of them comes back up. [2017-10-18 17:13:59.541977] E [MSGID: 108006] [afr-common.c:4808:afr_notify] 0-<vol>-replicate-3: All subvolumes are down. Going offline until atleast one of them comes back up. Node 2 /var/log/gluster/tier/<vol>/tierd.log: [2017-10-16 15:54:08.662873] I [MSGID: 109038] [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion failed for <file>(gfid:fffd714e-b2d2- 42d3-a31f-72673276e3d0) [2017-10-16 16:00:07.201584] I [MSGID: 109038] [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion failed for <file>(gfid:f10365e1-747b- 4985-97b9-8b5dc61ac464) [2017-10-16 16:00:07.372559] I [MSGID: 109038] [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion failed for <file>(gfid:f95f17bf-b696- 44cd-aae0-d8ac38149aa5) [2017-10-16 16:06:06.880522] I [MSGID: 109038] [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion failed for <file>(gfid:ec451f6c-8971- 4f9b-a04f-00f96db9b46a) [2017-10-16 16:06:08.062080] I [MSGID: 109038] [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion failed for <file>(gfid:e658cd70-3f6d- 4b25-8d9f-0d4c24d3ec5d) [2017-10-16 16:06:08.288298] I [MSGID: 109038] [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion failed for <file>(gfid:f22df67a-88e5- 4fae-aab0-b00e04f9a6e1) [2017-10-18 15:55:06.446416] I [MSGID: 109028] [dht-rebalance.c:4792:gf_defrag_status_get] 0-glusterfs: Rebalance is in progress. Time taken is 1376671.00 secs [2017-10-18 15:55:06.446433] I [MSGID: 109028] [dht-rebalance.c:4796:gf_defrag_status_get] 0-glusterfs: Files migrated: 0, size: 0, lookups: 47887089, failures: 3594, skipped: 0 [2017-10-19 00:00:00.501576] I [MSGID: 109038] [tier.c:2391:tier_prepare_compact] 0-<vol>-tier-dht: Start compaction on cold tier [2017-10-19 00:00:00.502016] I [MSGID: 109038] [tier.c:2403:tier_prepare_compact] 0-<vol>-tier-dht: End compaction on cold tier [2017-10-19 00:00:00.501608] I [MSGID: 109038] [tier.c:2391:tier_prepare_compact] 0-<vol>-tier-dht: Start compaction on cold tier [2017-10-19 00:00:00.502076] I [MSGID: 109038] [tier.c:2403:tier_prepare_compact] 0-<vol>-tier-dht: End compaction on cold tier [2017-10-19 16:03:49.522991] I [MSGID: 109028] [dht-rebalance.c:4792:gf_defrag_status_get] 0-glusterfs: Rebalance is in progress. Time taken is 1463594.00 secs [2017-10-19 16:03:49.523017] I [MSGID: 109028] [dht-rebalance.c:4796:gf_defrag_status_get] 0-glusterfs: Files migrated: 0, size: 0, lookups: 52790654, failures: 3594, skipped: 0 Node 2 /var/log/samba/glusterfs-<vol>-pool.log: [2017-10-18 16:49:09.218062] E [MSGID: 114031] [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-4: remote operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) [Invalid argument] [2017-10-18 16:49:09.218254] E [MSGID: 109089] [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] 0-<vol>-tier-dht: Failed to open the fd (0x7f009b36bac0, flags=00) on file 34d76e11-412f-4bc6-9a3e-b1f89658f13b @ <vol>-hot-dht [Invalid argument] [2017-10-18 16:49:09.222783] E [MSGID: 108006] [afr-common.c:4808:afr_notify] 0-<vol>-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up. [2017-10-18 16:49:09.222912] E [MSGID: 108006] [afr-common.c:4808:afr_notify] 0-<vol>-replicate-1: All subvolumes are down. Going offline until atleast one of them comes back up. [2017-10-18 16:49:09.223079] E [MSGID: 108006] [afr-common.c:4808:afr_notify] 0-<vol>-replicate-2: All subvolumes are down. Going offline until atleast one of them comes back up. [2017-10-18 16:49:09.223200] E [MSGID: 108006] [afr-common.c:4808:afr_notify] 0-<vol>-replicate-3: All subvolumes are down. Going offline until atleast one of them comes back up. Status: # gluster vol tier <vol> status Node Promoted files Demoted files Status run time in h:m:s --------- --------- --------- --------- --------- Node1 190861 0 in progress 408:34:13 Node2 0 0 in progress 408:34:14 Hot tier bricks: # df -h /dev/mapper/vg_bricks-brick_nvme1 1.4T 551G 883G 39% /mnt/brick_nvme1 /dev/mapper/vg_bricks-brick_nvme2 1.4T 512G 922G 36% /mnt/brick_nvme2 Can anyone point me in the right direction as to what may be going on? Any guidance is greatly appreciated. Thanks in advance, HB -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171019/478cea4e/attachment.html>
Herb, What are the high and low watermarks for the tier set at ? # gluster volume get <vol> cluster.watermark-hi # gluster volume get <vol> cluster.watermark-low What is the size of the file that failed to migrate as per the following tierd log: [2017-10-19 17:52:07.519614] I [MSGID: 109038] [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion failed for <file>(gfid:edaf97e1-02e0-4838-9d26-71ea3aab22fb) If possible, a *gluster volume info* would also help, instead of going to and fro with questions. -- Milind On Fri, Oct 20, 2017 at 12:42 AM, Herb Burnswell < herbert.burnswell at gmail.com> wrote:> All, > > I am new to gluster and have some questions/concerns about some tiering > errors that I see in the log files. > > OS: CentOs 7.3.1611 > Gluster version: 3.10.5 > Samba version: 4.6.2 > > I see the following (scrubbed): > > Node 1 /var/log/glusterfs/tier/<vol>/tierd.log: > > [2017-10-19 17:52:07.519614] I [MSGID: 109038] > [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion > failed for <file>(gfid:edaf97e1-02e0-4838-9d26-71ea3aab22fb) > [2017-10-19 17:52:07.525110] E [MSGID: 109011] > [dht-common.c:7188:dht_create] 0-<vol>-hot-dht: no subvolume in layout for > path=/path/to/<file> > [2017-10-19 17:52:07.526088] E [MSGID: 109023] > [dht-rebalance.c:757:__dht_rebalance_create_dst_file] 0-<vol>-tier-dht: > failed to create <file> on <vol>-hot-dht [Input/output error] > [2017-10-19 17:52:07.526111] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] > 0-<vol>-tier-dht: Create dst failed on - <vol>-hot-dht for file - <file> > [2017-10-19 17:52:07.527214] E [MSGID: 109037] > [tier.c:969:tier_migrate_link] 0-<vol>-tier-dht: Failed to migrate <file> > [No space left on device] > [2017-10-19 17:52:07.527244] I [MSGID: 109038] > [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion > failed for <file>(gfid:fb4411c4-a387-4e5f-a2b7-897633ef4aa8) > [2017-10-19 17:52:07.533510] E [MSGID: 109011] > [dht-common.c:7188:dht_create] 0-<vol>-hot-dht: no subvolume in layout for > path=/path/to/<file> > [2017-10-19 17:52:07.534434] E [MSGID: 109023] > [dht-rebalance.c:757:__dht_rebalance_create_dst_file] 0-<vol>-tier-dht: > failed to create <file> on <vol>-hot-dht [Input/output error] > [2017-10-19 17:52:07.534453] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] > 0-<vol>-tier-dht: Create dst failed on - <vol>-hot-dht for file - <file> > [2017-10-19 17:52:07.535570] E [MSGID: 109037] > [tier.c:969:tier_migrate_link] 0-<vol>-tier-dht: Failed to migrate <file> > [No space left on device] > [2017-10-19 17:52:07.535594] I [MSGID: 109038] > [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion > failed for <file>(gfid:fba421e7-0500-47c4-bf67-10a40690e13d) > [2017-10-19 17:52:07.541363] E [MSGID: 109011] > [dht-common.c:7188:dht_create] 0-<vol>-hot-dht: no subvolume in layout for > path=/path/to/<file> > [2017-10-19 17:52:07.542296] E [MSGID: 109023] > [dht-rebalance.c:757:__dht_rebalance_create_dst_file] 0-<vol>-tier-dht: > failed to create <file> on <vol>-hot-dht [Input/output error] > [2017-10-19 17:52:07.542357] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] > 0-<vol>-tier-dht: Create dst failed on - <vol>-hot-dht for file - <file> > [2017-10-19 17:52:07.543480] E [MSGID: 109037] > [tier.c:969:tier_migrate_link] 0-<vol>-tier-dht: Failed to migrate <file> > [No space left on device] > [2017-10-19 17:52:07.543521] I [MSGID: 109038] > [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion > failed for <file>(gfid:fe6799e1-42e6-43e5-a7eb-ac8facfcbc9f) > [2017-10-19 17:52:07.549959] E [MSGID: 109011] > [dht-common.c:7188:dht_create] 0-<vol>-hot-dht: no subvolume in layout for > path=/path/to/<file> > [2017-10-19 17:52:07.550901] E [MSGID: 109023] > [dht-rebalance.c:757:__dht_rebalance_create_dst_file] 0-<vol>-tier-dht: > failed to create <file> on <vol>-hot-dht [Input/output error] > [2017-10-19 17:52:07.550922] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] > 0-<vol>-tier-dht: Create dst failed on - <vol>-hot-dht for file - <file> > [2017-10-19 17:52:07.551896] E [MSGID: 109037] > [tier.c:969:tier_migrate_link] 0-<vol>-tier-dht: Failed to migrate <file> > [No space left on device] > [2017-10-19 17:52:07.551917] I [MSGID: 109038] > [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion > failed for <file>(gfid:ffe3a3f2-b170-43f0-a9fb-97c78e3173eb) > [2017-10-19 17:52:07.551945] E [MSGID: 109037] [tier.c:2565:tier_run] > 0-<vol>-tier-dht: Promotion failed > > Node 1 /var/log/samba/glusterfs-<vol>-pool.log: > > [2017-10-18 17:13:41.481860] E [MSGID: 114031] > [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-0: remote > operation failed. Path: /pool/testing (7d89b9a8-3e5d-4f28-9e57-039fe4416994) > [Invalid argument] > [2017-10-18 17:13:41.481860] E [MSGID: 114031] > [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-1: remote > operation failed. Path: /pool/testing (7d89b9a8-3e5d-4f28-9e57-039fe4416994) > [Invalid argument] > [2017-10-18 17:13:41.485916] E [MSGID: 109089] > [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] 0-<vol>-tier-dht: > Failed to open the fd (0x7f02bf1ff570, flags=00) on file > 7d89b9a8-3e5d-4f28-9e57-039fe4416994 @ <vol>-cold-dht [Invalid argument] > [2017-10-18 17:13:41.488223] E [MSGID: 114031] > [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-0: remote > operation failed. Path: /pool/testing (7d89b9a8-3e5d-4f28-9e57-039fe4416994) > [Invalid argument] > [2017-10-18 17:13:41.488235] E [MSGID: 114031] > [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-1: remote > operation failed. Path: /pool/testing (7d89b9a8-3e5d-4f28-9e57-039fe4416994) > [Invalid argument] > [2017-10-18 17:13:41.489060] E [MSGID: 109089] > [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] 0-<vol>-tier-dht: > Failed to open the fd (0x7f02bf1feb50, flags=00) on file > 7d89b9a8-3e5d-4f28-9e57-039fe4416994 @ <vol>-cold-dht [Invalid argument] > [2017-10-18 17:13:42.339936] E [MSGID: 114031] > [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-4: remote > operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) > [Invalid argument] > [2017-10-18 17:13:42.339988] E [MSGID: 114031] > [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-5: remote > operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) > [Invalid argument] > [2017-10-18 17:13:42.343769] E [MSGID: 109089] > [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] 0-<vol>-tier-dht: > Failed to open the fd (0x7f02bf2012c0, flags=00) on file > 34d76e11-412f-4bc6-9a3e-b1f89658f13b @ <vol>-hot-dht [Invalid argument] > [2017-10-18 17:13:42.345374] E [MSGID: 114031] > [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-4: remote > operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) > [Invalid argument] > [2017-10-18 17:13:42.345401] E [MSGID: 114031] > [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-5: remote > operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) > [Invalid argument] > [2017-10-18 17:13:42.346259] E [MSGID: 109089] > [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] 0-<vol>-tier-dht: > Failed to open the fd (0x7f02bf201130, flags=00) on file > 34d76e11-412f-4bc6-9a3e-b1f89658f13b @ <vol>-hot-dht [Invalid argument] > [2017-10-18 17:13:59.541591] E [MSGID: 108006] > [afr-common.c:4808:afr_notify] 0-<vol>-replicate-0: All subvolumes are > down. Going offline until atleast one of them comes back up. > [2017-10-18 17:13:59.541748] E [MSGID: 108006] > [afr-common.c:4808:afr_notify] 0-<vol>-replicate-1: All subvolumes are > down. Going offline until atleast one of them comes back up. > [2017-10-18 17:13:59.541887] E [MSGID: 108006] > [afr-common.c:4808:afr_notify] 0-<vol>-replicate-2: All subvolumes are > down. Going offline until atleast one of them comes back up. > [2017-10-18 17:13:59.541977] E [MSGID: 108006] > [afr-common.c:4808:afr_notify] 0-<vol>-replicate-3: All subvolumes are > down. Going offline until atleast one of them comes back up. > > Node 2 /var/log/gluster/tier/<vol>/tierd.log: > > [2017-10-16 15:54:08.662873] I [MSGID: 109038] > [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion > failed for <file>(gfid:fffd714e-b2d2-42d3-a31f-72673276e3d0) > [2017-10-16 16:00:07.201584] I [MSGID: 109038] > [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion > failed for <file>(gfid:f10365e1-747b-4985-97b9-8b5dc61ac464) > [2017-10-16 16:00:07.372559] I [MSGID: 109038] > [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion > failed for <file>(gfid:f95f17bf-b696-44cd-aae0-d8ac38149aa5) > [2017-10-16 16:06:06.880522] I [MSGID: 109038] > [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion > failed for <file>(gfid:ec451f6c-8971-4f9b-a04f-00f96db9b46a) > [2017-10-16 16:06:08.062080] I [MSGID: 109038] > [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion > failed for <file>(gfid:e658cd70-3f6d-4b25-8d9f-0d4c24d3ec5d) > [2017-10-16 16:06:08.288298] I [MSGID: 109038] > [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion > failed for <file>(gfid:f22df67a-88e5-4fae-aab0-b00e04f9a6e1) > [2017-10-18 15:55:06.446416] I [MSGID: 109028] > [dht-rebalance.c:4792:gf_defrag_status_get] 0-glusterfs: Rebalance is in > progress. Time taken is 1376671.00 secs > [2017-10-18 15:55:06.446433] I [MSGID: 109028] > [dht-rebalance.c:4796:gf_defrag_status_get] 0-glusterfs: Files migrated: > 0, size: 0, lookups: 47887089, failures: 3594, skipped: 0 > [2017-10-19 00:00:00.501576] I [MSGID: 109038] > [tier.c:2391:tier_prepare_compact] 0-<vol>-tier-dht: Start compaction on > cold tier > [2017-10-19 00:00:00.502016] I [MSGID: 109038] > [tier.c:2403:tier_prepare_compact] 0-<vol>-tier-dht: End compaction on > cold tier > [2017-10-19 00:00:00.501608] I [MSGID: 109038] > [tier.c:2391:tier_prepare_compact] 0-<vol>-tier-dht: Start compaction on > cold tier > [2017-10-19 00:00:00.502076] I [MSGID: 109038] > [tier.c:2403:tier_prepare_compact] 0-<vol>-tier-dht: End compaction on > cold tier > [2017-10-19 16:03:49.522991] I [MSGID: 109028] > [dht-rebalance.c:4792:gf_defrag_status_get] 0-glusterfs: Rebalance is in > progress. Time taken is 1463594.00 secs > [2017-10-19 16:03:49.523017] I [MSGID: 109028] > [dht-rebalance.c:4796:gf_defrag_status_get] 0-glusterfs: Files migrated: > 0, size: 0, lookups: 52790654, failures: 3594, skipped: 0 > > Node 2 /var/log/samba/glusterfs-<vol>-pool.log: > > [2017-10-18 16:49:09.218062] E [MSGID: 114031] > [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-4: remote > operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) > [Invalid argument] > [2017-10-18 16:49:09.218254] E [MSGID: 109089] > [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] 0-<vol>-tier-dht: > Failed to open the fd (0x7f009b36bac0, flags=00) on file > 34d76e11-412f-4bc6-9a3e-b1f89658f13b @ <vol>-hot-dht [Invalid argument] > [2017-10-18 16:49:09.222783] E [MSGID: 108006] > [afr-common.c:4808:afr_notify] 0-<vol>-replicate-0: All subvolumes are > down. Going offline until atleast one of them comes back up. > [2017-10-18 16:49:09.222912] E [MSGID: 108006] > [afr-common.c:4808:afr_notify] 0-<vol>-replicate-1: All subvolumes are > down. Going offline until atleast one of them comes back up. > [2017-10-18 16:49:09.223079] E [MSGID: 108006] > [afr-common.c:4808:afr_notify] 0-<vol>-replicate-2: All subvolumes are > down. Going offline until atleast one of them comes back up. > [2017-10-18 16:49:09.223200] E [MSGID: 108006] > [afr-common.c:4808:afr_notify] 0-<vol>-replicate-3: All subvolumes are > down. Going offline until atleast one of them comes back up. > > Status: > > # gluster vol tier <vol> status > > Node Promoted files Demoted files Status > run time in h:m:s > --------- --------- --------- > --------- --------- > Node1 190861 0 in > progress 408:34:13 > Node2 0 0 > in progress 408:34:14 > > Hot tier bricks: > > # df -h > > /dev/mapper/vg_bricks-brick_nvme1 1.4T 551G 883G 39% > /mnt/brick_nvme1 > /dev/mapper/vg_bricks-brick_nvme2 1.4T 512G 922G 36% > /mnt/brick_nvme2 > > > Can anyone point me in the right direction as to what may be going on? > Any guidance is greatly appreciated. > > Thanks in advance, > > HB > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-- Milind -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171022/47fefeb3/attachment.html>
There are several messages "no space left on device". I would check first that free disk space is available for the volume. On Oct 22, 2017 18:42, "Milind Changire" <mchangir at redhat.com> wrote:> Herb, > What are the high and low watermarks for the tier set at ? > > # gluster volume get <vol> cluster.watermark-hi > > # gluster volume get <vol> cluster.watermark-low > > What is the size of the file that failed to migrate as per the following > tierd log: > > [2017-10-19 17:52:07.519614] I [MSGID: 109038] > [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion > failed for <file>(gfid:edaf97e1-02e0-4838-9d26-71ea3aab22fb) > > If possible, a *gluster volume info* would also help, instead of going to > and fro with questions. > > -- > Milind > > > > On Fri, Oct 20, 2017 at 12:42 AM, Herb Burnswell < > herbert.burnswell at gmail.com> wrote: > >> All, >> >> I am new to gluster and have some questions/concerns about some tiering >> errors that I see in the log files. >> >> OS: CentOs 7.3.1611 >> Gluster version: 3.10.5 >> Samba version: 4.6.2 >> >> I see the following (scrubbed): >> >> Node 1 /var/log/glusterfs/tier/<vol>/tierd.log: >> >> [2017-10-19 17:52:07.519614] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:edaf97e1-02e0-4838-9d26-71ea3aab22fb) >> [2017-10-19 17:52:07.525110] E [MSGID: 109011] >> [dht-common.c:7188:dht_create] 0-<vol>-hot-dht: no subvolume in layout for >> path=/path/to/<file> >> [2017-10-19 17:52:07.526088] E [MSGID: 109023] >> [dht-rebalance.c:757:__dht_rebalance_create_dst_file] 0-<vol>-tier-dht: >> failed to create <file> on <vol>-hot-dht [Input/output error] >> [2017-10-19 17:52:07.526111] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] >> 0-<vol>-tier-dht: Create dst failed on - <vol>-hot-dht for file - <file> >> [2017-10-19 17:52:07.527214] E [MSGID: 109037] >> [tier.c:969:tier_migrate_link] 0-<vol>-tier-dht: Failed to migrate <file> >> [No space left on device] >> [2017-10-19 17:52:07.527244] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:fb4411c4-a387-4e5f-a2b7-897633ef4aa8) >> [2017-10-19 17:52:07.533510] E [MSGID: 109011] >> [dht-common.c:7188:dht_create] 0-<vol>-hot-dht: no subvolume in layout for >> path=/path/to/<file> >> [2017-10-19 17:52:07.534434] E [MSGID: 109023] >> [dht-rebalance.c:757:__dht_rebalance_create_dst_file] 0-<vol>-tier-dht: >> failed to create <file> on <vol>-hot-dht [Input/output error] >> [2017-10-19 17:52:07.534453] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] >> 0-<vol>-tier-dht: Create dst failed on - <vol>-hot-dht for file - <file> >> [2017-10-19 17:52:07.535570] E [MSGID: 109037] >> [tier.c:969:tier_migrate_link] 0-<vol>-tier-dht: Failed to migrate <file> >> [No space left on device] >> [2017-10-19 17:52:07.535594] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:fba421e7-0500-47c4-bf67-10a40690e13d) >> [2017-10-19 17:52:07.541363] E [MSGID: 109011] >> [dht-common.c:7188:dht_create] 0-<vol>-hot-dht: no subvolume in layout for >> path=/path/to/<file> >> [2017-10-19 17:52:07.542296] E [MSGID: 109023] >> [dht-rebalance.c:757:__dht_rebalance_create_dst_file] 0-<vol>-tier-dht: >> failed to create <file> on <vol>-hot-dht [Input/output error] >> [2017-10-19 17:52:07.542357] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] >> 0-<vol>-tier-dht: Create dst failed on - <vol>-hot-dht for file - <file> >> [2017-10-19 17:52:07.543480] E [MSGID: 109037] >> [tier.c:969:tier_migrate_link] 0-<vol>-tier-dht: Failed to migrate <file> >> [No space left on device] >> [2017-10-19 17:52:07.543521] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:fe6799e1-42e6-43e5-a7eb-ac8facfcbc9f) >> [2017-10-19 17:52:07.549959] E [MSGID: 109011] >> [dht-common.c:7188:dht_create] 0-<vol>-hot-dht: no subvolume in layout for >> path=/path/to/<file> >> [2017-10-19 17:52:07.550901] E [MSGID: 109023] >> [dht-rebalance.c:757:__dht_rebalance_create_dst_file] 0-<vol>-tier-dht: >> failed to create <file> on <vol>-hot-dht [Input/output error] >> [2017-10-19 17:52:07.550922] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] >> 0-<vol>-tier-dht: Create dst failed on - <vol>-hot-dht for file - <file> >> [2017-10-19 17:52:07.551896] E [MSGID: 109037] >> [tier.c:969:tier_migrate_link] 0-<vol>-tier-dht: Failed to migrate <file> >> [No space left on device] >> [2017-10-19 17:52:07.551917] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:ffe3a3f2-b170-43f0-a9fb-97c78e3173eb) >> [2017-10-19 17:52:07.551945] E [MSGID: 109037] [tier.c:2565:tier_run] >> 0-<vol>-tier-dht: Promotion failed >> >> Node 1 /var/log/samba/glusterfs-<vol>-pool.log: >> >> [2017-10-18 17:13:41.481860] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-0: remote >> operation failed. Path: /pool/testing (7d89b9a8-3e5d-4f28-9e57-039fe4416994) >> [Invalid argument] >> [2017-10-18 17:13:41.481860] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-1: remote >> operation failed. Path: /pool/testing (7d89b9a8-3e5d-4f28-9e57-039fe4416994) >> [Invalid argument] >> [2017-10-18 17:13:41.485916] E [MSGID: 109089] >> [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] >> 0-<vol>-tier-dht: Failed to open the fd (0x7f02bf1ff570, flags=00) on file >> 7d89b9a8-3e5d-4f28-9e57-039fe4416994 @ <vol>-cold-dht [Invalid argument] >> [2017-10-18 17:13:41.488223] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-0: remote >> operation failed. Path: /pool/testing (7d89b9a8-3e5d-4f28-9e57-039fe4416994) >> [Invalid argument] >> [2017-10-18 17:13:41.488235] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-1: remote >> operation failed. Path: /pool/testing (7d89b9a8-3e5d-4f28-9e57-039fe4416994) >> [Invalid argument] >> [2017-10-18 17:13:41.489060] E [MSGID: 109089] >> [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] >> 0-<vol>-tier-dht: Failed to open the fd (0x7f02bf1feb50, flags=00) on file >> 7d89b9a8-3e5d-4f28-9e57-039fe4416994 @ <vol>-cold-dht [Invalid argument] >> [2017-10-18 17:13:42.339936] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-4: remote >> operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) >> [Invalid argument] >> [2017-10-18 17:13:42.339988] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-5: remote >> operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) >> [Invalid argument] >> [2017-10-18 17:13:42.343769] E [MSGID: 109089] >> [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] >> 0-<vol>-tier-dht: Failed to open the fd (0x7f02bf2012c0, flags=00) on file >> 34d76e11-412f-4bc6-9a3e-b1f89658f13b @ <vol>-hot-dht [Invalid argument] >> [2017-10-18 17:13:42.345374] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-4: remote >> operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) >> [Invalid argument] >> [2017-10-18 17:13:42.345401] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-5: remote >> operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) >> [Invalid argument] >> [2017-10-18 17:13:42.346259] E [MSGID: 109089] >> [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] >> 0-<vol>-tier-dht: Failed to open the fd (0x7f02bf201130, flags=00) on file >> 34d76e11-412f-4bc6-9a3e-b1f89658f13b @ <vol>-hot-dht [Invalid argument] >> [2017-10-18 17:13:59.541591] E [MSGID: 108006] >> [afr-common.c:4808:afr_notify] 0-<vol>-replicate-0: All subvolumes are >> down. Going offline until atleast one of them comes back up. >> [2017-10-18 17:13:59.541748] E [MSGID: 108006] >> [afr-common.c:4808:afr_notify] 0-<vol>-replicate-1: All subvolumes are >> down. Going offline until atleast one of them comes back up. >> [2017-10-18 17:13:59.541887] E [MSGID: 108006] >> [afr-common.c:4808:afr_notify] 0-<vol>-replicate-2: All subvolumes are >> down. Going offline until atleast one of them comes back up. >> [2017-10-18 17:13:59.541977] E [MSGID: 108006] >> [afr-common.c:4808:afr_notify] 0-<vol>-replicate-3: All subvolumes are >> down. Going offline until atleast one of them comes back up. >> >> Node 2 /var/log/gluster/tier/<vol>/tierd.log: >> >> [2017-10-16 15:54:08.662873] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:fffd714e-b2d2-42d3-a31f-72673276e3d0) >> [2017-10-16 16:00:07.201584] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:f10365e1-747b-4985-97b9-8b5dc61ac464) >> [2017-10-16 16:00:07.372559] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:f95f17bf-b696-44cd-aae0-d8ac38149aa5) >> [2017-10-16 16:06:06.880522] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:ec451f6c-8971-4f9b-a04f-00f96db9b46a) >> [2017-10-16 16:06:08.062080] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:e658cd70-3f6d-4b25-8d9f-0d4c24d3ec5d) >> [2017-10-16 16:06:08.288298] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:f22df67a-88e5-4fae-aab0-b00e04f9a6e1) >> [2017-10-18 15:55:06.446416] I [MSGID: 109028] >> [dht-rebalance.c:4792:gf_defrag_status_get] 0-glusterfs: Rebalance is in >> progress. Time taken is 1376671.00 secs >> [2017-10-18 15:55:06.446433] I [MSGID: 109028] >> [dht-rebalance.c:4796:gf_defrag_status_get] 0-glusterfs: Files migrated: >> 0, size: 0, lookups: 47887089, failures: 3594, skipped: 0 >> [2017-10-19 00:00:00.501576] I [MSGID: 109038] >> [tier.c:2391:tier_prepare_compact] 0-<vol>-tier-dht: Start compaction on >> cold tier >> [2017-10-19 00:00:00.502016] I [MSGID: 109038] >> [tier.c:2403:tier_prepare_compact] 0-<vol>-tier-dht: End compaction on >> cold tier >> [2017-10-19 00:00:00.501608] I [MSGID: 109038] >> [tier.c:2391:tier_prepare_compact] 0-<vol>-tier-dht: Start compaction on >> cold tier >> [2017-10-19 00:00:00.502076] I [MSGID: 109038] >> [tier.c:2403:tier_prepare_compact] 0-<vol>-tier-dht: End compaction on >> cold tier >> [2017-10-19 16:03:49.522991] I [MSGID: 109028] >> [dht-rebalance.c:4792:gf_defrag_status_get] 0-glusterfs: Rebalance is in >> progress. Time taken is 1463594.00 secs >> [2017-10-19 16:03:49.523017] I [MSGID: 109028] >> [dht-rebalance.c:4796:gf_defrag_status_get] 0-glusterfs: Files migrated: >> 0, size: 0, lookups: 52790654, failures: 3594, skipped: 0 >> >> Node 2 /var/log/samba/glusterfs-<vol>-pool.log: >> >> [2017-10-18 16:49:09.218062] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-4: remote >> operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) >> [Invalid argument] >> [2017-10-18 16:49:09.218254] E [MSGID: 109089] >> [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] >> 0-<vol>-tier-dht: Failed to open the fd (0x7f009b36bac0, flags=00) on file >> 34d76e11-412f-4bc6-9a3e-b1f89658f13b @ <vol>-hot-dht [Invalid argument] >> [2017-10-18 16:49:09.222783] E [MSGID: 108006] >> [afr-common.c:4808:afr_notify] 0-<vol>-replicate-0: All subvolumes are >> down. Going offline until atleast one of them comes back up. >> [2017-10-18 16:49:09.222912] E [MSGID: 108006] >> [afr-common.c:4808:afr_notify] 0-<vol>-replicate-1: All subvolumes are >> down. Going offline until atleast one of them comes back up. >> [2017-10-18 16:49:09.223079] E [MSGID: 108006] >> [afr-common.c:4808:afr_notify] 0-<vol>-replicate-2: All subvolumes are >> down. Going offline until atleast one of them comes back up. >> [2017-10-18 16:49:09.223200] E [MSGID: 108006] >> [afr-common.c:4808:afr_notify] 0-<vol>-replicate-3: All subvolumes are >> down. Going offline until atleast one of them comes back up. >> >> Status: >> >> # gluster vol tier <vol> status >> >> Node Promoted files Demoted files Status >> run time in h:m:s >> --------- --------- --------- >> --------- --------- >> Node1 190861 0 in >> progress 408:34:13 >> Node2 0 0 >> in progress 408:34:14 >> >> Hot tier bricks: >> >> # df -h >> >> /dev/mapper/vg_bricks-brick_nvme1 1.4T 551G 883G 39% >> /mnt/brick_nvme1 >> /dev/mapper/vg_bricks-brick_nvme2 1.4T 512G 922G 36% >> /mnt/brick_nvme2 >> >> >> Can anyone point me in the right direction as to what may be going on? >> Any guidance is greatly appreciated. >> >> Thanks in advance, >> >> HB >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users >> > > > > -- > Milind > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171022/1ec3b9b9/attachment.html>
Milind - Thank you for the response..>> What are the high and low watermarks for the tier set at ?# gluster volume get <vol> cluster.watermark-hi Option Value ------ ----- cluster.watermark-hi 90 # gluster volume get <vol> cluster.watermark-low Option Value ------ ----- cluster.watermark-low 75>> What is the size of the file that failed to migrate as per the followingtierd log:>> [2017-10-19 17:52:07.519614] I [MSGID: 109038][tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion failed for <file>(gfid:edaf97e1-02e0-4838-9d26-71ea3aab22fb) The file was a word doc @ 29K in size.>>If possible, a *gluster volume info* would also help, instead of going toand fro with questions. # gluster vol info Volume Name: ctdb Type: Replicate Volume ID: f679c476-e0dd-4f3a-9813-1b26016b5384 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: <node1>:/mnt/ctdb_local/brick Brick2: <node2>:/mnt/ctdb_local/brick Options Reconfigured: nfs.disable: on transport.address-family: inet Volume Name: <vol> Type: Tier Volume ID: 7710ed2f-775e-4dd9-92ad-66407c72b0ad Status: Started Snapshot Count: 0 Number of Bricks: 8 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick1: <node2>:/mnt/brick_nvme1/brick Brick2: <node1>:/mnt/brick_nvme2/brick Brick3: <node2>:/mnt/brick_nvme2/brick Brick4: <node1>:/mnt/brick_nvme1/brick Cold Tier: Cold Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick5: <node1>:/mnt/brick1/brick Brick6: <node2>:/mnt/brick2/brick Brick7: <node1>:/mnt/brick2/brick Brick8: <node2>:/mnt/brick1/brick Options Reconfigured: cluster.lookup-optimize: on client.event-threads: 4 server.event-threads: 4 performance.write-behind-window-size: 4MB performance.cache-size: 16GB features.quota-deem-statfs: on features.inode-quota: on features.quota: on nfs.disable: on transport.address-family: inet features.ctr-enabled: on cluster.tier-mode: cache performance.io-cache: off performance.quick-read: off cluster.tier-max-files: 1000000 HB On Sun, Oct 22, 2017 at 8:41 AM, Milind Changire <mchangir at redhat.com> wrote:> Herb, > What are the high and low watermarks for the tier set at ? > > # gluster volume get <vol> cluster.watermark-hi > > # gluster volume get <vol> cluster.watermark-low > > What is the size of the file that failed to migrate as per the following > tierd log: > > [2017-10-19 17:52:07.519614] I [MSGID: 109038] > [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion > failed for <file>(gfid:edaf97e1-02e0-4838-9d26-71ea3aab22fb) > > If possible, a *gluster volume info* would also help, instead of going to > and fro with questions. > > -- > Milind > > > > On Fri, Oct 20, 2017 at 12:42 AM, Herb Burnswell < > herbert.burnswell at gmail.com> wrote: > >> All, >> >> I am new to gluster and have some questions/concerns about some tiering >> errors that I see in the log files. >> >> OS: CentOs 7.3.1611 >> Gluster version: 3.10.5 >> Samba version: 4.6.2 >> >> I see the following (scrubbed): >> >> Node 1 /var/log/glusterfs/tier/<vol>/tierd.log: >> >> [2017-10-19 17:52:07.519614] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:edaf97e1-02e0-4838-9d26-71ea3aab22fb) >> [2017-10-19 17:52:07.525110] E [MSGID: 109011] >> [dht-common.c:7188:dht_create] 0-<vol>-hot-dht: no subvolume in layout for >> path=/path/to/<file> >> [2017-10-19 17:52:07.526088] E [MSGID: 109023] >> [dht-rebalance.c:757:__dht_rebalance_create_dst_file] 0-<vol>-tier-dht: >> failed to create <file> on <vol>-hot-dht [Input/output error] >> [2017-10-19 17:52:07.526111] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] >> 0-<vol>-tier-dht: Create dst failed on - <vol>-hot-dht for file - <file> >> [2017-10-19 17:52:07.527214] E [MSGID: 109037] >> [tier.c:969:tier_migrate_link] 0-<vol>-tier-dht: Failed to migrate <file> >> [No space left on device] >> [2017-10-19 17:52:07.527244] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:fb4411c4-a387-4e5f-a2b7-897633ef4aa8) >> [2017-10-19 17:52:07.533510] E [MSGID: 109011] >> [dht-common.c:7188:dht_create] 0-<vol>-hot-dht: no subvolume in layout for >> path=/path/to/<file> >> [2017-10-19 17:52:07.534434] E [MSGID: 109023] >> [dht-rebalance.c:757:__dht_rebalance_create_dst_file] 0-<vol>-tier-dht: >> failed to create <file> on <vol>-hot-dht [Input/output error] >> [2017-10-19 17:52:07.534453] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] >> 0-<vol>-tier-dht: Create dst failed on - <vol>-hot-dht for file - <file> >> [2017-10-19 17:52:07.535570] E [MSGID: 109037] >> [tier.c:969:tier_migrate_link] 0-<vol>-tier-dht: Failed to migrate <file> >> [No space left on device] >> [2017-10-19 17:52:07.535594] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:fba421e7-0500-47c4-bf67-10a40690e13d) >> [2017-10-19 17:52:07.541363] E [MSGID: 109011] >> [dht-common.c:7188:dht_create] 0-<vol>-hot-dht: no subvolume in layout for >> path=/path/to/<file> >> [2017-10-19 17:52:07.542296] E [MSGID: 109023] >> [dht-rebalance.c:757:__dht_rebalance_create_dst_file] 0-<vol>-tier-dht: >> failed to create <file> on <vol>-hot-dht [Input/output error] >> [2017-10-19 17:52:07.542357] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] >> 0-<vol>-tier-dht: Create dst failed on - <vol>-hot-dht for file - <file> >> [2017-10-19 17:52:07.543480] E [MSGID: 109037] >> [tier.c:969:tier_migrate_link] 0-<vol>-tier-dht: Failed to migrate <file> >> [No space left on device] >> [2017-10-19 17:52:07.543521] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:fe6799e1-42e6-43e5-a7eb-ac8facfcbc9f) >> [2017-10-19 17:52:07.549959] E [MSGID: 109011] >> [dht-common.c:7188:dht_create] 0-<vol>-hot-dht: no subvolume in layout for >> path=/path/to/<file> >> [2017-10-19 17:52:07.550901] E [MSGID: 109023] >> [dht-rebalance.c:757:__dht_rebalance_create_dst_file] 0-<vol>-tier-dht: >> failed to create <file> on <vol>-hot-dht [Input/output error] >> [2017-10-19 17:52:07.550922] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] >> 0-<vol>-tier-dht: Create dst failed on - <vol>-hot-dht for file - <file> >> [2017-10-19 17:52:07.551896] E [MSGID: 109037] >> [tier.c:969:tier_migrate_link] 0-<vol>-tier-dht: Failed to migrate <file> >> [No space left on device] >> [2017-10-19 17:52:07.551917] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:ffe3a3f2-b170-43f0-a9fb-97c78e3173eb) >> [2017-10-19 17:52:07.551945] E [MSGID: 109037] [tier.c:2565:tier_run] >> 0-<vol>-tier-dht: Promotion failed >> >> Node 1 /var/log/samba/glusterfs-<vol>-pool.log: >> >> [2017-10-18 17:13:41.481860] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-0: remote >> operation failed. Path: /pool/testing (7d89b9a8-3e5d-4f28-9e57-039fe4416994) >> [Invalid argument] >> [2017-10-18 17:13:41.481860] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-1: remote >> operation failed. Path: /pool/testing (7d89b9a8-3e5d-4f28-9e57-039fe4416994) >> [Invalid argument] >> [2017-10-18 17:13:41.485916] E [MSGID: 109089] >> [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] >> 0-<vol>-tier-dht: Failed to open the fd (0x7f02bf1ff570, flags=00) on file >> 7d89b9a8-3e5d-4f28-9e57-039fe4416994 @ <vol>-cold-dht [Invalid argument] >> [2017-10-18 17:13:41.488223] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-0: remote >> operation failed. Path: /pool/testing (7d89b9a8-3e5d-4f28-9e57-039fe4416994) >> [Invalid argument] >> [2017-10-18 17:13:41.488235] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-1: remote >> operation failed. Path: /pool/testing (7d89b9a8-3e5d-4f28-9e57-039fe4416994) >> [Invalid argument] >> [2017-10-18 17:13:41.489060] E [MSGID: 109089] >> [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] >> 0-<vol>-tier-dht: Failed to open the fd (0x7f02bf1feb50, flags=00) on file >> 7d89b9a8-3e5d-4f28-9e57-039fe4416994 @ <vol>-cold-dht [Invalid argument] >> [2017-10-18 17:13:42.339936] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-4: remote >> operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) >> [Invalid argument] >> [2017-10-18 17:13:42.339988] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-5: remote >> operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) >> [Invalid argument] >> [2017-10-18 17:13:42.343769] E [MSGID: 109089] >> [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] >> 0-<vol>-tier-dht: Failed to open the fd (0x7f02bf2012c0, flags=00) on file >> 34d76e11-412f-4bc6-9a3e-b1f89658f13b @ <vol>-hot-dht [Invalid argument] >> [2017-10-18 17:13:42.345374] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-4: remote >> operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) >> [Invalid argument] >> [2017-10-18 17:13:42.345401] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-5: remote >> operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) >> [Invalid argument] >> [2017-10-18 17:13:42.346259] E [MSGID: 109089] >> [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] >> 0-<vol>-tier-dht: Failed to open the fd (0x7f02bf201130, flags=00) on file >> 34d76e11-412f-4bc6-9a3e-b1f89658f13b @ <vol>-hot-dht [Invalid argument] >> [2017-10-18 17:13:59.541591] E [MSGID: 108006] >> [afr-common.c:4808:afr_notify] 0-<vol>-replicate-0: All subvolumes are >> down. Going offline until atleast one of them comes back up. >> [2017-10-18 17:13:59.541748] E [MSGID: 108006] >> [afr-common.c:4808:afr_notify] 0-<vol>-replicate-1: All subvolumes are >> down. Going offline until atleast one of them comes back up. >> [2017-10-18 17:13:59.541887] E [MSGID: 108006] >> [afr-common.c:4808:afr_notify] 0-<vol>-replicate-2: All subvolumes are >> down. Going offline until atleast one of them comes back up. >> [2017-10-18 17:13:59.541977] E [MSGID: 108006] >> [afr-common.c:4808:afr_notify] 0-<vol>-replicate-3: All subvolumes are >> down. Going offline until atleast one of them comes back up. >> >> Node 2 /var/log/gluster/tier/<vol>/tierd.log: >> >> [2017-10-16 15:54:08.662873] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:fffd714e-b2d2-42d3-a31f-72673276e3d0) >> [2017-10-16 16:00:07.201584] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:f10365e1-747b-4985-97b9-8b5dc61ac464) >> [2017-10-16 16:00:07.372559] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:f95f17bf-b696-44cd-aae0-d8ac38149aa5) >> [2017-10-16 16:06:06.880522] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:ec451f6c-8971-4f9b-a04f-00f96db9b46a) >> [2017-10-16 16:06:08.062080] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:e658cd70-3f6d-4b25-8d9f-0d4c24d3ec5d) >> [2017-10-16 16:06:08.288298] I [MSGID: 109038] >> [tier.c:1169:tier_migrate_using_query_file] 0-<vol>-tier-dht: Promotion >> failed for <file>(gfid:f22df67a-88e5-4fae-aab0-b00e04f9a6e1) >> [2017-10-18 15:55:06.446416] I [MSGID: 109028] >> [dht-rebalance.c:4792:gf_defrag_status_get] 0-glusterfs: Rebalance is in >> progress. Time taken is 1376671.00 secs >> [2017-10-18 15:55:06.446433] I [MSGID: 109028] >> [dht-rebalance.c:4796:gf_defrag_status_get] 0-glusterfs: Files migrated: >> 0, size: 0, lookups: 47887089, failures: 3594, skipped: 0 >> [2017-10-19 00:00:00.501576] I [MSGID: 109038] >> [tier.c:2391:tier_prepare_compact] 0-<vol>-tier-dht: Start compaction on >> cold tier >> [2017-10-19 00:00:00.502016] I [MSGID: 109038] >> [tier.c:2403:tier_prepare_compact] 0-<vol>-tier-dht: End compaction on >> cold tier >> [2017-10-19 00:00:00.501608] I [MSGID: 109038] >> [tier.c:2391:tier_prepare_compact] 0-<vol>-tier-dht: Start compaction on >> cold tier >> [2017-10-19 00:00:00.502076] I [MSGID: 109038] >> [tier.c:2403:tier_prepare_compact] 0-<vol>-tier-dht: End compaction on >> cold tier >> [2017-10-19 16:03:49.522991] I [MSGID: 109028] >> [dht-rebalance.c:4792:gf_defrag_status_get] 0-glusterfs: Rebalance is in >> progress. Time taken is 1463594.00 secs >> [2017-10-19 16:03:49.523017] I [MSGID: 109028] >> [dht-rebalance.c:4796:gf_defrag_status_get] 0-glusterfs: Files migrated: >> 0, size: 0, lookups: 52790654, failures: 3594, skipped: 0 >> >> Node 2 /var/log/samba/glusterfs-<vol>-pool.log: >> >> [2017-10-18 16:49:09.218062] E [MSGID: 114031] >> [client-rpc-fops.c:443:client3_3_open_cbk] 0-<vol>-client-4: remote >> operation failed. Path: /pool (34d76e11-412f-4bc6-9a3e-b1f89658f13b) >> [Invalid argument] >> [2017-10-18 16:49:09.218254] E [MSGID: 109089] >> [dht-helper.c:517:dht_check_and_open_fd_on_subvol_task] >> 0-<vol>-tier-dht: Failed to open the fd (0x7f009b36bac0, flags=00) on file >> 34d76e11-412f-4bc6-9a3e-b1f89658f13b @ <vol>-hot-dht [Invalid argument] >> [2017-10-18 16:49:09.222783] E [MSGID: 108006] >> [afr-common.c:4808:afr_notify] 0-<vol>-replicate-0: All subvolumes are >> down. Going offline until atleast one of them comes back up. >> [2017-10-18 16:49:09.222912] E [MSGID: 108006] >> [afr-common.c:4808:afr_notify] 0-<vol>-replicate-1: All subvolumes are >> down. Going offline until atleast one of them comes back up. >> [2017-10-18 16:49:09.223079] E [MSGID: 108006] >> [afr-common.c:4808:afr_notify] 0-<vol>-replicate-2: All subvolumes are >> down. Going offline until atleast one of them comes back up. >> [2017-10-18 16:49:09.223200] E [MSGID: 108006] >> [afr-common.c:4808:afr_notify] 0-<vol>-replicate-3: All subvolumes are >> down. Going offline until atleast one of them comes back up. >> >> Status: >> >> # gluster vol tier <vol> status >> >> Node Promoted files Demoted files Status >> run time in h:m:s >> --------- --------- --------- >> --------- --------- >> Node1 190861 0 in >> progress 408:34:13 >> Node2 0 0 >> in progress 408:34:14 >> >> Hot tier bricks: >> >> # df -h >> >> /dev/mapper/vg_bricks-brick_nvme1 1.4T 551G 883G 39% >> /mnt/brick_nvme1 >> /dev/mapper/vg_bricks-brick_nvme2 1.4T 512G 922G 36% >> /mnt/brick_nvme2 >> >> >> Can anyone point me in the right direction as to what may be going on? >> Any guidance is greatly appreciated. >> >> Thanks in advance, >> >> HB >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users >> > > > > -- > Milind > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171024/8280cfc5/attachment.html>