Alan Orth
2019-May-07 13:12 UTC
[Gluster-users] "No space left on device" during rebalance with failed brick on Gluster 4.1.7
Dear list, We are using a Distributed-Replicate volume with replica 2 on Gluster 4.1.7 on CentOS 7. One of our nodes died recently and we will add new nodes and bricks to replace it soon. In preparation for the maintenance I wanted to rebalance the volume to make the disk thrashing less intense when we add/remove bricks, but after eight hours of scanning I see millions of "failures" in the rebalance status. The volume rebalance log shows many errors like: [2019-05-07 06:06:02.310843] E [MSGID: 109023] [dht-rebalance.c:2907:gf_defrag_migrate_single_file] 0-data-dht: migrate-data failed for /ilri/miseq/MiSeq2/MiSeq2Output2018/180912_M03021_0002_000000000-BVM95/Thumbnail_Images/L001/C174.1/s_1_2103_c.jpg [No space left on device] The bricks on the healthy nodes all have 1.5TB of free space so I'm not sure what this error means. Could it be because one of the replicas is unavailable? I saw a similar bug report? about that. I've started a simple fix-layout without data migration and it is working fine. Thank you, ? https://access.redhat.com/solutions/456333 -- Alan Orth alan.orth at gmail.com https://picturingjordan.com https://englishbulgaria.net https://mjanja.ch "In heaven all the interesting people are missing." ?Friedrich Nietzsche -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190507/636ec3e6/attachment.html>
Alan Orth
2019-May-07 13:51 UTC
[Gluster-users] "No space left on device" during rebalance with failed brick on Gluster 4.1.7
Dear list, After looking at my rebalance log more I saw another message that helped me solve the problem: [2019-05-06 22:46:01.074035] W [MSGID: 0] [dht-rebalance.c:1075:__dht_check_free_space] 0-data-dht: Write will cross min-free-disk for file - /ilri/miseq/MiSeq1/MiSeq1Output_2014/140624_M01601_0035_000000000-A6L82/Data/TileStatus/TileStatusL1T1114.tpl on subvol - data-replicate-1. Looking for new subvol I have not test cluster.min-free-disk, but it appears that the default value for is 10%, and my bricks are at about 97% capacity so the "No space error" in my previous message makes sense. I reduced the cluster.min-free-disk to 2% and restarted the data rebalance and now I see that it is already rebalancing files. The issue is solved. Sorry about that! Thank you, On Tue, May 7, 2019 at 4:12 PM Alan Orth <alan.orth at gmail.com> wrote:> Dear list, > > We are using a Distributed-Replicate volume with replica 2 on Gluster > 4.1.7 on CentOS 7. One of our nodes died recently and we will add new nodes > and bricks to replace it soon. In preparation for the maintenance I wanted > to rebalance the volume to make the disk thrashing less intense when we > add/remove bricks, but after eight hours of scanning I see millions of > "failures" in the rebalance status. The volume rebalance log shows many > errors like: > > [2019-05-07 06:06:02.310843] E [MSGID: 109023] > [dht-rebalance.c:2907:gf_defrag_migrate_single_file] 0-data-dht: > migrate-data failed for > /ilri/miseq/MiSeq2/MiSeq2Output2018/180912_M03021_0002_000000000-BVM95/Thumbnail_Images/L001/C174.1/s_1_2103_c.jpg > [No space left on device] > > The bricks on the healthy nodes all have 1.5TB of free space so I'm not > sure what this error means. Could it be because one of the replicas is > unavailable? I saw a similar bug report? about that. I've started a simple > fix-layout without data migration and it is working fine. > > Thank you, > > ? https://access.redhat.com/solutions/456333 > -- > Alan Orth > alan.orth at gmail.com > https://picturingjordan.com > https://englishbulgaria.net > https://mjanja.ch > "In heaven all the interesting people are missing." ?Friedrich Nietzsche >-- Alan Orth alan.orth at gmail.com https://picturingjordan.com https://englishbulgaria.net https://mjanja.ch "In heaven all the interesting people are missing." ?Friedrich Nietzsche -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190507/e1e070f1/attachment.html>