Matthew Benstead
2019-Jul-24 16:43 UTC
[Gluster-users] Brick missing trusted.glusterfs.dht xattr
So looking more closely at the trusted.glusterfs.dht attributes from the bricks it looks like they cover the entire range... and there is no range left for gluster07. The first 6 bricks range from 0x00000000 to 0xffffffff - so... is there a way to re-calculate what the dht values should be? Each of the bricks should have a gap Gluster05 00000000 -> 2aaaaaa9 Gluster06 2aaaaaaa -> 55555553 Gluster01 55555554 -> 7ffffffd Gluster02 7ffffffe -> aaaaaaa7 Gluster03 aaaaaaa8 -> d5555551 Gluster04 d5555552 -> ffffffff Gluster07 None If we split the range into 7 servers that would be a gap of about 0x24924924 for each server. Now in terms of the gluster07 brick, about 2 years ago the RAID array the brick was stored on became corrupted. I ran the remove-brick force command, then provisioned a new server, ran the add-brick command and then restored the missing files from backup by copying them back to the main gluster mount (not the brick). It looks like prior to that event this was the layout - which would make sense given the equal size of the 7 bricks: gluster02.pcic.uvic.ca | SUCCESS | rc=0 >> # file: /mnt/raid6-storage/storage trusted.glusterfs.dht=0x000000010000000048bfff206d1ffe5f gluster05.pcic.uvic.ca | SUCCESS | rc=0 >> # file: /mnt/raid6-storage/storage trusted.glusterfs.dht=0x0000000100000000b5dffce0da3ffc1f gluster04.pcic.uvic.ca | SUCCESS | rc=0 >> # file: /mnt/raid6-storage/storage trusted.glusterfs.dht=0x0000000100000000917ffda0b5dffcdf gluster03.pcic.uvic.ca | SUCCESS | rc=0 >> # file: /mnt/raid6-storage/storage trusted.glusterfs.dht=0x00000001000000006d1ffe60917ffd9f gluster01.pcic.uvic.ca | SUCCESS | rc=0 >> # file: /mnt/raid6-storage/storage trusted.glusterfs.dht=0x0000000100000000245fffe048bfff1f gluster07.pcic.uvic.ca | SUCCESS | rc=0 >> # file: /mnt/raid6-storage/storage trusted.glusterfs.dht=0x000000010000000000000000245fffdf gluster06.pcic.uvic.ca | SUCCESS | rc=0 >> # file: /mnt/raid6-storage/storage trusted.glusterfs.dht=0x0000000100000000da3ffc20ffffffff Which yields the following: 00000000 -> 245fffdf??? Gluster07 245fffe0 -> 48bfff1f??? Gluster01 48bfff20 -> 6d1ffe5f??? Gluster02 6d1ffe60 -> 917ffd9f??? Gluster03 917ffda0 -> b5dffcdf??? Gluster04 b5dffce0 -> da3ffc1f??? Gluster05 da3ffc20 -> ffffffff??? Gluster06 Is there some way to get back to this? Thanks, ?-Matthew -- Matthew Benstead System Administrator Pacific Climate Impacts Consortium <https://pacificclimate.org/> University of Victoria, UH1 PO Box 1800, STN CSC Victoria, BC, V8W 2Y2 Phone: +1-250-721-8432 Email: matthewb at uvic.ca On 7/18/19 7:20 AM, Matthew Benstead wrote:> Hi Nithya, > > No - it was added about a year and a half ago. I have tried > re-mounting the volume on the server, but it didn't add the attr: > > [root at gluster07 ~]# umount /storage/ > [root at gluster07 ~]# cat /etc/fstab | grep "/storage" > 10.0.231.56:/storage /storage glusterfs > defaults,log-level=WARNING,backupvolfile-server=10.0.231.51 0 0 > [root at gluster07 ~]# mount /storage/ > [root at gluster07 ~]# df -h /storage/ > Filesystem??????????? Size? Used Avail Use% Mounted on > 10.0.231.56:/storage? 255T? 194T?? 62T? 77% /storage > [root at gluster07 ~]# getfattr --absolute-names -m . -d -e hex > /mnt/raid6-storage/storage/ > # file: /mnt/raid6-storage/storage/ > security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000 > trusted.gfid=0x00000000000000000000000000000001 > trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x5d307baa00023ec0 > trusted.glusterfs.quota.dirty=0x3000 > trusted.glusterfs.quota.size.2=0x00001b71d5279e000000000000763e32000000000005cd53 > trusted.glusterfs.volume-id=0x6f95525a94d74174bac4e1a18fe010a2 > > Thanks, > ?-Matthew > > On 7/17/19 10:04 PM, Nithya Balachandran wrote: >> Hi Matthew, >> >> Was this node/brick added to the volume recently? If yes, try >> mounting the volume on a fresh mount point - that should create the >> xattr on this as well. >> >> Regards, >> Nithya >> >> On Wed, 17 Jul 2019 at 21:01, Matthew Benstead <matthewb at uvic.ca >> <mailto:matthewb at uvic.ca>> wrote: >> >> Hello, >> >> I've just noticed one brick in my 7 node distribute volume is missing >> the trusted.glusterfs.dht xattr...? How can I fix this? >> >> I'm running glusterfs-5.3-2.el7.x86_64 on CentOS 7. >> >> All of the other nodes are fine, but gluster07 from the list >> below does >> not have the attribute. >> >> $ ansible -i hosts gluster-servers[0:6] ... -m shell -a "getfattr >> -m . >> --absolute-names -n trusted.glusterfs.dht -e hex >> /mnt/raid6-storage/storage" >> ... >> gluster05 | SUCCESS | rc=0 >> >> # file: /mnt/raid6-storage/storage >> trusted.glusterfs.dht=0x0000000100000000000000002aaaaaa9 >> >> gluster03 | SUCCESS | rc=0 >> >> # file: /mnt/raid6-storage/storage >> trusted.glusterfs.dht=0x0000000100000000aaaaaaa8d5555551 >> >> gluster04 | SUCCESS | rc=0 >> >> # file: /mnt/raid6-storage/storage >> trusted.glusterfs.dht=0x0000000100000000d5555552ffffffff >> >> gluster06 | SUCCESS | rc=0 >> >> # file: /mnt/raid6-storage/storage >> trusted.glusterfs.dht=0x00000001000000002aaaaaaa55555553 >> >> gluster02 | SUCCESS | rc=0 >> >> # file: /mnt/raid6-storage/storage >> trusted.glusterfs.dht=0x00000001000000007ffffffeaaaaaaa7 >> >> gluster07 | FAILED | rc=1 >> >> /mnt/raid6-storage/storage: trusted.glusterfs.dht: No such >> attributenon-zero return code >> >> gluster01 | SUCCESS | rc=0 >> >> # file: /mnt/raid6-storage/storage >> trusted.glusterfs.dht=0x0000000100000000555555547ffffffd >> >> Here are all of the attr's from the brick: >> >> [root at gluster07 ~]# getfattr --absolute-names -m . -d -e hex >> /mnt/raid6-storage/storage/ >> # file: /mnt/raid6-storage/storage/ >> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000 >> trusted.gfid=0x00000000000000000000000000000001 >> trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x5d2dee800001fdf9 >> trusted.glusterfs.quota.dirty=0x3000 >> trusted.glusterfs.quota.size.2=0x00001b69498a1400000000000076332e000000000005cd03 >> trusted.glusterfs.volume-id=0x6f95525a94d74174bac4e1a18fe010a2 >> >> >> And here is the volume information: >> >> [root at gluster07 ~]# gluster volume info storage >> >> Volume Name: storage >> Type: Distribute >> Volume ID: 6f95525a-94d7-4174-bac4-e1a18fe010a2 >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 7 >> Transport-type: tcp >> Bricks: >> Brick1: 10.0.231.50:/mnt/raid6-storage/storage >> Brick2: 10.0.231.51:/mnt/raid6-storage/storage >> Brick3: 10.0.231.52:/mnt/raid6-storage/storage >> Brick4: 10.0.231.53:/mnt/raid6-storage/storage >> Brick5: 10.0.231.54:/mnt/raid6-storage/storage >> Brick6: 10.0.231.55:/mnt/raid6-storage/storage >> Brick7: 10.0.231.56:/mnt/raid6-storage/storage >> Options Reconfigured: >> changelog.changelog: on >> features.quota-deem-statfs: on >> features.read-only: off >> features.inode-quota: on >> features.quota: on >> performance.readdir-ahead: on >> nfs.disable: on >> geo-replication.indexing: on >> geo-replication.ignore-pid-check: on >> transport.address-family: inet >> >> Thanks, >> ?-Matthew >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >> https://lists.gluster.org/mailman/listinfo/gluster-users >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190724/b19fcacc/attachment.html>
Nithya Balachandran
2019-Jul-25 04:30 UTC
[Gluster-users] Brick missing trusted.glusterfs.dht xattr
On Wed, 24 Jul 2019 at 22:12, Matthew Benstead <matthewb at uvic.ca> wrote:> So looking more closely at the trusted.glusterfs.dht attributes from the > bricks it looks like they cover the entire range... and there is no range > left for gluster07. > > The first 6 bricks range from 0x00000000 to 0xffffffff - so... is there a > way to re-calculate what the dht values should be? Each of the bricks > should have a gap > > Gluster05 00000000 -> 2aaaaaa9 > Gluster06 2aaaaaaa -> 55555553 > Gluster01 55555554 -> 7ffffffd > Gluster02 7ffffffe -> aaaaaaa7 > Gluster03 aaaaaaa8 -> d5555551 > Gluster04 d5555552 -> ffffffff > Gluster07 None > > If we split the range into 7 servers that would be a gap of about > 0x24924924 for each server. > > Now in terms of the gluster07 brick, about 2 years ago the RAID array the > brick was stored on became corrupted. I ran the remove-brick force command, > then provisioned a new server, ran the add-brick command and then restored > the missing files from backup by copying them back to the main gluster > mount (not the brick). > >Did you run a rebalance after performing the add-brick? Without a rebalance/fix-layout , the layout for existing directories on the volume will not be updated to use the new brick as well. That the layout does not include the new brick in the root dir is in itself is not a problem. Do you create a lot of files directly in the root of the volume? If yes, you might want to run a rebalance. Otherwise, if you mostly create files in newly added directories, you can probably ignore this. You can check the layout for directories on the volume and see if they incorporate the brick7. I would expect a lookup on the root to have set an xattr on the brick with an empty layout range . The fact that the xattr does not exist at all on the brick is what I am looking into. It looks like prior to that event this was the layout - which would make> sense given the equal size of the 7 bricks: > > gluster02.pcic.uvic.ca | SUCCESS | rc=0 >> > # file: /mnt/raid6-storage/storage > trusted.glusterfs.dht=0x000000010000000048bfff206d1ffe5f > > gluster05.pcic.uvic.ca | SUCCESS | rc=0 >> > # file: /mnt/raid6-storage/storage > trusted.glusterfs.dht=0x0000000100000000b5dffce0da3ffc1f > > gluster04.pcic.uvic.ca | SUCCESS | rc=0 >> > # file: /mnt/raid6-storage/storage > trusted.glusterfs.dht=0x0000000100000000917ffda0b5dffcdf > > gluster03.pcic.uvic.ca | SUCCESS | rc=0 >> > # file: /mnt/raid6-storage/storage > trusted.glusterfs.dht=0x00000001000000006d1ffe60917ffd9f > > gluster01.pcic.uvic.ca | SUCCESS | rc=0 >> > # file: /mnt/raid6-storage/storage > trusted.glusterfs.dht=0x0000000100000000245fffe048bfff1f > > gluster07.pcic.uvic.ca | SUCCESS | rc=0 >> > # file: /mnt/raid6-storage/storage > trusted.glusterfs.dht=0x000000010000000000000000245fffdf > > gluster06.pcic.uvic.ca | SUCCESS | rc=0 >> > # file: /mnt/raid6-storage/storage > trusted.glusterfs.dht=0x0000000100000000da3ffc20ffffffff > > Which yields the following: > > 00000000 -> 245fffdf Gluster07 > 245fffe0 -> 48bfff1f Gluster01 > 48bfff20 -> 6d1ffe5f Gluster02 > 6d1ffe60 -> 917ffd9f Gluster03 > 917ffda0 -> b5dffcdf Gluster04 > b5dffce0 -> da3ffc1f Gluster05 > da3ffc20 -> ffffffff Gluster06 > > Is there some way to get back to this? > > Thanks, > -Matthew > > -- > Matthew Benstead > System Administrator > Pacific Climate Impacts Consortium <https://pacificclimate.org/> > University of Victoria, UH1 > PO Box 1800, STN CSC > Victoria, BC, V8W 2Y2 > Phone: +1-250-721-8432 > Email: matthewb at uvic.ca > On 7/18/19 7:20 AM, Matthew Benstead wrote: > > Hi Nithya, > > No - it was added about a year and a half ago. I have tried re-mounting > the volume on the server, but it didn't add the attr: > > [root at gluster07 ~]# umount /storage/ > [root at gluster07 ~]# cat /etc/fstab | grep "/storage" > 10.0.231.56:/storage /storage glusterfs > defaults,log-level=WARNING,backupvolfile-server=10.0.231.51 0 0 > [root at gluster07 ~]# mount /storage/ > [root at gluster07 ~]# df -h /storage/ > Filesystem Size Used Avail Use% Mounted on > 10.0.231.56:/storage 255T 194T 62T 77% /storage > [root at gluster07 ~]# getfattr --absolute-names -m . -d -e hex > /mnt/raid6-storage/storage/ > # file: /mnt/raid6-storage/storage/ > > security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000 > trusted.gfid=0x00000000000000000000000000000001 > > trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x5d307baa00023ec0 > trusted.glusterfs.quota.dirty=0x3000 > > trusted.glusterfs.quota.size.2=0x00001b71d5279e000000000000763e32000000000005cd53 > trusted.glusterfs.volume-id=0x6f95525a94d74174bac4e1a18fe010a2 > > Thanks, > -Matthew > > On 7/17/19 10:04 PM, Nithya Balachandran wrote: > > Hi Matthew, > > Was this node/brick added to the volume recently? If yes, try mounting the > volume on a fresh mount point - that should create the xattr on this as > well. > > Regards, > Nithya > > On Wed, 17 Jul 2019 at 21:01, Matthew Benstead <matthewb at uvic.ca> wrote: > >> Hello, >> >> I've just noticed one brick in my 7 node distribute volume is missing >> the trusted.glusterfs.dht xattr...? How can I fix this? >> >> I'm running glusterfs-5.3-2.el7.x86_64 on CentOS 7. >> >> All of the other nodes are fine, but gluster07 from the list below does >> not have the attribute. >> >> $ ansible -i hosts gluster-servers[0:6] ... -m shell -a "getfattr -m . >> --absolute-names -n trusted.glusterfs.dht -e hex >> /mnt/raid6-storage/storage" >> ... >> gluster05 | SUCCESS | rc=0 >> >> # file: /mnt/raid6-storage/storage >> trusted.glusterfs.dht=0x0000000100000000000000002aaaaaa9 >> >> gluster03 | SUCCESS | rc=0 >> >> # file: /mnt/raid6-storage/storage >> trusted.glusterfs.dht=0x0000000100000000aaaaaaa8d5555551 >> >> gluster04 | SUCCESS | rc=0 >> >> # file: /mnt/raid6-storage/storage >> trusted.glusterfs.dht=0x0000000100000000d5555552ffffffff >> >> gluster06 | SUCCESS | rc=0 >> >> # file: /mnt/raid6-storage/storage >> trusted.glusterfs.dht=0x00000001000000002aaaaaaa55555553 >> >> gluster02 | SUCCESS | rc=0 >> >> # file: /mnt/raid6-storage/storage >> trusted.glusterfs.dht=0x00000001000000007ffffffeaaaaaaa7 >> >> gluster07 | FAILED | rc=1 >> >> /mnt/raid6-storage/storage: trusted.glusterfs.dht: No such >> attributenon-zero return code >> >> gluster01 | SUCCESS | rc=0 >> >> # file: /mnt/raid6-storage/storage >> trusted.glusterfs.dht=0x0000000100000000555555547ffffffd >> >> Here are all of the attr's from the brick: >> >> [root at gluster07 ~]# getfattr --absolute-names -m . -d -e hex >> /mnt/raid6-storage/storage/ >> # file: /mnt/raid6-storage/storage/ >> >> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000 >> trusted.gfid=0x00000000000000000000000000000001 >> >> trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x5d2dee800001fdf9 >> trusted.glusterfs.quota.dirty=0x3000 >> >> trusted.glusterfs.quota.size.2=0x00001b69498a1400000000000076332e000000000005cd03 >> trusted.glusterfs.volume-id=0x6f95525a94d74174bac4e1a18fe010a2 >> >> >> And here is the volume information: >> >> [root at gluster07 ~]# gluster volume info storage >> >> Volume Name: storage >> Type: Distribute >> Volume ID: 6f95525a-94d7-4174-bac4-e1a18fe010a2 >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 7 >> Transport-type: tcp >> Bricks: >> Brick1: 10.0.231.50:/mnt/raid6-storage/storage >> Brick2: 10.0.231.51:/mnt/raid6-storage/storage >> Brick3: 10.0.231.52:/mnt/raid6-storage/storage >> Brick4: 10.0.231.53:/mnt/raid6-storage/storage >> Brick5: 10.0.231.54:/mnt/raid6-storage/storage >> Brick6: 10.0.231.55:/mnt/raid6-storage/storage >> Brick7: 10.0.231.56:/mnt/raid6-storage/storage >> Options Reconfigured: >> changelog.changelog: on >> features.quota-deem-statfs: on >> features.read-only: off >> features.inode-quota: on >> features.quota: on >> performance.readdir-ahead: on >> nfs.disable: on >> geo-replication.indexing: on >> geo-replication.ignore-pid-check: on >> transport.address-family: inet >> >> Thanks, >> -Matthew >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190725/00049135/attachment.html>