Nag Pavan Chilakam
2017-Feb-08 06:11 UTC
[Gluster-users] Input/output error - would not heal
"gluster volume info" and "gluster vol status" would help in us debug faster. However, coming to gfid mismatch, yes the file "abbreviations.log" (I assume the other brick copy also to be " abbreviations.log" and not "breviations.log" ....typo mistake?) is in gfid mismatch leading to IO error(gfid splitbrain) Resolving data and metadata splitbrains are not recommended to be done from backend brick. But in case of a GFID splitbrain(like in file abbreviations.log), the only method available is resolving from backend brick You can read more about this in http://gluster.readthedocs.io/en/latest/Troubleshooting/split-brain/?highlight=gfid (Fixing Directory entry split-brain section) (There is a bug already existing to resolve gfid splitbrain using CLI ) thanks, nagpavan ----- Original Message ----- From: "lejeczek" <peljasz at yahoo.co.uk> To: "Nag Pavan Chilakam" <nchilaka at redhat.com> Cc: gluster-users at gluster.org Sent: Tuesday, 7 February, 2017 10:53:07 PM Subject: Re: [Gluster-users] Input/output error - would not heal On 07/02/17 12:50, Nag Pavan Chilakam wrote:> Hi, > Can you help us with more information on the volume, like volume status and volume info > One reason of "transport endpoint error" is the brick could be down > > Also, i see that the syntax used for healing is wrong. > You need to use as below: > gluster v heal <vname> split-brain source-brick <brick path> <filename considering brick path as /> > > In yourcase if brick path is "/G-store/1" and the file to be healed is "that_file" , then use below syntax (in this case i am considering "that_file" lying under the brick path directly" > > gluster volume heal USER-HOME split-brain source-brick 10.5.6.100:/G-store/1 /that_filethat was that, my copy-paste typo, it does not heal. Interestingly, that file is not reported by heal. I've replied to - GFID Mismatch - Automatic Correction ? - I think my problem is similar, here is a file the heal actually sees: $ gluster vol heal USER-HOME info Brick 10.5.6.100:/__.aLocalStorages/3/0-GLUSTERs/0-USER.HOME/aUser/.vim.backup/.bash_profile.swp Status: Connected Number of entries: 1 Brick 10.5.6.49:/__.aLocalStorages/3/0-GLUSTERs/0-USER.HOME/aUser/.vim.backup/.bash_profile.swp Status: Connected Number of entries: 1 I'm copying+pasting what I said in that reply to that thread: ... yep, I'm seeing the same: as follows: 3]$ getfattr -d -m . -e hex . # file: . security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.USER-HOME-client-2=0x000000000000000000000000 trusted.afr.USER-HOME-client-3=0x000000000000000000000000 trusted.afr.USER-HOME-client-5=0x000000000000000000000000 trusted.afr.dirty=0x000000000000000000000000 trusted.gfid=0x06341b521ba94ab7938eca57f7a1824f trusted.glusterfs.9e4ed9b7-373a-413b-bc82-b6f978e82ec4.xtime=0x5898e0cf000dd2fe trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri.1=0x00701c90fcb11200fffffef6f08c798e0000006a99819205 trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.size.1=0x00701c90fcb11200fffffef6f08c798e0000006a99819205 3]$ getfattr -d -m . -e hex .vim.backup # file: .vim.backup security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.USER-HOME-client-3=0x000000000000000000000000 trusted.gfid=0x0b3a223955534de89086679a4dce8156 trusted.glusterfs.9e4ed9b7-373a-413b-bc82-b6f978e82ec4.xtime=0x5898621c0005d720 trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.quota.06341b52-1ba9-4ab7-938e-ca57f7a1824f.contri.1=0x000000000000040000000000000000020000000000000001 trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.size.1=0x000000000000040000000000000000020000000000000001 3]$ getfattr -d -m . -e hex .vim.backup/.bash_profile.swp # file: .vim.backup/.bash_profile.swp security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.USER-HOME-client-0=0x000000010000000100000000 trusted.afr.USER-HOME-client-5=0x000000010000000100000000 trusted.gfid=0xc2693670fc6d4fed953f21dcb77a02cf trusted.glusterfs.9e4ed9b7-373a-413b-bc82-b6f978e82ec4.xtime=0x5896043c000baa55 trusted.glusterfs.quota.0b3a2239-5553-4de8-9086-679a4dce8156.contri.1=0x00000000000000000000000000000001 trusted.pgfid.0b3a2239-5553-4de8-9086-679a4dce8156=0x00000001 2]$ getfattr -d -m . -e hex . # file: . security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c745f743a733000 trusted.afr.USER-HOME-client-1=0x000000000000000000000000 trusted.afr.USER-HOME-client-2=0x000000000000000000000000 trusted.afr.USER-HOME-client-3=0x000000000000000000000000 trusted.afr.USER-HOME-client-5=0x000000000000000000000000 trusted.afr.dirty=0x000000000000000000000000 trusted.gfid=0x06341b521ba94ab7938eca57f7a1824f trusted.glusterfs.9e4ed9b7-373a-413b-bc82-b6f978e82ec4.xtime=0x5898e0d000016f82 trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri.1=0xa5e66200a7a45000cb96fbf7d6336229fae7152d8851097b trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.size.1=0xa5e66200a7a45000cb96fbf7d6336229fae7152d8851097b 2]$ getfattr -d -m . -e hex .vim.backup # file: .vim.backup security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c745f743a733000 trusted.afr.USER-HOME-client-3=0x000000000000000000000000 trusted.gfid=0x0b3a223955534de89086679a4dce8156 trusted.glusterfs.9e4ed9b7-373a-413b-bc82-b6f978e82ec4.xtime=0x5898621b000855fe trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.quota.06341b52-1ba9-4ab7-938e-ca57f7a1824f.contri.1=0x000000000000040000000000000000020000000000000001 trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.size.1=0x000000000000040000000000000000020000000000000001 2]$ getfattr -d -m . -e hex .vim.backup/.bash_profile.swp # file: .vim.backup/.bash_profile.swp security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c745f743a733000 trusted.afr.USER-HOME-client-5=0x000000010000000100000000 trusted.afr.USER-HOME-client-6=0x000000010000000100000000 trusted.gfid=0x8a5b6e4ad18a49d0bae920c9cf8673a5 trusted.glusterfs.9e4ed9b7-373a-413b-bc82-b6f978e82ec4.xtime=0x5896041400058191 trusted.glusterfs.quota.0b3a2239-5553-4de8-9086-679a4dce8156.contri.1=0x00000000000000000000000000000001 trusted.pgfid.0b3a2239-5553-4de8-9086-679a4dce8156=0x00000001 and the log bit: GFID mismatch for <gfid:335bf026-68bd-4bf4-9cba-63b65b12c0b1>/abbreviations.xlsx 6e9a7fa1-bfbe-4a59-ad06-a78ee1625649 on USER-HOME-client-6 and 773b7ea3-31cf-4b24-94f0-0b61b573b082 on USER-HOME-client-0 most importantly, is there a workaround for the problem, as of now? Before the bug, it it's such, gets fixed. b.w. L. -- end of paste but I have a few more files which also report I/O errors and heal does NOT even mention them: on the brick that is a "master"(samba was sharing to the users) # file: abbreviations.log security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c745f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x0200000000000000589081fd00060376 trusted.gfid=0x773b7ea331cf4b2494f00b61b573b082 trusted.glusterfs.quota.335bf026-68bd-4bf4-9cba-63b65b12c0b1.contri.1=0x0000000000002a000000000000000001 trusted.pgfid.335bf026-68bd-4bf4-9cba-63b65b12c0b1=0x00000001 on the "slave" brick, was not serving files (certainly not that file) to any users: # file: bbreviations.log security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x0200000000000000588c958a000b67ea trusted.gfid=0x6e9a7fa1bfbe4a59ad06a78ee1625649 trusted.glusterfs.quota.335bf026-68bd-4bf4-9cba-63b65b12c0b1.contri.1=0x0000000000002a000000000000000001 trusted.pgfid.335bf026-68bd-4bf4-9cba-63b65b12c0b1=0x00000001 Question that probably was answered many times: is it OK to tamper with(remove in my case) files directly from bricks? many thanks, L.> regards, > nag pavan > > ----- Original Message ----- > From: "lejeczek"<peljasz at yahoo.co.uk> > To:gluster-users at gluster.org > Sent: Tuesday, 7 February, 2017 2:00:51 AM > Subject: [Gluster-users] Input/output error - would not heal > > hi all > > I'm hitting such problem: > > $ gluster vol heal USER-HOME split-brain source-brick > 10.5.6.100:/G-store/1 > Healing gfid:8a5b6e4a-d18a-49d0-bae9-20c9cf8673a5 > failed:Transport endpoint is not connected. > Status: Connected > Number of healed entries: 0 > > > > > $ gluster vol heal USER-HOME split-brain source-brick > 10.5.6.100:/G-store/1/that_file > Lookup failed on /that_file:Input/output error > Volume heal failed. > > v3.9. it's a two-brick volume, was three but removed one I > think a few hours before the problem was first noticed. > what to do now? > many thanks, > L > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users
On 08/02/17 06:11, Nag Pavan Chilakam wrote:> "gluster volume info" and "gluster vol status" would help in us debug faster. > > However, coming to gfid mismatch, yes the file "abbreviations.log" (I assume the other brick copy also to be " abbreviations.log" and not "breviations.log" ....typo mistake?) is in gfid mismatch leading to IO error(gfid splitbrain) > Resolving data and metadata splitbrains are not recommended to be done from backend brick. > But in case of a GFID splitbrain(like in file abbreviations.log), the only method available is resolving from backend brick > You can read more about this in http://gluster.readthedocs.io/en/latest/Troubleshooting/split-brain/?highlight=gfid (Fixing Directory entry split-brain section) > (There is a bug already existing to resolve gfid splitbrain using CLI ) > >I've read that doc, however I'm not sure what to do with bits that are not mentioned in that doc. Which is: when some xattr does not exist on one copy but does on the other, like: 3]$ getfattr -d -m . -e hex .vim.backup/.bash_profile.swp # file: .vim.backup/.bash_profile.swp security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.USER-HOME-client-0=0x000000010000000100000000 trusted.afr.USER-HOME-client-5=0x000000010000000100000000 2]$ getfattr -d -m . -e hex .vim.backup/.bash_profile.swp # file: .vim.backup/.bash_profile.swp security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c745f743a733000 trusted.afr.USER-HOME-client-5=0x000000010000000100000000 trusted.afr.USER-HOME-client-6=0x000000010000000100000000 unless the doc talks about it and I've gone (temporarily) blind, but if it's does not it would be great to include more scenarios/cases there. many thx. L.> > > thanks, > nagpavan > > > ----- Original Message ----- > From: "lejeczek" <peljasz at yahoo.co.uk> > To: "Nag Pavan Chilakam" <nchilaka at redhat.com> > Cc: gluster-users at gluster.org > Sent: Tuesday, 7 February, 2017 10:53:07 PM > Subject: Re: [Gluster-users] Input/output error - would not heal > > > > On 07/02/17 12:50, Nag Pavan Chilakam wrote: >> Hi, >> Can you help us with more information on the volume, like volume status and volume info >> One reason of "transport endpoint error" is the brick could be down >> >> Also, i see that the syntax used for healing is wrong. >> You need to use as below: >> gluster v heal <vname> split-brain source-brick <brick path> <filename considering brick path as /> >> >> In yourcase if brick path is "/G-store/1" and the file to be healed is "that_file" , then use below syntax (in this case i am considering "that_file" lying under the brick path directly" >> >> gluster volume heal USER-HOME split-brain source-brick 10.5.6.100:/G-store/1 /that_file > that was that, my copy-paste typo, it does not heal. > Interestingly, that file is not reported by heal. > > I've replied to - GFID Mismatch - Automatic Correction ? - > I think my problem is similar, here is a file the heal > actually sees: > > > $ gluster vol heal USER-HOME info > Brick > 10.5.6.100:/__.aLocalStorages/3/0-GLUSTERs/0-USER.HOME/aUser/.vim.backup/.bash_profile.swp > > Status: Connected > Number of entries: 1 > > Brick > 10.5.6.49:/__.aLocalStorages/3/0-GLUSTERs/0-USER.HOME/aUser/.vim.backup/.bash_profile.swp > > Status: Connected > Number of entries: 1 > > I'm copying+pasting what I said in that reply to that thread: > ... > > yep, I'm seeing the same: > as follows: > 3]$ getfattr -d -m . -e hex . > # file: . > security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 > trusted.afr.USER-HOME-client-2=0x000000000000000000000000 > trusted.afr.USER-HOME-client-3=0x000000000000000000000000 > trusted.afr.USER-HOME-client-5=0x000000000000000000000000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.gfid=0x06341b521ba94ab7938eca57f7a1824f > trusted.glusterfs.9e4ed9b7-373a-413b-bc82-b6f978e82ec4.xtime=0x5898e0cf000dd2fe > trusted.glusterfs.dht=0x000000010000000000000000ffffffff > trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri.1=0x00701c90fcb11200fffffef6f08c798e0000006a99819205 > trusted.glusterfs.quota.dirty=0x3000 > trusted.glusterfs.quota.size.1=0x00701c90fcb11200fffffef6f08c798e0000006a99819205 > 3]$ getfattr -d -m . -e hex .vim.backup > # file: .vim.backup > security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 > trusted.afr.USER-HOME-client-3=0x000000000000000000000000 > trusted.gfid=0x0b3a223955534de89086679a4dce8156 > trusted.glusterfs.9e4ed9b7-373a-413b-bc82-b6f978e82ec4.xtime=0x5898621c0005d720 > trusted.glusterfs.dht=0x000000010000000000000000ffffffff > trusted.glusterfs.quota.06341b52-1ba9-4ab7-938e-ca57f7a1824f.contri.1=0x000000000000040000000000000000020000000000000001 > trusted.glusterfs.quota.dirty=0x3000 > trusted.glusterfs.quota.size.1=0x000000000000040000000000000000020000000000000001 > 3]$ getfattr -d -m . -e hex .vim.backup/.bash_profile.swp > # file: .vim.backup/.bash_profile.swp > security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 > trusted.afr.USER-HOME-client-0=0x000000010000000100000000 > trusted.afr.USER-HOME-client-5=0x000000010000000100000000 > trusted.gfid=0xc2693670fc6d4fed953f21dcb77a02cf > trusted.glusterfs.9e4ed9b7-373a-413b-bc82-b6f978e82ec4.xtime=0x5896043c000baa55 > trusted.glusterfs.quota.0b3a2239-5553-4de8-9086-679a4dce8156.contri.1=0x00000000000000000000000000000001 > trusted.pgfid.0b3a2239-5553-4de8-9086-679a4dce8156=0x00000001 > > 2]$ getfattr -d -m . -e hex . > # file: . > security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c745f743a733000 > trusted.afr.USER-HOME-client-1=0x000000000000000000000000 > trusted.afr.USER-HOME-client-2=0x000000000000000000000000 > trusted.afr.USER-HOME-client-3=0x000000000000000000000000 > trusted.afr.USER-HOME-client-5=0x000000000000000000000000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.gfid=0x06341b521ba94ab7938eca57f7a1824f > trusted.glusterfs.9e4ed9b7-373a-413b-bc82-b6f978e82ec4.xtime=0x5898e0d000016f82 > trusted.glusterfs.dht=0x000000010000000000000000ffffffff > trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri.1=0xa5e66200a7a45000cb96fbf7d6336229fae7152d8851097b > trusted.glusterfs.quota.dirty=0x3000 > trusted.glusterfs.quota.size.1=0xa5e66200a7a45000cb96fbf7d6336229fae7152d8851097b > 2]$ getfattr -d -m . -e hex .vim.backup > # file: .vim.backup > security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c745f743a733000 > trusted.afr.USER-HOME-client-3=0x000000000000000000000000 > trusted.gfid=0x0b3a223955534de89086679a4dce8156 > trusted.glusterfs.9e4ed9b7-373a-413b-bc82-b6f978e82ec4.xtime=0x5898621b000855fe > trusted.glusterfs.dht=0x000000010000000000000000ffffffff > trusted.glusterfs.quota.06341b52-1ba9-4ab7-938e-ca57f7a1824f.contri.1=0x000000000000040000000000000000020000000000000001 > trusted.glusterfs.quota.dirty=0x3000 > trusted.glusterfs.quota.size.1=0x000000000000040000000000000000020000000000000001 > 2]$ getfattr -d -m . -e hex .vim.backup/.bash_profile.swp > # file: .vim.backup/.bash_profile.swp > security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c745f743a733000 > trusted.afr.USER-HOME-client-5=0x000000010000000100000000 > trusted.afr.USER-HOME-client-6=0x000000010000000100000000 > trusted.gfid=0x8a5b6e4ad18a49d0bae920c9cf8673a5 > trusted.glusterfs.9e4ed9b7-373a-413b-bc82-b6f978e82ec4.xtime=0x5896041400058191 > trusted.glusterfs.quota.0b3a2239-5553-4de8-9086-679a4dce8156.contri.1=0x00000000000000000000000000000001 > trusted.pgfid.0b3a2239-5553-4de8-9086-679a4dce8156=0x00000001 > > > and the log bit: > > GFID mismatch for > <gfid:335bf026-68bd-4bf4-9cba-63b65b12c0b1>/abbreviations.xlsx > 6e9a7fa1-bfbe-4a59-ad06-a78ee1625649 on USER-HOME-client-6 > and 773b7ea3-31cf-4b24-94f0-0b61b573b082 on USER-HOME-client-0 > > most importantly, is there a workaround for the problem, as > of now? Before the bug, it it's such, gets fixed. > b.w. > L. > > -- end of paste > > but I have a few more files which also report I/O errors and > heal does NOT even mention them: > on the brick that is a "master"(samba was sharing to the users) > > # file: abbreviations.log > security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c745f743a733000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.bit-rot.version=0x0200000000000000589081fd00060376 > trusted.gfid=0x773b7ea331cf4b2494f00b61b573b082 > trusted.glusterfs.quota.335bf026-68bd-4bf4-9cba-63b65b12c0b1.contri.1=0x0000000000002a000000000000000001 > trusted.pgfid.335bf026-68bd-4bf4-9cba-63b65b12c0b1=0x00000001 > > on the "slave" brick, was not serving files (certainly not > that file) to any users: > > # file: bbreviations.log > security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.bit-rot.version=0x0200000000000000588c958a000b67ea > trusted.gfid=0x6e9a7fa1bfbe4a59ad06a78ee1625649 > trusted.glusterfs.quota.335bf026-68bd-4bf4-9cba-63b65b12c0b1.contri.1=0x0000000000002a000000000000000001 > trusted.pgfid.335bf026-68bd-4bf4-9cba-63b65b12c0b1=0x00000001 > > Question that probably was answered many times: is it OK to > tamper with(remove in my case) files directly from bricks? > many thanks, > L. > > >> regards, >> nag pavan >> >> ----- Original Message ----- >> From: "lejeczek"<peljasz at yahoo.co.uk> >> To:gluster-users at gluster.org >> Sent: Tuesday, 7 February, 2017 2:00:51 AM >> Subject: [Gluster-users] Input/output error - would not heal >> >> hi all >> >> I'm hitting such problem: >> >> $ gluster vol heal USER-HOME split-brain source-brick >> 10.5.6.100:/G-store/1 >> Healing gfid:8a5b6e4a-d18a-49d0-bae9-20c9cf8673a5 >> failed:Transport endpoint is not connected. >> Status: Connected >> Number of healed entries: 0 >> >> >> >> >> $ gluster vol heal USER-HOME split-brain source-brick >> 10.5.6.100:/G-store/1/that_file >> Lookup failed on /that_file:Input/output error >> Volume heal failed. >> >> v3.9. it's a two-brick volume, was three but removed one I >> think a few hours before the problem was first noticed. >> what to do now? >> many thanks, >> L >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170208/ed9047af/attachment.html>