Hi, i think my btrfs volume is hosed.... it mounts okay, but iostat shows /dev/sdg on 100% load. dmesg shows lots of ''parent transid verify failed on x wanted y found z''. then after a while i can''t read from it (access to the filesystem freezes). the machine had crashed (prob from some other process), and upon reboot i''ve been experience this problem since. can anyone provide any guidance in how to proceed? cheers, Yee. $ sudo /usr/local/bin/btrfs-show failed to read /dev/sr0 Label: none uuid: ea7ea0b3-bc42-4b0c-9173-346df61d4454 Total devices 3 FS bytes used 3.56TB devid 3 size 1.82TB used 0.00 path /dev/sde devid 1 size 1.82TB used 1.82TB path /dev/sdf devid 2 size 1.82TB used 1.82TB path /dev/sdg Btrfs v0.19-16-g075587c $ sudo /usr/local/bin/btrfsck /dev/sdf failed to read /dev/sr0 parent transid verify failed on 2703873638400 wanted 9074 found 9016 parent transid verify failed on 2703884750848 wanted 9074 found 9055 parent transid verify failed on 2703884763136 wanted 9074 found 9060 parent transid verify failed on 2703883599872 wanted 9074 found 9034 parent transid verify failed on 2703920717824 wanted 9066 found 7543 parent transid verify failed on 2703912325120 wanted 9066 found 7543 parent transid verify failed on 2703912034304 wanted 9066 found 7543 parent transid verify failed on 2703881900032 wanted 9071 found 9060 parent transid verify failed on 2703881793536 wanted 9069 found 9057 bad block 2703860367360 Extent back ref already exists for 2703873536000 parent 0 root 2 bad block 2703860621312 bad block 2703861547008 Extent back ref already exists for 2703876689920 parent 0 root 2 Extent back ref already exists for 2703881900032 parent 0 root 2 Extent back ref already exists for 2703879290880 parent 0 root 2 Extent back ref already exists for 2703873753088 parent 0 root 2 parent transid verify failed on 2703921885184 wanted 9066 found 7543 parent transid verify failed on 2703921889280 wanted 9066 found 7543 parent transid verify failed on 2703879036928 wanted 9069 found 9061 parent transid verify failed on 2703881867264 wanted 9075 found 9065 parent transid verify failed on 2703873536000 wanted 9074 found 9062 parent transid verify failed on 2703883190272 wanted 9075 found 9061 parent transid verify failed on 2703869997056 wanted 9073 found 9060 parent transid verify failed on 2703922012160 wanted 9066 found 7543 parent transid verify failed on 2703921975296 wanted 9066 found 7543 parent transid verify failed on 2703867707392 wanted 9071 found 9060 parent transid verify failed on 2703922679808 wanted 9066 found 7543 parent transid verify failed on 2703922032640 wanted 9066 found 7543 parent transid verify failed on 2703881891840 wanted 9075 found 9057 parent transid verify failed on 2703882297344 wanted 9075 found 9061 parent transid verify failed on 2703884488704 wanted 9074 found 9057 parent transid verify failed on 2703884353536 wanted 9074 found 9057 parent transid verify failed on 2703884365824 wanted 9074 found 9055 parent transid verify failed on 2703921500160 wanted 9066 found 7543 parent transid verify failed on 2703883177984 wanted 9075 found 9061 parent transid verify failed on 2703921487872 wanted 9066 found 7543 parent transid verify failed on 2703922683904 wanted 9066 found 7543 parent transid verify failed on 2703873753088 wanted 9074 found 9062 parent transid verify failed on 2703874314240 wanted 9074 found 9056 Extent back ref already exists for 2703865823232 parent 0 root 2 Extent back ref already exists for 2703866810368 parent 0 root 2 Extent back ref already exists for 2703866986496 parent 0 root 2 Extent back ref already exists for 2703867031552 parent 0 root 2 Extent back ref already exists for 2703867625472 parent 0 root 2 Extent back ref already exists for 2703867609088 parent 0 root 2 Extent back ref already exists for 2703868829696 parent 0 root 2 Extent back ref already exists for 2703869734912 parent 0 root 2 Extent back ref already exists for 2703870255104 parent 0 root 2 Extent back ref already exists for 2703870562304 parent 0 root 2 Extent back ref already exists for 2703871201280 parent 0 root 2 Extent back ref already exists for 2703871168512 parent 0 root 2 Extent back ref already exists for 2703873040384 parent 0 root 2 Extent back ref already exists for 2703872610304 parent 0 root 2 Extent back ref already exists for 2703874686976 parent 0 root 2 Extent back ref already exists for 2703873318912 parent 0 root 2 Extent back ref already exists for 2703873740800 parent 0 root 2 Extent back ref already exists for 2703874465792 parent 0 root 2 Extent back ref already exists for 2703876370432 parent 0 root 2 Extent back ref already exists for 2703877046272 parent 0 root 2 Extent back ref already exists for 2703877050368 parent 0 root 2 Extent back ref already exists for 2703878647808 parent 0 root 2 Extent back ref already exists for 2703876407296 parent 0 root 2 Extent back ref already exists for 2703872782336 parent 0 root 2 Extent back ref already exists for 2703907266560 parent 0 root 2 Extent back ref already exists for 2703906869248 parent 0 root 2 Extent back ref already exists for 2703907241984 parent 0 root 2 Extent back ref already exists for 2703907553280 parent 0 root 2 Extent back ref already exists for 2703907942400 parent 0 root 2 Extent back ref already exists for 2703910154240 parent 0 root 2 Extent back ref already exists for 2703915515904 parent 0 root 2 Extent back ref already exists for 2703916965888 parent 0 root 2 Extent back ref already exists for 2703875280896 parent 0 root 2 Extent back ref already exists for 2703878635520 parent 0 root 2 Extent back ref already exists for 2221635985408 parent 0 root 2 Extent back ref already exists for 2703883841536 parent 0 root 2 Extent back ref already exists for 2703882489856 parent 0 root 2 Extent back ref already exists for 2703883186176 parent 0 root 2 Extent back ref already exists for 2221711962112 parent 0 root 2 parent transid verify failed on 2703875964928 wanted 9066 found 9064 parent transid verify failed on 2703920701440 wanted 9066 found 7543 parent transid verify failed on 2703921225728 wanted 9066 found 7543 parent transid verify failed on 2703919247360 wanted 9066 found 7543 parent transid verify failed on 2703921467392 wanted 9066 found 7543 parent transid verify failed on 2703919116288 wanted 9066 found 7543 parent transid verify failed on 2703920193536 wanted 9066 found 7543 leaf parent key incorrect 2703862099968 bad block 2703862099968 parent transid verify failed on 2703869194240 wanted 9069 found 9062 parent transid verify failed on 2703872065536 wanted 9075 found 9060 leaf parent key incorrect 2703865634816 bad block 2703865634816 parent transid verify failed on 2703872434176 wanted 9077 found 9059 leaf parent key incorrect 2703868116992 bad block 2703868116992 leaf parent key incorrect 2703869460480 bad block 2703869460480 parent transid verify failed on 2703878242304 wanted 9075 found 9065 leaf parent key incorrect 2703871660032 bad block 2703871660032 leaf parent key incorrect 2703872061440 bad block 2703872061440 bad block 2703873073152 parent transid verify failed on 2703873613824 wanted 9077 found 9025 bad block 2703873536000 bad block 2703876689920 leaf parent key incorrect 2703877709824 bad block 2703877709824 parent transid verify failed on 2703897231360 wanted 9077 found 9061 parent transid verify failed on 2703901822976 wanted 9077 found 9061 parent transid verify failed on 2703879938048 wanted 9075 found 9065 leaf parent key incorrect 2703879299072 bad block 2703879299072 bad block 2703881900032 leaf parent key incorrect 2703882805248 bad block 2703882805248 Extent back ref already exists for 2703885160448 parent 0 root 2 leaf parent key incorrect 2703883829248 bad block 2703883829248 parent transid verify failed on 2703878213632 wanted 9077 found 9061 bad block 2703896338432 Extent back ref already exists for 531120128 parent 0 root 2 Extent back ref already exists for 3624028745728 parent 0 root 2 Extent back ref already exists for 458403840 parent 0 root 2 Extent back ref already exists for 3624039575552 parent 0 root 2 Extent back ref already exists for 2221892575232 parent 0 root 2 Extent back ref already exists for 538480640 parent 0 root 2 Extent back ref already exists for 2221926707200 parent 0 root 2 Extent back ref already exists for 2221926719488 parent 0 root 2 Extent back ref already exists for 746985025536 parent 0 root 2 Extent back ref already exists for 2703867379712 parent 0 root 2 Extent back ref already exists for 2703877795840 parent 0 root 2 Extent back ref already exists for 3624023527424 parent 0 root 2 Extent back ref already exists for 3624023547904 parent 0 root 2 Extent back ref already exists for 3624029978624 parent 0 root 2 Extent back ref already exists for 2221998817280 parent 0 root 2 Extent back ref already exists for 747239817216 parent 0 root 2 Extent back ref already exists for 1497120432128 parent 0 root 2 Extent back ref already exists for 1497285292032 parent 0 root 2 Extent back ref already exists for 1497514807296 parent 0 root 2 Extent back ref already exists for 1497549565952 parent 0 root 2 Extent back ref already exists for 746363998208 parent 0 root 2 Extent back ref already exists for 2703878045696 parent 0 root 2 Extent back ref already exists for 2221998825472 parent 0 root 2 Extent back ref already exists for 3624204349440 parent 0 root 2 Extent back ref already exists for 484401152 parent 0 root 2 Extent back ref already exists for 2221929988096 parent 0 root 2 Extent back ref already exists for 707141632 parent 0 root 2 Extent back ref already exists for 2221930053632 parent 0 root 2 Extent back ref already exists for 2703875485696 parent 0 root 2 Extent back ref already exists for 3624161251328 parent 0 root 2 Extent back ref already exists for 3624024666112 parent 0 root 2 Extent back ref already exists for 165191680 parent 0 root 2 Extent back ref already exists for 3623966523392 parent 0 root 2 Extent back ref already exists for 2221876412416 parent 0 root 2 Extent back ref already exists for 1496842756096 parent 0 root 2 Extent back ref already exists for 2221936676864 parent 0 root 2 Extent back ref already exists for 1497422680064 parent 0 root 2 Extent back ref already exists for 1497454501888 parent 0 root 2 Extent back ref already exists for 2221823078400 parent 0 root 2 Extent back ref already exists for 3624937074688 parent 0 root 2 Extent back ref already exists for 3624953167872 parent 0 root 2 Extent back ref already exists for 3624268865536 parent 0 root 2 Extent back ref already exists for 2221718986752 parent 0 root 2 Extent back ref already exists for 414621696 parent 0 root 2 Extent back ref already exists for 2221929848832 parent 0 root 2 Extent back ref already exists for 3624936488960 parent 0 root 2 Extent back ref already exists for 3623950848000 parent 0 root 2 Extent back ref already exists for 733777920 parent 0 root 2 Extent back ref already exists for 3624953176064 parent 0 root 2 Extent back ref already exists for 2221928071168 parent 0 root 2 Extent back ref already exists for 3624310071296 parent 0 root 2 Extent back ref already exists for 2221906374656 parent 0 root 2 Extent back ref already exists for 2221906382848 parent 0 root 2 Extent back ref already exists for 2703871188992 parent 0 root 2 Extent back ref already exists for 2703879311360 parent 0 root 2 Extent back ref already exists for 761036800 parent 0 root 2 Extent back ref already exists for 751378432 parent 0 root 2 Extent back ref already exists for 2221916528640 parent 0 root 2 parent transid verify failed on 2703899471872 wanted 9077 found 9061 parent transid verify failed on 2703876403200 wanted 9078 found 9055 parent transid verify failed on 2703880609792 wanted 9069 found 9065 parent transid verify failed on 2703904714752 wanted 9066 found 5091 leaf parent key incorrect 2703904018432 bad block 2703904018432 parent transid verify failed on 2703921881088 wanted 9066 found 7543 parent transid verify failed on 2703883845632 wanted 9074 found 9061 parent transid verify failed on 2703887519744 wanted 9076 found 9056 btrfsck: disk-io.c:410: find_and_setup_root: Assertion `!(ret)'' failed. Aborted -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Yee-Ting Li <yee379 <at> gmail.com> writes:> > Hi, > > i think my btrfs volume is hosed.... it mounts okay, but iostat shows /dev/sdgon 100% load. dmesg shows lots> of ''parent transid verify failed on x wanted y found z''. then after a while ican''t read from it (access to the> filesystem freezes). > > the machine had crashed (prob from some other process), and upon reboot i''vebeen experience this problem since.> > can anyone provide any guidance in how to proceed? > > cheers, > > Yee.I am also having the same problem with a slightly different setup. In My case I cannot mount the filesystem. mount, btrfs-endio-met and kblockd/0 will all continually run until the system freezes up and requires a power cycle. I have both the kernel module and the tools checked out from git so if you have any ideas on fix''s I can build them and test it out. here is some information about my setup [root@solution ~]# uname -a Linux solution.bcig 2.6.35-0.13.rc3.git2.fc14.x86_64 #1 SMP Mon Jun 28 19:27:35 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux [root@solution ~]# [root@solution ~]# btrfs-show Label: store uuid: 4ba1cc6b-e12a-454a-a064-f4019312c063 Total devices 7 FS bytes used 1.15TB devid 1 size 931.51GB used 415.55GB path /dev/sdb devid 2 size 931.51GB used 518.50GB path /dev/sdc devid 3 size 931.51GB used 342.04GB path /dev/sdd devid 4 size 931.51GB used 523.54GB path /dev/sde devid 5 size 465.76GB used 402.54GB path /dev/sdf devid 6 size 465.76GB used 382.54GB path /dev/sdg devid 7 size 465.76GB used 367.54GB path /dev/sdh Btrfs v0.19-16-g075587c-dirty [root@solution ~]# [root@solution ~]# tail -n 12 /var/log/messages Jul 1 04:47:03 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: verify_parent_transid: 9244 callbacks suppressed Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 [root@solution ~]# -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 1 Jul 2010, at 05:51, Daniel Kozlowski wrote:> I am also having the same problem with a slightly different setup. In My case I > cannot mount the filesystem. mount, btrfs-endio-met and kblockd/0 will all > continually run until the system freezes up and requires a power cycle.have you tried mounting with ''-o degraded''? having monitored the system for a while, i also think that in fact it''s btrfs that''s killing my system. i''m on ubuntu 10.4 with: $ uname -a Linux htpc 2.6.32-22-server #36-Ubuntu SMP Thu Jun 3 20:38:33 UTC 2010 x86_64 GNU/Linux using the default kernel module, but git''d out the tools. following the other thread ''Is there a more aggressive fixer than btrfsck?'' i suspect that we''ll just have to wait until some actual fsck operations are available for btrfs :( on my system, it''s btrfs-endio-met (only 1 out of 4) and btrfs-transacti (1 out of 2) that is taking up all the cpu/io wait cycles. i wonder if it''s only certain files on the array that are hosed; if that''s the case is there a way i can map the kernel messages to a real filename? i don''t mind loosing the odd file on this array, but i don''t fancy copying it all over to somewhere else (yeah-yeah, up to date backups blah blah!) - i figured given the momentum btrfs was gaining it would be much more stable than this :( Yee.-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Jun 26, 2010 at 03:15:04PM -0700, Yee-Ting Li wrote:> Hi, > > i think my btrfs volume is hosed.... it mounts okay, but iostat shows /dev/sdg on 100% load. dmesg shows lots of ''parent transid verify failed on x wanted y found z''. then after a while i can''t read from it (access to the filesystem freezes). > > the machine had crashed (prob from some other process), and upon reboot i''ve been experience this problem since. > > can anyone provide any guidance in how to proceed?These are definitely corruptions, and they probably came from the crash. Can you tell me more about the crash? (Power failure, what is the storage underneath etc, what are the write cache settings). We don''t expect these kinds corruptions to happen. Yan Zheng is making a lot of progress on btrfsck, but I don''t think you''ll want to be one of the first testers there. I can definitely help copy things off if you''re having trouble accessing the FS. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 01, 2010 at 12:51:04PM +0000, Daniel Kozlowski wrote:> Yee-Ting Li <yee379 <at> gmail.com> writes: > > > > > Hi, > > > > i think my btrfs volume is hosed.... it mounts okay, but iostat shows /dev/sdg > on 100% load. dmesg shows lots > > of ''parent transid verify failed on x wanted y found z''. then after a while i > can''t read from it (access to the > > filesystem freezes). > > > > the machine had crashed (prob from some other process), and upon reboot i''ve > been experience this problem since. > > > > can anyone provide any guidance in how to proceed? > > > > cheers, > > > > Yee. > > I am also having the same problem with a slightly different setup. In My case I > cannot mount the filesystem.What is your hardware setup here? Including write cache settings. Did you have craces with 2.6.35-rc1 or rc2?> mount, btrfs-endio-met and kblockd/0 will all > continually run until the system freezes up and requires a power cycle. I have > both the kernel module and the tools checked out from git so if you have any > ideas on fix''s I can build them and test it out. > > here is some information about my setup > [root@solution ~]# uname -a > Linux solution.bcig 2.6.35-0.13.rc3.git2.fc14.x86_64 #1 SMP Mon Jun 28 19:27:35 > UTC 2010 x86_64 x86_64 x86_64 GNU/Linux > [root@solution ~]# > > [root@solution ~]# btrfs-show > Label: store uuid: 4ba1cc6b-e12a-454a-a064-f4019312c063 > Total devices 7 FS bytes used 1.15TB > devid 1 size 931.51GB used 415.55GB path /dev/sdb > devid 2 size 931.51GB used 518.50GB path /dev/sdc > devid 3 size 931.51GB used 342.04GB path /dev/sdd > devid 4 size 931.51GB used 523.54GB path /dev/sde > devid 5 size 465.76GB used 402.54GB path /dev/sdf > devid 6 size 465.76GB used 382.54GB path /dev/sdg > devid 7 size 465.76GB used 367.54GB path /dev/sdh > > Btrfs v0.19-16-g075587c-dirty > [root@solution ~]# > > [root@solution ~]# tail -n 12 /var/log/messages > Jul 1 04:47:03 solution kernel: parent transid verify failed on 1682196926464 > wanted 285263 found 283510 > Jul 1 04:47:08 solution kernel: verify_parent_transid: 9244 callbacks > suppressed > Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 > wanted 285263 found 283510 > Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 > wanted 285263 found 283510 > Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 > wanted 285263 found 283510 > Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 > wanted 285263 found 283510 > Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 > wanted 285263 found 283510 > Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 > wanted 285263 found 283510 > Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 > wanted 285263 found 283510 > Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 > wanted 285263 found 283510 > Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 > wanted 285263 found 283510 > Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 > wanted 285263 found 283510Looks like we''re looping on a single block. What happens when you dmesg -n1 to cut down on the console traffic? If that doesn''t help we can change it to spit a stack trace to figure out where the looping is happening. We should be erroring out instead of hitting it over and over again. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 6 Jul 2010, at 17:16, Chris Mason wrote:> These are definitely corruptions, and they probably came from the crash. > Can you tell me more about the crash? (Power failure, what is the > storage underneath etc, what are the write cache settings). We don''t > expect these kinds corruptions to happen.i think what happened was that the power got pulled accidentally. at the time i had a drive (sde) on an external usb controller. the other two drives are internal on a nForce 730i chipset. they are all 2TB WD drives (combination of EADS and EARS drives). according to hdparm all the drives have write-caching on.> Yan Zheng is making a lot of progress on btrfsck, but I don''t think > you''ll want to be one of the first testers there. I can definitely help > copy things off if you''re having trouble accessing the FS.i''m performing rsyncs at the moment to get some of the data off. i can read the drive fine, but after a while (i guess when something tries to access the corrupt file) i get the dmesgs again, and high cpu on the two btrfs-transacti and btrfs-endio-met threads. is there a way i can determine the actual filenames that may be corrupt? also, as i''m not using the /dev/sde drive (btrfs-show gives used 0.00TB) as i didn''t do a balance after i installed it - is there a way i can degrade the array to recover that disk and keep the array with just two disks? then i will have enough storage to copy the ''good'' files off :) once i have a replica, then i can test whatever code you''d like to throw at me :) cheers, Yee.-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 6, 2010 at 8:19 PM, Chris Mason <chris.mason@oracle.com> wrote:>> I am also having the same problem with a slightly different setup. In My case I >> cannot mount the filesystem. > > What is your hardware setup here? Including write cache settings. Did > you have craces with 2.6.35-rc1 or rc2?My setup is Eight hard Drive four 1TB Drives four 500GB Drives All drives are connected through a 3ware Inc 9550SX SATA-II RAID PCI-X card The card is configured to export all drives essentially acting as a SATA port multiplier. (drives show up sdb - sdi) Drives are configured in btrfs raid0 Filesystem is mounted using: mount -t btrfs /dev/sdb /opt I have been able to lock up the system on 2.6.33.5-124.fc13.x86_64 2.6.35-0.13.rc3.git2.fc14.x86_64 2.6.35-0.23.rc3.git6.fc14.x86_64 and 2.6.35-0.23.rc3.git6.fc14.x86_64 with a DKMS build of the btrfs module (Btrfs v0.19-16-g075587c-dirty) If you would like me to pull out another version of the kernel or roll back specific commits from the kernel module I can I have been able to get different responses form different version 2.6.33.* - This will mount the volume but will hang shortly after mounting when reading data form the filesystem ( ls /opt) writes a bunch of transid verify failed messages hangs on ls 2.6.34.* - Will not mount at all still gives the transid verify failed hands on mount> > Looks like we''re looping on a single block. What happens when you > dmesg -n1 to cut down on the console traffic? >Nothing changes I still have endless repeats of parent transid verify failed on 1682586464256 wanted 285114 found 11257> If that doesn''t help we can change it to spit a stack trace to figure > out where the looping is happening. We should be erroring out instead > of hitting it over and over again.In my kernel noviceness i tried attaching gdb to the btrfs-endio-met, however apparently you can''t attach gdb to a kernel thread like that If you could assist me in obtaining a call trace I will gladly attempt to resolve the matter. Dan Kozlowski -- S.D.G. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Looks like we''re looping on a single block. What happens when you >> dmesg -n1 to cut down on the console traffic? >> > Nothing changes I still have endless repeats of > > parent transid verify failed on 1682586464256 wanted 285114 found 11257 > >> If that doesn''t help we can change it to spit a stack trace to figure >> out where the looping is happening. We should be erroring out instead >> of hitting it over and over again. > > In my kernel noviceness i tried attaching gdb to the btrfs-endio-met, > however apparently you can''t attach gdb to a kernel thread like that > If you could assist me in obtaining a call trace I will gladly attempt > to resolve the matter.Ok I had some free time and decided to excersice my googlefoo and came up with this trace parent transid verify failed on 3241193205760 wanted 285287 found 281382 Pid: 2163, comm: mount Not tainted 2.6.35-0.23.rc3.git6.fc14.x86_64 #1 Call Trace: [<ffffffffa047c376>] verify_parent_transid+0xb7/0xfe [btrfs] [<ffffffffa047c4f2>] btrfs_buffer_uptodate+0x49/0x59 [btrfs] [<ffffffffa04686a2>] read_block_for_search+0x8f/0x289 [btrfs] [<ffffffffa046d554>] btrfs_search_slot+0x3ae/0x513 [btrfs] [<ffffffffa0470ece>] btrfs_read_block_groups+0x73/0x526 [btrfs] [<ffffffff8149b0a3>] ? _raw_spin_unlock+0x2b/0x2f [<ffffffffa0469f56>] ? btrfs_root_node+0x2a/0x32 [btrfs] [<ffffffffa047d287>] ? find_and_setup_root+0xab/0xbc [btrfs] [<ffffffffa04800eb>] open_ctree+0xf19/0x143a [btrfs] [<ffffffffa0467960>] btrfs_get_sb+0x1ce/0x40b [btrfs] [<ffffffff810e9cfd>] ? free_pages+0x49/0x4e [<ffffffff8112c9f9>] vfs_kern_mount+0xbd/0x19b [<ffffffff8112cb3f>] do_kern_mount+0x4d/0xed [<ffffffff81143742>] do_mount+0x776/0x7ed [<ffffffff81143841>] sys_mount+0x88/0xc2 [<ffffffff81009c32>] system_call_fastpath+0x16/0x1b> Dan Kozlowski > > -- > S.D.G. >-- S.D.G. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 8 July 2010 01:21, Daniel Kozlowski <dan.kozlowski@gmail.com> wrote:> On Tue, Jul 6, 2010 at 8:19 PM, Chris Mason <chris.mason@oracle.com> wrote: >>> I am also having the same problem with a slightly different setup. In My case I >>> cannot mount the filesystem. >> >> What is your hardware setup here? Including write cache settings. Did >> you have craces with 2.6.35-rc1 or rc2? > > My setup is > > Eight hard Drive > four 1TB Drives > four 500GB Drives > All drives are connected through a 3ware Inc 9550SX SATA-II RAID PCI-X card > The card is configured to export all drives essentially acting as a > SATA port multiplier. (drives show up sdb - sdi) > Drives are configured in btrfs raid0 > Filesystem is mounted using: > mount -t btrfs /dev/sdb /opt > > I have been able to lock up the system on > 2.6.33.5-124.fc13.x86_64 > 2.6.35-0.13.rc3.git2.fc14.x86_64 > 2.6.35-0.23.rc3.git6.fc14.x86_64 > and > 2.6.35-0.23.rc3.git6.fc14.x86_64 with a DKMS build of the btrfs module > (Btrfs v0.19-16-g075587c-dirty) > > If you would like me to pull out another version of the kernel or roll > back specific commits from the kernel module I can > > I have been able to get different responses form different version > 2.6.33.* - This will mount the volume but will hang shortly after > mounting when reading data form the filesystem ( ls /opt) writes a > bunch of transid verify failed messages hangs on ls > 2.6.34.* - Will not mount at all still gives the transid verify failed > hands on mount > >> >> Looks like we''re looping on a single block. What happens when you >> dmesg -n1 to cut down on the console traffic? >> > Nothing changes I still have endless repeats of > > parent transid verify failed on 1682586464256 wanted 285114 found 11257 > >> If that doesn''t help we can change it to spit a stack trace to figure >> out where the looping is happening. We should be erroring out instead >> of hitting it over and over again. > > In my kernel noviceness i tried attaching gdb to the btrfs-endio-met, > however apparently you can''t attach gdb to a kernel thread like that > If you could assist me in obtaining a call trace I will gladly attempt > to resolve the matter.For grabbing kernel backtraces: $ sudo -s # dmesg -c >/dev/null # echo t >/proc/sysrq-trigger # dmesg >backtraces.txt (there are other ways with The problem is that you''ll be taking instantaneous snapshots, which may or may not be representative of the main looping, but over a few shots should be. Thanks, Daniel -- Daniel J Blueman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
so after leaving the array for a while, with the disk churning away for a few days, it stopped. i copied some files off the disk (everything seems okay) and decided to unmount and run btrfsck again - this time i get a different error: $ sudo /usr/local/bin/btrfsck /dev/sdf failed to read /dev/sr0 parent transid verify failed on 2703919247360 wanted 9066 found 7543 parent transid verify failed on 2703914500096 wanted 9066 found 7543 parent transid verify failed on 2703873781760 wanted 9074 found 9022 parent transid verify failed on 2703877693440 wanted 9070 found 9062 parent transid verify failed on 2703921868800 wanted 9066 found 7543 parent transid verify failed on 2703922647040 wanted 9066 found 7543 parent transid verify failed on 2703919247360 wanted 9066 found 7543 parent transid verify failed on 2703919255552 wanted 9066 found 7543 parent transid verify failed on 2703917125632 wanted 9066 found 7543 parent transid verify failed on 2703879294976 wanted 9075 found 9055 parent transid verify failed on 2703883194368 wanted 9075 found 9057 parent transid verify failed on 2703922688000 wanted 9066 found 7543 parent transid verify failed on 2703873781760 wanted 9074 found 9022 parent transid verify failed on 2703877693440 wanted 9070 found 9062 parent transid verify failed on 2703921868800 wanted 9066 found 7543 parent transid verify failed on 2703922647040 wanted 9066 found 7543 parent transid verify failed on 2703919247360 wanted 9066 found 7543 parent transid verify failed on 2703919255552 wanted 9066 found 7543 bad block 2703873781760 Extent back ref already exists for 365342720 parent 0 root 2 Extent back ref already exists for 2221870616576 parent 0 root 2 Extent back ref already exists for 383959040 parent 0 root 2 Extent back ref already exists for 367714304 parent 0 root 2 Extent back ref already exists for 706744320 parent 0 root 2 Extent back ref already exists for 368672768 parent 0 root 2 Extent back ref already exists for 315338752 parent 0 root 2 Extent back ref already exists for 377356288 parent 0 root 2 Extent back ref already exists for 368914432 parent 0 root 2 Extent back ref already exists for 369807360 parent 0 root 2 Extent back ref already exists for 2221957713920 parent 0 root 2 Extent back ref already exists for 370139136 parent 0 root 2 Extent back ref already exists for 369811456 parent 0 root 2 Extent back ref already exists for 370122752 parent 0 root 2 Extent back ref already exists for 365936640 parent 0 root 2 Extent back ref already exists for 2221948424192 parent 0 root 2 Extent back ref already exists for 3624002596864 parent 0 root 2 Extent back ref already exists for 706789376 parent 0 root 2 Extent back ref already exists for 2703778734080 parent 0 root 2 Extent back ref already exists for 372252672 parent 0 root 2 Extent back ref already exists for 372109312 parent 0 root 2 Extent back ref already exists for 372989952 parent 0 root 2 Extent back ref already exists for 373657600 parent 0 root 2 Extent back ref already exists for 374521856 parent 0 root 2 Extent back ref already exists for 374628352 parent 0 root 2 Extent back ref already exists for 374976512 parent 0 root 2 Extent back ref already exists for 2221948403712 parent 0 root 2 Extent back ref already exists for 375586816 parent 0 root 2 Extent back ref already exists for 375906304 parent 0 root 2 Extent back ref already exists for 376639488 parent 0 root 2 Extent back ref already exists for 706818048 parent 0 root 2 Extent back ref already exists for 383778816 parent 0 root 2 Extent back ref already exists for 377626624 parent 0 root 2 leaf parent key incorrect 2703874203648 bad block 2703874203648 leaf 2222080487424 items 37 free space 1183 generation 10279 owner 2 fs uuid ea7ea0b3-bc42-4b0c-9173-346df61d4454 chunk uuid 886b0dfb-fa34-49c7-9ab0-2589603f8ae4 item 0 key (364388352 EXTENT_ITEM 4096) itemoff 3944 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200172044288) level 0 tree block backref root 7 item 1 key (364392448 EXTENT_ITEM 4096) itemoff 3893 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200220258304) level 0 tree block backref root 7 item 2 key (364396544 EXTENT_ITEM 4096) itemoff 3842 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200179384320) level 0 tree block backref root 7 item 3 key (364400640 EXTENT_ITEM 4096) itemoff 3791 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200220258304) level 0 tree block backref root 7 item 4 key (364404736 EXTENT_ITEM 4096) itemoff 3740 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200184492032) level 0 tree block backref root 7 item 5 key (364408832 EXTENT_ITEM 4096) itemoff 3689 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200183775232) level 0 tree block backref root 7 item 6 key (364412928 EXTENT_ITEM 4096) itemoff 3638 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200192163840) level 0 tree block backref root 7 item 7 key (364417024 EXTENT_ITEM 4096) itemoff 3587 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200189181952) level 0 tree block backref root 7 item 8 key (364421120 EXTENT_ITEM 4096) itemoff 3536 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200220258304) level 0 tree block backref root 7 item 9 key (364425216 EXTENT_ITEM 4096) itemoff 3485 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200198520832) level 0 tree block backref root 7 item 10 key (364429312 EXTENT_ITEM 4096) itemoff 3434 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200201732096) level 0 tree block backref root 7 item 11 key (364433408 EXTENT_ITEM 4096) itemoff 3383 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200208547840) level 0 tree block backref root 7 item 12 key (364437504 EXTENT_ITEM 4096) itemoff 3332 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200215756800) level 0 tree block backref root 7 item 13 key (364441600 EXTENT_ITEM 4096) itemoff 3281 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200219885568) level 0 tree block backref root 7 item 14 key (364445696 EXTENT_ITEM 4096) itemoff 3230 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200213725184) level 0 tree block backref root 7 item 15 key (364449792 EXTENT_ITEM 4096) itemoff 3179 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200208547840) level 0 tree block backref root 7 item 16 key (364453888 EXTENT_ITEM 4096) itemoff 3128 itemsize 51 extent refs 1 gen 8461 flags 2 tree block key (104423 1 0) level 0 tree block backref root 5 item 17 key (364462080 EXTENT_ITEM 4096) itemoff 3077 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200657076224) level 0 tree block backref root 7 item 18 key (364466176 EXTENT_ITEM 4096) itemoff 3026 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200663105536) level 0 tree block backref root 7 item 19 key (364470272 EXTENT_ITEM 4096) itemoff 2975 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200674902016) level 0 tree block backref root 7 item 20 key (364474368 EXTENT_ITEM 4096) itemoff 2924 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200666513408) level 0 tree block backref root 7 item 21 key (364478464 EXTENT_ITEM 4096) itemoff 2873 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200687484928) level 0 tree block backref root 7 item 22 key (364482560 EXTENT_ITEM 4096) itemoff 2822 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200670707712) level 0 tree block backref root 7 item 23 key (364486656 EXTENT_ITEM 4096) itemoff 2771 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200699871232) level 0 tree block backref root 7 item 24 key (364490752 EXTENT_ITEM 4096) itemoff 2720 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200674902016) level 0 tree block backref root 7 item 25 key (364494848 EXTENT_ITEM 4096) itemoff 2669 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200683290624) level 0 tree block backref root 7 item 26 key (364498944 EXTENT_ITEM 4096) itemoff 2618 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200708128768) level 0 tree block backref root 7 item 27 key (364503040 EXTENT_ITEM 4096) itemoff 2567 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200699871232) level 0 tree block backref root 7 item 28 key (364507136 EXTENT_ITEM 4096) itemoff 2516 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200720515072) level 0 tree block backref root 7 item 29 key (364511232 EXTENT_ITEM 4096) itemoff 2465 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200704000000) level 0 tree block backref root 7 item 30 key (364515328 EXTENT_ITEM 4096) itemoff 2414 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200712257536) level 0 tree block backref root 7 item 31 key (364519424 EXTENT_ITEM 4096) itemoff 2363 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200724643840) level 0 tree block backref root 7 item 32 key (364523520 EXTENT_ITEM 4096) itemoff 2312 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200666513408) level 0 tree block backref root 7 item 33 key (364527616 EXTENT_ITEM 4096) itemoff 2261 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200669659136) level 0 tree block backref root 7 item 34 key (364531712 EXTENT_ITEM 4096) itemoff 2210 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200670707712) level 0 tree block backref root 7 item 35 key (364535808 EXTENT_ITEM 4096) itemoff 2159 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200691613696) level 0 tree block backref root 7 item 36 key (364539904 EXTENT_ITEM 4096) itemoff 2108 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200682242048) level 0 tree block backref root 7 failed to find block number 364457984 Aborted any ideas?-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Jul 11, 2010 at 01:19:34AM -0700, Yee-Ting Li wrote:> so after leaving the array for a while, with the disk churning away for a few days, it stopped. i copied some files off the disk (everything seems okay) and decided to unmount and run btrfsck again - this time i get a different error: > > $ sudo /usr/local/bin/btrfsck /dev/sdf[ ... ick ... ]> failed to find block number 364457984 > AbortedWas this after a fresh mkfs? Clearly things are very corrupt on this original drive. It would be a good test case for Yan Zhengs new fsck code, but first I''d like to figure out if you''re still seeing the old corruption of if you''ve started over. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 07, 2010 at 10:39:48PM -0400, Daniel Kozlowski wrote:> >> Looks like we''re looping on a single block. What happens when you > >> dmesg -n1 to cut down on the console traffic? > >> > > Nothing changes I still have endless repeats of > > > > parent transid verify failed on 1682586464256 wanted 285114 found 11257 > > > >> If that doesn''t help we can change it to spit a stack trace to figure > >> out where the looping is happening. We should be erroring out instead > >> of hitting it over and over again. > > > > In my kernel noviceness i tried attaching gdb to the btrfs-endio-met, > > however apparently you can''t attach gdb to a kernel thread like that > > If you could assist me in obtaining a call trace I will gladly attempt > > to resolve the matter. > > Ok I had some free time and decided to excersice my googlefoo and came > up with this trace > > parent transid verify failed on 3241193205760 wanted 285287 found 281382 > Pid: 2163, comm: mount Not tainted 2.6.35-0.23.rc3.git6.fc14.x86_64 #1 > Call Trace: > [<ffffffffa047c376>] verify_parent_transid+0xb7/0xfe [btrfs] > [<ffffffffa047c4f2>] btrfs_buffer_uptodate+0x49/0x59 [btrfs] > [<ffffffffa04686a2>] read_block_for_search+0x8f/0x289 [btrfs] > [<ffffffffa046d554>] btrfs_search_slot+0x3ae/0x513 [btrfs] > [<ffffffffa0470ece>] btrfs_read_block_groups+0x73/0x526 [btrfs] > [<ffffffff8149b0a3>] ? _raw_spin_unlock+0x2b/0x2f > [<ffffffffa0469f56>] ? btrfs_root_node+0x2a/0x32 [btrfs] > [<ffffffffa047d287>] ? find_and_setup_root+0xab/0xbc [btrfs] > [<ffffffffa04800eb>] open_ctree+0xf19/0x143a [btrfs] > [<ffffffffa0467960>] btrfs_get_sb+0x1ce/0x40b [btrfs] > [<ffffffff810e9cfd>] ? free_pages+0x49/0x4e > [<ffffffff8112c9f9>] vfs_kern_mount+0xbd/0x19b > [<ffffffff8112cb3f>] do_kern_mount+0x4d/0xed > [<ffffffff81143742>] do_mount+0x776/0x7ed > [<ffffffff81143841>] sys_mount+0x88/0xc2 > [<ffffffff81009c32>] system_call_fastpath+0x16/0x1bOk, so we''re never getting out of mount. A recent change to read_block_for_search is causing this problem. We''re looping over and over again because it is returning -EAGAIN instead of -EIO. Thanks for nailing this trace down, I''ll get a fix in for the looping. I''m afraid it won''t bring back the filesystem though, you''ll end up failing in mount. Would you like some helping copying the data off? -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11 Jul 2010, at 17:43, Chris Mason wrote:> Was this after a fresh mkfs? Clearly things are very corrupt on this > original drive. It would be a good test case for Yan Zhengs new fsck > code, but first I''d like to figure out if you''re still seeing the old > corruption of if you''ve started over.nope, same disk as before when the btrfsck exited with: btrfsck: disk-io.c:410: find_and_setup_root: Assertion `!(ret)'' failed. the strange thing was that i''m pretty sure that btrfs crashed the system a couple of times (hung). after reboot the mounted drive would basically churn away for hours and spit out lots of the parent transid messages. but after a while it stops and everything seems fine again. i don''t mind losing files on the disk array, but it would be nice if it could tell me the actual filenames which are corrupt. Yee.-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Di, 06.07.10 20:16 Chris Mason <chris.mason@oracle.com> wrote:> On Sat, Jun 26, 2010 at 03:15:04PM -0700, Yee-Ting Li wrote: > > Hi, > > > > i think my btrfs volume is hosed.... it mounts okay, but iostat > > shows /dev/sdg on 100% load. dmesg shows lots of ''parent transid > > verify failed on x wanted y found z''. then after a while i can''t > > read from it (access to the filesystem freezes). > > > > the machine had crashed (prob from some other process), and upon > > reboot i''ve been experience this problem since. > > > > can anyone provide any guidance in how to proceed? > > These are definitely corruptions, and they probably came from the > crash. Can you tell me more about the crash? (Power failure, what is > the storage underneath etc, what are the write cache settings). We > don''t expect these kinds corruptions to happen. > > Yan Zheng is making a lot of progress on btrfsck, but I don''t think > you''ll want to be one of the first testers there. I can definitely > help copy things off if you''re having trouble accessing the FS. > > -chrisHello Chris, sorry if I''m hijacking this thread. I got a similar problem, probably caused by a system crash due to faulty/badly timed memory dimms. The system suddenly hardlocked during write activity. - kernel is 2.6.35 - btrfs on top of a md raid5, which looks healthy. Desktop SATA disks. # cat /proc/mdstat|grep -A1 md0 md0 : active raid5 sdb1[0] sdd1[1] sdc1[2] 2930271872 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] # btrfsck usage: btrfsck dev Btrfs v0.19-16-g075587c-dirty # btrfsck /dev/md0 parent transid verify failed on 2419218964480 wanted 127839 found 127260 parent transid verify failed on 2419218964480 wanted 127839 found 127260 parent transid verify failed on 2419218915328 wanted 127839 found 127260 parent transid verify failed on 2419218915328 wanted 127839 found 127260 parent transid verify failed on 2419214266368 wanted 127839 found 127837 parent transid verify failed on 2419214266368 wanted 127839 found 127837 parent transid verify failed on 2419214266368 wanted 127839 found 127837 Segmentation fault Mount endlessly loops, like explained in this thread. If there is a way, I would really like some aid copying the data off. The backup is quite out of date, shame on me. Best regards, Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Aug 04, 2010 at 08:48:40PM +0200, Thomas Kuther wrote:> On Di, 06.07.10 20:16 Chris Mason <chris.mason@oracle.com> wrote: > > > On Sat, Jun 26, 2010 at 03:15:04PM -0700, Yee-Ting Li wrote: > > > Hi, > > > > > > i think my btrfs volume is hosed.... it mounts okay, but iostat > > > shows /dev/sdg on 100% load. dmesg shows lots of ''parent transid > > > verify failed on x wanted y found z''. then after a while i can''t > > > read from it (access to the filesystem freezes). > > > > > > the machine had crashed (prob from some other process), and upon > > > reboot i''ve been experience this problem since. > > > > > > can anyone provide any guidance in how to proceed? > > > > These are definitely corruptions, and they probably came from the > > crash. Can you tell me more about the crash? (Power failure, what is > > the storage underneath etc, what are the write cache settings). We > > don''t expect these kinds corruptions to happen. > > > > Yan Zheng is making a lot of progress on btrfsck, but I don''t think > > you''ll want to be one of the first testers there. I can definitely > > help copy things off if you''re having trouble accessing the FS. > > > > -chris > > Hello Chris, > > sorry if I''m hijacking this thread. I got a similar problem, probably > caused by a system crash due to faulty/badly timed memory dimms. The > system suddenly hardlocked during write activity. > > - kernel is 2.6.35 > - btrfs on top of a md raid5, which looks healthy. Desktop SATA disks. > > # cat /proc/mdstat|grep -A1 md0 > md0 : active raid5 sdb1[0] sdd1[1] sdc1[2] > 2930271872 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] > > # btrfsck > usage: btrfsck dev > Btrfs v0.19-16-g075587c-dirty > > # btrfsck /dev/md0 > parent transid verify failed on 2419218964480 wanted 127839 found 127260 > parent transid verify failed on 2419218964480 wanted 127839 found 127260 > parent transid verify failed on 2419218915328 wanted 127839 found 127260 > parent transid verify failed on 2419218915328 wanted 127839 found 127260 > parent transid verify failed on 2419214266368 wanted 127839 found 127837 > parent transid verify failed on 2419214266368 wanted 127839 found 127837 > parent transid verify failed on 2419214266368 wanted 127839 found 127837 > Segmentation fault > > Mount endlessly loops, like explained in this thread. > > If there is a way, I would really like some aid copying the data off. > The backup is quite out of date, shame on me.No problem, I''ll get a test patch out in the morning. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mi, 04.08.10 21:30 Chris Mason <chris.mason@oracle.com> wrote:> On Wed, Aug 04, 2010 at 08:48:40PM +0200, Thomas Kuther wrote: > > On Di, 06.07.10 20:16 Chris Mason <chris.mason@oracle.com> wrote: > > > > > On Sat, Jun 26, 2010 at 03:15:04PM -0700, Yee-Ting Li wrote: > > > > Hi, > > > > > > > > i think my btrfs volume is hosed.... it mounts okay, but iostat > > > > shows /dev/sdg on 100% load. dmesg shows lots of ''parent transid > > > > verify failed on x wanted y found z''. then after a while i can''t > > > > read from it (access to the filesystem freezes). > > > > > > > > the machine had crashed (prob from some other process), and upon > > > > reboot i''ve been experience this problem since. > > > > > > > > can anyone provide any guidance in how to proceed? > > > > > > These are definitely corruptions, and they probably came from the > > > crash. Can you tell me more about the crash? (Power failure, what > > > is the storage underneath etc, what are the write cache > > > settings). We don''t expect these kinds corruptions to happen. > > > > > > Yan Zheng is making a lot of progress on btrfsck, but I don''t > > > think you''ll want to be one of the first testers there. I can > > > definitely help copy things off if you''re having trouble > > > accessing the FS. > > > > > > -chris > > > > Hello Chris, > > > > sorry if I''m hijacking this thread. I got a similar problem, > > probably caused by a system crash due to faulty/badly timed memory > > dimms. The system suddenly hardlocked during write activity. > > > > - kernel is 2.6.35 > > - btrfs on top of a md raid5, which looks healthy. Desktop SATA > > disks. > > > > # cat /proc/mdstat|grep -A1 md0 > > md0 : active raid5 sdb1[0] sdd1[1] sdc1[2] > > 2930271872 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] > > > > # btrfsck > > usage: btrfsck dev > > Btrfs v0.19-16-g075587c-dirty > > > > # btrfsck /dev/md0 > > parent transid verify failed on 2419218964480 wanted 127839 found > > 127260 parent transid verify failed on 2419218964480 wanted 127839 > > found 127260 parent transid verify failed on 2419218915328 wanted > > 127839 found 127260 parent transid verify failed on 2419218915328 > > wanted 127839 found 127260 parent transid verify failed on > > 2419214266368 wanted 127839 found 127837 parent transid verify > > failed on 2419214266368 wanted 127839 found 127837 parent transid > > verify failed on 2419214266368 wanted 127839 found 127837 > > Segmentation fault > > > > Mount endlessly loops, like explained in this thread. > > > > If there is a way, I would really like some aid copying the data > > off. The backup is quite out of date, shame on me. > > No problem, I''ll get a test patch out in the morning. > > -chris >Hi Chris, did you find the time to get that patch done meanwhile? I''m willing to test. Seems more people get this error after power outages, suspending or similar. Thanks in advance. ~Thomas