Sebastian Ochmann
2013-Nov-28 20:36 UTC
2 errors when scrubbing - but I don''t know what they mean
Hello everyone, when I scrubbed one of my btrfs volumes today, the result of the scrub was: total bytes scrubbed: 1.27TB with 2 errors error details: super=2 corrected errors: 0, uncorrectable errors: 0, unverified errors: 0 and dmesg said: btrfs: bdev /dev/mapper/tray errs: wr 0, rd 0, flush 0, corrupt 0, gen 1 btrfs: bdev /dev/mapper/tray errs: wr 0, rd 0, flush 0, corrupt 0, gen 2 Can someone please enlighten me what these errors mean (especially the "super" and "gen" values)? As an additional info: The drive is sometimes used in a machine with kernel 3.11.6 and sometimes with 3.12.0, could this swapping explain the problem somehow? Best regards Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Duncan
2013-Nov-29 01:10 UTC
Re: 2 errors when scrubbing - but I don''t know what they mean
Sebastian Ochmann posted on Thu, 28 Nov 2013 21:36:32 +0100 as excerpted:> when I scrubbed one of my btrfs volumes today, the result of the scrub > was: > > total bytes scrubbed: 1.27TB with 2 errors error details: super=2 > corrected errors: 0, uncorrectable errors: 0, unverified errors: 0 > > and dmesg said: > > btrfs: bdev /dev/mapper/tray errs: wr 0, rd 0, flush 0, corrupt 0, gen 1 > btrfs: bdev /dev/mapper/tray errs: wr 0, rd 0, flush 0, corrupt 0, gen 2 > > Can someone please enlighten me what these errors mean (especially the > "super" and "gen" values)? As an additional info: The drive is sometimes > used in a machine with kernel 3.11.6 and sometimes with 3.12.0, could > this swapping explain the problem somehow?[Just an admin using/testing btrfs here; not a dev.] Super=superblock. I really can''t say what errors registered as superblock errors might mean as I''ve never seen them here and haven''t chanced across an explanation on-list or on the wiki, but were I seeing that here, my approach would be to try the scrub again and hope the errors were fixed (tho I should mention that I''m on SSD with multiple independent rather small btrfs partitions, so scrubs take a couple minutes for my larger partitions, not the hours you''re likely to see with multi-TB spinning rust, so rerunning a scrub is trivial, /here/!). If that didn''t catch them, then I''d try btrfsck (without --repair) and see if it had any further information to offer. (Repair is a a further step that I''d only take if necessary -- making sure I had a good backup first!) There''s also btrfs-show-super, which should be safe as it''s read-only, simply displaying a lot of information, much of which probably won''t make much sense except to a btrfs dev/expert (it''s beyond me). As for the dmesg output you quoted, if you compare your syslog times for the same messages, I suspect you''ll find they were printed at filesystem mount time, NOT during the scrub, and are thus not directly related. What the dmesg output IS directly related to is the output of btrfs device stat. The first thing to note about it is that errors reported are cumulative, only being reset if its -z option is used. Thus, stats let you track whether the number of errors are rising, but unless you reset stats (using btrfs dev stat -z) after your last scrub, they''ll still reflect historical errors that have already been corrected -- errors reported at mount time and by device stat reflect historical status and do NOT necessarily reflect *CURRENT* errors. As with the superblock errors, I''ve not actually seen generation errors here, so I don''t know whether they''re the superblock errors scrub is reporting or are different. Similarly, I don''t know what fixes them. What I /have/ seen here are read_ and write_io_errs (as reported by stat, simply wr/rd as reported by the kernel at mount time), due to bad shutdowns (well, suspend-to-ram that didn''t resume properly). I know scrub can and does recover those, provided it has a second copy to recover from, as it does here since (with the exception of /boot) all my btrfs filesystems are btrfs raid1 mode, both data and metadata, across two SSDs. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Wang Shilong
2013-Nov-29 05:51 UTC
Re: 2 errors when scrubbing - but I don''t know what they mean
Hi, On 11/29/2013 04:36 AM, Sebastian Ochmann wrote:> Hello everyone, > > when I scrubbed one of my btrfs volumes today, the result of the scrub > was: > > total bytes scrubbed: 1.27TB with 2 errors > error details: super=2 > corrected errors: 0, uncorrectable errors: 0, unverified errors: 0Here super error means superblock checksum mismatch,scrub just report superblock errors but dosen''t try to fix it.... Maybe this is just a read error, anyway, superblocks will be rewritten after commiting a transaction.. Thanks, Wang> > and dmesg said: > > btrfs: bdev /dev/mapper/tray errs: wr 0, rd 0, flush 0, corrupt 0, gen 1 > btrfs: bdev /dev/mapper/tray errs: wr 0, rd 0, flush 0, corrupt 0, gen 2 > > Can someone please enlighten me what these errors mean (especially the > "super" and "gen" values)? As an additional info: The drive is > sometimes used in a machine with kernel 3.11.6 and sometimes with > 3.12.0, could this swapping explain the problem somehow? > > Best regards > Sebastian > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Sebastian Ochmann
2013-Nov-30 11:31 UTC
Re: 2 errors when scrubbing - but I don''t know what they mean
Hello, thank you for your input. I didn''t know that btrfs keeps the error counters over mounts/reboots, but that''s nice. I''m still trying to figure out how such a generation error may occur in the first place. One thing I noticed looking at the btrfs code is that the generation error counter will only get incremented in the actual scrubbing code (either in "scrub_checksum_super" or in "scrub_handle_errored_block", both in scrub.c - please correct me if I''m wrong, I''m not a btrfs dev). Also, the dmesg errors I saw were not there at boot time, but about 10 minutes after boot which was about the time when I started the scrub so I''m pretty sure that it was the scrub that detected the errors. The question remains what can cause superblock/gen errors. Sure it could be "some" read error, but I''d really like to make sure that it''s not a systematic error. I wasn''t able to reproduce it yet though. Best Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Shilong Wang
2013-Dec-01 01:16 UTC
Fwd: 2 errors when scrubbing - but I don''t know what they mean
cc: linux-btrfs ---------- Forwarded message ---------- From: Shilong Wang <wangshilong1991@gmail.com> Date: 2013/12/1 Subject: Re: 2 errors when scrubbing - but I don''t know what they mean To: Sebastian Ochmann <ochmann@informatik.uni-bonn.de> Hello Sebastian, 2013/11/30 Sebastian Ochmann <ochmann@informatik.uni-bonn.de>:> Hello, > > thank you for your input. I didn''t know that btrfs keeps the error counters > over mounts/reboots, but that''s nice. > > I''m still trying to figure out how such a generation error may occur in the > first place. One thing I noticed looking at the btrfs code is that the > generation error counter will only get incremented in the actual scrubbing > code (either in "scrub_checksum_super" or in "scrub_handle_errored_block", > both in scrub.c - please correct me if I''m wrong, I''m not a btrfs dev).Right, Scrub will read superblock with bio rather than using pagecaches. This mean we will reread superblock from disks, if a checksum mismatch happens, This can be the following reasons: 1.some read errors happen while scrubing, while superblocks are actually good 2.during last transaction, when we are trying to write superblocks to disk, some silent corruption happens. 3.some unexpected operation write data to superblocks directly, for example..''dd if=/dev/zero'' of=/dev/ seek=65536 count=4k'' something like this. Actually, during boot time, superblock should be fine, because will do checksum check when trying to using superblock. if checksum mismatch, we will refuse to mount, After mounting, these superblocks should be cached in memory until you umouting filesystem. So ideal thing is your disk is fine, and during next transaction, superblocks will be rewritten. and during next umounting, you can mounting filesystem successfully! However, if you find such superblocks checksum mismatch very often during scrub, it maybe there are something wrong with disk!> Also, the dmesg errors I saw were not there at boot time, but about 10 > minutes after boot which was about the time when I started the scrub so I''m > pretty sure that it was the scrub that detected the errors. > > The question remains what can cause superblock/gen errors. Sure it could be > "some" read error, but I''d really like to make sure that it''s not a > systematic error. I wasn''t able to reproduce it yet though.You can reproduce this by doing ''dd if=/dev/zero of=/dev/sd* seek=65536 count=4k'' before btrfs scrubing. Thanks, Wang> > Best > Sebastian > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Sebastian Ochmann
2013-Dec-01 20:45 UTC
Re: 2 errors when scrubbing - but I don''t know what they mean
Hello, > However, if you find such superblocks checksum mismatch very often > during scrub, it maybe > there are something wrong with disk! I''m sorry, but I don''t think there''s a problem with my disks because I was able to trigger the errors that increment the "gen" error counter during scrub on a completely different machine and drive today. I basically performed some I/O operations on a drive and scrubbed at the same time over and over again until I actually saw "super" errors during scrub. But the error is reeally hard to trigger. It seems to me like a race condition somewhere. So I went a step further and tried to create a repro for this. It seems like I can trigger the errors now once every few minutes with the method described below, but sometimes it really takes a long time until the error pops up, so be patient when trying this... For the repro: I''m using a btrfs image in RAM for this for two reasons: I can scrub quickly over and over again and I can rule our hard drive errors. My machine has 32 GB of RAM, so that comes in handy here - if you try this on a physical drive, make sure to adjust some parameters, if necessary. Create a tmpfs and a testing image, format as btrfs: $ mkdir btrfstest $ cd btrfstest/ $ mkdir tmp $ mount -t tmpfs -o size=20G none tmp $ dd if=/dev/zero of=tmp/vol bs=1G count=19 $ mkfs.btrfs tmp/vol $ mkdir mnt $ mount -o commit=1 tmp/vol mnt Note the "commit=1" mount option. It''s not strictly necessary, but I have the feeling it helps with triggering the problem... So now we have a 19 GB btrfs filesystem in RAM, mounted in "mnt". What I did for performing some artificial I/O operations is to rm and cp a linux source tree over and over again. Suppose you have an unpacked linux source tree available in the "/somewhere/linux" directory (and you''re using bash). We''ll spawn some loops that keep the filesystem busy: $ while true; do rm -fr mnt/a; sleep 1.0; cp -R /somewhere/linux mnt/a; sleep 1.0; done $ while true; do rm -fr mnt/b; sleep 1.1; cp -R /somewhere/linux mnt/b; sleep 1.1; done $ while true; do rm -fr mnt/c; sleep 1.2; cp -R /somewhere/linux mnt/c; sleep 1.2; done Now that the filesystem is busy, we''ll also scrub it repeatedly (without backgrounding, -B): $ while true; do btrfs scrub start -B mnt; sleep 0.5; done On my machine and in RAM, each scrub takes 0-1 second and the "total bytes scrubbed" should fluctuate (seems to be especially true with commit=1, but not sure). Get a beverage of your choice and wait. (about 10 minutes later) When I was writing this repro it took about 10 minutes until scrub said: total bytes scrubbed: 1.20GB with 2 errors error details: super=2 corrected errors: 0, uncorrectable errors: 0, unverified errors: 0 and in dmesg: [15282.155170] btrfs: bdev /dev/loop0 errs: wr 0, rd 0, flush 0, corrupt 0, gen 1 [15282.155176] btrfs: bdev /dev/loop0 errs: wr 0, rd 0, flush 0, corrupt 0, gen 2 After that, scrub is happy again and will continue normally until the same errors happen again after a few hundred scrubs or so. So all in all, the error can be triggered using normal I/O operations and scrubbing at the right moments, it seems. Even with a btrfs image in RAM, so no hard drive error is possible. Hope anyone can reproduce this and maybe debug it. Best regards Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Wang Shilong
2013-Dec-02 01:30 UTC
Re: 2 errors when scrubbing - but I don''t know what they mean
On 12/02/2013 04:45 AM, Sebastian Ochmann wrote:> Hello, > > > However, if you find such superblocks checksum mismatch very often > > during scrub, it maybe > > there are something wrong with disk! > > I''m sorry, but I don''t think there''s a problem with my disks because I > was able to trigger the errors that increment the "gen" error counter > during scrub on a completely different machine and drive today. I > basically performed some I/O operations on a drive and scrubbed at the > same time over and over again until I actually saw "super" errors > during scrub. But the error is reeally hard to trigger. It seems to me > like a race condition somewhere. > > So I went a step further and tried to create a repro for this. It > seems like I can trigger the errors now once every few minutes with > the method described below, but sometimes it really takes a long time > until the error pops up, so be patient when trying this... > > For the repro: > > I''m using a btrfs image in RAM for this for two reasons: I can scrub > quickly over and over again and I can rule our hard drive errors. My > machine has 32 GB of RAM, so that comes in handy here - if you try > this on a physical drive, make sure to adjust some parameters, if > necessary. > > Create a tmpfs and a testing image, format as btrfs: > > $ mkdir btrfstest > $ cd btrfstest/ > $ mkdir tmp > $ mount -t tmpfs -o size=20G none tmp > $ dd if=/dev/zero of=tmp/vol bs=1G count=19 > $ mkfs.btrfs tmp/vol > $ mkdir mnt > $ mount -o commit=1 tmp/vol mnt > > Note the "commit=1" mount option. It''s not strictly necessary, but I > have the feeling it helps with triggering the problem... > > So now we have a 19 GB btrfs filesystem in RAM, mounted in "mnt". What > I did for performing some artificial I/O operations is to rm and cp a > linux source tree over and over again. Suppose you have an unpacked > linux source tree available in the "/somewhere/linux" directory (and > you''re using bash). We''ll spawn some loops that keep the filesystem busy: > > $ while true; do rm -fr mnt/a; sleep 1.0; cp -R /somewhere/linux > mnt/a; sleep 1.0; done > $ while true; do rm -fr mnt/b; sleep 1.1; cp -R /somewhere/linux > mnt/b; sleep 1.1; done > $ while true; do rm -fr mnt/c; sleep 1.2; cp -R /somewhere/linux > mnt/c; sleep 1.2; done > > Now that the filesystem is busy, we''ll also scrub it repeatedly > (without backgrounding, -B): > > $ while true; do btrfs scrub start -B mnt; sleep 0.5; done > > On my machine and in RAM, each scrub takes 0-1 second and the "total > bytes scrubbed" should fluctuate (seems to be especially true with > commit=1, but not sure). Get a beverage of your choice and wait. > > (about 10 minutes later) > > When I was writing this repro it took about 10 minutes until scrub said: > > total bytes scrubbed: 1.20GB with 2 errors > error details: super=2 > corrected errors: 0, uncorrectable errors: 0, unverified errors: 0 > > and in dmesg: > > [15282.155170] btrfs: bdev /dev/loop0 errs: wr 0, rd 0, flush 0, > corrupt 0, gen 1 > [15282.155176] btrfs: bdev /dev/loop0 errs: wr 0, rd 0, flush 0, > corrupt 0, gen 2 > > After that, scrub is happy again and will continue normally until the > same errors happen again after a few hundred scrubs or so. > > So all in all, the error can be triggered using normal I/O operations > and scrubbing at the right moments, it seems. Even with a btrfs image > in RAM, so no hard drive error is possible. > > Hope anyone can reproduce this and maybe debug it.Let me have a look at this. Thanks, Wang> > Best regards > Sebastian > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Wang Shilong
2013-Dec-02 01:53 UTC
Re: 2 errors when scrubbing - but I don''t know what they mean
On 12/02/2013 09:30 AM, Wang Shilong wrote:> On 12/02/2013 04:45 AM, Sebastian Ochmann wrote: >> Hello, >> >> > However, if you find such superblocks checksum mismatch very often >> > during scrub, it maybe >> > there are something wrong with disk! >> >> I''m sorry, but I don''t think there''s a problem with my disks because >> I was able to trigger the errors that increment the "gen" error >> counter during scrub on a completely different machine and drive >> today. I basically performed some I/O operations on a drive and >> scrubbed at the same time over and over again until I actually saw >> "super" errors during scrub. But the error is reeally hard to >> trigger. It seems to me like a race condition somewhere. >> >> So I went a step further and tried to create a repro for this. It >> seems like I can trigger the errors now once every few minutes with >> the method described below, but sometimes it really takes a long time >> until the error pops up, so be patient when trying this... >> >> For the repro: >> >> I''m using a btrfs image in RAM for this for two reasons: I can scrub >> quickly over and over again and I can rule our hard drive errors. My >> machine has 32 GB of RAM, so that comes in handy here - if you try >> this on a physical drive, make sure to adjust some parameters, if >> necessary. >> >> Create a tmpfs and a testing image, format as btrfs: >> >> $ mkdir btrfstest >> $ cd btrfstest/ >> $ mkdir tmp >> $ mount -t tmpfs -o size=20G none tmp >> $ dd if=/dev/zero of=tmp/vol bs=1G count=19 >> $ mkfs.btrfs tmp/vol >> $ mkdir mnt >> $ mount -o commit=1 tmp/vol mnt >> >> Note the "commit=1" mount option. It''s not strictly necessary, but I >> have the feeling it helps with triggering the problem... >> >> So now we have a 19 GB btrfs filesystem in RAM, mounted in "mnt". >> What I did for performing some artificial I/O operations is to rm and >> cp a linux source tree over and over again. Suppose you have an >> unpacked linux source tree available in the "/somewhere/linux" >> directory (and you''re using bash). We''ll spawn some loops that keep >> the filesystem busy: >> >> $ while true; do rm -fr mnt/a; sleep 1.0; cp -R /somewhere/linux >> mnt/a; sleep 1.0; done >> $ while true; do rm -fr mnt/b; sleep 1.1; cp -R /somewhere/linux >> mnt/b; sleep 1.1; done >> $ while true; do rm -fr mnt/c; sleep 1.2; cp -R /somewhere/linux >> mnt/c; sleep 1.2; done >> >> Now that the filesystem is busy, we''ll also scrub it repeatedly >> (without backgrounding, -B): >> >> $ while true; do btrfs scrub start -B mnt; sleep 0.5; done >> >> On my machine and in RAM, each scrub takes 0-1 second and the "total >> bytes scrubbed" should fluctuate (seems to be especially true with >> commit=1, but not sure). Get a beverage of your choice and wait. >> >> (about 10 minutes later) >> >> When I was writing this repro it took about 10 minutes until scrub said: >> >> total bytes scrubbed: 1.20GB with 2 errors >> error details: super=2 >> corrected errors: 0, uncorrectable errors: 0, unverified errors: 0 >> >> and in dmesg: >> >> [15282.155170] btrfs: bdev /dev/loop0 errs: wr 0, rd 0, flush 0, >> corrupt 0, gen 1 >> [15282.155176] btrfs: bdev /dev/loop0 errs: wr 0, rd 0, flush 0, >> corrupt 0, gen 2 >> >> After that, scrub is happy again and will continue normally until the >> same errors happen again after a few hundred scrubs or so. >> >> So all in all, the error can be triggered using normal I/O operations >> and scrubbing at the right moments, it seems. Even with a btrfs image >> in RAM, so no hard drive error is possible. >> >> Hope anyone can reproduce this and maybe debug it.It seems this is a generation mismatch not a checksum mismatch. The story is `tree log sync` now only flush first superblock, this will casue superblock generation mismatch while we are scrubbing other two superblocks. I will give a patch to fix this issue, thanks for reporting! Thanks, Wang> Let me have a look at this. > > Thanks, > Wang >> >> Best regards >> Sebastian >> -- >> To unsubscribe from this list: send the line "unsubscribe >> linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Wang Shilong
2013-Dec-02 09:21 UTC
Re: 2 errors when scrubbing - but I don''t know what they mean
Hi Sebastian, On 12/02/2013 04:45 AM, Sebastian Ochmann wrote:> Hello, > > > However, if you find such superblocks checksum mismatch very often > > during scrub, it maybe > > there are something wrong with disk! > > I''m sorry, but I don''t think there''s a problem with my disks because I > was able to trigger the errors that increment the "gen" error counter > during scrub on a completely different machine and drive today. I > basically performed some I/O operations on a drive and scrubbed at the > same time over and over again until I actually saw "super" errors > during scrub. But the error is reeally hard to trigger. It seems to me > like a race condition somewhere.I am sorry, i try to reproduce the problem as steps what you have said, it didn''t come up yet(i have run it for more than 6 hours).:-( I took a careful look at code. Superblock generation mismatch can only happen in scrub_checksum_super(). The generation mismatch happens when: superblocks'' gen ! = last_trans_commited. While we can only modify value ''last_trans_commited'' in one place(commiting transaction), However, in commiting transaction before changing last_trans_commited, we will call btrfs_scrub_pause() which make it impossible that srubbing and writting supers happen at the same time. Otherwise, i must miss some important thing here:-) Would you please have a try with btrfs-next and see if the problem still exist in that branch: https://git.kernel.org/cgit/linux/kernel/git/josef/btrfs-next.git/ Thanks, Wang> > So I went a step further and tried to create a repro for this. It > seems like I can trigger the errors now once every few minutes with > the method described below, but sometimes it really takes a long time > until the error pops up, so be patient when trying this... > > For the repro: > > I''m using a btrfs image in RAM for this for two reasons: I can scrub > quickly over and over again and I can rule our hard drive errors. My > machine has 32 GB of RAM, so that comes in handy here - if you try > this on a physical drive, make sure to adjust some parameters, if > necessary. > > Create a tmpfs and a testing image, format as btrfs: > > $ mkdir btrfstest > $ cd btrfstest/ > $ mkdir tmp > $ mount -t tmpfs -o size=20G none tmp > $ dd if=/dev/zero of=tmp/vol bs=1G count=19 > $ mkfs.btrfs tmp/vol > $ mkdir mnt > $ mount -o commit=1 tmp/vol mnt > > Note the "commit=1" mount option. It''s not strictly necessary, but I > have the feeling it helps with triggering the problem... > > So now we have a 19 GB btrfs filesystem in RAM, mounted in "mnt". What > I did for performing some artificial I/O operations is to rm and cp a > linux source tree over and over again. Suppose you have an unpacked > linux source tree available in the "/somewhere/linux" directory (and > you''re using bash). We''ll spawn some loops that keep the filesystem busy: > > $ while true; do rm -fr mnt/a; sleep 1.0; cp -R /somewhere/linux > mnt/a; sleep 1.0; done > $ while true; do rm -fr mnt/b; sleep 1.1; cp -R /somewhere/linux > mnt/b; sleep 1.1; done > $ while true; do rm -fr mnt/c; sleep 1.2; cp -R /somewhere/linux > mnt/c; sleep 1.2; done > > Now that the filesystem is busy, we''ll also scrub it repeatedly > (without backgrounding, -B): > > $ while true; do btrfs scrub start -B mnt; sleep 0.5; done > > On my machine and in RAM, each scrub takes 0-1 second and the "total > bytes scrubbed" should fluctuate (seems to be especially true with > commit=1, but not sure). Get a beverage of your choice and wait. > > (about 10 minutes later) > > When I was writing this repro it took about 10 minutes until scrub said: > > total bytes scrubbed: 1.20GB with 2 errors > error details: super=2 > corrected errors: 0, uncorrectable errors: 0, unverified errors: 0 > > and in dmesg: > > [15282.155170] btrfs: bdev /dev/loop0 errs: wr 0, rd 0, flush 0, > corrupt 0, gen 1 > [15282.155176] btrfs: bdev /dev/loop0 errs: wr 0, rd 0, flush 0, > corrupt 0, gen 2 > > After that, scrub is happy again and will continue normally until the > same errors happen again after a few hundred scrubs or so. > > So all in all, the error can be triggered using normal I/O operations > and scrubbing at the right moments, it seems. Even with a btrfs image > in RAM, so no hard drive error is possible. > > Hope anyone can reproduce this and maybe debug it. > > Best regards > Sebastian > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html