thr3ads.net - Btrfs devel - oops at mount [May 2013]

If this information is useful, please help other people find it:
Share via:

Papp Tamas

2013-May-30 11:17 UTC

oops at mount

hi All,

I''m new on the list.

System:
Distributor ID:	Ubuntu
Description:	Ubuntu 13.04
Release:	13.04
Codename:	raring

Linux ctu 3.8.0-19-generic #30-Ubuntu SMP Wed May 1 16:35:23 UTC 2013 x86_64
x86_64 x86_64 GNU/Linux

The symptom is the same with Saucy 3.9 kernel.

ii  btrfs-tools                              
0.20~git20130524~650e656-0daily13~raring1 amd64
  Checksumming Copy on Write Filesystem utilities


I also tried btrfs-tools v0.19 before with no luck.


$ btrfsck --repair /dev/sda1
enabling repair mode
parent transid verify failed on 430612480 wanted 81016 found 81011
parent transid verify failed on 430612480 wanted 81016 found 81011
parent transid verify failed on 430612480 wanted 81016 found 81011
parent transid verify failed on 430612480 wanted 81016 found 81011
Ignoring transid failure
Checking filesystem on /dev/sda1
UUID: deed1ffb-27bb-4555-b5ce-8a3c8ee5612c
checking extents
checking free space cache
cache and super generation don''t match, space cache will be invalidated
checking fs roots
checking csums
checking root refs
found 67570520064 bytes used err is 0
total csum bytes: 65168792
total tree bytes: 789745664
total fs tree bytes: 651145216
total extent tree bytes: 50372608
btree space waste bytes: 192929190
file data blocks allocated: 80764424192
  referenced 69347667968
Btrfs v0.20-rc1


If I mount, I get an oops message. The machine is not completely freezed, but I
have to reboot it to
be able to use it again.


    69.257107] btrfsck[2703]: segfault at 7ff069802710 ip 00007ff063ceecbd sp
00007fff9bb5db70 error
4 in libc-2.17.so[7ff063c6f000+1be000]
[  480.799981] device fsid deed1ffb-27bb-4555-b5ce-8a3c8ee5612c devid 1 transid
81010 /dev/sda1
[  480.802507] btrfs: disk space caching is enabled
[  480.851534] Btrfs detected SSD devices, enabling SSD mode
[  480.863245] btrfs bad tree block start 0 413601792
[  480.863320] btrfs bad tree block start 0 413601792
[  480.863389] ------------[ cut here ]------------
[  480.863426] Kernel BUG at ffffffffa03d3b6a [verbose debug info unavailable]
[  480.863459] invalid opcode: 0000 [#1] SMP
[  480.863490] Modules linked in: ip6table_filter(F) ip6_tables(F) xt_state(F)
ipt_REJECT(F)
xt_CHECKSUM(F) iptable_mangle(F) xt_tcpudp(F) iptable_filter(F)
ipt_MASQUERADE(F) iptable_nat(F)
nf_conntrack_ipv4(F) nf_defrag_ipv4(F) nf_nat_ipv4(F) nf_nat(F) nf_conntrack(F)
ip_tables(F)
x_tables(F) bridge(F) stp(F) llc(F) pci_stub vboxpci(OF) vboxnetadp(OF)
vboxnetflt(OF) vboxdrv(OF)
rfcomm bnep snd_hda_codec_hdmi snd_hda_codec_idt binfmt_misc(F) qcserial
usb_wwan usbserial
pata_pcmcia arc4(F) hid_generic coretemp kvm_intel iwldvm kvm mac80211
ghash_clmulni_intel(F)
aesni_intel(F) aes_x86_64(F) xts(F) lrw(F) gf128mul(F) ablk_helper(F) cryptd(F)
usbhid hid joydev(F)
tpm_infineon hp_wmi sparse_keymap uvcvideo videobuf2_vmalloc videobuf2_memops
videobuf2_core
videodev pcmcia microcode(F) btusb bluetooth psmouse(F) serio_raw(F) intel_ips
btrfs(F) tpm_tis
libcrc32c(F) zlib_deflate(F) sdhci_pci snd_hda_intel sdhci snd_hda_codec
snd_hwdep(F) snd_pcm(F)
firewire_ohci snd_page_alloc(F) firewire_core snd_seq_midi(F)
snd_seq_midi_event(F) crc_itu_t(F)
yenta_socket pcmcia_rsrc i915 pcmcia_core snd_rawmidi(F) drm_kms_helper
snd_seq(F) hp_accel drm
lis3lv02d snd_seq_device(F) input_polldev snd_timer(F) wmi iwlwifi snd(F)
video(F) mac_hid cfg80211
lpc_ich i2c_algo_bit mei e1000e(F) soundcore(F) lp(F) parport(F) ahci(F)
libahci(F)
[  480.864322] CPU 3
[  480.864338] Pid: 5550, comm: mount Tainted: GF          O 3.8.0-19-generic
#30-Ubuntu
Hewlett-Packard HP EliteBook 2540p/7008
[  480.864386] RIP: 0010:[<ffffffffa03d3b6a>]  [<ffffffffa03d3b6a>] 
btrfs_recover_log_trees+0x23a/0x390 [btrfs]
[  480.864474] RSP: 0018:ffff88012ad41b40  EFLAGS: 00010282
[  480.864499] RAX: 00000000fffffffb RBX: ffff88018b91c000 RCX: 00000001801c001b
[  480.864531] RDX: 00000001801c001c RSI: 00000000801c001b RDI: ffff8801b20b3900
[  480.864563] RBP: ffff88012ad41bf0 R08: 0000000000000000 R09: 0000000000000001
[  480.864594] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88014fc0a5a0
[  480.864625] R13: ffff88011d2f0e40 R14: ffff88018b91a800 R15: ffff8801ab3ea000
[  480.864656] FS:  00007fb531818840(0000) GS:ffff8801bbcc0000(0000)
knlGS:0000000000000000
[  480.864693] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  480.864718] CR2: 00000000006a5000 CR3: 000000016800b000 CR4: 00000000000007e0
[  480.864750] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  480.864781] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  480.864813] Process mount (pid: 5550, threadinfo ffff88012ad40000, task
ffff880128522e80)
[  480.864847] Stack:
[  480.864860]  ffff8801b0e5ce40 ffff88012ad41b98 fffffffa00000000
ffffff84ffffffff
[  480.864905]  fffffaffffffffff 010684ffffffffff 0106000000000000
ff84000000000000
[  480.864947]  faffffffffffffff 84ffffffffffffff 0000000000000106
0000000000000000
[  480.864990] Call Trace:
[  480.865019]  [<ffffffffa03d1a10>] ? fixup_inode_link_counts+0x150/0x150
[btrfs]
[  480.865061]  [<ffffffffa0398a2c>] open_ctree+0x171c/0x1da0 [btrfs]
[  480.865095]  [<ffffffff81331461>] ? disk_name+0x61/0xc0
[  480.865126]  [<ffffffffa0371a83>] btrfs_mount+0x613/0x750 [btrfs]
[  480.865160]  [<ffffffff81197c43>] mount_fs+0x43/0x1b0
[  480.865187]  [<ffffffff811b2457>] ? alloc_vfsmnt+0xd7/0x1b0
[  480.865214]  [<ffffffff811b25e4>] vfs_kern_mount+0x74/0x110
[  480.865240]  [<ffffffff811b495f>] do_mount+0x21f/0xac0
[  480.865270]  [<ffffffff8114a46b>] ? strndup_user+0x5b/0x80
[  480.865296]  [<ffffffff811b528e>] sys_mount+0x8e/0xe0
[  480.865323]  [<ffffffff816d37dd>] system_call_fastpath+0x1a/0x1f
[  480.865352] Code: ef e8 bb 9e ff ff 85 c0 75 21 83 7d b8 02 0f 85 ad fe ff ff
48 8b 75 c0 4c 89
e2 4c 89 ef e8 5e dd ff ff 85 c0 0f 84 96 fe ff ff <0f> 0b 0f 1f 40 00 4c
89 e7 e8 b8 13 fa ff 44 8b
5d b4 45 85 db
[  480.865680] RIP  [<ffffffffa03d3b6a>]
btrfs_recover_log_trees+0x23a/0x390 [btrfs]
[  480.865743]  RSP <ffff88012ad41b40>
[  481.887687] ---[ end trace 6d9b536c1234c5bc ]---



The storage is an Intel X18-M/X25-M/X25-V G2 SSD and a similar error was there a
couple of weeks ago.
It''s a root partition with 3 subvolumes. Now I use a secondary system
on the drive.

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED 
WHEN_FAILED RAW_VALUE
   3 Spin_Up_Time            0x0020   100   100   000    Old_age   Offline     
-       0
   4 Start_Stop_Count        0x0030   100   100   000    Old_age   Offline     
-       0
   5 Reallocated_Sector_Ct   0x0032   100   100   000    Old_age   Always      
-       4
   9 Power_On_Hours          0x0032   100   100   000    Old_age   Always      
-       9718
  12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always      
-       3532
192 Unsafe_Shutdown_Count   0x0032   100   100   000    Old_age   Always       -
486
225 Host_Writes_32MiB       0x0030   200   200   000    Old_age   Offline      -
172848
226 Workld_Media_Wear_Indic 0x0032   100   100   000    Old_age   Always       -
1823
227 Workld_Host_Reads_Perc  0x0032   100   100   000    Old_age   Always       -
4
228 Workload_Minutes        0x0032   100   100   000    Old_age   Always       -
1156268736
232 Available_Reservd_Space 0x0033   099   099   010    Pre-fail  Always       -
0
233 Media_Wearout_Indicator 0x0032   098   098   000    Old_age   Always       -
0
184 End-to-End_Error        0x0033   100   100   099    Pre-fail  Always       -
0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours) 
LBA_of_first_error
# 1  Reserved (0x80)     Completed without error       00%      8925         -
# 2  Reserved (0x18)     Completed without error       00%      8921         -
# 3  Vendor (0xb8)       Completed without error       00%      8920         -
# 4  Reserved (0x30)     Completed without error       00%      8065         -
# 5  Vendor (0xd0)       Completed without error       00%      3530         -
# 6  Offline             Completed without error       00%        38         -

Note: selective self-test log revision number (0) not 1 implies that no
selective self-test has ever
been run
SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been
run
  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
     1        0        0  Not_testing
     2        0        0  Not_testing
     3        0        0  Not_testing
     4        0        0  Not_testing
     5        0        0  Not_testing


Is it the SSD or rather a bug?


Thanks,
tamas
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Josef Bacik

2013-May-30 12:32 UTC

head link

Re: oops at mount

On Thu, May 30, 2013 at 05:17:06AM -0600, Papp Tamas
wrote:> hi All,
> 
> I''m new on the list.
> 
> System:
> Distributor ID:	Ubuntu
> Description:	Ubuntu 13.04
> Release:	13.04
> Codename:	raring
> 
> Linux ctu 3.8.0-19-generic #30-Ubuntu SMP Wed May 1 16:35:23 UTC 2013
x86_64 x86_64 x86_64 GNU/Linux
> 
> The symptom is the same with Saucy 3.9 kernel.
Can you try btrfs-next

git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git

if it''s still not fixed please file a bug at bugzilla.kernel.org and
make sure
the component is set to btrfs.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stefan Behrens

2013-May-30 12:55 UTC

head link

Re: oops at mount

On Thu, 30 May 2013 08:32:35 -0400, Josef Bacik wrote:> On Thu, May 30, 2013 at 05:17:06AM -0600, Papp Tamas wrote:
>> hi All,
>>
>> I''m new on the list.
>>
>> System:
>> Distributor ID:	Ubuntu
>> Description:	Ubuntu 13.04
>> Release:	13.04
>> Codename:	raring
>>
>> Linux ctu 3.8.0-19-generic #30-Ubuntu SMP Wed May 1 16:35:23 UTC 2013
x86_64 x86_64 x86_64 GNU/Linux
>>
>> The symptom is the same with Saucy 3.9 kernel.
> 
> Can you try btrfs-next
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git
> 
> if it''s still not fixed please file a bug at bugzilla.kernel.org
and make sure
> the component is set to btrfs.  Thanks,
Papp is using an Intel X18-M/X25-M/X25-V G2 SSD. At least with an Intel
X25 SSD that identifies itself with "INTEL SSDSA2M080" and on one with
the ID "INTEL SSDSA2M040", I''ve tested whether they honor the
flush
request. And these two SSDs don''t do so, they ignore it. If you cut the
power after a flush request completes, the data that was written before
the flush request is gone, the write cache was _not_ flushed.

You can only disable the write cache during/after every boot "hdparm -W
0 /dev/sd..." (which reduces the SSDs write speed to about 4 MB/s), or
avoid such SSDs, or prepare to restore from backup occasionally.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason

2013-May-30 14:03 UTC

head link

Re: oops at mount

Quoting Stefan Behrens (2013-05-30 08:55:58)> On Thu, 30 May 2013 08:32:35 -0400, Josef Bacik wrote:
> > On Thu, May 30, 2013 at 05:17:06AM -0600, Papp Tamas wrote:
> >> hi All,
> >>
> >> I''m new on the list.
> >>
> >> System:
> >> Distributor ID:      Ubuntu
> >> Description: Ubuntu 13.04
> >> Release:     13.04
> >> Codename:    raring
> >>
> >> Linux ctu 3.8.0-19-generic #30-Ubuntu SMP Wed May 1 16:35:23 UTC
2013 x86_64 x86_64 x86_64 GNU/Linux
> >>
> >> The symptom is the same with Saucy 3.9 kernel.
> > 
> > Can you try btrfs-next
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git
> > 
> > if it''s still not fixed please file a bug at
bugzilla.kernel.org and make sure
> > the component is set to btrfs.  Thanks,
> 
> Papp is using an Intel X18-M/X25-M/X25-V G2 SSD. At least with an Intel
> X25 SSD that identifies itself with "INTEL SSDSA2M080" and on one
with
> the ID "INTEL SSDSA2M040", I''ve tested whether they
honor the flush
> request. And these two SSDs don''t do so, they ignore it. If you
cut the
> power after a flush request completes, the data that was written before
> the flush request is gone, the write cache was _not_ flushed.
> 
> You can only disable the write cache during/after every boot "hdparm
-W
> 0 /dev/sd..." (which reduces the SSDs write speed to about 4 MB/s), or
> avoid such SSDs, or prepare to restore from backup occasionally.
Hi Stefan,

How did you verify this?  I''m sure intel will want to hear about it if
we can reproduce on all filesystems.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stefan Behrens

2013-May-30 14:59 UTC

head link

Re: oops at mount

On Thu, 30 May 2013 10:03:29 -0400, Chris Mason wrote:> Quoting Stefan Behrens (2013-05-30 08:55:58)
>> Papp is using an Intel X18-M/X25-M/X25-V G2 SSD. At least with an Intel
>> X25 SSD that identifies itself with "INTEL SSDSA2M080" and on
one with
>> the ID "INTEL SSDSA2M040", I''ve tested whether they
honor the flush
>> request. And these two SSDs don''t do so, they ignore it. If
you cut the
>> power after a flush request completes, the data that was written before
>> the flush request is gone, the write cache was _not_ flushed.
>>
>> You can only disable the write cache during/after every boot
"hdparm -W
>> 0 /dev/sd..." (which reduces the SSDs write speed to about 4
MB/s), or
>> avoid such SSDs, or prepare to restore from backup occasionally.
> 
> Hi Stefan,
> 
> How did you verify this?  I''m sure intel will want to hear about
it if
> we can reproduce on all filesystems.
> 
> -chris
> 
We have written a kernel module that (among others) is able to write 4KB
block of random data at random locations on an SSD, and in a second step
to read and verify that data.

The test procedure to check SSDs is:
1. Write 4KB blocks of random data to random locations on the disk. Send
a submit_bio(REQ_FLUSH) after each 4KB block. Log the completion of the
write request and of the flush request together with the result value.
2. Somewhere in the middle of operation, switch off all power, drive
presence and SAS data pins between the SSD and the SATA host controller.
3. Wait some time, afterwards enable the connection between the SSD and
the host controller again.
4. Read back the 4KB blocks of random data at random locations using the
same seed value that was used to generate the contents and location when
the blocks were written. Verify the data, log whether the verification
succeeded or failed.
5. Compare the log of the write and flush request completion with the
one of the read and verify process.

SSDs that honor the flush request don''t cause verify errors for blocks
where the write bio and the flush bio completed successfully. Those two
Intel SSDs that I mentioned failed this test. Other Intel SSD types
succeeded the test.

Maybe a firmware update would fix this issue, I suppose it will, I have
never tried it. My intention was not to blame the SSD manufacturer, in
fact, I like their SSDs very much and buy and use them frequently. I
just wanted to prevent Josef from the headache to question the Btrfs
implementation. The issue that Papp described looks just like a power
failure in conjunction with a storage device that ignores flush requests.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason

2013-May-30 16:37 UTC

head link

Re: oops at mount

Quoting Stefan Behrens (2013-05-30 10:59:59)> On Thu, 30 May 2013 10:03:29 -0400, Chris Mason wrote:
> > Quoting Stefan Behrens (2013-05-30 08:55:58)
> >> Papp is using an Intel X18-M/X25-M/X25-V G2 SSD. At least with an
Intel
> >> X25 SSD that identifies itself with "INTEL SSDSA2M080"
and on one with
> >> the ID "INTEL SSDSA2M040", I''ve tested whether
they honor the flush
> >> request. And these two SSDs don''t do so, they ignore it.
If you cut the
> >> power after a flush request completes, the data that was written
before
> >> the flush request is gone, the write cache was _not_ flushed.
> >>
> >> You can only disable the write cache during/after every boot
"hdparm -W
> >> 0 /dev/sd..." (which reduces the SSDs write speed to about 4
MB/s), or
> >> avoid such SSDs, or prepare to restore from backup occasionally.
> > 
> > Hi Stefan,
> > 
> > How did you verify this?  I''m sure intel will want to hear
about it if
> > we can reproduce on all filesystems.
> > 
> > -chris
> > 
> 
> We have written a kernel module that (among others) is able to write 4KB
> block of random data at random locations on an SSD, and in a second step
> to read and verify that data.
> 
> The test procedure to check SSDs is:
> 1. Write 4KB blocks of random data to random locations on the disk. Send
> a submit_bio(REQ_FLUSH) after each 4KB block. Log the completion of the
> write request and of the flush request together with the result value.
> 2. Somewhere in the middle of operation, switch off all power, drive
> presence and SAS data pins between the SSD and the SATA host controller.
> 3. Wait some time, afterwards enable the connection between the SSD and
> the host controller again.
> 4. Read back the 4KB blocks of random data at random locations using the
> same seed value that was used to generate the contents and location when
> the blocks were written. Verify the data, log whether the verification
> succeeded or failed.
> 5. Compare the log of the write and flush request completion with the
> one of the read and verify process.
> 
> SSDs that honor the flush request don''t cause verify errors for
blocks
> where the write bio and the flush bio completed successfully. Those two
> Intel SSDs that I mentioned failed this test. Other Intel SSD types
> succeeded the test.
> 
> Maybe a firmware update would fix this issue, I suppose it will, I have
> never tried it. My intention was not to blame the SSD manufacturer, in
> fact, I like their SSDs very much and buy and use them frequently. I
> just wanted to prevent Josef from the headache to question the Btrfs
> implementation. The issue that Papp described looks just like a power
> failure in conjunction with a storage device that ignores flush requests.
It''s definitely useful information.  The gen2''s did have some
problems
(mine failed as well) but I didn''t realize how bad the powercut
handling
was.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stefan Behrens

2013-May-30 20:08 UTC

head link

Re: oops at mount

On 05/30/2013 13:17, Papp Tamas wrote:> hi All,
>
> I''m new on the list.
>
> System:
> Distributor ID:    Ubuntu
> Description:    Ubuntu 13.04
> Release:    13.04
> Codename:    raring
>
> Linux ctu 3.8.0-19-generic #30-Ubuntu SMP Wed May 1 16:35:23 UTC 2013
> x86_64 x86_64 x86_64 GNU/Linux
>
> The symptom is the same with Saucy 3.9 kernel.
>
> ii  btrfs-tools
> 0.20~git20130524~650e656-0daily13~raring1 amd64  Checksumming Copy on
> Write Filesystem utilities
>
>
> I also tried btrfs-tools v0.19 before with no luck.
>
>
> $ btrfsck --repair /dev/sda1
> enabling repair mode
> parent transid verify failed on 430612480 wanted 81016 found 81011
> parent transid verify failed on 430612480 wanted 81016 found 81011
> parent transid verify failed on 430612480 wanted 81016 found 81011
> parent transid verify failed on 430612480 wanted 81016 found 81011
> Ignoring transid failure
> Checking filesystem on /dev/sda1
> UUID: deed1ffb-27bb-4555-b5ce-8a3c8ee5612c
> checking extents
> checking free space cache
> cache and super generation don''t match, space cache will be
invalidated
> checking fs roots
> checking csums
> checking root refs
> found 67570520064 bytes used err is 0
> total csum bytes: 65168792
> total tree bytes: 789745664
> total fs tree bytes: 651145216
> total extent tree bytes: 50372608
> btree space waste bytes: 192929190
> file data blocks allocated: 80764424192
>   referenced 69347667968
> Btrfs v0.20-rc1
>
>
> If I mount, I get an oops message. The machine is not completely
> freezed, but I have to reboot it to be able to use it again.
>
>
>     69.257107] btrfsck[2703]: segfault at 7ff069802710 ip
> 00007ff063ceecbd sp 00007fff9bb5db70 error 4 in
> libc-2.17.so[7ff063c6f000+1be000]
> [  480.799981] device fsid deed1ffb-27bb-4555-b5ce-8a3c8ee5612c devid 1
> transid 81010 /dev/sda1
> [  480.802507] btrfs: disk space caching is enabled
> [  480.851534] Btrfs detected SSD devices, enabling SSD mode
> [  480.863245] btrfs bad tree block start 0 413601792
> [  480.863320] btrfs bad tree block start 0 413601792
> [  480.863389] ------------[ cut here ]------------
> [  480.863426] Kernel BUG at ffffffffa03d3b6a [verbose debug info
> unavailable]
> [  480.863459] invalid opcode: 0000 [#1] SMP
> [  480.863490] Modules linked in: ip6table_filter(F) ip6_tables(F)
> xt_state(F) ipt_REJECT(F) xt_CHECKSUM(F) iptable_mangle(F) xt_tcpudp(F)
> iptable_filter(F) ipt_MASQUERADE(F) iptable_nat(F) nf_conntrack_ipv4(F)
> nf_defrag_ipv4(F) nf_nat_ipv4(F) nf_nat(F) nf_conntrack(F) ip_tables(F)
> x_tables(F) bridge(F) stp(F) llc(F) pci_stub vboxpci(OF) vboxnetadp(OF)
> vboxnetflt(OF) vboxdrv(OF) rfcomm bnep snd_hda_codec_hdmi
> snd_hda_codec_idt binfmt_misc(F) qcserial usb_wwan usbserial pata_pcmcia
> arc4(F) hid_generic coretemp kvm_intel iwldvm kvm mac80211
> ghash_clmulni_intel(F) aesni_intel(F) aes_x86_64(F) xts(F) lrw(F)
> gf128mul(F) ablk_helper(F) cryptd(F) usbhid hid joydev(F) tpm_infineon
> hp_wmi sparse_keymap uvcvideo videobuf2_vmalloc videobuf2_memops
> videobuf2_core videodev pcmcia microcode(F) btusb bluetooth psmouse(F)
> serio_raw(F) intel_ips btrfs(F) tpm_tis libcrc32c(F) zlib_deflate(F)
> sdhci_pci snd_hda_intel sdhci snd_hda_codec snd_hwdep(F) snd_pcm(F)
> firewire_ohci snd_page_alloc(F) firewire_core snd_seq_midi(F)
> snd_seq_midi_event(F) crc_itu_t(F) yenta_socket pcmcia_rsrc i915
> pcmcia_core snd_rawmidi(F) drm_kms_helper snd_seq(F) hp_accel drm
> lis3lv02d snd_seq_device(F) input_polldev snd_timer(F) wmi iwlwifi
> snd(F) video(F) mac_hid cfg80211 lpc_ich i2c_algo_bit mei e1000e(F)
> soundcore(F) lp(F) parport(F) ahci(F) libahci(F)
> [  480.864322] CPU 3
> [  480.864338] Pid: 5550, comm: mount Tainted: GF          O
> 3.8.0-19-generic #30-Ubuntu Hewlett-Packard HP EliteBook 2540p/7008
> [  480.864386] RIP: 0010:[<ffffffffa03d3b6a>] 
[<ffffffffa03d3b6a>]
> btrfs_recover_log_trees+0x23a/0x390 [btrfs]
> [  480.864474] RSP: 0018:ffff88012ad41b40  EFLAGS: 00010282
> [  480.864499] RAX: 00000000fffffffb RBX: ffff88018b91c000 RCX:
> 00000001801c001b
> [  480.864531] RDX: 00000001801c001c RSI: 00000000801c001b RDI:
> ffff8801b20b3900
> [  480.864563] RBP: ffff88012ad41bf0 R08: 0000000000000000 R09:
> 0000000000000001
> [  480.864594] R10: 0000000000000000 R11: 0000000000000000 R12:
> ffff88014fc0a5a0
> [  480.864625] R13: ffff88011d2f0e40 R14: ffff88018b91a800 R15:
> ffff8801ab3ea000
> [  480.864656] FS:  00007fb531818840(0000) GS:ffff8801bbcc0000(0000)
> knlGS:0000000000000000
> [  480.864693] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  480.864718] CR2: 00000000006a5000 CR3: 000000016800b000 CR4:
> 00000000000007e0
> [  480.864750] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [  480.864781] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [  480.864813] Process mount (pid: 5550, threadinfo ffff88012ad40000,
> task ffff880128522e80)
> [  480.864847] Stack:
> [  480.864860]  ffff8801b0e5ce40 ffff88012ad41b98 fffffffa00000000
> ffffff84ffffffff
> [  480.864905]  fffffaffffffffff 010684ffffffffff 0106000000000000
> ff84000000000000
> [  480.864947]  faffffffffffffff 84ffffffffffffff 0000000000000106
> 0000000000000000
> [  480.864990] Call Trace:
> [  480.865019]  [<ffffffffa03d1a10>] ?
> fixup_inode_link_counts+0x150/0x150 [btrfs]
> [  480.865061]  [<ffffffffa0398a2c>] open_ctree+0x171c/0x1da0 [btrfs]
> [  480.865095]  [<ffffffff81331461>] ? disk_name+0x61/0xc0
> [  480.865126]  [<ffffffffa0371a83>] btrfs_mount+0x613/0x750 [btrfs]
> [  480.865160]  [<ffffffff81197c43>] mount_fs+0x43/0x1b0
> [  480.865187]  [<ffffffff811b2457>] ? alloc_vfsmnt+0xd7/0x1b0
> [  480.865214]  [<ffffffff811b25e4>] vfs_kern_mount+0x74/0x110
> [  480.865240]  [<ffffffff811b495f>] do_mount+0x21f/0xac0
> [  480.865270]  [<ffffffff8114a46b>] ? strndup_user+0x5b/0x80
> [  480.865296]  [<ffffffff811b528e>] sys_mount+0x8e/0xe0
> [  480.865323]  [<ffffffff816d37dd>] system_call_fastpath+0x1a/0x1f
> [  480.865352] Code: ef e8 bb 9e ff ff 85 c0 75 21 83 7d b8 02 0f 85 ad
> fe ff ff 48 8b 75 c0 4c 89 e2 4c 89 ef e8 5e dd ff ff 85 c0 0f 84 96 fe
> ff ff <0f> 0b 0f 1f 40 00 4c 89 e7 e8 b8 13 fa ff 44 8b 5d b4 45 85
db
> [  480.865680] RIP  [<ffffffffa03d3b6a>]
> btrfs_recover_log_trees+0x23a/0x390 [btrfs]
> [  480.865743]  RSP <ffff88012ad41b40>
> [  481.887687] ---[ end trace 6d9b536c1234c5bc ]---
>
>
>
> The storage is an Intel X18-M/X25-M/X25-V G2 SSD and a similar error was
> there a couple of weeks ago.
> It''s a root partition with 3 subvolumes. Now I use a secondary
system on
> the drive.
>
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
> UPDATED  WHEN_FAILED RAW_VALUE
>    3 Spin_Up_Time            0x0020   100   100   000    Old_age
> Offline      -       0
>    4 Start_Stop_Count        0x0030   100   100   000    Old_age
> Offline      -       0
>    5 Reallocated_Sector_Ct   0x0032   100   100   000    Old_age
> Always       -       4
>    9 Power_On_Hours          0x0032   100   100   000    Old_age
> Always       -       9718
>   12 Power_Cycle_Count       0x0032   100   100   000    Old_age
> Always       -       3532
> 192 Unsafe_Shutdown_Count   0x0032   100   100   000    Old_age
> Always       -       486
> 225 Host_Writes_32MiB       0x0030   200   200   000    Old_age
> Offline      -       172848
> 226 Workld_Media_Wear_Indic 0x0032   100   100   000    Old_age
> Always       -       1823
> 227 Workld_Host_Reads_Perc  0x0032   100   100   000    Old_age
> Always       -       4
> 228 Workload_Minutes        0x0032   100   100   000    Old_age
> Always       -       1156268736
> 232 Available_Reservd_Space 0x0033   099   099   010    Pre-fail
> Always       -       0
> 233 Media_Wearout_Indicator 0x0032   098   098   000    Old_age
> Always       -       0
> 184 End-to-End_Error        0x0033   100   100   099    Pre-fail
> Always       -       0
>
> SMART Error Log Version: 1
> No Errors Logged
>
> SMART Self-test log structure revision number 1
> Num  Test_Description    Status                  Remaining
> LifeTime(hours)  LBA_of_first_error
> # 1  Reserved (0x80)     Completed without error       00%
> 8925         -
> # 2  Reserved (0x18)     Completed without error       00%
> 8921         -
> # 3  Vendor (0xb8)       Completed without error       00%
> 8920         -
> # 4  Reserved (0x30)     Completed without error       00%
> 8065         -
> # 5  Vendor (0xd0)       Completed without error       00%
> 3530         -
> # 6  Offline             Completed without error       00%
> 38         -
>
> Note: selective self-test log revision number (0) not 1 implies that no
> selective self-test has ever been run
> SMART Selective self-test log data structure revision number 0
> Note: revision number not 1 implies that no selective self-test has ever
> been run
>   SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
>      1        0        0  Not_testing
>      2        0        0  Not_testing
>      3        0        0  Not_testing
>      4        0        0  Not_testing
>      5        0        0  Not_testing
>
>
> Is it the SSD or rather a bug?
Try the procedures that are described in the Wiki:

https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#I_can.27t_mount_my_filesystem.2C_and_I_get_a_kernel_oops.21

https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#My_filesystem_won.27t_mount_and_none_of_the_above_helped._Is_there_any_hope_for_my_data.3F

And if the Wiki recommends to update the kernel and the btrfs progs, 
make sure to follow the advice :)

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Papp Tamas

2013-May-31 14:55 UTC

head link

Re: oops at mount

On 05/30/2013 02:32 PM, Josef Bacik wrote:>
> On Thu, May 30, 2013 at 05:17:06AM -0600, Papp Tamas wrote:
>> hi All,
>>
>> I''m new on the list.
>>
>> System:
>> Distributor ID:	Ubuntu
>> Description:	Ubuntu 13.04
>> Release:	13.04
>> Codename:	raring
>>
>> Linux ctu 3.8.0-19-generic #30-Ubuntu SMP Wed May 1 16:35:23 UTC 2013
x86_64 x86_64 x86_64 GNU/Linux
>>
>> The symptom is the same with Saucy 3.9 kernel.
>
> Can you try btrfs-next
>
> git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git
>
> if it''s still not fixed please file a bug at bugzilla.kernel.org
and make sure
> the component is set to btrfs.  Thanks,
I did:

https://bugzilla.kernel.org/show_bug.cgi?id=59141

I''m doubt, it will be helpful, as unfortunately I''m not
familiar with bug reports regarding to the
kernel.


Thank you,
tamas
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Papp Tamas

2013-Jun-03 11:56 UTC

head link

Re: oops at mount

On 05/30/2013 02:55 PM, Stefan Behrens wrote:>
> On Thu, 30 May 2013 08:32:35 -0400, Josef Bacik wrote:
>> On Thu, May 30, 2013 at 05:17:06AM -0600, Papp Tamas wrote:
>>> hi All,
>>>
>>> I''m new on the list.
>>>
>>> System:
>>> Distributor ID:	Ubuntu
>>> Description:	Ubuntu 13.04
>>> Release:	13.04
>>> Codename:	raring
>>>
>>> Linux ctu 3.8.0-19-generic #30-Ubuntu SMP Wed May 1 16:35:23 UTC
2013 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> The symptom is the same with Saucy 3.9 kernel.
>>
>> Can you try btrfs-next
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git
>>
>> if it''s still not fixed please file a bug at
bugzilla.kernel.org and make sure
>> the component is set to btrfs.  Thanks,
>
> Papp is using an Intel X18-M/X25-M/X25-V G2 SSD. At least with an Intel
> X25 SSD that identifies itself with "INTEL SSDSA2M080" and on one
with
> the ID "INTEL SSDSA2M040", I''ve tested whether they
honor the flush
> request. And these two SSDs don''t do so, they ignore it. If you
cut the
> power after a flush request completes, the data that was written before
> the flush request is gone, the write cache was _not_ flushed.
>
> You can only disable the write cache during/after every boot "hdparm
-W
> 0 /dev/sd..." (which reduces the SSDs write speed to about 4 MB/s), or
> avoid such SSDs, or prepare to restore from backup occasionally.
Basically it means it''s not safe to use this SSD?
I used it for 2 years with ext4 without any issue, before I switched to btrfs
(on the root
partition). In the meantime btrfs also was quite stable on my /data partition.

After I reinstalled thr system with btrfs, this issue happened two times.
But anyway, I thought cow should be able to handle these kind of issues by
design. Am I wrong?


Thanks,
tamas

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hugo Mills

2013-Jun-03 12:13 UTC

head link

Re: oops at mount

On Mon, Jun 03, 2013 at 01:56:10PM +0200, Papp Tamas
wrote:> On 05/30/2013 02:55 PM, Stefan Behrens wrote:
> >
> >On Thu, 30 May 2013 08:32:35 -0400, Josef Bacik wrote:
> >>On Thu, May 30, 2013 at 05:17:06AM -0600, Papp Tamas wrote:
> >>>hi All,
> >>>
> >>>I''m new on the list.
> >>>
> >>>System:
> >>>Distributor ID:	Ubuntu
> >>>Description:	Ubuntu 13.04
> >>>Release:	13.04
> >>>Codename:	raring
> >>>
> >>>Linux ctu 3.8.0-19-generic #30-Ubuntu SMP Wed May 1 16:35:23
UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
> >>>
> >>>The symptom is the same with Saucy 3.9 kernel.
> >>
> >>Can you try btrfs-next
> >>
> >>git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git
> >>
> >>if it''s still not fixed please file a bug at
bugzilla.kernel.org and make sure
> >>the component is set to btrfs.  Thanks,
> >
> >Papp is using an Intel X18-M/X25-M/X25-V G2 SSD. At least with an Intel
> >X25 SSD that identifies itself with "INTEL SSDSA2M080" and on
one with
> >the ID "INTEL SSDSA2M040", I''ve tested whether they
honor the flush
> >request. And these two SSDs don''t do so, they ignore it. If
you cut the
> >power after a flush request completes, the data that was written before
> >the flush request is gone, the write cache was _not_ flushed.
> >
> >You can only disable the write cache during/after every boot
"hdparm -W
> >0 /dev/sd..." (which reduces the SSDs write speed to about 4
MB/s), or
> >avoid such SSDs, or prepare to restore from backup occasionally.
> 
> Basically it means it''s not safe to use this SSD?
   Correct.
> I used it for 2 years with ext4 without any issue, before I switched
> to btrfs (on the root partition). In the meantime btrfs also was
> quite stable on my /data partition.
> 
> After I reinstalled thr system with btrfs, this issue happened two times.
> But anyway, I thought cow should be able to handle these kind of issues by
design. Am I wrong?
   CoW writes out everything that''s going to be changed first, and
finally writes one piece of data which points to the new version of
the data. *Provided* you can guarantee that the final piece of data
(the superblock) gets written only after everything else has made it
to permanent storage, then everything is good.

   However, most hardware (and most operating systems) reorder the
data which is being sent to the disk, for performance reasons. This is
fine, as long as you can enforce the dependency in some way -- this is
what barriers/flushes do: they say "ensure that all of this is fully
written out to real permanent storage before you try to write the
superblock".

   If the hardware ignores flushes or barriers, there''s no mechanism
for ensuring that the data is fully consistent, because you may find
that the superblock gets reordered to be written before some of the
other writes to the device. If that happens and then the power gets
cut before the rest of the data can be written, you have a corrupt
filesystem.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ==  PGP
key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
         --- In theory, theory and practice are the same. In ---         
                      practice,  they''re different.

Reasonably Related Threads

Search for more possibly parallel threads

Btrfs devel - May 2013 - oops at mount

oops at mount

Re: oops at mount

Re: oops at mount

Re: oops at mount

Re: oops at mount

Re: oops at mount

Re: oops at mount

Re: oops at mount

Re: oops at mount

Re: oops at mount

Reasonably Related Threads