Hi, this shown up today. I had to do a hard reboot as graceful hanged on sync(). ------------[ cut here ]------------ kernel BUG at fs/btrfs/delayed-inode.c:1466! invalid opcode: 0000 [#1] SMP CPU 10 Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3 ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror dm_region_hash dm_log dm_multipath video battery acpi_pad acpi_ipmi ac parport usbhid evdev acpi_power_meter radeon ttm drm_kms_helper drm hwmon backlight i2c_algo_bit ipmi_si bnx2x ipmi_msghandler i2c_core hpwdt hpilo psmouse mdio uhci_hcd ehci_hcd Pid: 1488, comm: mips-wrs-linux- Tainted: G W 3.2.7 #2 HP ProLiant BL460c G6 RIP: 0010:[<ffffffffa03559f9>] [<ffffffffa03559f9>] btrfs_delete_delayed_dir_index+0xe9/0x157 [btrfs] RSP: 0018:ffff8805d5b43ba8 EFLAGS: 00010286 RAX: 00000000ffffffe4 RBX: ffff8805f0a57c00 RCX: 0000000002b06e0a RDX: 00000000ffffffe4 RSI: 0000000000018000 RDI: ffff880bee4c8150 RBP: 00000000fffffff4 R08: ffffea002fd582c0 R09: 0000001a00000000 R10: 0000000000000001 R11: ffff880bf58c7800 R12: ffff8801059cf0e0 R13: ffff8801059cf128 R14: ffff8805d5b43bb8 R15: ffff880bf58c7800 FS: 0000000000000000(0000) GS:ffff880c1fca0000(0063) knlGS:00000000f77566c0 CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b CR2: 000000000807b3b9 CR3: 0000000953a76000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process mips-wrs-linux- (pid: 1488, threadinfo ffff8805d5b42000, task ffff8805b4acec00) Stack: 0000000000000085 ffff880bf0984870 000000000033bef7 0000000000008560 0000000000000000 00000000000007bf ffff880bf0984870 00000000000009d9 0000000000000000 ffff880bf560b1b0 ffff8805c29a5e88 000000000033bef7 Call Trace: [<ffffffffa03210e5>] ? __btrfs_unlink_inode+0x172/0x25e [btrfs] [<ffffffffa032158c>] ? btrfs_rename+0x38b/0x55b [btrfs] [<ffffffff81124e8c>] ? vfs_rename+0x202/0x382 [<ffffffff81127312>] ? sys_renameat+0x169/0x1e5 [<ffffffff81051164>] ? do_page_fault+0x39b/0x3af [<ffffffff8112c35e>] ? dput+0x3f/0x126 [<ffffffff8111ee28>] ? vfs_fstatat+0x57/0x60 [<ffffffff81057176>] ? sys32_lstat64+0x20/0x29 [<ffffffff810a72f2>] ? audit_syscall_entry+0x176/0x1a1 [<ffffffff813e6c30>] ? sysenter_dispatch+0x7/0x2e Code: 4c 89 ef e8 ad d6 08 e1 eb 74 fc 48 8d 78 18 4c 89 f6 48 89 c2 a5 a5 a5 a5 a4 4c 89 fe 48 8b 7c 24 08 e8 58 e9 ff ff 85 c0 74 04 <0f> 0b eb fe 4c 89 ef e8 5c d6 08 e1 ba 02 00 00 00 48 89 de 4c RIP [<ffffffffa03559f9>] btrfs_delete_delayed_dir_index+0xe9/0x157 [btrfs] RSP <ffff8805d5b43ba8> ---[ end trace cc1045e0105e2505 ]--- -Jacek -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 08, 2012 at 01:10:45PM +0100, Jacek Luczak wrote:> kernel BUG at fs/btrfs/delayed-inode.c:1466!1461 ret = btrfs_delayed_item_reserve_metadata(trans, root, item); 1462 /* 1463 * we have reserved enough space when we start a new transaction, 1464 * so reserving metadata failure is impossible. 1465 */ 1466 BUG_ON(ret);> RAX: 00000000ffffffe4ENOSPC> [<ffffffffa03210e5>] ? __btrfs_unlink_inode+0x172/0x25e [btrfs] > [<ffffffffa032158c>] ? btrfs_rename+0x38b/0x55b [btrfs]rename reserves 20 blocks, but seems that''s not enough. I''ve never seen a crash report in rename, and according to the stacktrace there''s nothing suspicious (like selinux related). david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2012/3/8 David Sterba <dave@jikos.cz>:> On Thu, Mar 08, 2012 at 01:10:45PM +0100, Jacek Luczak wrote: >> kernel BUG at fs/btrfs/delayed-inode.c:1466! > > 1461 ret = btrfs_delayed_item_reserve_metadata(trans, root, item); > 1462 /* > 1463 * we have reserved enough space when we start a new transaction, > 1464 * so reserving metadata failure is impossible. > 1465 */ > 1466 BUG_ON(ret); > >> RAX: 00000000ffffffe4 > > ENOSPC > >> [<ffffffffa03210e5>] ? __btrfs_unlink_inode+0x172/0x25e [btrfs] >> [<ffffffffa032158c>] ? btrfs_rename+0x38b/0x55b [btrfs] > > rename reserves 20 blocks, but seems that''s not enough. I''ve never seen > a crash report in rename, and according to the stacktrace there''s > nothing suspicious (like selinux related).There were quite many things happening in the system at that time. Can''t really tell what could trigger this. Complete logs: http://91.234.146.107/~difrost/logs/tampere_log.gz -Jacek
On 03/09/2012 03:35 AM, Jacek Luczak wrote:> 2012/3/8 David Sterba <dave@jikos.cz>: >> On Thu, Mar 08, 2012 at 01:10:45PM +0100, Jacek Luczak wrote: >>> kernel BUG at fs/btrfs/delayed-inode.c:1466! >> 1461 ret = btrfs_delayed_item_reserve_metadata(trans, root, item); >> 1462 /* >> 1463 * we have reserved enough space when we start a new transaction, >> 1464 * so reserving metadata failure is impossible. >> 1465 */ >> 1466 BUG_ON(ret); >> >>> RAX: 00000000ffffffe4 >> ENOSPC >> >>> [<ffffffffa03210e5>] ? __btrfs_unlink_inode+0x172/0x25e [btrfs] >>> [<ffffffffa032158c>] ? btrfs_rename+0x38b/0x55b [btrfs] >> rename reserves 20 blocks, but seems that''s not enough. I''ve never seen >> a crash report in rename, and according to the stacktrace there''s >> nothing suspicious (like selinux related). > > There were quite many things happening in the system at that time. > Can''t really tell what could trigger this. > > Complete logs: http://91.234.146.107/~difrost/logs/tampere_log.gz >Hi Jacek, So are these warnings based on the latest upstream of btrfs? thanks, liubo> -Jacek > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ >
On 09/03/12 12:31, Liu Bo wrote:> So are these warnings based on the latest upstream of btrfs?Looks like it was 3.2.7, his oops said: Pid: 1488, comm: mips-wrs-linux- Tainted: G W 3.2.7 #2 HP -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2012/3/9 Chris Samuel <chris@csamuel.org>:> On 09/03/12 12:31, Liu Bo wrote: > >> So are these warnings based on the latest upstream of btrfs? > > Looks like it was 3.2.7, his oops said: > > Pid: 1488, comm: mips-wrs-linux- Tainted: G W 3.2.7 #2 HPYep, that''s 3.2.7. Now I can''t upgrade to latest upstream. As sson as btrfs CI tests will finish I can go to latest upstream version. -Jacek
On Fri, Mar 09, 2012 at 09:31:25AM +0800, Liu Bo wrote:> > There were quite many things happening in the system at that time. > > Can''t really tell what could trigger this. > > > > Complete logs: http://91.234.146.107/~difrost/logs/tampere_log.gz > > > So are these warnings based on the latest upstream of btrfs?No, from the log it''s 3.2.7, so it does not conain the patch "increase the global block reserve estimates" the log shows lots of messages Mar 8 13:36:37 kernel: use_block_rsv: 7 callbacks suppressed Mar 8 13:36:37 kernel: ------------[ cut here ]------------ Mar 8 13:36:37 kernel: WARNING: at fs/btrfs/extent-tree.c:5985 btrfs_alloc_free_block+0xc5/0x292 [btrfs]() ... Mar 8 13:36:37 kernel: [<ffffffffa031161f>] ? btrfs_alloc_free_block+0xc5/0x292 [btrfs] Mar 8 13:36:37 kernel: [<ffffffff8106901d>] ? warn_slowpath_common+0x78/0x8d Mar 8 13:36:37 kernel: [<ffffffffa031161f>] ? btrfs_alloc_free_block+0xc5/0x292 [btrfs] Mar 8 13:36:37 kernel: [<ffffffffa032bbad>] ? btrfs_item_offset+0x2c/0x61 [btrfs] Mar 8 13:36:37 kernel: [<ffffffffa03007dc>] ? leaf_space_used+0x86/0xb5 [btrfs] Mar 8 13:36:39 kernel: [<ffffffffa0304a7a>] ? split_leaf+0x2d9/0x632 [btrfs] Mar 8 13:36:39 kernel: [<ffffffffa0306261>] ? btrfs_search_slot+0x6c1/0x75a [btrfs] Mar 8 13:36:40 kernel: [<ffffffffa0306e6e>] ? btrfs_insert_empty_items+0x61/0xab [btrfs] Mar 8 13:36:41 kernel: [<ffffffffa032207c>] ? insert_inline_extent+0xea/0x2d4 [btrfs] Mar 8 13:36:41 kernel: [<ffffffffa0322389>] ? cow_file_range_inline+0x123/0x163 [btrfs] ... so there is probably other problem in reservations and it just blew up during the unlink call. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2012/3/9 David Sterba <dave@jikos.cz>:> On Fri, Mar 09, 2012 at 09:31:25AM +0800, Liu Bo wrote: >> > There were quite many things happening in the system at that time. >> > Can''t really tell what could trigger this. >> > >> > Complete logs: http://91.234.146.107/~difrost/logs/tampere_log.gz >> > >> So are these warnings based on the latest upstream of btrfs? > > No, from the log it''s 3.2.7, so it does not conain the patch > > "increase the global block reserve estimates" > > the log shows lots of messages > > Mar 8 13:36:37 kernel: use_block_rsv: 7 callbacks suppressed > Mar 8 13:36:37 kernel: ------------[ cut here ]------------ > Mar 8 13:36:37 kernel: WARNING: at fs/btrfs/extent-tree.c:5985 btrfs_alloc_free_block+0xc5/0x292 [btrfs]() > ... > Mar 8 13:36:37 kernel: [<ffffffffa031161f>] ? btrfs_alloc_free_block+0xc5/0x292 [btrfs] > Mar 8 13:36:37 kernel: [<ffffffff8106901d>] ? warn_slowpath_common+0x78/0x8d > Mar 8 13:36:37 kernel: [<ffffffffa031161f>] ? btrfs_alloc_free_block+0xc5/0x292 [btrfs] > Mar 8 13:36:37 kernel: [<ffffffffa032bbad>] ? btrfs_item_offset+0x2c/0x61 [btrfs] > Mar 8 13:36:37 kernel: [<ffffffffa03007dc>] ? leaf_space_used+0x86/0xb5 [btrfs] > Mar 8 13:36:39 kernel: [<ffffffffa0304a7a>] ? split_leaf+0x2d9/0x632 [btrfs] > Mar 8 13:36:39 kernel: [<ffffffffa0306261>] ? btrfs_search_slot+0x6c1/0x75a [btrfs] > Mar 8 13:36:40 kernel: [<ffffffffa0306e6e>] ? btrfs_insert_empty_items+0x61/0xab [btrfs] > Mar 8 13:36:41 kernel: [<ffffffffa032207c>] ? insert_inline_extent+0xea/0x2d4 [btrfs] > Mar 8 13:36:41 kernel: [<ffffffffa0322389>] ? cow_file_range_inline+0x123/0x163 [btrfs] > ...For this one I''ve created also a report [1].> > so there is probably other problem in reservations and it just blew up during > the unlink call.Could be as this came up after a longer time of throwing above WARN_ON. I''m now cloning the Linus tree. Lets see if both will pop up on there. -Jacek [1] http://www.spinics.net/lists/linux-btrfs/msg15404.html
On Fri, Mar 09, 2012 at 12:08:12PM +0100, Jacek Luczak wrote:> For this one I''ve created also a report [1]. > > > > so there is probably other problem in reservations and it just blew up during > > the unlink call. > > Could be as this came up after a longer time of throwing above WARN_ON. > > I''m now cloning the Linus tree. Lets see if both will pop up on there.The 3.3-rc6 should help in one case, with http://thread.gmane.org/gmane.comp.file-systems.btrfs/15268 but I was able to reproduce the WARN_ON even with this patch, didn''t get to debugging it again yet. david
2012/3/9 David Sterba <dave@jikos.cz>:> On Fri, Mar 09, 2012 at 12:08:12PM +0100, Jacek Luczak wrote: >> For this one I''ve created also a report [1]. >> > >> > so there is probably other problem in reservations and it just blew up during >> > the unlink call. >> >> Could be as this came up after a longer time of throwing above WARN_ON. >> >> I''m now cloning the Linus tree. Lets see if both will pop up on there. > > The 3.3-rc6 should help in one case, with > > http://thread.gmane.org/gmane.comp.file-systems.btrfs/15268 > > but I was able to reproduce the WARN_ON even with this patch, didn''t get > to debugging it again yet.Those two issues go inline. After a longer while of WARN_ON the BUG_ON hit again. I''m still running 3.2.7, clone is done thus will switch now to upstream. -Jacek
2012/3/9 Jacek Luczak <difrost.kernel@gmail.com>:> 2012/3/9 David Sterba <dave@jikos.cz>: >> On Fri, Mar 09, 2012 at 12:08:12PM +0100, Jacek Luczak wrote: >>> For this one I''ve created also a report [1]. >>> > >>> > so there is probably other problem in reservations and it just blew up during >>> > the unlink call. >>> >>> Could be as this came up after a longer time of throwing above WARN_ON. >>> >>> I''m now cloning the Linus tree. Lets see if both will pop up on there. >> >> The 3.3-rc6 should help in one case, with >> >> http://thread.gmane.org/gmane.comp.file-systems.btrfs/15268 >> >> but I was able to reproduce the WARN_ON even with this patch, didn''t get >> to debugging it again yet. > > Those two issues go inline. After a longer while of WARN_ON the BUG_ON > hit again.One more observation. Host is running builds from CI system. After BUG_ON pop up all builds take 50% more time to complete. Also I see that current load average stall at value of 7 even if host is completely idle. -Jacek -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Mar 09, 2012 at 03:33:24PM +0100, Jacek Luczak wrote:> > Those two issues go inline. After a longer while of WARN_ON the BUG_ON > > hit again. > > One more observation. Host is running builds from CI system. After > BUG_ON pop up all builds take 50% more time to complete.After a BUG_ON the system is in an inconsistent state and will misbehave.> Also I see that current load average stall at value of 7 even if host > is completely idle.That''s probably because some process is stuck in a D-state after the crash. david
2012/3/9 David Sterba <dave@jikos.cz>:> On Fri, Mar 09, 2012 at 03:33:24PM +0100, Jacek Luczak wrote: >> > Those two issues go inline. After a longer while of WARN_ON the BUG_ON >> > hit again. >> >> One more observation. Host is running builds from CI system. After >> BUG_ON pop up all builds take 50% more time to complete. > > After a BUG_ON the system is in an inconsistent state and will > misbehave. > >> Also I see that current load average stall at value of 7 even if host >> is completely idle. > > That''s probably because some process is stuck in a D-state after the > crash. >Now when I''ve tried to bring CI back I had to clean the broken builds from BUG_ON run. The svn up fails: open("/btrfs/project1/.svn/lock", O_WRONLY|O_CREAT|O_EXCL, 0666) = -1 ENOSPC (No space left on device) -Jacek
2012/3/9 Jacek Luczak <difrost.kernel@gmail.com>:> 2012/3/9 David Sterba <dave@jikos.cz>: >> On Fri, Mar 09, 2012 at 03:33:24PM +0100, Jacek Luczak wrote: >>> > Those two issues go inline. After a longer while of WARN_ON the BUG_ON >>> > hit again. >>> >>> One more observation. Host is running builds from CI system. After >>> BUG_ON pop up all builds take 50% more time to complete. >> >> After a BUG_ON the system is in an inconsistent state and will >> misbehave. >> >>> Also I see that current load average stall at value of 7 even if host >>> is completely idle. >> >> That''s probably because some process is stuck in a D-state after the >> crash. >> > > Now when I''ve tried to bring CI back I had to clean the broken builds > from BUG_ON run. The svn up fails: > > open("/btrfs/project1/.svn/lock", O_WRONLY|O_CREAT|O_EXCL, 0666) = -1 > ENOSPC (No space left on device) >btrfsck shows a lot of: root 5 inode 52651143 errors 400 root 5 inode 52651144 errors 400 root 5 inode 52651163 errors 400 root 5 inode 52651164 errors 400 root 5 inode 52651165 errors 400 found 142180204544 bytes used err is 1 total csum bytes: 0 total tree bytes: 7636135936 total fs tree bytes: 7371886592 btree space waste bytes: 2119907598 file data blocks allocated: 134544068608 referenced 134544068608 -Jacek -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2012/3/9 David Sterba <dave@jikos.cz>:> On Fri, Mar 09, 2012 at 12:08:12PM +0100, Jacek Luczak wrote: >> For this one I''ve created also a report [1]. >> > >> > so there is probably other problem in reservations and it just blew up during >> > the unlink call. >> >> Could be as this came up after a longer time of throwing above WARN_ON. >> >> I''m now cloning the Linus tree. Lets see if both will pop up on there. > > The 3.3-rc6 should help in one case, with > > http://thread.gmane.org/gmane.comp.file-systems.btrfs/15268 > > but I was able to reproduce the WARN_ON even with this patch, didn''t get > to debugging it again yet. >The story so far looks like this: 1) kernel 3.2.7: - on the BUG_ON triggers after a longer while of CI env (doing builds) running. This has been already reproduced twice. - WARN_ON spams heavily, even after BUG_ON pop up. - possible relation between WARN_ON and BUG_ON. 2) A *regression* in 3.3.0-rc6-00197-g9f8050c - completely unusable as reports ENOSPC - to reproduce, mount volume and issue: # CNT=1 ; while [ $CNT -lt 10000 ] ; do rm -f /btrfs/dd ; ! touch /btrfs/dd && echo "$CNT" && break ; CNT=$(( $CNT + 1 )) ; done On my host this shows: # CNT=1 ; while [ $CNT -lt 10000 ] ; do rm -f /btrfs/dd ; ! touch /btrfs/dd && echo "$CNT" && break ; CNT=$(( $CNT + 1 )) ; done touch: cannot touch `/btrfs/dd'': No space left on device 423 - remount to reset: # CNT=1 ; while [ $CNT -lt 10000 ] ; do rm -f /btrfs/dd ; ! touch /btrfs/dd && echo "$CNT" && break ; CNT=$(( $CNT + 1 )) ; done touch: cannot touch `/btrfs/dd'': No space left on device 1 # umount /btrfs/ # mount -t btrfs /dev/vg00/btrfs /btrfs/ -o noatime,nodatacow,defaults # CNT=1 ; while [ $CNT -lt 10000 ] ; do rm -f /btrfs/dd ; ! touch /vdd && echo "$CNT" && break ; CNT=$(( $CNT + 1 )) ; done touch: cannot touch `/btrfs/dd'': No space left on device 423 - bisected down to 5500cdb (Btrfs: increase the global block reserve estimates). After reverting this one Linus master works for me again. -Jacek -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2012/3/10 Jacek Luczak <difrost.kernel@gmail.com>:> 2) A *regression* in 3.3.0-rc6-00197-g9f8050c > - completely unusable as reports ENOSPC > - to reproduce, mount volume and issue: > # CNT=1 ; while [ $CNT -lt 10000 ] ; do rm -f /btrfs/dd ; ! touch > /btrfs/dd && echo "$CNT" && break ; CNT=$(( $CNT + 1 )) ; done > On my host this shows: > # CNT=1 ; while [ $CNT -lt 10000 ] ; do rm -f /btrfs/dd ; ! touch > /btrfs/dd && echo "$CNT" && break ; CNT=$(( $CNT + 1 )) ; done > touch: cannot touch `/btrfs/dd'': No space left on device > 423 > - remount to reset: > # CNT=1 ; while [ $CNT -lt 10000 ] ; do rm -f /btrfs/dd ; ! touch > /btrfs/dd && echo "$CNT" && break ; CNT=$(( $CNT + 1 )) ; done > touch: cannot touch `/btrfs/dd'': No space left on device > 1 > # umount /btrfs/ > # mount -t btrfs /dev/vg00/btrfs /btrfs/ -o noatime,nodatacow,defaults > # CNT=1 ; while [ $CNT -lt 10000 ] ; do rm -f /btrfs/dd ; ! touch > /vdd && echo "$CNT" && break ; CNT=$(( $CNT + 1 )) ; done > touch: cannot touch `/btrfs/dd'': No space left on device > 423 > - bisected down to 5500cdb (Btrfs: increase the global block reserve > estimates). After reverting this one Linus master works for me again.This patch is included in 3.3-rc7. Do you plan to keep it or revert? -jacek
2012/3/10 Jacek Luczak <difrost.kernel@gmail.com>:> 2012/3/9 David Sterba <dave@jikos.cz>: >> On Fri, Mar 09, 2012 at 12:08:12PM +0100, Jacek Luczak wrote: >>> For this one I''ve created also a report [1]. >>> > >>> > so there is probably other problem in reservations and it just blew up during >>> > the unlink call. >>> >>> Could be as this came up after a longer time of throwing above WARN_ON. >>> >>> I''m now cloning the Linus tree. Lets see if both will pop up on there. >> >> The 3.3-rc6 should help in one case, with >> >> http://thread.gmane.org/gmane.comp.file-systems.btrfs/15268 >> >> but I was able to reproduce the WARN_ON even with this patch, didn''t get >> to debugging it again yet. >> > > > The story so far looks like this: > 1) kernel 3.2.7: > - on the BUG_ON triggers after a longer while of CI env (doing builds) > running. This has been already reproduced twice. > - WARN_ON spams heavily, even after BUG_ON pop up. > - possible relation between WARN_ON and BUG_ON.WARN_ON still popup in 3.3.0-rc6-00197-g9f8050c but did not triggered BUG_ON after ~300 occurrence.> 2) A *regression* in 3.3.0-rc6-00197-g9f8050c > - completely unusable as reports ENOSPC > - to reproduce, mount volume and issue: > # CNT=1 ; while [ $CNT -lt 10000 ] ; do rm -f /btrfs/dd ; ! touch > /btrfs/dd && echo "$CNT" && break ; CNT=$(( $CNT + 1 )) ; done > On my host this shows: > # CNT=1 ; while [ $CNT -lt 10000 ] ; do rm -f /btrfs/dd ; ! touch > /btrfs/dd && echo "$CNT" && break ; CNT=$(( $CNT + 1 )) ; done > touch: cannot touch `/btrfs/dd'': No space left on device > 423 > - remount to reset: > # CNT=1 ; while [ $CNT -lt 10000 ] ; do rm -f /btrfs/dd ; ! touch > /btrfs/dd && echo "$CNT" && break ; CNT=$(( $CNT + 1 )) ; done > touch: cannot touch `/btrfs/dd'': No space left on device > 1 > # umount /btrfs/ > # mount -t btrfs /dev/vg00/btrfs /btrfs/ -o noatime,nodatacow,defaults > # CNT=1 ; while [ $CNT -lt 10000 ] ; do rm -f /btrfs/dd ; ! touch > /vdd && echo "$CNT" && break ; CNT=$(( $CNT + 1 )) ; done > touch: cannot touch `/btrfs/dd'': No space left on device > 423 > - bisected down to 5500cdb (Btrfs: increase the global block reserve > estimates). After reverting this one Linus master works for me again.With above patch reverted after a longer run I''ve got ENOSPC again: 1) # df -hP /btrfs Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg00-btrfs 195G 179G 11G 95% /btrfs 2) # rm -f /btrfs/dd rm: cannot remove `/btrfs/dd'': No space left on device 3) strace unlink("/btrfs/dd") = -1 ENOSPC (No space left on device) 4) last message from kernel (except WARN_ONs): btrfs: fail to dirty inode 116882385 error -28 I''ve remouted volume and after that I''ve been able to remove dd file from volume. In dmesg there''s bunch on new WARN_ONs: ------------[ cut here ]------------ WARNING: at fs/btrfs/extent-tree.c:4185 btrfs_free_block_groups+0x17d/0x2b8 [btrfs]() Hardware name: ProLiant BL460c G6 Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3 ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror dm_region_hash dm_log dm_multipath video battery acpi_pad acpi_ipmi ac parport usbhid evdev acpi_power_meter radeon ttm drm_kms_helper drm hwmon ipmi_si bnx2x backlight i2c_algo_bit ipmi_msghandler i2c_core hpilo mdio hpwdt psmouse uhci_hcd ehci_hcd Pid: 9518, comm: umount Tainted: G W 3.3.0-rc6-00197-g9f8050c-dirty #1 Call Trace: [<ffffffff8105caca>] ? print_oops_end_marker+0x9/0x20 [<ffffffffa031a8a7>] ? btrfs_free_block_groups+0x17d/0x2b8 [btrfs] [<ffffffff8105cc92>] ? warn_slowpath_common+0x78/0x8d [<ffffffffa031a8a7>] ? btrfs_free_block_groups+0x17d/0x2b8 [btrfs] [<ffffffffa0327473>] ? close_ctree+0x1e1/0x380 [btrfs] [<ffffffff811320c2>] ? dispose_list+0x27/0x31 [<ffffffff8113248c>] ? evict_inodes+0xc5/0xcc [<ffffffff8112098c>] ? generic_shutdown_super+0x4d/0xc1 [<ffffffff81120a67>] ? kill_anon_super+0x9/0x11 [<ffffffffa030a3aa>] ? btrfs_kill_super+0xd/0x73 [btrfs] [<ffffffff81120c81>] ? deactivate_locked_super+0x2f/0x5f [<ffffffff81135d5f>] ? sys_umount+0x2c1/0x30b [<ffffffff813ef0f9>] ? system_call_fastpath+0x16/0x1b ---[ end trace fd6da849e53b77dd ]--- ------------[ cut here ]------------ WARNING: at fs/btrfs/extent-tree.c:4186 btrfs_free_block_groups+0x198/0x2b8 [btrfs]() Hardware name: ProLiant BL460c G6 Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3 ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror dm_region_hash dm_log dm_multipath video battery acpi_pad acpi_ipmi ac parport usbhid evdev acpi_power_meter radeon ttm drm_kms_helper drm hwmon ipmi_si bnx2x backlight i2c_algo_bit ipmi_msghandler i2c_core hpilo mdio hpwdt psmouse uhci_hcd ehci_hcd Pid: 9518, comm: umount Tainted: G W 3.3.0-rc6-00197-g9f8050c-dirty #1 Call Trace: [<ffffffff8105caca>] ? print_oops_end_marker+0x9/0x20 [<ffffffffa031a8c2>] ? btrfs_free_block_groups+0x198/0x2b8 [btrfs] [<ffffffff8105cc92>] ? warn_slowpath_common+0x78/0x8d [<ffffffffa031a8c2>] ? btrfs_free_block_groups+0x198/0x2b8 [btrfs] [<ffffffffa0327473>] ? close_ctree+0x1e1/0x380 [btrfs] [<ffffffff811320c2>] ? dispose_list+0x27/0x31 [<ffffffff8113248c>] ? evict_inodes+0xc5/0xcc [<ffffffff8112098c>] ? generic_shutdown_super+0x4d/0xc1 [<ffffffff81120a67>] ? kill_anon_super+0x9/0x11 [<ffffffffa030a3aa>] ? btrfs_kill_super+0xd/0x73 [btrfs] [<ffffffff81120c81>] ? deactivate_locked_super+0x2f/0x5f [<ffffffff81135d5f>] ? sys_umount+0x2c1/0x30b [<ffffffff813ef0f9>] ? system_call_fastpath+0x16/0x1b ---[ end trace fd6da849e53b77de ]--- ------------[ cut here ]------------ WARNING: at fs/btrfs/extent-tree.c:4187 btrfs_free_block_groups+0x1b3/0x2b8 [btrfs]() Hardware name: ProLiant BL460c G6 Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3 ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror dm_region_hash dm_log dm_multipath video battery acpi_pad acpi_ipmi ac parport usbhid evdev acpi_power_meter radeon ttm drm_kms_helper drm hwmon ipmi_si bnx2x backlight i2c_algo_bit ipmi_msghandler i2c_core hpilo mdio hpwdt psmouse uhci_hcd ehci_hcd Pid: 9518, comm: umount Tainted: G W 3.3.0-rc6-00197-g9f8050c-dirty #1 Call Trace: [<ffffffff8105caca>] ? print_oops_end_marker+0x9/0x20 [<ffffffffa031a8dd>] ? btrfs_free_block_groups+0x1b3/0x2b8 [btrfs] [<ffffffff8105cc92>] ? warn_slowpath_common+0x78/0x8d [<ffffffffa031a8dd>] ? btrfs_free_block_groups+0x1b3/0x2b8 [btrfs] [<ffffffffa0327473>] ? close_ctree+0x1e1/0x380 [btrfs] [<ffffffff811320c2>] ? dispose_list+0x27/0x31 [<ffffffff8113248c>] ? evict_inodes+0xc5/0xcc [<ffffffff8112098c>] ? generic_shutdown_super+0x4d/0xc1 [<ffffffff81120a67>] ? kill_anon_super+0x9/0x11 [<ffffffffa030a3aa>] ? btrfs_kill_super+0xd/0x73 [btrfs] [<ffffffff81120c81>] ? deactivate_locked_super+0x2f/0x5f [<ffffffff81135d5f>] ? sys_umount+0x2c1/0x30b [<ffffffff813ef0f9>] ? system_call_fastpath+0x16/0x1b ---[ end trace fd6da849e53b77df ]--- ------------[ cut here ]------------ WARNING: at fs/btrfs/extent-tree.c:7454 btrfs_free_block_groups+0x256/0x2b8 [btrfs]() Hardware name: ProLiant BL460c G6 Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3 ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror dm_region_hash dm_log dm_multipath video battery acpi_pad acpi_ipmi ac parport usbhid evdev acpi_power_meter radeon ttm drm_kms_helper drm hwmon ipmi_si bnx2x backlight i2c_algo_bit ipmi_msghandler i2c_core hpilo mdio hpwdt psmouse uhci_hcd ehci_hcd Pid: 9518, comm: umount Tainted: G W 3.3.0-rc6-00197-g9f8050c-dirty #1 Call Trace: [<ffffffff8105caca>] ? print_oops_end_marker+0x9/0x20 [<ffffffffa031a980>] ? btrfs_free_block_groups+0x256/0x2b8 [btrfs] [<ffffffff8105cc92>] ? warn_slowpath_common+0x78/0x8d [<ffffffffa031a980>] ? btrfs_free_block_groups+0x256/0x2b8 [btrfs] [<ffffffffa0327473>] ? close_ctree+0x1e1/0x380 [btrfs] [<ffffffff811320c2>] ? dispose_list+0x27/0x31 [<ffffffff8113248c>] ? evict_inodes+0xc5/0xcc [<ffffffff8112098c>] ? generic_shutdown_super+0x4d/0xc1 [<ffffffff81120a67>] ? kill_anon_super+0x9/0x11 [<ffffffffa030a3aa>] ? btrfs_kill_super+0xd/0x73 [btrfs] [<ffffffff81120c81>] ? deactivate_locked_super+0x2f/0x5f [<ffffffff81135d5f>] ? sys_umount+0x2c1/0x30b [<ffffffff813ef0f9>] ? system_call_fastpath+0x16/0x1b ---[ end trace fd6da849e53b77e0 ]--- space_info 4 has 3043549184 free, is not full space_info total=11953766400, used=8901763072, pinned=0, reserved=0, may_use=121643008, readonly=8454144 device fsid b70500f5-3ec6-4a39-9b9a-adad0e8a0346 devid 1 transid 54268 /dev/vg00/btrfs btrfs: setting nodatacow -Jacek
Am Mon, 12 Mar 2012 15:21:49 +0100 schrieb Jacek Luczak <difrost.kernel@gmail.com>:> > 2) A *regression* in 3.3.0-rc6-00197-g9f8050c > > - completely unusable as reports ENOSPC > > - to reproduce, mount volume and issue: > > # CNT=1 ; while [ $CNT -lt 10000 ] ; do rm -f /btrfs/dd ; ! touch > > /btrfs/dd && echo "$CNT" && break ; CNT=$(( $CNT + 1 )) ; done > > On my host this shows: > > # CNT=1 ; while [ $CNT -lt 10000 ] ; do rm -f /btrfs/dd ; ! touch > > /btrfs/dd && echo "$CNT" && break ; CNT=$(( $CNT + 1 )) ; done > > touch: cannot touch `/btrfs/dd'': No space left on device > > 423 > > - remount to reset: > > # CNT=1 ; while [ $CNT -lt 10000 ] ; do rm -f /btrfs/dd ; ! touch > > /btrfs/dd && echo "$CNT" && break ; CNT=$(( $CNT + 1 )) ; done > > touch: cannot touch `/btrfs/dd'': No space left on device > > 1 > > # umount /btrfs/ > > # mount -t btrfs /dev/vg00/btrfs /btrfs/ -o > > noatime,nodatacow,defaults # CNT=1 ; while [ $CNT -lt 10000 ] ; do > > rm -f /btrfs/dd ; ! touch /vdd && echo "$CNT" && break ; > > CNT=$(( $CNT + 1 )) ; done touch: cannot touch `/btrfs/dd'': No > > space left on device 423 > > - bisected down to 5500cdb (Btrfs: increase the global block reserve > > estimates). After reverting this one Linus master works for me > > again. > > With above patch reverted after a longer run I''ve got ENOSPC again: > 1) # df -hP /btrfs > Filesystem Size Used Avail Use% Mounted on > /dev/mapper/vg00-btrfs 195G 179G 11G 95% /btrfs > 2) # rm -f /btrfs/dd > rm: cannot remove `/btrfs/dd'': No space left on device > 3) strace > unlink("/btrfs/dd") = -1 ENOSPC (No space left on > device) 4) last message from kernel (except WARN_ONs): > btrfs: fail to dirty inode 116882385 > > I''ve remouted volume and after that I''ve been able to remove dd file > from volume. In dmesg there''s bunch on new WARN_ONs: > ------------[ cut here ]------------ > WARNING: at fs/btrfs/extent-tree.c:4185 > btrfs_free_block_groups+0x17d/0x2b8 [btrfs]() > Hardware name: ProLiant BL460c G6 > Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf > autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa > ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3 > ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror > dm_region_hash dm_log dm_multipath video battery acpi_pad acpi_ipmi ac > parport usbhid evdev acpi_power_meter radeon ttm drm_kms_helper drm > hwmon ipmi_si bnx2x backlight i2c_algo_bit ipmi_msghandler i2c_core > hpilo mdio hpwdt psmouse uhci_hcd ehci_hcd > Pid: 9518, comm: umount Tainted: G W > 3.3.0-rc6-00197-g9f8050c-dirty #1 Call Trace: > [<ffffffff8105caca>] ? print_oops_end_marker+0x9/0x20 > [<ffffffffa031a8a7>] ? btrfs_free_block_groups+0x17d/0x2b8 [btrfs] > [<ffffffff8105cc92>] ? warn_slowpath_common+0x78/0x8d > [<ffffffffa031a8a7>] ? btrfs_free_block_groups+0x17d/0x2b8 [btrfs] > [<ffffffffa0327473>] ? close_ctree+0x1e1/0x380 [btrfs] > [<ffffffff811320c2>] ? dispose_list+0x27/0x31 > [<ffffffff8113248c>] ? evict_inodes+0xc5/0xcc > [<ffffffff8112098c>] ? generic_shutdown_super+0x4d/0xc1 > [<ffffffff81120a67>] ? kill_anon_super+0x9/0x11 > [<ffffffffa030a3aa>] ? btrfs_kill_super+0xd/0x73 [btrfs] > [<ffffffff81120c81>] ? deactivate_locked_super+0x2f/0x5f > [<ffffffff81135d5f>] ? sys_umount+0x2c1/0x30b > [<ffffffff813ef0f9>] ? system_call_fastpath+0x16/0x1b > ---[ end trace fd6da849e53b77dd ]--- > ------------[ cut here ]------------ > WARNING: at fs/btrfs/extent-tree.c:4186 > btrfs_free_block_groups+0x198/0x2b8 [btrfs]() > Hardware name: ProLiant BL460c G6 > Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf > autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa > ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3 > ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror > dm_region_hash dm_log dm_multipath video battery acpi_pad acpi_ipmi ac > parport usbhid evdev acpi_power_meter radeon ttm drm_kms_helper drm > hwmon ipmi_si bnx2x backlight i2c_algo_bit ipmi_msghandler i2c_core > hpilo mdio hpwdt psmouse uhci_hcd ehci_hcd > Pid: 9518, comm: umount Tainted: G W > 3.3.0-rc6-00197-g9f8050c-dirty #1 Call Trace: > [<ffffffff8105caca>] ? print_oops_end_marker+0x9/0x20 > [<ffffffffa031a8c2>] ? btrfs_free_block_groups+0x198/0x2b8 [btrfs] > [<ffffffff8105cc92>] ? warn_slowpath_common+0x78/0x8d > [<ffffffffa031a8c2>] ? btrfs_free_block_groups+0x198/0x2b8 [btrfs] > [<ffffffffa0327473>] ? close_ctree+0x1e1/0x380 [btrfs] > [<ffffffff811320c2>] ? dispose_list+0x27/0x31 > [<ffffffff8113248c>] ? evict_inodes+0xc5/0xcc > [<ffffffff8112098c>] ? generic_shutdown_super+0x4d/0xc1 > [<ffffffff81120a67>] ? kill_anon_super+0x9/0x11 > [<ffffffffa030a3aa>] ? btrfs_kill_super+0xd/0x73 [btrfs] > [<ffffffff81120c81>] ? deactivate_locked_super+0x2f/0x5f > [<ffffffff81135d5f>] ? sys_umount+0x2c1/0x30b > [<ffffffff813ef0f9>] ? system_call_fastpath+0x16/0x1b > ---[ end trace fd6da849e53b77de ]--- > ------------[ cut here ]------------ > WARNING: at fs/btrfs/extent-tree.c:4187 > btrfs_free_block_groups+0x1b3/0x2b8 [btrfs]() > Hardware name: ProLiant BL460c G6 > Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf > autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa > ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3 > ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror > dm_region_hash dm_log dm_multipath video battery acpi_pad acpi_ipmi ac > parport usbhid evdev acpi_power_meter radeon ttm drm_kms_helper drm > hwmon ipmi_si bnx2x backlight i2c_algo_bit ipmi_msghandler i2c_core > hpilo mdio hpwdt psmouse uhci_hcd ehci_hcd > Pid: 9518, comm: umount Tainted: G W > 3.3.0-rc6-00197-g9f8050c-dirty #1 Call Trace: > [<ffffffff8105caca>] ? print_oops_end_marker+0x9/0x20 > [<ffffffffa031a8dd>] ? btrfs_free_block_groups+0x1b3/0x2b8 [btrfs] > [<ffffffff8105cc92>] ? warn_slowpath_common+0x78/0x8d > [<ffffffffa031a8dd>] ? btrfs_free_block_groups+0x1b3/0x2b8 [btrfs] > [<ffffffffa0327473>] ? close_ctree+0x1e1/0x380 [btrfs] > [<ffffffff811320c2>] ? dispose_list+0x27/0x31 > [<ffffffff8113248c>] ? evict_inodes+0xc5/0xcc > [<ffffffff8112098c>] ? generic_shutdown_super+0x4d/0xc1 > [<ffffffff81120a67>] ? kill_anon_super+0x9/0x11 > [<ffffffffa030a3aa>] ? btrfs_kill_super+0xd/0x73 [btrfs] > [<ffffffff81120c81>] ? deactivate_locked_super+0x2f/0x5f > [<ffffffff81135d5f>] ? sys_umount+0x2c1/0x30b > [<ffffffff813ef0f9>] ? system_call_fastpath+0x16/0x1b > ---[ end trace fd6da849e53b77df ]--- > ------------[ cut here ]------------ > WARNING: at fs/btrfs/extent-tree.c:7454 > btrfs_free_block_groups+0x256/0x2b8 [btrfs]() > Hardware name: ProLiant BL460c G6 > Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf > autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa > ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3 > ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror > dm_region_hash dm_log dm_multipath video battery acpi_pad acpi_ipmi ac > parport usbhid evdev acpi_power_meter radeon ttm drm_kms_helper drm > hwmon ipmi_si bnx2x backlight i2c_algo_bit ipmi_msghandler i2c_core > hpilo mdio hpwdt psmouse uhci_hcd ehci_hcd > Pid: 9518, comm: umount Tainted: G W > 3.3.0-rc6-00197-g9f8050c-dirty #1 Call Trace: > [<ffffffff8105caca>] ? print_oops_end_marker+0x9/0x20 > [<ffffffffa031a980>] ? btrfs_free_block_groups+0x256/0x2b8 [btrfs] > [<ffffffff8105cc92>] ? warn_slowpath_common+0x78/0x8d > [<ffffffffa031a980>] ? btrfs_free_block_groups+0x256/0x2b8 [btrfs] > [<ffffffffa0327473>] ? close_ctree+0x1e1/0x380 [btrfs] > [<ffffffff811320c2>] ? dispose_list+0x27/0x31 > [<ffffffff8113248c>] ? evict_inodes+0xc5/0xcc > [<ffffffff8112098c>] ? generic_shutdown_super+0x4d/0xc1 > [<ffffffff81120a67>] ? kill_anon_super+0x9/0x11 > [<ffffffffa030a3aa>] ? btrfs_kill_super+0xd/0x73 [btrfs] > [<ffffffff81120c81>] ? deactivate_locked_super+0x2f/0x5f > [<ffffffff81135d5f>] ? sys_umount+0x2c1/0x30b > [<ffffffff813ef0f9>] ? system_call_fastpath+0x16/0x1b > ---[ end trace fd6da849e53b77e0 ]--- > space_info 4 has 3043549184 free, is not full > space_info total=11953766400, used=8901763072, pinned=0, reserved=0, > may_use=121643008, readonly=8454144 > device fsid b70500f5-3ec6-4a39-9b9a-adad0e8a0346 devid 1 transid 54268 > /dev/vg00/btrfs > btrfs: setting nodatacowThat''s similar to my experiences. Without commit 5500cdb the ENOSPC errors happens much later but they still happen. And I''ve seen only a few times the error -28 and never those backtraces. regards, Johannes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html