Hiya, I''ve got a server using brtfs to implement a backup system. Basically, every night, for a few machines, I rsync (and other methods) their file systems into one btrfs subvolume each and then snapshot it. On that server, the btrfs fs is on 3 3TB drives, mounted with compress-force. Every week, I rsync the most recent snapshot of a selection of subvolumes onto an encrypted (LUKS) external hard drive (3TB as well). Now, on a few occasions (actually, most of the time), when I rsynced the data (about 2.5TB) onto the external drive, the system would crash after some time with "Out of memory and no killable process". Basically, something in kernel was allocating the whole memory, then oom mass killed everybody and crash. That was with ubuntu 2.6.38. I had then moved to debian and 2.6.39 and thought the problem was fixed, but it just happened again with 3.0.0rc5 while rsyncing onto an initially empty btrfs fs. I''m going to resume the rsync again, and it''s likely to happen again. Is there anything simple (as I''ve got very little time to look into that) I could do to help debug the issue (I''m not 100% sure it''s btrfs fault but that''s the most likely culprit). For a start, I''ll switch the console to serial, and watch /proc/vmstat. Anything else I could do? Note that that server has never crashed when doing a lot of IO at the same time in a lot of subvolumes with remote hosts. It''s only when copying data to that external drive on LUKS that it seems to crash. Cheers, Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Jul 3, 2011 at 1:09 PM, Stephane Chazelas <stephane_chazelas@yahoo.fr> wrote:> Hiya, > > I''ve got a server using brtfs to implement a backup system. > Basically, every night, for a few machines, I rsync (and other > methods) their file systems into one btrfs subvolume each and > then snapshot it. > > On that server, the btrfs fs is on 3 3TB drives, mounted with > compress-force. Every week, I rsync the most recent snapshot of > a selection of subvolumes onto an encrypted (LUKS) external hard > drive (3TB as well). > > Now, on a few occasions (actually, most of the time), when I > rsynced the data (about 2.5TB) onto the external drive, the > system would crash after some time with "Out of memory and no > killable process". Basically, something in kernel was allocating > the whole memory, then oom mass killed everybody and crash. > > That was with ubuntu 2.6.38. I had then moved to debian and > 2.6.39 and thought the problem was fixed, but it just happened > again with 3.0.0rc5 while rsyncing onto an initially empty btrfs > fs. > > I''m going to resume the rsync again, and it''s likely to happen > again. Is there anything simple (as I''ve got very little time to > look into that) I could do to help debug the issue (I''m not 100% > sure it''s btrfs fault but that''s the most likely culprit). > > For a start, I''ll switch the console to serial, and watch > /proc/vmstat. Anything else I could do? > > Note that that server has never crashed when doing a lot of IO > at the same time in a lot of subvolumes with remote hosts. It''s > only when copying data to that external drive on LUKS that it > seems to crash.Look at the output of slabtop (should be installed by default, procfs package), before rsync for comparison, and during. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-03 13:38:57 -0600, cwillu:> On Sun, Jul 3, 2011 at 1:09 PM, Stephane Chazelas > <stephane_chazelas@yahoo.fr> wrote:[...]> > Now, on a few occasions (actually, most of the time), when I > > rsynced the data (about 2.5TB) onto the external drive, the > > system would crash after some time with "Out of memory and no > > killable process". Basically, something in kernel was allocating > > the whole memory, then oom mass killed everybody and crash.[...]> Look at the output of slabtop (should be installed by default, procfs > package), before rsync for comparison, and during.Hi, so, no crash this time, but at the end of the rsync, there''s a whole chunk of memory that is no longer available to processes (about 3.5G). As suggested by Carey, I watched /proc/slabinfo during the rsync process (see below for a report of the most significant ones over time). Does that mean that if the system had had less than 3G of RAM, it would have crashed? I tried to reclaim the space without success. I had a process allocate as much memory as it could. Then I unmounted the btrfs fs that rsync was copying onto (the one on LUKS). btrfs_inode_cache hardly changed. Then I tried to unmount the source (the one on 3 hard disks and plenty of subvolumes). umount hung. The FS disappeared from /proc/mounts. Here is the backtrace: [169270.268005] umount D ffff880145ebe770 0 24079 1290 0x00000004 [169270.268005] ffff880145ebe770 0000000000000086 0000000000000000 ffffffff8160b020 [169270.268005] 0000000000012840 ffff880123bc7fd8 ffff880123bc7fd8 0000000000012840 [169270.268005] ffff880145ebe770 ffff880123bc6010 00007fffac84f4a8 0000000100000000 [169270.268005] Call Trace: [169270.268005] [<ffffffff81337d50>] ? rwsem_down_failed_common+0xda/0x10e [169270.268005] [<ffffffff811aca63>] ? call_rwsem_down_write_failed+0x13/0x20 [169270.268005] [<ffffffff813376c7>] ? down_write+0x25/0x27 [169270.268005] [<ffffffff810ff6eb>] ? deactivate_super+0x30/0x3d [169270.268005] [<ffffffff81114135>] ? sys_umount+0x2ea/0x315 [169270.268005] [<ffffffff8133d412>] ? system_call_fastpath+0x16/0x1b iostat shows nothing being written to the drives. extent_map delayed_node btrfs_inode_cache btrfs_free_space_cache (in bytes) 09:40 131560 0 0 128 09:50 131560 0 2000 128 rsync started at 9:52 10:00 15832608 87963264 1465830000 325440 10:10 386056 85071168 1444656000 832960 10:20 237600 33988032 1549769000 1355584 10:30 23300288 145209600 487274000 2237056 10:40 24667104 148610016 506492000 2304448 10:50 22479776 139655808 511893000 2382464 11:00 4137672 1041696 4592000 2425344 11:10 5498592 3211776 12742000 2443648 11:20 4567904 1408896 9599000 2452608 11:30 2276736 1309824 4685000 2453696 11:40 2225696 421920 1987000 2455424 11:50 1971552 90432 383000 2466176 12:00 1761672 74016 327000 2469760 12:10 1939608 85824 401000 2473536 12:20 2136288 121824 551000 2479168 12:30 2367288 135648 619000 2486016 12:40 1380984 181152 911000 2485696 12:50 1053272 202176 1027000 2483712 13:00 1938200 219744 1152000 2491712 13:10 2037112 223776 1204000 2494528 13:20 1775664 249120 1244000 2497216 13:30 1704560 366048 1732000 2500608 13:40 1433344 468864 2460000 2501824 13:50 8553248 20505888 67332000 2503168 14:00 12682208 34351200 141968000 2494208 14:10 18488800 50282784 177803000 2500544 14:20 19435592 46767744 163582000 2505920 14:30 18734936 44863488 156501000 2507200 14:40 21865184 46053504 160185000 2484928 14:50 24457664 46473120 162446000 2499200 15:00 24401344 47700576 166723000 2502784 15:10 31390304 63426240 221179000 2521472 15:20 34594560 61365600 214243000 2524160 15:30 33836704 60934752 212695000 2526400 15:40 33358776 60598944 211455000 2528320 15:50 34909952 62583840 218492000 2526272 16:00 44326656 65875392 230123000 2529792 16:10 45840608 66373632 232114000 2532736 16:20 47848064 66577536 232048000 2535872 16:30 48013152 68888160 240651000 2536128 16:40 47594184 67766976 236241000 2536576 16:50 48144184 67739904 236120000 2542144 17:00 48005848 67639392 235298000 2544000 17:10 48253920 67661280 235376000 2537216 17:20 48857952 67612032 234995000 2536000 17:30 48514752 67611168 234986000 2535488 17:40 48436872 67609728 234924000 2534528 17:50 48902216 67765248 235654000 2542400 18:00 49055160 67763520 236020000 2542912 18:10 48749712 67727520 235740000 2550464 18:20 48631088 67705344 235557000 2553280 18:30 49101096 67777344 235713000 2555520 18:40 48609264 67782816 235601000 2558912 18:50 48480080 67808160 235595000 2561792 19:00 48661184 67830912 235680000 2563776 19:10 49004296 67826880 235609000 2565056 19:20 48530592 67819968 235510000 2565440 19:30 47806352 67827456 235904000 2569536 19:40 48563064 67806432 235902000 2572224 19:50 49107960 67845600 239902000 2577472 20:00 48902744 67828320 237062000 2577600 20:10 48921752 67897728 236337000 2578880 20:20 48713016 67927680 236237000 2579520 20:30 47901216 67961376 236228000 2580416 20:40 55052096 75482784 263292000 2586944 20:50 64494496 81030528 286325000 2594368 21:00 65381536 80284608 279964000 2598464 21:10 68131712 80154720 279335000 2598336 21:20 73455712 79982208 278654000 2586048 21:30 78164064 81536832 284003000 2604544 21:40 79013088 86614848 303401000 2615744 21:50 84289568 94296384 328520000 2626560 22:00 85835464 92268576 322902000 2627264 22:10 84338584 91186272 317134000 2627520 22:20 84768728 91129248 316551000 2625664 22:30 84280856 91147968 316611000 2634880 22:40 84117704 91176192 316714000 2637312 22:50 84217760 91232928 316791000 2640640 23:00 84345976 91288224 318467000 2642880 23:10 84431864 91296864 318490000 2643648 23:20 84579968 91302912 318509000 2644160 23:30 84381792 91284480 318288000 2644800 23:40 84323624 91298016 318010000 2645440 23:50 84220488 91296864 317799000 2648640 00:00 84217496 91293408 317572000 2651008 00:10 84178072 91276416 317235000 2653952 00:20 84208432 91290816 317230000 2656384 00:30 84245216 91312992 317310000 2657088 00:40 84235360 91314144 317064000 2657856 00:50 84151760 91332864 317116000 2658176 01:00 84146040 91344096 317184000 2658880 01:10 84119288 91358784 317238000 2659200 01:20 84158624 91373760 317277000 2663808 01:30 84247680 91378368 317292000 2665984 01:40 84209576 91393344 317344000 2669504 01:50 84286400 91410048 317402000 2660224 02:00 84312976 91406592 317390000 2656512 02:10 84228496 91423008 317442000 2665408 02:20 84065256 91506816 317703000 2668800 02:30 84093152 91589184 318030000 2671552 02:40 84038768 91686528 318343000 2672000 02:50 84134424 91713600 318460000 2680832 03:00 84183792 91728576 318508000 2684288 03:10 84259296 91728576 318505000 2688000 03:20 84237296 91743552 318557000 2690752 03:30 84212832 91759968 318609000 2691968 03:40 84150088 91778688 318661000 2692224 03:50 84133104 91738656 318552000 2692928 04:00 84129672 91728576 318533000 2693760 04:10 84178600 91711296 318446000 2701888 04:20 84164872 91726272 318497000 2704384 04:30 84231752 91717632 318467000 2708864 04:40 84260000 91711872 318447000 2714880 04:50 84261672 91707552 318432000 2714304 05:00 84204472 91709856 318435000 2714240 05:10 84128616 91732032 318487000 2715136 05:20 84111984 91736640 318555000 2715904 05:30 84145512 91742112 318552000 2710656 05:40 84111192 91740672 318561000 2719552 05:50 84169624 91755648 318600000 2722304 06:00 84047480 91791360 318704000 2724800 06:10 84107496 91874304 319036000 2727488 06:20 84129672 91968768 319336000 2727808 06:30 84144192 92054592 319610000 2728192 06:40 84245304 92050560 320572000 2728896 06:50 84257800 92036160 320522000 2729472 07:00 84299600 92050560 320572000 2733376 07:10 84148504 92037312 320521000 2736832 07:20 84120960 92013696 320427000 2739200 07:30 84097816 92000160 320142000 2742464 07:40 84112512 91984896 319954000 2743040 07:50 84197960 91976832 319751000 2744000 08:00 84180448 91979424 319679000 2744640 08:10 84199720 91983456 319626000 2746496 08:20 84152200 91989792 319615000 2745664 08:30 84309808 92004768 319653000 2755456 08:40 84288424 92022048 319683000 2758208 08:50 84162672 92040192 319675000 2761664 09:00 84128616 92099232 319874000 2765312 09:10 84160208 92192544 320177000 2767424 09:20 84299952 92249280 320383000 2765760 09:30 84265808 92241504 320329000 2765888 09:40 84335416 92254176 320314000 2766208 09:50 84231488 92255328 320331000 2771648 10:00 84155016 92266848 320366000 2773568 10:10 84077664 92279232 320418000 2776320 10:20 84151584 92290752 320484000 2774080 10:30 84245128 92305728 320509000 2768768 10:40 84226032 92284992 320442000 2761408 10:50 84265192 92292192 320462000 2765056 11:00 84308136 92285856 320440000 2770176 11:10 84273816 92281248 320408000 2771008 11:20 84210016 92292480 320460000 2775872 11:30 84137152 92278656 320408000 2780096 11:40 84097288 92301696 320476000 2785536 11:50 83999344 92389824 320791000 2791296 12:00 84162936 92478240 321127000 2791360 12:10 84116208 92541600 321512000 2768960 12:20 84727896 92524608 321464000 2763008 12:30 84868520 92582496 321797000 2772096 12:40 85114216 92604960 322010000 2774848 12:50 85450024 92558304 321832000 2778112 13:00 86018064 92632320 322221000 2780032 13:10 85731536 92576448 321966000 2781248 13:20 86218352 92672928 322370000 2781120 13:30 86143728 92668032 322394000 2781568 13:40 86400688 92681280 322335000 2781760 13:50 85993512 92759616 322693000 2785536 14:00 86842976 92859264 323995000 2788288 14:10 86913024 92849184 323884000 2790784 14:20 89733600 93937248 327207000 2794240 14:30 93420800 95194080 331519000 2795008 14:40 90696496 94583808 329243000 2795136 14:50 89758240 94404096 328002000 2795328 15:00 89516944 94354272 327777000 2796160 15:10 89768096 94390848 327930000 2798656 15:20 92897376 109204704 382426000 2803008 15:30 99914144 130731552 456126000 2770880 15:40 100707464 124378848 431912000 2775168 15:50 100372888 124369920 431846000 2777984 16:00 100679128 124373664 431858000 2779776 16:10 107601120 136025856 474697000 2796608 16:20 114131072 145027584 505458000 2804160 16:30 117556032 148438944 516472000 2810880 16:40 119440640 152107200 530502000 2778112 16:50 122959232 154380672 539360000 2790144 17:00 124928496 158689440 552207000 2800768 17:10 126891776 158847552 552512000 2812416 17:20 127855024 158224320 550382000 2809216 17:30 131414624 175564224 612382000 2820224 17:40 134469632 181457280 631190000 2832448 17:50 134728000 180476928 627692000 2834752 18:00 135065304 180194976 626369000 2837248 18:10 135048936 180101664 625861000 2839744 18:20 135131216 180012960 625274000 2840640 18:30 135184016 179980416 625136000 2840768 18:40 134980208 179988192 625164000 2844864 18:50 144367608 230463072 809997000 2841984 19:00 143008624 207547488 720874000 2845760 19:10 142775160 207562464 720707000 2849664 19:20 143572088 207624672 724979000 2858752 19:30 142798480 207610848 721979000 2863040 19:40 18940240 68677632 388814000 2844544 19:50 24506944 80140032 284238000 2876288 20:00 33779328 109852704 368152000 2896640 20:10 39744760 130029408 450371000 2927872 20:20 63298752 218945376 794323000 2962112 20:30 99855008 347968224 1220066000 3093824 20:40 133762112 431508096 1498229000 3230208 20:50 136770304 441884448 1540560000 3235072 21:00 143755744 462537504 1604335000 3251712 21:10 151189984 475643232 1652439000 3262528 21:20 157106400 480849408 1671900000 3251520 21:30 163313216 487090656 1690416000 3273728 21:40 169934336 498551040 1731005000 3275776 21:50 175835968 506492352 1761175000 3292672 22:00 182476800 516803040 1795376000 3299200 22:10 190756544 532979712 1856787000 3310016 22:20 197719808 539955936 1877669000 3319104 22:30 202812896 546294528 1910587000 3322944 22:40 209630080 552107520 1916452000 3333760 22:50 214403200 558634752 1941940000 3338688 23:00 220982784 567669024 1980128000 3340224 23:10 226360992 574801632 1999039000 3352768 23:20 232831104 583352640 2026452000 3333440 23:30 238333216 588961152 2049620000 3366656 23:40 245980416 600810912 2087720000 3379520 23:50 267191584 669404448 2325403000 3403520 00:00 267194048 668599200 2321551000 3405824 end 00:06 (I think) 00:10 267309152 668577888 2321454000 3408704 00:20 267289792 668578752 2321461000 3409600 00:30 267462272 668578752 2321461000 3411456 00:40 267234880 668578752 2321461000 3411840 00:50 267410880 668579616 2321468000 3416448 01:00 267192640 668595744 2321646000 3418048 01:10 267192640 668595744 2321646000 3418048 01:20 267192640 668595744 2321646000 3418048 01:30 267192640 668595744 2321646000 3418048 01:40 267192640 668595744 2321646000 3418048 Don''t know if that''s relevant, but I got another one of those: Jul 4 12:50:51 ------------[ cut here ]------------ Jul 4 12:50:51 kernel BUG at /media/data/mattems/src/linux-2.6-3.0.0~rc5/debian/build/source_amd64_none/fs/btrfs/inode.c:1583! Jul 4 12:50:51 invalid opcode: 0000 [#1] SMP Jul 4 12:50:51 CPU 0 Jul 4 12:50:51 Modules linked in: lm85 dme1737 hwmon_vid coretemp ipmi_si ipmi_msghandler sha256_generic cryptd aes_x86_64 aes_generic cbc fuse dm_crypt snd_pcm snd_timer snd sg soundcore i3200_edac snd_page_alloc sr_mod processor tpm_tis i2c_i801 pl2303 pcspkr thermal_sys i2c_core tpm edac_core tpm_bios cdrom usbserial container evdev psmouse button serio_raw ext4 mbcache jbd2 crc16 dm_mod nbd btrfs zlib_deflate crc32c libcrc32c ums_cypress sd_mod crc_t10dif usb_storage uas uhci_hcd ahci libahci ehci_hcd libata e1000e usbcore scsi_mod [last unloaded: i2c_dev] Jul 4 12:50:51 Jul 4 12:50:51 Pid: 764, comm: btrfs-fixup-0 Not tainted 3.0.0-rc5-amd64 #1 empty empty/Tyan Tank GT20 B5211 Jul 4 12:50:51 RIP: 0010:[<ffffffffa00820f4>] [<ffffffffa00820f4>] btrfs_writepage_fixup_worker+0xdb/0x120 [btrfs] Jul 4 12:50:51 RSP: 0018:ffff880147ffdde0 EFLAGS: 00010246 Jul 4 12:50:51 RAX: 0000000000000000 RBX: ffffea0004648098 RCX: ffffffffffffffff Jul 4 12:50:51 RDX: 0000000000000000 RSI: 0000000005854000 RDI: ffff8800073f18d0 Jul 4 12:50:51 RBP: 0000000005854000 R08: 0000000000000008 R09: ffff880147ffdd98 Jul 4 12:50:51 R10: dead000000200200 R11: dead000000100100 R12: ffff8800073f19a8 Jul 4 12:50:51 R13: 0000000000000000 R14: ffff880147ffddf8 R15: 0000000005854fff Jul 4 12:50:51 FS: 0000000000000000(0000) GS:ffff88014fc00000(0000) knlGS:0000000000000000 Jul 4 12:50:51 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jul 4 12:50:51 CR2: 00000000f75c6000 CR3: 000000014780f000 CR4: 00000000000006f0 Jul 4 12:50:51 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jul 4 12:50:51 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jul 4 12:50:51 Process btrfs-fixup-0 (pid: 764, threadinfo ffff880147ffc000, task ffff880145d962c0) Jul 4 12:50:51 Stack: Jul 4 12:50:51 ffff8800073f1840 ffff8800040b9ec0 ffffffffffffffff ffff8800c45ef5a8 Jul 4 12:50:51 0000000000000000 ffff880147c13440 ffff880147c13490 ffff880147c13458 Jul 4 12:50:51 ffff880147c13448 ffff880147c13480 ffff880147c13468 ffffffffa009d9fa Jul 4 12:50:51 Call Trace: Jul 4 12:50:51 [<ffffffffa009d9fa>] ? worker_loop+0x186/0x4a1 [btrfs] Jul 4 12:50:51 [<ffffffff813369ca>] ? schedule+0x5ed/0x61a Jul 4 12:50:51 [<ffffffffa009d874>] ? btrfs_queue_worker+0x24a/0x24a [btrfs] Jul 4 12:50:51 [<ffffffffa009d874>] ? btrfs_queue_worker+0x24a/0x24a [btrfs] Jul 4 12:50:51 [<ffffffff8105faed>] ? kthread+0x7a/0x82 Jul 4 12:50:51 [<ffffffff8133e524>] ? kernel_thread_helper+0x4/0x10 Jul 4 12:50:51 [<ffffffff8105fa73>] ? kthread_worker_fn+0x147/0x147 Jul 4 12:50:51 [<ffffffff8133e520>] ? gs_change+0x13/0x13 Jul 4 12:50:51 Code: 41 b8 50 00 00 00 4c 89 f1 e8 d5 3b 01 00 48 89 df e8 fb 3c 03 e1 ba 01 00 00 00 4c 89 ee 4c 89 e7 e8 ce 05 01 00 e9 4e ff ff ff <0f> 0beb fe 48 8b 3c 24 41 b8 50 00 00 00 4c 89 f1 4c 89 fa 48 Jul 4 12:50:51 RIP [<ffffffffa00820f4>] btrfs_writepage_fixup_worker+0xdb/0x120 [btrfs] Jul 4 12:50:51 RSP <ffff880147ffdde0> Jul 4 12:50:51 ---[ end trace 3b143d6e02d6e847 ]--- Best regards, Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-06 09:11:11 +0100, Stephane Chazelas: [...]> extent_map delayed_node btrfs_inode_cache btrfs_free_space_cache > (in bytes)[...]> 01:00 267192640 668595744 2321646000 3418048 > 01:10 267192640 668595744 2321646000 3418048 > 01:20 267192640 668595744 2321646000 3418048 > 01:30 267192640 668595744 2321646000 3418048 > 01:40 267192640 668595744 2321646000 3418048[...] I''ve just come accross http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=4b9465cb9e3859186eefa1ca3b990a5849386320 GIT> author Chris Mason <chris.mason@oracle.com> GIT> Fri, 3 Jun 2011 13:36:29 +0000 (09:36 -0400) GIT> committer Chris Mason <chris.mason@oracle.com> GIT> Sat, 4 Jun 2011 12:03:47 +0000 (08:03 -0400) GIT> commit 4b9465cb9e3859186eefa1ca3b990a5849386320 GIT> tree 8fc06452fb75e52f6c1c2e2253c2ff6700e622fd tree | snapshot GIT> parent e7786c3ae517b2c433edc91714e86be770e9f1ce commit | diff GIT> Btrfs: add mount -o inode_cache GIT> GIT> This makes the inode map cache default to off until we GIT> fix the overflow problem when the free space crcs don''t fit GIT> inside a single page. I would have thought that would have disabled that btrfs_inode_cache. And I can see that patch is in 3.0.0-rc5 (I''m not mounting with -o inode_cache). So, why those 2.2GiB in btrfs_inode_cache above? -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Stephane Chazelas wrote:> 2011-07-06 09:11:11 +0100, Stephane Chazelas: > [...] >> extent_map delayed_node btrfs_inode_cache btrfs_free_space_cache >> (in bytes) > [...] >> 01:00 267192640 668595744 2321646000 3418048 >> 01:10 267192640 668595744 2321646000 3418048 >> 01:20 267192640 668595744 2321646000 3418048 >> 01:30 267192640 668595744 2321646000 3418048 >> 01:40 267192640 668595744 2321646000 3418048 > [...] > > I''ve just come accross > http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=4b9465cb9e3859186eefa1ca3b990a5849386320 > > GIT> author Chris Mason <chris.mason@oracle.com> > GIT> Fri, 3 Jun 2011 13:36:29 +0000 (09:36 -0400) > GIT> committer Chris Mason <chris.mason@oracle.com> > GIT> Sat, 4 Jun 2011 12:03:47 +0000 (08:03 -0400) > GIT> commit 4b9465cb9e3859186eefa1ca3b990a5849386320 > GIT> tree 8fc06452fb75e52f6c1c2e2253c2ff6700e622fd tree | snapshot > GIT> parent e7786c3ae517b2c433edc91714e86be770e9f1ce commit | diff > GIT> Btrfs: add mount -o inode_cache > GIT> > GIT> This makes the inode map cache default to off until we > GIT> fix the overflow problem when the free space crcs don''t fit > GIT> inside a single page. > > I would have thought that would have disabled that > btrfs_inode_cache. And I can see that patch is in 3.0.0-rc5 (I''m > not mounting with -o inode_cache). So, why those 2.2GiB in > btrfs_inode_cache above? >This should be irrelevant to your problem.. btrfs_inode_cache is a slab cache for in memory inodes, which is of struct btrfs_inode. while the ino_cache is a cache in which the entries are ranges of free inode numbers, and currently it won''t be enabled unless you mount with inode_cache option. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-07 16:20:20 +0800, Li Zefan: [...]> btrfs_inode_cache is a slab cache for in memory inodes, which is of > struct btrfs_inode.[...] Thanks Li. If that''s a cache, the system should be able to reuse the space there when it''s low on memory, wouldn''t it? What would be the conditions where that couldn''t be done? (like in my case, where the oom killer was hired to free memory rather than reclaiming that cache memory). Best regards, Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-06 09:11:11 +0100, Stephane Chazelas:> 2011-07-03 13:38:57 -0600, cwillu: > > On Sun, Jul 3, 2011 at 1:09 PM, Stephane Chazelas > > <stephane_chazelas@yahoo.fr> wrote: > [...] > > > Now, on a few occasions (actually, most of the time), when I > > > rsynced the data (about 2.5TB) onto the external drive, the > > > system would crash after some time with "Out of memory and no > > > killable process". Basically, something in kernel was allocating > > > the whole memory, then oom mass killed everybody and crash. > [...] > > Look at the output of slabtop (should be installed by default, procfs > > package), before rsync for comparison, and during. > > Hi, > > so, no crash this time[...] Another attempt, again onto an empty drive, this time with 3.0.0-rc6. Exact same scenario. slabinfo: extent_map delayed_node btrfs_inode_cache btrfs_free_space_cache (in bytes) 2011-07-07_10:40 143264 29952 0 120832 2011-07-07_10:50 16088160 59593248 228140000 196352 2011-07-07_11:00 19549728 83101824 258900000 475776 2011-07-07_11:10 17695040 52438464 163416000 758976 2011-07-07_11:20 16723168 55321344 180820000 1095040 2011-07-07_11:30 15468640 46624032 166568000 1344256 2011-07-07_11:40 14651648 43962048 143916000 1944640 2011-07-07_11:50 9010144 2620800 9732000 2231616 2011-07-07_12:00 4224352 3646656 9224000 2473280 2011-07-07_12:10 2528416 456768 1984000 2511040 2011-07-07_12:20 2400640 426816 1452000 2590336 2011-07-07_12:30 2241888 336960 1052000 2590336 2011-07-07_12:40 4669632 10224864 36804000 2662080 2011-07-07_12:50 4386976 4320576 8032000 2662080 2011-07-07_13:00 4282432 1793376 5860000 2662080 2011-07-07_13:10 4270816 1231776 2848000 2662080 2011-07-07_13:20 4255328 1141920 2348000 2662080 2011-07-07_13:30 4255328 1134432 2232000 2662080 2011-07-07_13:40 4255328 1130688 2192000 2662080 2011-07-07_13:50 4255328 1130688 2172000 2662080 2011-07-07_14:00 4239840 1130688 2168000 2665856 2011-07-07_14:10 4212736 7259616 28420000 2669632 2011-07-07_14:20 7008320 9936576 23176000 2673408 2011-07-07_14:30 6915392 9929088 22512000 2677184 2011-07-07_14:40 7047040 9929088 25392000 2680960 2011-07-07_14:50 8537760 9929088 26368000 2707392 2011-07-07_15:00 10469888 12804480 40116000 2718720 2011-07-07_15:10 13195776 14028768 47080000 2726272 2011-07-07_15:20 13571360 13946400 45748000 2730048 2011-07-07_15:30 19580704 16200288 54596000 2737600 2011-07-07_15:40 19449056 16192800 52412000 2737600 2011-07-07_15:50 19445184 16192800 52412000 2737600 2011-07-07_16:00 19425824 19450080 66480000 2741376 2011-07-07_16:10 24858240 25994592 91076000 2756480 2011-07-07_16:20 24246464 25930944 77448000 2760256 2011-07-07_16:30 25477760 35144928 131312000 2767808 2011-07-07_16:40 40625024 85512960 326444000 2767808 2011-07-07_16:50 53247744 102390912 346560000 2767808 2011-07-07_17:00 63465952 113390784 388416000 2786688 2011-07-07_17:10 10020736 13523328 24960000 2805568 2011-07-07_17:20 8971424 9345024 28900000 2809344 2011-07-07_17:30 9257952 31408416 78040000 2805568 2011-07-07_17:40 9257952 31378464 76812000 2805568 2011-07-07_17:50 9257952 31374720 75828000 2805568 2011-07-07_18:00 6857312 13725504 21940000 2805568 2011-07-07_18:10 6605632 10707840 37816000 2813120 2011-07-07_18:20 10768032 17252352 54032000 2820672 2011-07-07_18:30 21737408 74595456 210492000 2824448 2011-07-07_18:40 14554848 16967808 41232000 2832000 2011-07-07_18:50 6594016 10281024 30876000 2832000 2011-07-07_19:00 6322976 11467872 40804000 2835776 2011-07-07_19:10 7798208 12227904 46520000 2843328 2011-07-07_19:20 9927808 13279968 46960000 2850880 2011-07-07_19:30 10237568 13272480 46644000 2854656 2011-07-07_19:40 17160704 16368768 56740000 2865984 2011-07-07_19:50 26039200 29105856 101828000 2881088 2011-07-07_20:00 27878400 33988032 115528000 2881088 2011-07-07_20:10 30604288 39151008 125960000 2881088 2011-07-07_20:20 31339968 39049920 125476000 2881088 2011-07-07_20:30 31297376 39042432 127928000 2881088 2011-07-07_20:40 31390304 39038688 129048000 2881088 2011-07-07_20:50 31390304 39038688 127280000 2888640 2011-07-07_21:00 31390304 39038688 127188000 2892416 2011-07-07_21:10 31738784 39038688 127160000 2892416 2011-07-07_21:20 40299776 49342176 173700000 2896192 2011-07-07_21:30 46041952 58904352 206176000 2899968 2011-07-07_21:40 54223488 61832160 216108000 2907520 2011-07-07_21:50 60728448 71034912 245064000 2915072 2011-07-07_22:00 67353440 70619328 240420000 2933952 2011-07-07_22:10 66517088 70619328 237640000 2933952 2011-07-07_22:20 66513216 70619328 237636000 2937728 2011-07-07_22:30 66505472 70619328 237632000 2937728 2011-07-07_22:40 66505472 70619328 237620000 2941504 2011-07-07_22:50 69936064 73266336 248200000 2945280 2011-07-07_23:00 69912832 73255104 246616000 2945280 2011-07-07_23:10 69912832 73247616 246596000 2949056 2011-07-07_23:20 69905088 73247616 249108000 2949056 2011-07-07_23:30 75043232 82573920 286540000 2949056 2011-07-07_23:40 76739168 82543968 274260000 2949056 2011-07-07_23:50 76638496 82543968 273060000 2949056 2011-07-08_00:00 76553312 82543968 278432000 2952832 2011-07-08_00:10 76708192 82543968 275812000 2956608 2011-07-08_00:20 80262688 84898944 293508000 2964160 2011-07-08_00:30 80549216 84895200 288516000 2971712 2011-07-08_00:40 86473376 105022944 361956000 2975488 2011-07-08_00:50 94937568 127599264 423384000 2986816 2011-07-08_01:00 94763328 127599264 405932000 2994368 2011-07-08_01:10 96285024 127599264 411020000 3001920 2011-07-08_01:20 97957728 127599264 417404000 3005696 2011-07-08_01:30 102999072 127599264 443520000 3009472 2011-07-08_01:40 108803200 132541344 463332000 3009472 2011-07-08_01:50 114324672 137056608 476084000 3009472 2011-07-08_02:00 114808672 137487168 478348000 3013248 2011-07-08_02:10 114897728 137505888 478428000 3013248 2011-07-08_02:20 118192800 140991552 491152000 3020800 2011-07-08_02:30 121724064 145641600 506520000 3035904 2011-07-08_02:40 127969600 152350848 529720000 3043456 2011-07-08_02:50 128864032 152324640 528980000 3043456 2011-07-08_03:00 134788192 158322528 553792000 3043456 2011-07-08_03:10 138338816 163425600 568052000 3043456 2011-07-08_03:20 141993984 166450752 579660000 3043456 2011-07-08_03:30 147368320 169322400 589688000 3047232 2011-07-08_03:40 153311840 173545632 602736000 3058560 2011-07-08_03:50 153346688 173508192 602076000 3062336 2011-07-08_04:00 157904032 182403936 634212000 3069888 2011-07-08_04:10 156808256 181778688 628016000 3069888 2011-07-08_04:20 160231104 183463488 637672000 3069888 2011-07-08_04:30 161733440 183399840 633880000 3069888 2011-07-08_04:40 161621152 183396096 633388000 3069888 2011-07-08_04:50 161474016 183396096 633376000 3069888 2011-07-08_05:00 161466272 183396096 633364000 3069888 2011-07-08_05:10 161466272 183396096 637660000 3069888 2011-07-08_05:20 161656000 183396096 635752000 3069888 2011-07-08_05:30 161640512 183396096 635752000 3069888 2011-07-08_05:40 161799264 183396096 636468000 3069888 2011-07-08_05:50 162298752 183396096 636828000 3077440 2011-07-08_06:00 162434272 183422304 637084000 3077440 2011-07-08_06:10 162434272 183414816 637068000 3081216 2011-07-08_06:20 162546560 183478464 637316000 3084992 2011-07-08_06:30 162527200 183568320 637432000 3084992 2011-07-08_06:40 162860192 183635712 637748000 3088768 2011-07-08_06:50 162879552 183950208 638912000 3092544 2011-07-08_07:00 162879552 184234752 639944000 3092544 2011-07-08_07:10 163181568 184635360 641692000 3096320 2011-07-08_07:20 165977152 193875552 673360000 3103872 2011-07-08_07:30 165946176 194493312 661292000 3103872 2011-07-08_07:40 167742784 194493312 668704000 3107648 2011-07-08_07:50 168219040 194493312 668896000 3111424 2011-07-08_08:00 171901312 200775744 697404000 3115200 2011-07-08_08:10 188078528 273997152 894600000 3115200 2011-07-08_08:20 188004960 273813696 891580000 3115200 2011-07-08_08:30 187780384 283016448 1327376000 3205824 2011-07-08_08:40 187633248 273832416 941688000 3254912 2011-07-08_08:50 187807488 273832416 891768000 3254912 2011-07-08_09:00 187950752 273832416 891768000 3258688 2011-07-08_09:10 188117248 273832416 891768000 3258688 2011-07-08_09:20 189166560 273832416 891748000 3266240 2011-07-08_09:30 195218496 273832416 891732000 3266240 2011-07-08_09:40 195795424 273832416 892996000 3270016 2011-07-08_09:50 200387616 273832416 892544000 3277568 2011-07-08_10:00 199044032 273832416 892284000 3277568 2011-07-08_10:10 199020800 273832416 892284000 3277568 2011-07-08_10:20 199202784 273832416 892280000 3281344 2011-07-08_10:30 199342176 273832416 902376000 3285120 2011-07-08_10:40 204933344 273832416 903660000 3296448 2011-07-08_10:50 209788832 273832416 909948000 3300224 2011-07-08_11:00 214113856 273832416 915596000 3307776 2011-07-08_11:10 218504704 281043360 964020000 3315328 2011-07-08_11:20 218961600 281039616 957752000 3315328 2011-07-08_11:30 245693888 397912320 1370312000 3315328 2011-07-08_11:40 318092544 635757408 2201932000 3387072 2011-07-08_11:50 390467968 877170528 3032756000 3451264 2011-07-08_11:54:30 408050720 935236224 3246892000 3466368 2011-07-08_11:54:50 408817376 938669472 3248144000 3470144 2011-07-08_11:55:00 409192960 938793024 3256076000 3470144 After which the oom killer killed everybody. Just before that, we had: ==> /proc/slabinfo <=slabinfo - version: 2.1 # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail> fuse_request 0 0 608 6 1 : tunables 54 27 8 : slabdata 0 0 0 fuse_inode 0 0 768 5 1 : tunables 54 27 8 : slabdata 0 0 0 dm_crypt_io 416 625 152 25 1 : tunables 120 60 8 : slabdata 25 25 324 ext4_groupinfo_4k 80 84 136 28 1 : tunables 120 60 8 : slabdata 3 3 0 ext4_inode_cache 209 448 912 4 1 : tunables 54 27 8 : slabdata 112 112 0 ext4_xattr 0 0 88 44 1 : tunables 120 60 8 : slabdata 0 0 0 ext4_free_data 0 0 56 67 1 : tunables 120 60 8 : slabdata 0 0 0 ext4_allocation_context 0 0 136 28 1 : tunables 120 60 8 : slabdata 0 0 0 ext4_prealloc_space 3 37 104 37 1 : tunables 120 60 8 : slabdata 1 1 0 ext4_system_zone 0 0 40 92 1 : tunables 120 60 8 : slabdata 0 0 0 ext4_io_end 0 0 1128 3 1 : tunables 24 12 8 : slabdata 0 0 0 ext4_io_page 0 0 16 202 1 : tunables 120 60 8 : slabdata 0 0 0 jbd2_inode 51 77 48 77 1 : tunables 120 60 8 : slabdata 1 1 0 jbd2_journal_handle 0 0 24 144 1 : tunables 120 60 8 : slabdata 0 0 0 jbd2_journal_head 0 0 112 34 1 : tunables 120 60 8 : slabdata 0 0 0 jbd2_revoke_table 2 202 16 202 1 : tunables 120 60 8 : slabdata 1 1 0 jbd2_revoke_record 0 0 32 112 1 : tunables 120 60 8 : slabdata 0 0 0 kcopyd_job 0 0 3384 2 2 : tunables 24 12 8 : slabdata 0 0 0 io 0 0 64 59 1 : tunables 120 60 8 : slabdata 0 0 0 dm_uevent 0 0 2608 3 2 : tunables 24 12 8 : slabdata 0 0 0 dm_rq_clone_bio_info 0 0 16 202 1 : tunables 120 60 8 : slabdata 0 0 0 dm_rq_target_io 0 0 400 10 1 : tunables 54 27 8 : slabdata 0 0 0 dm_target_io 1596 3168 24 144 1 : tunables 120 60 8 : slabdata 22 22 420 dm_io 1612 2668 40 92 1 : tunables 120 60 8 : slabdata 29 29 480 delayed_node 3259698 3259698 288 13 1 : tunables 54 27 8 : slabdata 250746 250746 0 extent_map 4649920 4649920 88 44 1 : tunables 120 60 8 : slabdata 105680 105680 0 extent_buffers 7248 14520 128 30 1 : tunables 120 60 8 : slabdata 484 484 60 extent_state 3150 7740 128 30 1 : tunables 120 60 8 : slabdata 258 258 0 btrfs_free_space_cache 54194 54221 64 59 1 : tunables 120 60 8 : slabdata 919 919 0 btrfs_path_cache 135 135 144 27 1 : tunables 120 60 8 : slabdata 5 5 0 btrfs_transaction_cache 16 16 240 16 1 : tunables 120 60 8 : slabdata 1 1 0 btrfs_trans_handle_cache 111 159 72 53 1 : tunables 120 60 8 : slabdata 3 3 0 btrfs_inode_cache 3260160 3260160 1000 4 1 : tunables 54 27 8 : slabdata 815040 815040 27 sd_ext_cdb 2 112 32 112 1 : tunables 120 60 8 : slabdata 1 1 0 uhci_urb_priv 32 67 56 67 1 : tunables 120 60 8 : slabdata 1 1 0 scsi_sense_cache 210 210 128 30 1 : tunables 120 60 8 : slabdata 7 7 0 scsi_cmd_cache 210 210 256 15 1 : tunables 120 60 8 : slabdata 14 14 60 sgpool-128 19 19 4096 1 1 : tunables 24 12 8 : slabdata 19 19 0 sgpool-64 8 8 2048 2 1 : tunables 24 12 8 : slabdata 4 4 0 sgpool-32 40 40 1024 4 1 : tunables 54 27 8 : slabdata 10 10 0 sgpool-16 79 80 512 8 1 : tunables 54 27 8 : slabdata 10 10 0 sgpool-8 180 180 256 15 1 : tunables 120 60 8 : slabdata 12 12 60 scsi_data_buffer 0 0 24 144 1 : tunables 120 60 8 : slabdata 0 0 0 fib6_nodes 5 59 64 59 1 : tunables 120 60 8 : slabdata 1 1 0 ip6_dst_cache 4 20 384 10 1 : tunables 54 27 8 : slabdata 2 2 0 ndisc_cache 1 12 320 12 1 : tunables 54 27 8 : slabdata 1 1 0 ip6_mrt_cache 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 RAWv6 5 8 1024 4 1 : tunables 54 27 8 : slabdata 2 2 0 UDPLITEv6 0 0 1024 4 1 : tunables 54 27 8 : slabdata 0 0 0 UDPv6 0 0 1024 4 1 : tunables 54 27 8 : slabdata 0 0 0 tw_sock_TCPv6 0 0 320 12 1 : tunables 54 27 8 : slabdata 0 0 0 request_sock_TCPv6 0 0 192 20 1 : tunables 120 60 8 : slabdata 0 0 0 TCPv6 3 4 1856 2 1 : tunables 24 12 8 : slabdata 2 2 0 flow_cache 0 0 96 40 1 : tunables 120 60 8 : slabdata 0 0 0 btree_node 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 cfq_io_context 85 120 128 30 1 : tunables 120 60 8 : slabdata 4 4 0 cfq_queue 133 136 232 17 1 : tunables 120 60 8 : slabdata 8 8 0 bsg_cmd 0 0 312 12 1 : tunables 54 27 8 : slabdata 0 0 0 mqueue_inode_cache 1 4 896 4 1 : tunables 54 27 8 : slabdata 1 1 0 hugetlbfs_inode_cache 1 6 608 6 1 : tunables 54 27 8 : slabdata 1 1 0 dquot 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0 0 kioctx 0 0 384 10 1 : tunables 54 27 8 : slabdata 0 0 0 kiocb 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0 0 fanotify_response_event 0 0 32 112 1 : tunables 120 60 8 : slabdata 0 0 0 fsnotify_mark 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 inotify_event_private_data 0 0 32 112 1 : tunables 120 60 8 : slabdata 0 0 0 inotify_inode_mark 11 28 136 28 1 : tunables 120 60 8 : slabdata 1 1 0 dnotify_mark 0 0 136 28 1 : tunables 120 60 8 : slabdata 0 0 0 dnotify_struct 0 0 32 112 1 : tunables 120 60 8 : slabdata 0 0 0 fasync_cache 0 0 48 77 1 : tunables 120 60 8 : slabdata 0 0 0 khugepaged_mm_slot 0 0 40 92 1 : tunables 120 60 8 : slabdata 0 0 0 ksm_mm_slot 0 0 48 77 1 : tunables 120 60 8 : slabdata 0 0 0 ksm_stable_node 0 0 40 92 1 : tunables 120 60 8 : slabdata 0 0 0 ksm_rmap_item 0 0 64 59 1 : tunables 120 60 8 : slabdata 0 0 0 pid_namespace 0 0 2112 3 2 : tunables 24 12 8 : slabdata 0 0 0 user_namespace 0 0 1072 7 2 : tunables 24 12 8 : slabdata 0 0 0 nsproxy 0 0 48 77 1 : tunables 120 60 8 : slabdata 0 0 0 posix_timers_cache 0 0 176 22 1 : tunables 120 60 8 : slabdata 0 0 0 uid_cache 2 30 128 30 1 : tunables 120 60 8 : slabdata 1 1 0 UNIX 16 27 832 9 2 : tunables 54 27 8 : slabdata 3 3 0 ip_mrt_cache 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 UDP-Lite 0 0 832 9 2 : tunables 54 27 8 : slabdata 0 0 0 tcp_bind_bucket 13 59 64 59 1 : tunables 120 60 8 : slabdata 1 1 0 inet_peer_cache 11 20 192 20 1 : tunables 120 60 8 : slabdata 1 1 0 secpath_cache 0 0 64 59 1 : tunables 120 60 8 : slabdata 0 0 0 xfrm_dst_cache 0 0 512 8 1 : tunables 54 27 8 : slabdata 0 0 0 ip_fib_trie 8 67 56 67 1 : tunables 120 60 8 : slabdata 1 1 0 ip_fib_alias 9 77 48 77 1 : tunables 120 60 8 : slabdata 1 1 0 ip_dst_cache 173 195 256 15 1 : tunables 120 60 8 : slabdata 13 13 0 arp_cache 10 24 320 12 1 : tunables 54 27 8 : slabdata 2 2 0 PING 0 0 832 9 2 : tunables 54 27 8 : slabdata 0 0 0 RAW 3 9 832 9 2 : tunables 54 27 8 : slabdata 1 1 0 UDP 1 9 832 9 2 : tunables 54 27 8 : slabdata 1 1 0 tw_sock_TCP 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0 0 request_sock_TCP 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 TCP 15 20 1728 4 2 : tunables 24 12 8 : slabdata 5 5 0 eventpoll_pwq 0 0 72 53 1 : tunables 120 60 8 : slabdata 0 0 0 eventpoll_epi 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 blkdev_integrity 0 0 112 34 1 : tunables 120 60 8 : slabdata 0 0 0 blkdev_queue 25 28 1720 4 2 : tunables 24 12 8 : slabdata 7 7 0 blkdev_requests 345 539 360 11 1 : tunables 54 27 8 : slabdata 49 49 189 blkdev_ioc 67 159 72 53 1 : tunables 120 60 8 : slabdata 3 3 0 fsnotify_event_holder 0 0 24 144 1 : tunables 120 60 8 : slabdata 0 0 0 fsnotify_event 1 34 112 34 1 : tunables 120 60 8 : slabdata 1 1 0 bio-0 680 1280 192 20 1 : tunables 120 60 8 : slabdata 64 64 420 biovec-256 110 110 4096 1 1 : tunables 24 12 8 : slabdata 110 110 12 biovec-128 142 208 2048 2 1 : tunables 24 12 8 : slabdata 93 104 96 biovec-64 24 24 1024 4 1 : tunables 54 27 8 : slabdata 6 6 0 biovec-16 109 240 256 15 1 : tunables 120 60 8 : slabdata 16 16 7 bip-256 2 2 4224 1 2 : tunables 8 4 0 : slabdata 2 2 0 bip-128 0 0 2176 3 2 : tunables 24 12 8 : slabdata 0 0 0 bip-64 0 0 1152 7 2 : tunables 24 12 8 : slabdata 0 0 0 bip-16 0 0 384 10 1 : tunables 54 27 8 : slabdata 0 0 0 bip-4 0 0 192 20 1 : tunables 120 60 8 : slabdata 0 0 0 bip-1 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 sock_inode_cache 54 66 640 6 1 : tunables 54 27 8 : slabdata 11 11 0 skbuff_fclone_cache 5 7 512 7 1 : tunables 54 27 8 : slabdata 1 1 0 skbuff_head_cache 591 690 256 15 1 : tunables 120 60 8 : slabdata 46 46 120 file_lock_cache 1 22 176 22 1 : tunables 120 60 8 : slabdata 1 1 0 net_namespace 0 0 2624 3 2 : tunables 24 12 8 : slabdata 0 0 0 shmem_inode_cache 787 805 816 5 1 : tunables 54 27 8 : slabdata 161 161 0 Acpi-Operand 1113 1272 72 53 1 : tunables 120 60 8 : slabdata 24 24 0 Acpi-ParseExt 0 0 72 53 1 : tunables 120 60 8 : slabdata 0 0 0 Acpi-Parse 0 0 48 77 1 : tunables 120 60 8 : slabdata 0 0 0 Acpi-State 0 0 80 48 1 : tunables 120 60 8 : slabdata 0 0 0 Acpi-Namespace 765 828 40 92 1 : tunables 120 60 8 : slabdata 9 9 0 task_delay_info 134 238 112 34 1 : tunables 120 60 8 : slabdata 7 7 0 taskstats 1 12 328 12 1 : tunables 54 27 8 : slabdata 1 1 0 proc_inode_cache 133 258 656 6 1 : tunables 54 27 8 : slabdata 41 43 128 sigqueue 1 24 160 24 1 : tunables 120 60 8 : slabdata 1 1 0 bdev_cache 30 32 832 4 1 : tunables 54 27 8 : slabdata 8 8 0 sysfs_dir_cache 13442 13536 80 48 1 : tunables 120 60 8 : slabdata 282 282 0 mnt_cache 24 30 256 15 1 : tunables 120 60 8 : slabdata 2 2 0 filp 1044 1260 192 20 1 : tunables 120 60 8 : slabdata 63 63 300 inode_cache 1497 1680 592 6 1 : tunables 54 27 8 : slabdata 280 280 0 dentry 4420 6140 192 20 1 : tunables 120 60 8 : slabdata 307 307 0 names_cache 27 27 4096 1 1 : tunables 24 12 8 : slabdata 27 27 0 key_jar 1 20 192 20 1 : tunables 120 60 8 : slabdata 1 1 0 buffer_head 164 333 104 37 1 : tunables 120 60 8 : slabdata 9 9 0 vm_area_struct 2283 2552 176 22 1 : tunables 120 60 8 : slabdata 116 116 0 mm_struct 64 64 960 4 1 : tunables 54 27 8 : slabdata 16 16 0 fs_cache 54 177 64 59 1 : tunables 120 60 8 : slabdata 3 3 0 files_cache 55 88 704 11 2 : tunables 54 27 8 : slabdata 8 8 0 signal_cache 127 147 1088 7 2 : tunables 24 12 8 : slabdata 21 21 0 sighand_cache 115 129 2112 3 2 : tunables 24 12 8 : slabdata 43 43 0 task_xstate 67 80 512 8 1 : tunables 54 27 8 : slabdata 10 10 0 task_struct 129 144 1776 4 2 : tunables 24 12 8 : slabdata 36 36 0 cred_jar 312 400 192 20 1 : tunables 120 60 8 : slabdata 20 20 0 anon_vma_chain 1785 3619 48 77 1 : tunables 120 60 8 : slabdata 47 47 60 anon_vma 1217 1711 64 59 1 : tunables 120 60 8 : slabdata 29 29 0 pid 136 210 128 30 1 : tunables 120 60 8 : slabdata 7 7 0 shared_policy_node 0 0 48 77 1 : tunables 120 60 8 : slabdata 0 0 0 numa_policy 0 0 24 144 1 : tunables 120 60 8 : slabdata 0 0 0 radix_tree_node 54739 82677 560 7 1 : tunables 54 27 8 : slabdata 11811 11811 0 idr_layer_cache 231 252 544 7 1 : tunables 54 27 8 : slabdata 36 36 0 size-4194304(DMA) 0 0 4194304 1 1024 : tunables 1 1 0 : slabdata 0 0 0 size-4194304 0 0 4194304 1 1024 : tunables 1 1 0 : slabdata 0 0 0 size-2097152(DMA) 0 0 2097152 1 512 : tunables 1 1 0 : slabdata 0 0 0 size-2097152 0 0 2097152 1 512 : tunables 1 1 0 : slabdata 0 0 0 size-1048576(DMA) 0 0 1048576 1 256 : tunables 1 1 0 : slabdata 0 0 0 size-1048576 0 0 1048576 1 256 : tunables 1 1 0 : slabdata 0 0 0 size-524288(DMA) 0 0 524288 1 128 : tunables 1 1 0 : slabdata 0 0 0 size-524288 0 0 524288 1 128 : tunables 1 1 0 : slabdata 0 0 0 size-262144(DMA) 0 0 262144 1 64 : tunables 1 1 0 : slabdata 0 0 0 size-262144 0 0 262144 1 64 : tunables 1 1 0 : slabdata 0 0 0 size-131072(DMA) 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0 0 size-131072 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0 0 size-65536(DMA) 0 0 65536 1 16 : tunables 8 4 0 : slabdata 0 0 0 size-65536 3 3 65536 1 16 : tunables 8 4 0 : slabdata 3 3 0 size-32768(DMA) 0 0 32768 1 8 : tunables 8 4 0 : slabdata 0 0 0 size-32768 0 0 32768 1 8 : tunables 8 4 0 : slabdata 0 0 0 size-16384(DMA) 0 0 16384 1 4 : tunables 8 4 0 : slabdata 0 0 0 size-16384 14 14 16384 1 4 : tunables 8 4 0 : slabdata 14 14 0 size-8192(DMA) 0 0 8192 1 2 : tunables 8 4 0 : slabdata 0 0 0 size-8192 24 32 8192 1 2 : tunables 8 4 0 : slabdata 24 32 0 size-4096(DMA) 0 0 4096 1 1 : tunables 24 12 8 : slabdata 0 0 0 size-4096 969 969 4096 1 1 : tunables 24 12 8 : slabdata 969 969 0 size-2048(DMA) 0 0 2048 2 1 : tunables 24 12 8 : slabdata 0 0 0 size-2048 630 630 2048 2 1 : tunables 24 12 8 : slabdata 315 315 0 size-1024(DMA) 0 0 1024 4 1 : tunables 54 27 8 : slabdata 0 0 0 size-1024 964 964 1024 4 1 : tunables 54 27 8 : slabdata 241 241 0 size-512(DMA) 0 0 512 8 1 : tunables 54 27 8 : slabdata 0 0 0 size-512 700 712 512 8 1 : tunables 54 27 8 : slabdata 89 89 135 size-256(DMA) 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0 0 size-256 3213 3660 256 15 1 : tunables 120 60 8 : slabdata 244 244 120 size-192(DMA) 0 0 192 20 1 : tunables 120 60 8 : slabdata 0 0 0 size-192 1576 2100 192 20 1 : tunables 120 60 8 : slabdata 105 105 420 size-128(DMA) 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 size-64(DMA) 0 0 64 59 1 : tunables 120 60 8 : slabdata 0 0 0 size-64 9046 18408 64 59 1 : tunables 120 60 8 : slabdata 312 312 120 size-32(DMA) 0 0 32 112 1 : tunables 120 60 8 : slabdata 0 0 0 size-128 3990 5790 128 30 1 : tunables 120 60 8 : slabdata 193 193 120 size-32 7846 8064 32 112 1 : tunables 120 60 8 : slabdata 72 72 60 kmem_cache 193 193 4224 1 2 : tunables 8 4 0 : slabdata 193 193 0 ==> /proc/vmstat <=nr_free_pages 50700 nr_inactive_anon 2970 nr_active_anon 1007 nr_inactive_file 11130 nr_active_file 7157 nr_unevictable 0 nr_mlock 0 nr_anon_pages 2938 nr_mapped 363 nr_file_pages 19588 nr_dirty 6664 nr_writeback 0 nr_slab_reclaimable 1186454 nr_slab_unreclaimable 4799 nr_page_table_pages 1006 nr_kernel_stack 117 nr_unstable 0 nr_bounce 0 nr_vmscan_write 443674146 nr_writeback_temp 0 nr_isolated_anon 0 nr_isolated_file 0 nr_shmem 0 nr_dirtied 1979672411 nr_written 394329592 numa_hit 1442613360 numa_miss 0 numa_foreign 0 numa_interleave 14467 numa_local 1442613360 numa_other 0 nr_anon_transparent_hugepages 0 nr_dirty_threshold 14593 nr_dirty_background_threshold 7296 pgpgin 1166783994 pgpgout 1189551831 pswpin 561247 pswpout 527463 pgalloc_dma 1 pgalloc_dma32 817965569 pgalloc_normal 624704577 pgalloc_movable 0 pgfree 1461887065 pgactivate 35670010 pgdeactivate 11575599 pgfault 15571083 pgmajfault 159505 pgrefill_dma 0 pgrefill_dma32 3822790 pgrefill_normal 6083339 pgrefill_movable 0 pgsteal_dma 0 pgsteal_dma32 451541278 pgsteal_normal 317158520 pgsteal_movable 0 pgscan_kswapd_dma 0 pgscan_kswapd_dma32 456928166 pgscan_kswapd_normal 1030237072 pgscan_kswapd_movable 0 pgscan_direct_dma 0 pgscan_direct_dma32 5453588 pgscan_direct_normal 975367 pgscan_direct_movable 0 zone_reclaim_failed 0 pginodesteal 27622 slabs_scanned 49472512 kswapd_steal 763108137 kswapd_inodesteal 41027944 kswapd_low_wmark_hit_quickly 1660 kswapd_high_wmark_hit_quickly 212270 kswapd_skip_congestion_wait 9497 pageoutrun 6256294 allocstall 43035 pgrotated 2512196 compact_blocks_moved 472304 compact_pages_moved 6499644 compact_pagemigrate_failed 1402982 compact_stall 6988 compact_fail 2641 compact_success 4347 htlb_buddy_alloc_success 0 htlb_buddy_alloc_fail 0 unevictable_pgs_culled 26833 unevictable_pgs_scanned 0 unevictable_pgs_rescued 28012 unevictable_pgs_mlocked 28646 unevictable_pgs_munlocked 28629 unevictable_pgs_cleared 0 unevictable_pgs_stranded 0 unevictable_pgs_mlockfreed 0 thp_fault_alloc 0 thp_fault_fallback 0 thp_collapse_alloc 0 thp_collapse_alloc_failed 0 thp_split 0 Got another "invalid opcode" in btrfs-fixup-0. That was in the middle of transfering a 3.6GB file. I''ve got one and only one of those during one boot time when doing that rsync. When the btrfs_inode_cache grew dramatically before the crash, It was in the middle of transferring an area that had hundreds of thousands of little files Any idea anyone? Best regards, Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Stephane Chazelas''s message of 2011-07-08 08:44:29 -0400:> 2011-07-06 09:11:11 +0100, Stephane Chazelas: > > 2011-07-03 13:38:57 -0600, cwillu: > > > On Sun, Jul 3, 2011 at 1:09 PM, Stephane Chazelas > > > <stephane_chazelas@yahoo.fr> wrote: > > [...] > > > > Now, on a few occasions (actually, most of the time), when I > > > > rsynced the data (about 2.5TB) onto the external drive, the > > > > system would crash after some time with "Out of memory and no > > > > killable process". Basically, something in kernel was allocating > > > > the whole memory, then oom mass killed everybody and crash. > > [...] > > > Look at the output of slabtop (should be installed by default, procfs > > > package), before rsync for comparison, and during. > > > > Hi, > > > > so, no crash this time > [...] > > Another attempt, again onto an empty drive, this time with > 3.0.0-rc6. Exact same scenario. > > slabinfo:> Got another "invalid opcode" in btrfs-fixup-0. That was in the > middle of transfering a 3.6GB file. I''ve got one and only one of > those during one boot time when doing that rsync.So the invalidate opcode in btrfs-fixup-0 is the big problem. We''re either failing to write because we weren''t able to allocate memory (and not dealing with it properly) or there is a bigger problem. Does the btrfs-fixup-0 oops come before or after the ooms? Please send along any oops output during the run. Only the first (earliest) oops matters. I would do two things. First, I''d turn off compress_force. There''s no explicit reason for this, it just seems like the mostly likely place for a bug. If the oops comes after the oom messages, please grab a sysrq-t and sysrq-w output (you''ll need some way to log the console if you do this). What we really need to see is what kswapd is doing. That''s usually where the inodes get freed. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-08 11:06:08 -0400, Chris Mason: [...]> So the invalidate opcode in btrfs-fixup-0 is the big problem. We''re > either failing to write because we weren''t able to allocate memory (and > not dealing with it properly) or there is a bigger problem. > > Does the btrfs-fixup-0 oops come before or after the ooms?Hi Chris, thanks for looking into this. It comes long before. Hours before there''s any problem. So it seems unrelated.> Please send along any oops output during the run. Only the first > (earliest) oops matters.There''s always only one in between two reboots. I''ve sent two already, but here they are: Jul 1 18:25:57 ------------[ cut here ]------------ Jul 1 18:25:57 kernel BUG at /media/data/mattems/src/linux-2.6-3.0.0~rc5/debian/build/source_amd64_none/fs/btrfs/inode.c:1583! Jul 1 18:25:57 invalid opcode: 0000 [#1] SMP Jul 1 18:25:57 CPU 1 Jul 1 18:25:57 Modules linked in: sha256_generic cryptd aes_x86_64 aes_generic cbc dm_crypt fuse snd_pcm psmouse tpm_tis tpm i2c_i801 snd_timer snd soundcore snd_page_alloc i3200_edac tpm_bios serio_raw evdev pcspkr processor button thermal_sys i2c_core container edac_core sg sr_mod cdrom ext4 mbcache jbd2 crc16 dm_mod nbd btrfs zlib_deflate crc32c libcrc32c ums_cypress usb_storage sd_mod crc_t10dif uas uhci_hcd ahci libahci libata ehci_hcd e1000e scsi_mod usbcore [last unloaded: scsi_wait_scan] Jul 1 18:25:57 Jul 1 18:25:57 Pid: 747, comm: btrfs-fixup-0 Not tainted 3.0.0-rc5-amd64 #1 empty empty/Tyan Tank GT20 B5211 Jul 1 18:25:57 RIP: 0010:[<ffffffffa014b0f4>] [<ffffffffa014b0f4>] btrfs_writepage_fixup_worker+0xdb/0x120 [btrfs] Jul 1 18:25:57 RSP: 0018:ffff8801483ffde0 EFLAGS: 00010246 Jul 1 18:25:57 RAX: 0000000000000000 RBX: ffffea000496a430 RCX: ffffffffffffffff Jul 1 18:25:57 RDX: 0000000000000000 RSI: 0000000006849000 RDI: ffff880071c1fcb8 Jul 1 18:25:57 RBP: 0000000006849000 R08: 0000000000000008 R09: ffff8801483ffd98 Jul 1 18:25:57 R10: dead000000200200 R11: dead000000100100 R12: ffff880071c1fd90 Jul 1 18:25:57 R13: 0000000000000000 R14: ffff8801483ffdf8 R15: 0000000006849fff Jul 1 18:25:57 FS: 0000000000000000(0000) GS:ffff88014fd00000(0000) knlGS:0000000000000000 Jul 1 18:25:57 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jul 1 18:25:57 CR2: 00000000f7596000 CR3: 000000013def9000 CR4: 00000000000006e0 Jul 1 18:25:57 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jul 1 18:25:57 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jul 1 18:25:57 Process btrfs-fixup-0 (pid: 747, threadinfo ffff8801483fe000, task ffff88014672efa0) Jul 1 18:25:57 Stack: Jul 1 18:25:57 ffff880071c1fc28 ffff8800c70165c0 ffffffffffffffff ffff88011e61ca28 Jul 1 18:25:57 0000000000000000 ffff880146ef41c0 ffff880146ef4210 ffff880146ef41d8 Jul 1 18:25:57 ffff880146ef41c8 ffff880146ef4200 ffff880146ef41e8 ffffffffa01669fa Jul 1 18:25:57 Call Trace: Jul 1 18:25:57 [<ffffffffa01669fa>] ? worker_loop+0x186/0x4a1 [btrfs] Jul 1 18:25:57 [<ffffffff813369ca>] ? schedule+0x5ed/0x61a Jul 1 18:25:57 [<ffffffffa0166874>] ? btrfs_queue_worker+0x24a/0x24a [btrfs] Jul 1 18:25:57 [<ffffffffa0166874>] ? btrfs_queue_worker+0x24a/0x24a [btrfs] Jul 1 18:25:57 [<ffffffff8105faed>] ? kthread+0x7a/0x82 Jul 1 18:25:57 [<ffffffff8133e524>] ? kernel_thread_helper+0x4/0x10 Jul 1 18:25:57 [<ffffffff8105fa73>] ? kthread_worker_fn+0x147/0x147 Jul 1 18:25:57 [<ffffffff8133e520>] ? gs_change+0x13/0x13 Jul 1 18:25:57 Code: 41 b8 50 00 00 00 4c 89 f1 e8 d5 3b 01 00 48 89 df e8 fb ac f6 e0 ba 01 00 00 00 4c 89 ee 4c 89 e7 e8 ce 05 01 00 e9 4e ff ff ff <0f> 0b eb fe 48 8b 3c 24 41 b8 50 00 00 00 4c 89 f1 4c 89 fa 48 Jul 1 18:25:57 RIP [<ffffffffa014b0f4>] btrfs_writepage_fixup_worker+0xdb/0x120 [btrfs] Jul 1 18:25:57 RSP <ffff8801483ffde0> Jul 1 18:25:57 ---[ end trace 9744d33381de3d04 ]--- Jul 4 12:50:51 ------------[ cut here ]------------ Jul 4 12:50:51 kernel BUG at /media/data/mattems/src/linux-2.6-3.0.0~rc5/debian/build/source_amd64_none/fs/btrfs/inode.c:1583! Jul 4 12:50:51 invalid opcode: 0000 [#1] SMP Jul 4 12:50:51 CPU 0 Jul 4 12:50:51 Modules linked in: lm85 dme1737 hwmon_vid coretemp ipmi_si ipmi_msghandler sha256_generic cryptd aes_x86_64 aes_generic cbc fuse dm_crypt snd_pcm snd_timer snd sg soundcore i3200_edac snd_page_alloc sr_mod processor tpm_tis i2c_i801 pl2303 pcspkr thermal_sys i2c_core tpm edac_core tpm_bios cdrom usbserial container evdev psmouse button serio_raw ext4 mbcache jbd2 crc16 dm_mod nbd btrfs zlib_deflate crc32c libcrc32c ums_cypress sd_mod crc_t10dif usb_storage uas uhci_hcd ahci libahci ehci_hcd libata e1000e usbcore scsi_mod [last unloaded: i2c_dev] Jul 4 12:50:51 Jul 4 12:50:51 Pid: 764, comm: btrfs-fixup-0 Not tainted 3.0.0-rc5-amd64 #1 empty empty/Tyan Tank GT20 B5211 Jul 4 12:50:51 RIP: 0010:[<ffffffffa00820f4>] [<ffffffffa00820f4>] btrfs_writepage_fixup_worker+0xdb/0x120 [btrfs] Jul 4 12:50:51 RSP: 0018:ffff880147ffdde0 EFLAGS: 00010246 Jul 4 12:50:51 RAX: 0000000000000000 RBX: ffffea0004648098 RCX: ffffffffffffffff Jul 4 12:50:51 RDX: 0000000000000000 RSI: 0000000005854000 RDI: ffff8800073f18d0 Jul 4 12:50:51 RBP: 0000000005854000 R08: 0000000000000008 R09: ffff880147ffdd98 Jul 4 12:50:51 R10: dead000000200200 R11: dead000000100100 R12: ffff8800073f19a8 Jul 4 12:50:51 R13: 0000000000000000 R14: ffff880147ffddf8 R15: 0000000005854fff Jul 4 12:50:51 FS: 0000000000000000(0000) GS:ffff88014fc00000(0000) knlGS:0000000000000000 Jul 4 12:50:51 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jul 4 12:50:51 CR2: 00000000f75c6000 CR3: 000000014780f000 CR4: 00000000000006f0 Jul 4 12:50:51 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jul 4 12:50:51 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jul 4 12:50:51 Process btrfs-fixup-0 (pid: 764, threadinfo ffff880147ffc000, task ffff880145d962c0) Jul 4 12:50:51 Stack: Jul 4 12:50:51 ffff8800073f1840 ffff8800040b9ec0 ffffffffffffffff ffff8800c45ef5a8 Jul 4 12:50:51 0000000000000000 ffff880147c13440 ffff880147c13490 ffff880147c13458 Jul 4 12:50:51 ffff880147c13448 ffff880147c13480 ffff880147c13468 ffffffffa009d9fa Jul 4 12:50:51 Call Trace: Jul 4 12:50:51 [<ffffffffa009d9fa>] ? worker_loop+0x186/0x4a1 [btrfs] Jul 4 12:50:51 [<ffffffff813369ca>] ? schedule+0x5ed/0x61a Jul 4 12:50:51 [<ffffffffa009d874>] ? btrfs_queue_worker+0x24a/0x24a [btrfs] Jul 4 12:50:51 [<ffffffffa009d874>] ? btrfs_queue_worker+0x24a/0x24a [btrfs] Jul 4 12:50:51 [<ffffffff8105faed>] ? kthread+0x7a/0x82 Jul 4 12:50:51 [<ffffffff8133e524>] ? kernel_thread_helper+0x4/0x10 Jul 4 12:50:51 [<ffffffff8105fa73>] ? kthread_worker_fn+0x147/0x147 Jul 4 12:50:51 [<ffffffff8133e520>] ? gs_change+0x13/0x13 Jul 4 12:50:51 Code: 41 b8 50 00 00 00 4c 89 f1 e8 d5 3b 01 00 48 89 df e8 fb 3c 03 e1 ba 01 00 00 00 4c 89 ee 4c 89 e7 e8 ce 05 01 00 e9 4e ff ff ff <0f> 0b eb fe 48 8b 3c 24 41 b8 50 00 00 00 4c 89 f1 4c 89 fa 48 Jul 4 12:50:51 RIP [<ffffffffa00820f4>] btrfs_writepage_fixup_worker+0xdb/0x120 [btrfs] Jul 4 12:50:51 RSP <ffff880147ffdde0> Jul 4 12:50:51 ---[ end trace 3b143d6e02d6e847 ]--- Jul 7 12:48:54 ------------[ cut here ]------------ Jul 7 12:48:55 kernel BUG at /media/data/mattems/src/linux-2.6-3.0.0~rc6/debian/build/source_amd64_none/fs/btrfs/inode.c:1583! Jul 7 12:48:55 invalid opcode: 0000 [#1] SMP Jul 7 12:48:55 CPU 0 Jul 7 12:48:55 Modules linked in: sha256_generic cryptd aes_x86_64 aes_generic cbc fuse lm85 dme1737 hwmon_vid coretemp ipmi_si ipmi_msghandler dm_crypt snd_pcm snd_timer snd soundcore i2c_i801 i2c_core psmouse snd_page_alloc tpm_tis tpm tpm_bios pcspkr i3200_edac pl2303 processor button serio_raw evdev edac_core container usbserial thermal_sys ext4 mbcache jbd2 crc16 dm_mod nbd btrfs zlib_deflate crc32c libcrc32c sg sr_mod cdrom ums_cypress sd_mod crc_t10dif usb_storage uas uhci_hcd ahci libahci libata ehci_hcd e1000e scsi_mod usbcore [last unloaded: scsi_wait_scan] Jul 7 12:48:55 Jul 7 12:48:55 Pid: 800, comm: btrfs-fixup-0 Not tainted 3.0.0-rc6-amd64 #1 empty empty/Tyan Tank GT20 B5211 Jul 7 12:48:55 RIP: 0010:[<ffffffffa00bf0f4>] [<ffffffffa00bf0f4>] btrfs_writepage_fixup_worker+0xdb/0x120 [btrfs] Jul 7 12:48:55 RSP: 0018:ffff880147ce1de0 EFLAGS: 00010246 Jul 7 12:48:55 RAX: 0000000000000000 RBX: ffffea00047058e8 RCX: ffffffffffffffff Jul 7 12:48:55 RDX: 0000000000000000 RSI: 0000000003393000 RDI: ffff880020086100 Jul 7 12:48:55 RBP: 0000000003393000 R08: 0000000000000008 R09: ffff880147ce1d98 Jul 7 12:48:55 R10: dead000000200200 R11: dead000000100100 R12: ffff8800200861d8 Jul 7 12:48:55 R13: 0000000000000000 R14: ffff880147ce1df8 R15: 0000000003393fff Jul 7 12:48:55 FS: 0000000000000000(0000) GS:ffff88014fc00000(0000) knlGS:0000000000000000 Jul 7 12:48:55 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jul 7 12:48:55 CR2: 00000000f7594000 CR3: 0000000145b0b000 CR4: 00000000000006f0 Jul 7 12:48:55 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jul 7 12:48:55 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jul 7 12:48:55 Process btrfs-fixup-0 (pid: 800, threadinfo ffff880147ce0000, task ffff880147ca6f20) Jul 7 12:48:55 Stack: Jul 7 12:48:55 ffff880020086070 ffff88013e0be6c0 ffffffffffffffff ffff88004e72bb28 Jul 7 12:48:55 0000000000000000 ffff880146a001c0 ffff880146a00210 ffff880146a001d8 Jul 7 12:48:55 ffff880146a001c8 ffff880146a00200 ffff880146a001e8 ffffffffa00da9fa Jul 7 12:48:55 Call Trace: Jul 7 12:48:55 [<ffffffffa00da9fa>] ? worker_loop+0x186/0x4a1 [btrfs] Jul 7 12:48:55 [<ffffffff81336a3a>] ? schedule+0x5ed/0x61a Jul 7 12:48:55 [<ffffffffa00da874>] ? btrfs_queue_worker+0x24a/0x24a [btrfs] Jul 7 12:48:55 [<ffffffffa00da874>] ? btrfs_queue_worker+0x24a/0x24a [btrfs] Jul 7 12:48:55 [<ffffffff8105fb2d>] ? kthread+0x7a/0x82 Jul 7 12:48:55 [<ffffffff8133e564>] ? kernel_thread_helper+0x4/0x10 Jul 7 12:48:55 [<ffffffff8105fab3>] ? kthread_worker_fn+0x147/0x147 Jul 7 12:48:55 [<ffffffff8133e560>] ? gs_change+0x13/0x13 Jul 7 12:48:55 Code: 41 b8 50 00 00 00 4c 89 f1 e8 d5 3b 01 00 48 89 df e8 3b 6d ff e0 ba 01 00 00 00 4c 89 ee 4c 89 e7 e8 ce 05 01 00 e9 4e ff ff ff <0f> 0b eb fe 48 8b 3c 24 41 b8 50 00 00 00 4c 89 f1 4c 89 fa 48 Jul 7 12:48:55 RIP [<ffffffffa00bf0f4>] btrfs_writepage_fixup_worker+0xdb/0x120 [btrfs] Jul 7 12:48:55 RSP <ffff880147ce1de0> Jul 7 12:48:55 ---[ end trace 2c2801fb34de9335 ]---> I would do two things. First, I''d turn off compress_force. There''s no > explicit reason for this, it just seems like the mostly likely place for > a bug.[...] On the target FS you mean? I can try that on next week if I can manage to get hold of another big enough hard drive. Note that last time I was trying to reclaim the used memory (see my earlier message), I was able to unmount the target FS (which didn''t free any memory) but not the source one. And sysrq-t for umount showed: umount D ffff880145ebe770 0 24079 1290 0x00000004 ffff880145ebe770 0000000000000086 0000000000000000 ffffffff8160b020 0000000000012840 ffff880123bc7fd8 ffff880123bc7fd8 0000000000012840 ffff880145ebe770 ffff880123bc6010 00007fffac84f4a8 0000000100000000 Call Trace: [<ffffffff81337d50>] ? rwsem_down_failed_common+0xda/0x10e [<ffffffff811aca63>] ? call_rwsem_down_write_failed+0x13/0x20 [<ffffffff813376c7>] ? down_write+0x25/0x27 [<ffffffff810ff6eb>] ? deactivate_super+0x30/0x3d [<ffffffff81114135>] ? sys_umount+0x2ea/0x315 [<ffffffff8133d412>] ? system_call_fastpath+0x16/0x1b I''ve resumed the rsync now, got another invalid opcode, and memory usage is going up again, but like last time, the rsync is likely to end before it causes a crash. -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-08 16:41:23 +0100, Stephane Chazelas:> 2011-07-08 11:06:08 -0400, Chris Mason: > [...] > > So the invalidate opcode in btrfs-fixup-0 is the big problem. We''re > > either failing to write because we weren''t able to allocate memory (and > > not dealing with it properly) or there is a bigger problem. > > > > Does the btrfs-fixup-0 oops come before or after the ooms? > > Hi Chris, thanks for looking into this. > > It comes long before. Hours before there''s any problem. So it > seems unrelated.Though every time I had the issue, there had been such an "invalid opcode" before. But also, I only had both the "invalid opcode" and memory issue when doing that rsync onto external hard drive.> > Please send along any oops output during the run. Only the first > > (earliest) oops matters. > > There''s always only one in between two reboots. I''ve sent two > already, but here they are:[...] I dug up the traces for before I switched to debian (thinking getting a newer kernel would improve matters) in case it helps: Jun 4 18:12:58 ------------[ cut here ]------------ Jun 4 18:12:58 kernel BUG at /build/buildd/linux-2.6.38/fs/btrfs/inode.c:1555! Jun 4 18:12:58 invalid opcode: 0000 [#2] SMP Jun 4 18:12:58 last sysfs file: /sys/devices/virtual/block/dm-2/dm/name Jun 4 18:12:58 CPU 0 Jun 4 18:12:58 Modules linked in: sha256_generic cryptd aes_x86_64 aes_generic dm_crypt psmouse serio_raw xgifb(C+) i3200_edac edac_core nbd btrfs zlib_deflate libcrc32c xenbus_probe_frontend ums_cypress usb_storage uas e1000e ahci libahci Jun 4 18:12:58 Jun 4 18:12:58 Pid: 416, comm: btrfs-fixup-0 Tainted: G D C 2.6.38-7-server #35-Ubuntu empty empty/Tyan Tank GT20 B5211 Jun 4 18:12:58 RIP: 0010:[<ffffffffa0099765>] [<ffffffffa0099765>] btrfs_writepage_fixup_worker+0x145/0x150 [btrfs] Jun 4 18:12:58 RSP: 0018:ffff88003cfddde0 EFLAGS: 00010246 Jun 4 18:12:58 RAX: 0000000000000000 RBX: ffffea000004ca88 RCX: 0000000000000000 Jun 4 18:12:58 RDX: ffff88003cfddd98 RSI: ffffffffffffffff RDI: ffff8800152088b0 Jun 4 18:12:58 RBP: ffff88003cfdde30 R08: ffffe8ffffc09988 R09: ffff88003cfddd98 Jun 4 18:12:58 R10: 0000000000000000 R11: 0000000000000000 R12: 00000000010ec000 Jun 4 18:12:58 R13: ffff880015208988 R14: 0000000000000000 R15: 00000000010ecfff Jun 4 18:12:58 FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 Jun 4 18:12:58 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jun 4 18:12:58 CR2: 0000000000e73fe8 CR3: 0000000030fcc000 CR4: 00000000000006f0 Jun 4 18:12:58 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jun 4 18:12:58 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jun 4 18:12:58 Process btrfs-fixup-0 (pid: 416, threadinfo ffff88003cfdc000, task ffff880036912dc0) Jun 4 18:12:58 Stack: Jun 4 18:12:58 ffff880039c4e120 ffff880015208820 ffff88003cfdde90 ffff880032da4b80 Jun 4 18:12:58 ffff88003cfdde30 ffff88003ce915a0 ffff88003cfdde90 ffff88003cfdde80 Jun 4 18:12:58 ffff880036912dc0 ffff88003ce915f0 ffff88003cfddee0 ffffffffa00c34f4 Jun 4 18:12:58 Call Trace: Jun 4 18:12:58 [<ffffffffa00c34f4>] worker_loop+0xa4/0x3a0 [btrfs] Jun 4 18:12:58 [<ffffffffa00c3450>] ? worker_loop+0x0/0x3a0 [btrfs] Jun 4 18:12:58 [<ffffffff81087116>] kthread+0x96/0xa0 Jun 4 18:12:58 [<ffffffff8100cde4>] kernel_thread_helper+0x4/0x10 Jun 4 18:12:58 [<ffffffff81087080>] ? kthread+0x0/0xa0 Jun 4 18:12:58 [<ffffffff8100cde0>] ? kernel_thread_helper+0x0/0x10 Jun 4 18:12:58 Code: 1f 80 00 00 00 00 48 8b 7d b8 48 8d 4d c8 41 b8 50 00 00 00 4c 89 fa 4c 89 e6 e8 37 d1 01 00 eb b6 48 89 df e8 8d 1a 07 e1 eb 9a <0f> 0b 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 Jun 4 18:12:58 RIP [<ffffffffa0099765>] btrfs_writepage_fixup_worker+0x145/0x150 [btrfs] Jun 4 18:12:58 RSP <ffff88003cfddde0> Jun 4 18:12:58 ---[ end trace e5cf15794ff3ebdb ]--- And: Jun 5 00:58:10 BUG: Bad page state in process rsync pfn:1bfdf Jun 5 00:58:10 page:ffffea000061f8c8 count:0 mapcount:0 mapping: (null) index:0x2300 Jun 5 00:58:10 page flags: 0x100000000000010(dirty) Jun 5 00:58:10 Pid: 1584, comm: rsync Tainted: G D C 2.6.38-7-server #35-Ubuntu Jun 5 00:58:10 Call Trace: Jun 5 00:58:10 [<ffffffff8111250b>] ? dump_page+0x9b/0xd0 Jun 5 00:58:10 [<ffffffff8111260c>] ? bad_page+0xcc/0x120 Jun 5 00:58:10 [<ffffffff81112905>] ? prep_new_page+0x1a5/0x1b0 Jun 5 00:58:10 [<ffffffff815d755e>] ? _raw_spin_lock+0xe/0x20 Jun 5 00:58:10 [<ffffffffa00b7391>] ? test_range_bit+0x111/0x150 [btrfs] Jun 5 00:58:10 [<ffffffff81112b74>] ? get_page_from_freelist+0x264/0x650 Jun 5 00:58:10 [<ffffffffa0073cce>] ? generic_bin_search.clone.42+0x19e/0x200 [btrfs] Jun 5 00:58:10 [<ffffffff81113778>] ? __alloc_pages_nodemask+0x118/0x830 Jun 5 00:58:10 [<ffffffffa0073cce>] ? generic_bin_search.clone.42+0x19e/0x200 [btrfs] Jun 5 00:58:10 [<ffffffff815d755e>] ? _raw_spin_lock+0xe/0x20 Jun 5 00:58:10 [<ffffffff811541d2>] ? get_partial_node+0x92/0xb0 Jun 5 00:58:10 [<ffffffffa00d45bd>] ? btrfs_submit_compressed_read+0x15d/0x4e0 [btrfs] Jun 5 00:58:10 [<ffffffff81149325>] ? alloc_pages_current+0xa5/0x110 Jun 5 00:58:10 [<ffffffffa00d4625>] ? btrfs_submit_compressed_read+0x1c5/0x4e0 [btrfs] Jun 5 00:58:10 [<ffffffffa0097e81>] ? btrfs_submit_bio_hook+0x151/0x160 [btrfs] Jun 5 00:58:10 [<ffffffffa009a118>] ? btrfs_get_extent+0x528/0x8e0 [btrfs] Jun 5 00:58:10 [<ffffffffa00b4bfa>] ? submit_one_bio+0x6a/0xa0 [btrfs] Jun 5 00:58:10 [<ffffffffa00b7a12>] ? submit_extent_page.clone.24+0x112/0x1b0 [btrfs] Jun 5 00:58:10 [<ffffffffa00b7f96>] ? __extent_read_full_page+0x496/0x650 [btrfs] Jun 5 00:58:10 [<ffffffffa00b7420>] ? end_bio_extent_readpage+0x0/0x250 [btrfs] Jun 5 00:58:10 [<ffffffffa0099bf0>] ? btrfs_get_extent+0x0/0x8e0 [btrfs] Jun 5 00:58:10 [<ffffffffa00b8e72>] ? extent_readpages+0xc2/0x100 [btrfs] Jun 5 00:58:10 [<ffffffffa0099bf0>] ? btrfs_get_extent+0x0/0x8e0 [btrfs] Jun 5 00:58:10 [<ffffffffa0098c0f>] ? btrfs_readpages+0x1f/0x30 [btrfs] Jun 5 00:58:10 [<ffffffff811168bb>] ? __do_page_cache_readahead+0x14b/0x220 Jun 5 00:58:10 [<ffffffff81116cf1>] ? ra_submit+0x21/0x30 Jun 5 00:58:10 [<ffffffff81116e15>] ? ondemand_readahead+0x115/0x230 Jun 5 00:58:10 [<ffffffff8110afd4>] ? file_read_actor+0xd4/0x170 Jun 5 00:58:10 [<ffffffff81117021>] ? page_cache_sync_readahead+0x31/0x50 Jun 5 00:58:10 [<ffffffff8110c3be>] ? do_generic_file_read.clone.23+0x2be/0x450 Jun 5 00:58:10 [<ffffffff8110d00a>] ? generic_file_aio_read+0x1ca/0x240 Jun 5 00:58:10 [<ffffffff81164b52>] ? do_sync_read+0xd2/0x110 Jun 5 00:58:10 [<ffffffff81278ea3>] ? security_file_permission+0x93/0xb0 Jun 5 00:58:10 [<ffffffff81164e71>] ? rw_verify_area+0x61/0xf0 Jun 5 00:58:10 [<ffffffff81165333>] ? vfs_read+0xc3/0x180 Jun 5 00:58:10 [<ffffffff81165441>] ? sys_read+0x51/0x90 Jun 5 00:58:10 [<ffffffff8100bfc2>] ? system_call_fastpath+0x16/0x1b Then first oom kill at 07:33 That "bad page state" is the only occurrence. With that same kernel, I had the "invalid opcode" + "oom kill" before that without that "bad page state". -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Stephane Chazelas''s message of 2011-07-08 11:41:23 -0400:> 2011-07-08 11:06:08 -0400, Chris Mason: > [...] > > So the invalidate opcode in btrfs-fixup-0 is the big problem. We''re > > either failing to write because we weren''t able to allocate memory (and > > not dealing with it properly) or there is a bigger problem. > > > > Does the btrfs-fixup-0 oops come before or after the ooms? > > Hi Chris, thanks for looking into this. > > It comes long before. Hours before there''s any problem. So it > seems unrelated.It could be the cause of the problem. We''re calling BUG() with the page locked, which means that page will never ever be freed, and since this worker thread is gone, it could be messing up various parts of the reclaim code. But, this worker thread isn''t supposed to get called very often...it''s really catching a corner of a corner of a strange race. So we do need to get a better understanding of why and how often. You described this workload as rsync, is there anything else running? I''d definitely try without -o compress_force. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Stephane Chazelas''s message of 2011-07-08 12:11:03 -0400:> 2011-07-08 16:41:23 +0100, Stephane Chazelas: > > 2011-07-08 11:06:08 -0400, Chris Mason: > > [...] > > > So the invalidate opcode in btrfs-fixup-0 is the big problem. We''re > > > either failing to write because we weren''t able to allocate memory (and > > > not dealing with it properly) or there is a bigger problem. > > > > > > Does the btrfs-fixup-0 oops come before or after the ooms? > > > > Hi Chris, thanks for looking into this. > > > > It comes long before. Hours before there''s any problem. So it > > seems unrelated. > > Though every time I had the issue, there had been such an > "invalid opcode" before. But also, I only had both the "invalid > opcode" and memory issue when doing that rsync onto external > hard drive. > > > > Please send along any oops output during the run. Only the first > > > (earliest) oops matters. > > > > There''s always only one in between two reboots. I''ve sent two > > already, but here they are: > [...] > > I dug up the traces for before I switched to debian (thinking > getting a newer kernel would improve matters) in case it helps: > > And: > > Jun 5 00:58:10 BUG: Bad page state in process rsync pfn:1bfdf > Jun 5 00:58:10 page:ffffea000061f8c8 count:0 mapcount:0 mapping: (null) index:0x2300 > Jun 5 00:58:10 page flags: 0x100000000000010(dirty) > Jun 5 00:58:10 Pid: 1584, comm: rsync Tainted: G D C 2.6.38-7-server #35-Ubuntu > Jun 5 00:58:10 Call Trace:Ok, this one is really interesting. Did you get this after another oops or was it after a reboot? How easily can you recompile your kernel with more debugging flags? That should help narrow it down. I''m looking for CONFIG_SLAB_DEBUG (or slub) and CONFIG_DEBUG_PAGEALLOC -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-08 12:17:54 -0400, Chris Mason: [...]> > Jun 5 00:58:10 BUG: Bad page state in process rsync pfn:1bfdf > > Jun 5 00:58:10 page:ffffea000061f8c8 count:0 mapcount:0 mapping: (null) index:0x2300 > > Jun 5 00:58:10 page flags: 0x100000000000010(dirty) > > Jun 5 00:58:10 Pid: 1584, comm: rsync Tainted: G D C 2.6.38-7-server #35-Ubuntu > > Jun 5 00:58:10 Call Trace: > > Ok, this one is really interesting. Did you get this after another oops > or was it after a reboot? >After the oops above (a few hours after though). The two reports were together with nothing inbetween in the kern.log. That was the only occurrence though.> How easily can you recompile your kernel with more debugging flags? > That should help narrow it down. I''m looking for CONFIG_SLAB_DEBUG (or > slub) and CONFIG_DEBUG_PAGEALLOC[...] I can try next week. -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-08 12:15:08 -0400, Chris Mason: [...]> You described this workload as rsync, is there anything else running?[...] Nope. Nothing else. And at least initially, that was onto an empty drive so basic copy. rsync --archive --xattrs --hard-links --numeric-ids --sparse --acls Cheers, Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-08 12:15:08 -0400, Chris Mason: [...]> I''d definitely try without -o compress_force.[...] Just started that over the night. I''m running a "dstat -df" at the same time and I''m seeing substantive amount of disk writes on the disks that hold the source FS (and I''m rsyncing from read-only snapshot subvolumes in case you''re thinking of atimes) almost more than onto the drive holding the target FS!? --dsk/sda-- --dsk/sdb-- --dsk/sdc-- --dsk/sdd-- read writ: read writ: read writ: read writ 1000k 0 : 580k 0 :2176k 0 : 0 0 1192k 0 : 76k 0 : 988k 0 : 0 0 436k 0 : 364k 0 :1984k 0 : 0 33M 824k 0 : 812k 0 :4276k 0 : 0 0 3004k 0 :2868k 0 :5488k 0 : 0 0 796k 564k: 640k 25M:2284k 4600k: 0 0 108k 0 : 68k 23M: 648k 35M: 0 0 1712k 12k:1644k 12k:2476k 7864k: 0 0 732k 0 : 620k 0 :3192k 0 : 0 0 1148k 0 :1116k 0 :3532k 0 : 0 19M 1392k 0 :1380k 0 :4416k 0 : 0 7056k 1336k 0 :1212k 0 :6664k 0 : 0 3148k 820k 0 : 784k 0 :4528k 0 : 0 48M 1336k 0 :1340k 0 :3964k 0 : 0 8996k 1528k 0 :1280k 0 :2908k 0 : 0 0 1352k 0 :1216k 0 :3880k 0 : 0 0 864k 0 : 888k 0 :1684k 0 : 0 0 1268k 0 :1208k 0 :3072k 0 : 0 0 (source FS is on sda4+sdb+sdc, target on sdd, sda alsa has an ext4 FS) How can that be? Some garbage collection, background defrag or something like that? But then, if I stop the rsync, those writes stop as well. -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Stephane Chazelas''s message of 2011-07-08 16:04:12 -0400:> 2011-07-08 12:15:08 -0400, Chris Mason: > [...] > > I''d definitely try without -o compress_force. > [...] > > Just started that over the night. > > I''m running a "dstat -df" at the same time and I''m seeing > substantive amount of disk writes on the disks that hold the > source FS (and I''m rsyncing from read-only snapshot subvolumes > in case you''re thinking of atimes) almost more than onto the > drive holding the target FS!?These are probably atime updates. You can completely disable them with mount -o noatime. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-08 16:12:28 -0400, Chris Mason: [...]> > I''m running a "dstat -df" at the same time and I''m seeing > > substantive amount of disk writes on the disks that hold the > > source FS (and I''m rsyncing from read-only snapshot subvolumes > > in case you''re thinking of atimes) almost more than onto the > > drive holding the target FS!? > > These are probably atime updates. You can completely disable them with > mount -o noatime.[...] I don''t think it is. First, as I said, I''m rsyncing from read-only snapshots (and I could see atimes were not updated) and nothing else is running. Then now looking at the traces this morning, There was a lot written in the first minutes, then it calmed down. That''s every 5 minutes (in megabytes written to the hard disks making part of the source FS): 976 *|**|** 1786 *|****|**** 2486 *|******|****** 3139 *|********|******** 3720 *|*********|********* 4226 *|**********|********** 4756 *|************|*********** 5134 **|************|*********** 5573 ***|*************|************ 5991 ****|**************|************ 6423 ****|***************|************* 6580 ****|****************|************* 6598 ****|****************|************* 6602 ****|****************|************* 6671 ****|****************|************* 6774 ****|****************|************** 6887 ****|****************|************** 6928 ****|****************|************** 6943 ****|*****************|************** 6954 ****|*****************|************** 6969 ****|*****************|************** 6982 ****|*****************|************** 6998 ****|*****************|************** 7016 ****|*****************|************** 7106 ****|*****************|************** 7143 ****|*****************|************** 7184 ****|*****************|*************** 7216 ****|*****************|*************** 7244 ****|*****************|*************** 7315 ****|*****************|*************** 7347 ****|******************|*************** 7358 ****|******************|*************** 7418 ****|******************|*************** 7432 ****|******************|*************** 7438 ****|******************|*************** 7438 ****|******************|*************** 7462 ****|******************|*************** 7483 ****|******************|*************** 7528 ****|******************|*************** 7529 ****|******************|*************** 7618 ****|******************|**************** 7869 ****|*******************|**************** 8325 *****|********************|****************** 8419 *****|********************|****************** 8533 *****|*********************|****************** 8587 *****|*********************|****************** 8731 *****|*********************|******************* 8830 *****|*********************|******************* 8869 *****|*********************|******************* 8881 *****|**********************|******************* 8903 *****|**********************|******************* 8974 *****|**********************|******************* 9044 *****|**********************|******************* 9100 *****|**********************|******************** 9492 *****|***********************|********************* 9564 *****|***********************|********************* 9640 *****|************************|********************* 9684 *****|************************|********************* 9756 *****|************************|********************* 9803 *****|************************|********************* 9836 *****|************************|********************* 9845 *****|************************|********************** 9881 *****|************************|********************** 9989 *****|************************|********************** 10083 *****|*************************|********************** 10172 *****|*************************|********************** 10230 *****|*************************|********************** 10250 *****|*************************|*********************** 10271 *****|*************************|*********************** 10291 *****|*************************|*********************** 10306 *****|*************************|*********************** 10314 *****|*************************|*********************** 10330 *****|*************************|*********************** 10395 *****|*************************|*********************** 10468 *****|**************************|*********************** 10641 *****|**************************|*********************** 10728 *****|**************************|************************ 10818 *****|**************************|************************ 10854 *****|***************************|************************ 10855 *****|***************************|************************ 10867 *****|***************************|************************ 10876 *****|***************************|************************ 10937 *****|***************************|************************ 10937 *****|***************************|************************ 10939 *****|***************************|************************ 10950 *****|***************************|************************ 10979 *****|***************************|************************ 11108 *****|***************************|************************* 11125 *****|***************************|************************* 11161 *****|***************************|************************* 11235 *****|****************************|************************* 11270 *****|****************************|************************* 11284 *****|****************************|************************* 11298 *****|****************************|************************* 11465 *****|****************************|************************** 11608 *****|****************************|************************** 11655 *****|*****************************|************************** 11690 *****|*****************************|************************** 11776 *****|*****************************|************************** 11829 *****|*****************************|*************************** 11915 *****|*****************************|*************************** 11988 *****|*****************************|*************************** 12040 *****|******************************|*************************** 12068 *****|******************************|*************************** 12080 *****|******************************|*************************** 12139 *****|******************************|*************************** 12216 *****|******************************|*************************** 12305 *****|******************************|**************************** 12405 *****|******************************|**************************** 12423 *****|*******************************|**************************** 12509 *****|*******************************|**************************** 12568 *****|*******************************|**************************** 12601 *****|*******************************|**************************** 12669 *****|*******************************|***************************** 12690 *****|*******************************|***************************** 12799 *****|*******************************|***************************** 12802 *****|*******************************|***************************** 12856 *****|********************************|***************************** 12875 *****|********************************|***************************** 12891 *****|********************************|***************************** 12904 *****|********************************|***************************** 12932 *****|********************************|***************************** 12946 *****|********************************|***************************** 12971 *****|********************************|***************************** 12998 *****|********************************|***************************** 13014 *****|********************************|***************************** 13028 *****|********************************|***************************** 13036 *****|********************************|****************************** 13055 *****|********************************|****************************** 13064 *****|********************************|****************************** 13074 *****|********************************|****************************** 13124 *****|********************************|****************************** 13167 *****|********************************|****************************** 13178 *****|********************************|****************************** # btrfs fi show /dev/sda4 Label: none uuid: b3ce8b16-970e-4ba8-b9d2-4c7de270d0f1 Total devices 3 FS bytes used 3.71TB devid 2 size 2.73TB used 1.30TB path /dev/sdb devid 1 size 2.70TB used 1.27TB path /dev/sda4 devid 3 size 2.73TB used 1.27TB path /dev/sdc Don''t know why it would write that less /dev/sda4. -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Stephane Chazelas
2011-Jul-09 07:42 UTC
A lot of writing to FS only read (Was: Memory leak?)
2011-07-09 08:09:55 +0100, Stephane Chazelas:> 2011-07-08 16:12:28 -0400, Chris Mason: > [...] > > > I''m running a "dstat -df" at the same time and I''m seeing > > > substantive amount of disk writes on the disks that hold the > > > source FS (and I''m rsyncing from read-only snapshot subvolumes > > > in case you''re thinking of atimes) almost more than onto the > > > drive holding the target FS!?[...] One thing I didn''t mention is that before doing the rsync, I deleted a few snapshot volumes (those were read-only snapshots of read-only snapshots) and recreated them and that''s the ones I''m rsyncing from (basically, I prepare an area to be rsynced made of the latests snapshots of a few subvolumes). -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-08 11:06:08 -0400, Chris Mason: [...]> I would do two things. First, I''d turn off compress_force. There''s no > explicit reason for this, it just seems like the mostly likely place for > a bug.[...] I couldn''t reproduce it with compress_force turned off, the inode_cache reached 600MB but never stayed high. Not using compress_force is not an option for me though unfortunately. -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-08 12:17:54 -0400, Chris Mason: [...]> How easily can you recompile your kernel with more debugging flags? > That should help narrow it down. I''m looking for CONFIG_SLAB_DEBUG (or > slub) and CONFIG_DEBUG_PAGEALLOC[...] I tried that (with CONFIG_DEBUG_SLAB_LEAK as well) but no difference whatsoever -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Jul 9, 2011 at 11:09 AM, Stephane Chazelas <stephane_chazelas@yahoo.fr> wrote:> 2011-07-08 11:06:08 -0400, Chris Mason: > [...] >> I would do two things. Â First, I''d turn off compress_force. Â There''s no >> explicit reason for this, it just seems like the mostly likely place for >> a bug. > [...] > > I couldn''t reproduce it with compress_force turned off, the > inode_cache reached 600MB but never stayed high. > > Not using compress_force is not an option for me though > unfortunately.Disabling compression doesn''t decompress everything that''s already compressed. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-09 13:25:00 -0600, cwillu:> On Sat, Jul 9, 2011 at 11:09 AM, Stephane Chazelas > <stephane_chazelas@yahoo.fr> wrote: > > 2011-07-08 11:06:08 -0400, Chris Mason: > > [...] > >> I would do two things. Â First, I''d turn off compress_force. Â There''s no > >> explicit reason for this, it just seems like the mostly likely place for > >> a bug. > > [...] > > > > I couldn''t reproduce it with compress_force turned off, the > > inode_cache reached 600MB but never stayed high. > > > > Not using compress_force is not an option for me though > > unfortunately. > > Disabling compression doesn''t decompress everything that''s already compressed.Yes. But the very issue here is that I get this problem when I copy data onto an empty drive. If I don''t enable compression there, it simply doesn''t fit. In that very case, support for compression is the main reason why I use brtfs here. -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-09 08:09:55 +0100, Stephane Chazelas:> 2011-07-08 16:12:28 -0400, Chris Mason: > [...] > > > I''m running a "dstat -df" at the same time and I''m seeing > > > substantive amount of disk writes on the disks that hold the > > > source FS (and I''m rsyncing from read-only snapshot subvolumes > > > in case you''re thinking of atimes) almost more than onto the > > > drive holding the target FS!? > > > > These are probably atime updates. You can completely disable them with > > mount -o noatime. > [...] > > I don''t think it is. First, as I said, I''m rsyncing from > read-only snapshots (and I could see atimes were not updated) > and nothing else is running. Then now looking at the traces this > morning, There was a lot written in the first minutes, then it > calmed down.[...] How embarrassing, sorry In that instance I wasn''t rsyncing from the right place, so I was indeed copying non-readonly volumes before copying readonly ones. So, those writes were probably due to atimes. -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Stephane Chazelas''s message of 2011-07-09 16:36:50 -0400:> 2011-07-09 13:25:00 -0600, cwillu: > > On Sat, Jul 9, 2011 at 11:09 AM, Stephane Chazelas > > <stephane_chazelas@yahoo.fr> wrote: > > > 2011-07-08 11:06:08 -0400, Chris Mason: > > > [...] > > >> I would do two things.  First, I''d turn off compress_force.  There''s no > > >> explicit reason for this, it just seems like the mostly likely place for > > >> a bug. > > > [...] > > > > > > I couldn''t reproduce it with compress_force turned off, the > > > inode_cache reached 600MB but never stayed high. > > > > > > Not using compress_force is not an option for me though > > > unfortunately. > > > > Disabling compression doesn''t decompress everything that''s already compressed. > > Yes. But the very issue here is that I get this problem when I > copy data onto an empty drive. If I don''t enable compression > there, it simply doesn''t fit. In that very case, support for > compression is the main reason why I use brtfs here. >Great, we''re on the right track. Does it trigger with mount -o compress instead of mount -o compress_force? This is very likely to be a bug in compress_force, so the extra verification will help. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-10 08:44:34 -0400, Chris Mason: [...]> Great, we''re on the right track. Does it trigger with mount -o compress > instead of mount -o compress_force?[...] It does trigger. I get that same "invalid opcode". BTW, I tried with CONFIG_SLUB and slub_debug and no more useful information than with SLAB_DEBUG. I''m trying now without dmcrypt. Then I won''t have much bandwidth for testing. -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-10 19:37:28 +0100, Stephane Chazelas:> 2011-07-10 08:44:34 -0400, Chris Mason: > [...] > > Great, we''re on the right track. Does it trigger with mount -o compress > > instead of mount -o compress_force? > [...] > > It does trigger. I get that same "invalid opcode". > > BTW, I tried with CONFIG_SLUB and slub_debug and no more useful > information than with SLAB_DEBUG. > > I''m trying now without dmcrypt. Then I won''t have much bandwidth > for testing.[...] Same without dmcrypt. So to sum up, BUG() reached in btrfs-fixup thread when doing an - rsync (though I also got (back when on ubuntu and 2.6.38) at least one occurrence using bsdtar | bsdtar) - of a large amount of data (with a large number of files), though the bug occurs quite early probably after having transfered about 50-100GB - the source FS being btrfs with compress-force on 3 devices (one of which slightly shorter than the others) and a lot of subvolumes and snapshots (I''m now copying from read-only snapshots but that happened with RW ones as well). - to a newly created btrfs fs - on one device (/dev/sdd or dmcrypt) - mounted with compress or compress-force. - noatime on either source or dest doesn''t make a difference (wrt the occurrence of fixup BUG()) - can''t reproduce it when dest is not mounted with compress - beside that BUG(), - kernel memory is being used up (mostly in btrfs_inode_cache) and can''t be reclaimed (leading to crash with oom killing everybody) - the target FS can be unmounted but that does not reclaim memory. However the *source* FS (that is not the one we tried with and without compress) cannot be unmounted (umount hangs, see another email for its stack trace). - Only way to get out of there is reboot with sysrq-b - happens with 2.6.38, 2.6.39, 3.0.0rc6 - CONFIG_SLAB_DEBUG, CONFIG_DEBUG_PAGEALLOC, CONFIG_DEBUG_SLAB_LEAK, slub_debug don''t tell us anything useful (there''s more info in /proc/slabinfo when CONFIG_SLAB_DEBUG is on, see below) - happens with CONFIG_SLUB as well. slabinfo every about 60-70 seconds which include the "globalstats" slabinfo - version: 2.1 (statistics) # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail> : globalst at <listallocs> <maxobjs> <grown> <reaped> <error> <maxfreeable> <nodeallocs> <remotefrees> <alienoverflow> : cpustat <allochit> <allocmiss> <freehit> <freemiss> btrfs_inode_cache 77610 77610 4096 1 1 : tunables 24 12 8 : slabdata 77610 77610 0 : globalstat 77610 77610 77610 0 0 0 0 0 0 : cpustat 104 77609 98 5 btrfs_inode_cache 165696 165696 4096 1 1 : tunables 24 12 8 : slabdata 165696 165696 0 : globalstat 174592 166889 173117 0 0 37 8 0 0 : cpustat 14375 174178 21198 1659 btrfs_inode_cache 173906 173906 4096 1 1 : tunables 24 12 8 : slabdata 173906 173906 0 : globalstat 231342 196133 228848 8 0 37 8 0 0 : cpustat 24914 230649 75318 6338 btrfs_inode_cache 201190 201190 4096 1 1 : tunables 24 12 8 : slabdata 201190 201190 0 : globalstat 338963 201190 331454 8 0 38 11 0 0 : cpustat 53954 335583 173512 14834 btrfs_inode_cache 224106 224143 4096 1 1 : tunables 24 12 8 : slabdata 224106 224143 96 : globalstat 453173 267189 442101 8 0 38 13 0 0 : cpustat 77063 448023 277242 23875 btrfs_inode_cache 126520 126520 4096 1 1 : tunables 24 12 8 : slabdata 126520 126520 0 : globalstat 486327 267189 472461 32 0 38 13 0 0 : cpustat 96675 479904 414073 35992 btrfs_inode_cache 144723 144723 4096 1 1 : tunables 24 12 8 : slabdata 144723 144723 0 : globalstat 537600 267189 521248 32 0 38 15 0 0 : cpustat 114446 530048 459922 39849 btrfs_inode_cache 176590 176590 4096 1 1 : tunables 24 12 8 : slabdata 176590 176590 0 : globalstat 626027 267189 605212 32 0 38 35 0 0 : cpustat 142336 616188 535659 46275 btrfs_inode_cache 225715 225752 4096 1 1 : tunables 24 12 8 : slabdata 225715 225752 96 : globalstat 766387 267189 739439 32 0 38 60 0 0 : cpustat 181607 753564 653165 56404 btrfs_inode_cache 179039 179076 4096 1 1 : tunables 24 12 8 : slabdata 179039 179076 84 : globalstat 821296 267189 793315 48 0 38 60 0 0 : cpustat 189640 808027 753396 65349 btrfs_inode_cache 139572 139609 4096 1 1 : tunables 24 12 8 : slabdata 139572 139609 96 : globalstat 890513 267189 858553 56 0 38 60 0 0 : cpustat 214964 875265 874796 75992 btrfs_inode_cache 122064 122101 4096 1 1 : tunables 24 12 8 : slabdata 122064 122101 96 : globalstat 936515 267189 903015 72 0 38 66 0 0 : cpustat 230345 920877 947006 82282 btrfs_inode_cache 136431 136468 4096 1 1 : tunables 24 12 8 : slabdata 136431 136468 96 : globalstat 1001394 267189 963758 88 0 38 67 0 0 : cpustat 256274 983686 1015526 88131 btrfs_inode_cache 111428 111465 4096 1 1 : tunables 24 12 8 : slabdata 111428 111465 96 : globalstat 1047868 267189 1007092 88 0 38 67 0 0 : cpustat 278315 1028765 1100279 95502 btrfs_inode_cache 143308 143308 4096 1 1 : tunables 24 12 8 : slabdata 143308 143308 0 : globalstat 1122732 267189 1078759 88 0 38 67 0 0 : cpustat 298923 1102028 1157303 100340 btrfs_inode_cache 108761 108798 4096 1 1 : tunables 24 12 8 : slabdata 108761 108798 96 : globalstat 1180806 267189 1133334 104 0 38 67 0 0 : cpustat 319122 1158203 1259376 109306 btrfs_inode_cache 144115 144152 4096 1 1 : tunables 24 12 8 : slabdata 144115 144152 84 : globalstat 1259003 267189 1208854 104 0 38 76 0 0 : cpustat 338025 1235163 1315124 114040 btrfs_inode_cache 138492 138529 4096 1 1 : tunables 24 12 8 : slabdata 138492 138529 36 : globalstat 1316265 267189 1264558 120 0 38 76 0 0 : cpustat 352376 1291923 1385679 120174 btrfs_inode_cache 125898 125898 4096 1 1 : tunables 24 12 8 : slabdata 125898 125898 0 : globalstat 1379943 267189 1323790 120 0 38 76 0 0 : cpustat 377547 1353189 1476746 128097 btrfs_inode_cache 143074 143111 4096 1 1 : tunables 24 12 8 : slabdata 143074 143111 72 : globalstat 1460219 267189 1399985 120 0 38 84 0 0 : cpustat 403813 1431515 1557263 135070 btrfs_inode_cache 140767 140804 4096 1 1 : tunables 24 12 8 : slabdata 140767 140804 96 : globalstat 1530279 267189 1466920 120 0 38 89 0 0 : cpustat 426902 1500267 1643922 142597 btrfs_inode_cache 122322 122359 4096 1 1 : tunables 24 12 8 : slabdata 122322 122359 96 : globalstat 1580762 267189 1514531 120 0 38 89 0 0 : cpustat 450503 1549707 1728127 149872 btrfs_inode_cache 167295 167332 4096 1 1 : tunables 24 12 8 : slabdata 167295 167332 72 : globalstat 1685301 267189 1613571 120 0 38 91 0 0 : cpustat 488967 1651795 1816243 157318 btrfs_inode_cache 129692 129729 4096 1 1 : tunables 24 12 8 : slabdata 129692 129729 84 : globalstat 1752216 267189 1677321 120 0 38 105 0 0 : cpustat 511333 1717315 1931582 167479 btrfs_inode_cache 133930 133967 4096 1 1 : tunables 24 12 8 : slabdata 133930 133967 84 : globalstat 1815698 267189 1736104 136 0 38 105 0 0 : cpustat 542451 1778567 2012803 174401 btrfs_inode_cache 115171 115208 4096 1 1 : tunables 24 12 8 : slabdata 115171 115208 0 : globalstat 1863438 267189 1780512 144 0 38 110 0 0 : cpustat 566046 1824830 2094266 181449 btrfs_inode_cache 149671 149671 4096 1 1 : tunables 24 12 8 : slabdata 149671 149671 0 : globalstat 1936523 267189 1850671 152 0 38 111 0 0 : cpustat 589089 1896761 2150041 186138 btrfs_inode_cache 117604 117641 4096 1 1 : tunables 24 12 8 : slabdata 117604 117641 96 : globalstat 1985533 267189 1896777 152 0 38 118 0 0 : cpustat 609612 1944469 2242380 194207 btrfs_inode_cache 123673 123710 4096 1 1 : tunables 24 12 8 : slabdata 123673 123710 96 : globalstat 2061550 267189 1969892 152 0 38 118 0 0 : cpustat 628820 2019061 2323136 201208 btrfs_inode_cache 106536 106536 4096 1 1 : tunables 24 12 8 : slabdata 106536 106536 0 : globalstat 2123063 267189 2027404 152 0 38 128 0 0 : cpustat 652365 2078445 2415076 209202 btrfs_inode_cache 123158 123195 4096 1 1 : tunables 24 12 8 : slabdata 123158 123195 96 : globalstat 2192496 267189 2091447 152 0 38 129 0 0 : cpustat 684828 2145107 2491235 215680 btrfs_inode_cache 111398 111435 4096 1 1 : tunables 24 12 8 : slabdata 111398 111435 96 : globalstat 2241767 267189 2137427 160 0 38 130 0 0 : cpustat 706721 2192791 2566095 222131 btrfs_inode_cache 116173 116210 4096 1 1 : tunables 24 12 8 : slabdata 116173 116210 48 : globalstat 2322912 267189 2214790 176 0 38 132 0 0 : cpustat 728839 2271902 2654784 229845 btrfs_inode_cache 105739 105739 4096 1 1 : tunables 24 12 8 : slabdata 105739 105739 0 : globalstat 2375200 267189 2264543 200 0 38 133 0 0 : cpustat 750581 2323318 2731690 236469 btrfs_inode_cache 99891 99928 4096 1 1 : tunables 24 12 8 : slabdata 99891 99928 96 : globalstat 2428901 267189 2316644 200 0 38 144 0 0 : cpustat 767974 2376721 2802354 242572 btrfs_inode_cache 109472 109509 4096 1 1 : tunables 24 12 8 : slabdata 109472 109509 96 : globalstat 2481915 267189 2367557 200 0 38 156 0 0 : cpustat 783035 2428814 2855341 247151 btrfs_inode_cache 106089 106126 4096 1 1 : tunables 24 12 8 : slabdata 106089 106126 84 : globalstat 2539989 267189 2422799 200 0 38 156 0 0 : cpustat 805498 2485800 2931573 253725 btrfs_inode_cache 104047 104084 4096 1 1 : tunables 24 12 8 : slabdata 104047 104084 96 : globalstat 2595552 267189 2475236 200 0 38 163 0 0 : cpustat 826719 2539916 3002833 259878 btrfs_inode_cache 126659 126696 4096 1 1 : tunables 24 12 8 : slabdata 126659 126696 96 : globalstat 2674478 267189 2550480 200 0 38 165 0 0 : cpustat 851791 2617102 3076213 266133 btrfs_inode_cache 94805 94842 4096 1 1 : tunables 24 12 8 : slabdata 94805 94842 96 : globalstat 2716094 267189 2589363 208 0 38 171 0 0 : cpustat 869515 2657400 3158839 273386 btrfs_inode_cache 136610 136610 4096 1 1 : tunables 24 12 8 : slabdata 136610 136610 72 : globalstat 2808055 267189 2675923 208 0 38 175 0 0 : cpustat 902598 2746580 3233055 279621 btrfs_inode_cache 57696 57733 4096 1 1 : tunables 24 12 8 : slabdata 57696 57733 96 : globalstat 2830215 267189 2697211 320 0 38 176 0 0 : cpustat 908085 2768282 3330428 288354 btrfs_inode_cache 18159 18196 4096 1 1 : tunables 24 12 8 : slabdata 18159 18196 96 : globalstat 2830274 267189 2697233 496 0 38 176 0 0 : cpustat 908209 2768316 3366842 291639 btrfs_inode_cache 41004 41004 4096 1 1 : tunables 24 12 8 : slabdata 41004 41004 0 : globalstat 2857575 267189 2723805 496 0 38 179 0 0 : cpustat 914262 2795352 3376245 292389 btrfs_inode_cache 93730 93767 4096 1 1 : tunables 24 12 8 : slabdata 93730 93767 84 : globalstat 2921315 267189 2786290 504 0 38 179 0 0 : cpustat 929670 2858965 3400706 294314 btrfs_inode_cache 57785 57822 4096 1 1 : tunables 24 12 8 : slabdata 57785 57822 96 : globalstat 2948640 267189 2812472 600 0 38 179 0 0 : cpustat 937340 2885746 3465355 300056 btrfs_inode_cache 19440 19477 4096 1 1 : tunables 24 12 8 : slabdata 19440 19477 96 : globalstat 2948640 267189 2812472 776 0 38 179 0 0 : cpustat 937340 2885746 3500561 303227 btrfs_inode_cache 13020 13057 4096 1 1 : tunables 24 12 8 : slabdata 13020 13057 96 : globalstat 2948640 267189 2812472 952 0 38 179 0 0 : cpustat 937340 2885746 3506433 303739 btrfs_inode_cache 9951 9981 4096 1 1 : tunables 24 12 8 : slabdata 9951 9981 76 : globalstat 2948640 267189 2812472 1128 0 38 179 0 0 : cpustat 937340 2885746 3509252 303970 btrfs_inode_cache 8508 8537 4096 1 1 : tunables 24 12 8 : slabdata 8508 8537 36 : globalstat 2948640 267189 2812472 1304 0 38 179 0 0 : cpustat 937342 2885746 3510539 304077 btrfs_inode_cache 8077 8101 4096 1 1 : tunables 24 12 8 : slabdata 8077 8101 96 : globalstat 2948640 267189 2812472 1469 0 38 179 0 0 : cpustat 937342 2885746 3511014 304114 btrfs_inode_cache 6656 6693 4096 1 1 : tunables 24 12 8 : slabdata 6656 6693 96 : globalstat 2948664 267189 2812472 1643 0 38 179 0 0 : cpustat 937385 2885751 3512371 304225 btrfs_inode_cache 6747 6781 4096 1 1 : tunables 24 12 8 : slabdata 6747 6781 76 : globalstat 2949480 267189 2813259 1798 0 38 179 0 0 : cpustat 937488 2886550 3513094 304280 btrfs_inode_cache 6257 6294 4096 1 1 : tunables 24 12 8 : slabdata 6257 6294 96 : globalstat 2949480 267189 2813259 1899 0 38 179 0 0 : cpustat 937488 2886550 3513575 304318 btrfs_inode_cache 4848 4848 4096 1 1 : tunables 24 12 8 : slabdata 4848 4848 0 : globalstat 2949480 267189 2813259 2033 0 38 179 0 0 : cpustat 937488 2886550 3514773 304417 btrfs_inode_cache 4839 4839 4096 1 1 : tunables 24 12 8 : slabdata 4839 4839 0 : globalstat 2949480 267189 2813259 2042 0 38 179 0 0 : cpustat 937488 2886550 3514782 304417 btrfs_inode_cache 4806 4814 4096 1 1 : tunables 24 12 8 : slabdata 4806 4814 6 : globalstat 2949482 267189 2813261 2069 0 38 179 0 0 : cpustat 937488 2886552 3514827 304418 btrfs_inode_cache 4490 4497 4096 1 1 : tunables 24 12 8 : slabdata 4490 4497 0 : globalstat 2949482 267189 2813261 2236 0 38 179 0 0 : cpustat 937494 2886556 3515125 304435 btrfs_inode_cache 4054 4085 4096 1 1 : tunables 24 12 8 : slabdata 4054 4085 56 : globalstat 2949482 267189 2813261 2430 0 38 179 0 0 : cpustat 937495 2886559 3515591 304466 btrfs_inode_cache 3972 3989 4096 1 1 : tunables 24 12 8 : slabdata 3972 3989 64 : globalstat 2949482 267189 2813261 2499 0 38 179 0 0 : cpustat 937495 2886559 3515690 304473 btrfs_inode_cache 3698 3727 4096 1 1 : tunables 24 12 8 : slabdata 3698 3727 36 : globalstat 2949487 267189 2813266 2600 0 38 179 0 0 : cpustat 937496 2886564 3515912 304488 btrfs_inode_cache 3189 3189 4096 1 1 : tunables 24 12 8 : slabdata 3189 3189 0 : globalstat 2949488 267189 2813266 2784 0 38 179 0 0 : cpustat 937496 2886568 3516353 304522 btrfs_inode_cache 3193 3193 4096 1 1 : tunables 24 12 8 : slabdata 3193 3193 0 : globalstat 2949492 267189 2813270 2784 0 38 179 0 0 : cpustat 937496 2886572 3516353 304522 btrfs_inode_cache 2861 2861 4096 1 1 : tunables 24 12 8 : slabdata 2861 2861 0 : globalstat 2949494 267189 2813272 2917 0 38 179 0 0 : cpustat 937496 2886574 3516664 304545 btrfs_inode_cache 2706 2706 4096 1 1 : tunables 24 12 8 : slabdata 2706 2706 0 : globalstat 2949494 267189 2813272 3021 0 38 179 0 0 : cpustat 937496 2886574 3516809 304555 btrfs_inode_cache 2709 2709 4096 1 1 : tunables 24 12 8 : slabdata 2709 2709 0 : globalstat 2949497 267189 2813275 3021 0 38 179 0 0 : cpustat 937496 2886577 3516809 304555 btrfs_inode_cache 2610 2610 4096 1 1 : tunables 24 12 8 : slabdata 2610 2610 0 : globalstat 2949505 267189 2813283 3086 0 38 179 0 0 : cpustat 937496 2886585 3516909 304562 btrfs_inode_cache 2610 2610 4096 1 1 : tunables 24 12 8 : slabdata 2610 2610 0 : globalstat 2949505 267189 2813283 3086 0 38 179 0 0 : cpustat 937496 2886585 3516909 304562 btrfs_inode_cache 2502 2502 4096 1 1 : tunables 24 12 8 : slabdata 2502 2502 0 : globalstat 2949506 267189 2813284 3163 0 38 179 0 0 : cpustat 937496 2886586 3517010 304570 btrfs_inode_cache 2493 2496 4096 1 1 : tunables 24 12 8 : slabdata 2493 2496 6 : globalstat 2949508 267189 2813286 3171 0 38 179 0 0 : cpustat 937496 2886588 3517035 304571 btrfs_inode_cache 2350 2350 4096 1 1 : tunables 24 12 8 : slabdata 2350 2350 0 : globalstat 2949508 267189 2813286 3266 0 38 179 0 0 : cpustat 937496 2886588 3517154 304580 btrfs_inode_cache 2237 2249 4096 1 1 : tunables 24 12 8 : slabdata 2237 2249 1 : globalstat 2949510 267189 2813288 3330 0 38 179 0 0 : cpustat 937496 2886590 3517262 304588 btrfs_inode_cache 2238 2238 4096 1 1 : tunables 24 12 8 : slabdata 2238 2238 0 : globalstat 2949512 267189 2813290 3343 0 38 179 0 0 : cpustat 937496 2886592 3517262 304588 btrfs_inode_cache 2238 2238 4096 1 1 : tunables 24 12 8 : slabdata 2238 2238 0 : globalstat 2949512 267189 2813290 3343 0 38 179 0 0 : cpustat 937496 2886592 3517262 304588 btrfs_inode_cache 2585 2585 4096 1 1 : tunables 24 12 8 : slabdata 2585 2585 0 : globalstat 2950149 267189 2813886 3439 0 38 179 0 0 : cpustat 937538 2887195 3517538 304610 btrfs_inode_cache 43128 43165 4096 1 1 : tunables 24 12 8 : slabdata 43128 43165 96 : globalstat 2991028 267189 2854761 3447 0 38 179 0 0 : cpustat 938115 2928095 3518521 304676 btrfs_inode_cache 32454 32491 4096 1 1 : tunables 24 12 8 : slabdata 32454 32491 96 : globalstat 2992689 267189 2856322 3519 0 38 179 0 0 : cpustat 939186 2929734 3530831 305766 btrfs_inode_cache 17499 17536 4096 1 1 : tunables 24 12 8 : slabdata 17499 17536 96 : globalstat 2993106 267189 2856673 3679 0 38 179 0 0 : cpustat 940032 2930139 3545708 307087 ===> BUG() occurred here <== btrfs_inode_cache 15331 15368 4096 1 1 : tunables 24 12 8 : slabdata 15331 15368 96 : globalstat 2995554 267189 2859045 3839 0 38 179 0 0 : cpustat 940758 2932563 3550613 307502 btrfs_inode_cache 31639 31639 4096 1 1 : tunables 24 12 8 : slabdata 31639 31639 0 : globalstat 3014974 267189 2877988 3855 0 38 179 0 0 : cpustat 945598 2951863 3557779 308061 btrfs_inode_cache 33820 33857 4096 1 1 : tunables 24 12 8 : slabdata 33820 33857 96 : globalstat 3023329 267189 2886224 3919 0 38 179 0 0 : cpustat 948158 2960257 3565994 308729 btrfs_inode_cache 51992 51992 4096 1 1 : tunables 24 12 8 : slabdata 51992 51992 0 : globalstat 3055265 267189 2917703 3991 0 38 180 0 0 : cpustat 953344 2992092 3583268 310175 btrfs_inode_cache 57834 57871 4096 1 1 : tunables 24 12 8 : slabdata 57834 57871 96 : globalstat 3073251 267189 2935329 4079 0 38 180 0 0 : cpustat 959228 3010099 3600080 311533 btrfs_inode_cache 22835 22872 4096 1 1 : tunables 24 12 8 : slabdata 22835 22872 96 : globalstat 3073974 267189 2936003 4247 0 38 180 0 0 : cpustat 959857 3010802 3633418 314528 btrfs_inode_cache 12362 12399 4096 1 1 : tunables 24 12 8 : slabdata 12362 12399 96 : globalstat 3073974 267189 2936003 4423 0 38 180 0 0 : cpustat 960103 3010813 3643282 315396 btrfs_inode_cache 8133 8170 4096 1 1 : tunables 24 12 8 : slabdata 8133 8170 96 : globalstat 3074309 267189 2936309 4591 0 38 180 0 0 : cpustat 960590 3011150 3647944 315791 btrfs_inode_cache 6556 6585 4096 1 1 : tunables 24 12 8 : slabdata 6556 6585 84 : globalstat 3074598 267189 2936569 4751 0 38 180 0 0 : cpustat 961159 3011447 3650181 315967 btrfs_inode_cache 5780 5817 4096 1 1 : tunables 24 12 8 : slabdata 5780 5817 96 : globalstat 3075142 267189 2937027 4879 0 38 180 0 0 : cpustat 961655 3011946 3651838 316093 btrfs_inode_cache 4959 4960 4096 1 1 : tunables 24 12 8 : slabdata 4959 4960 0 : globalstat 3075142 267189 2937027 5055 0 38 180 0 0 : cpustat 961663 3011947 3652506 316145 btrfs_inode_cache 4674 4674 4096 1 1 : tunables 24 12 8 : slabdata 4674 4674 0 : globalstat 3075176 267189 2937027 5220 0 38 180 0 0 : cpustat 961682 3011952 3652815 316162 btrfs_inode_cache 4569 4580 4096 1 1 : tunables 24 12 8 : slabdata 4569 4580 0 : globalstat 3075448 267189 2937270 5341 0 38 180 0 0 : cpustat 961863 3012211 3653321 316196 btrfs_inode_cache 4895 4900 4096 1 1 : tunables 24 12 8 : slabdata 4895 4900 48 : globalstat 3075986 267189 2937735 5468 0 38 180 0 0 : cpustat 961990 3012691 3653642 316212 btrfs_inode_cache 2943 2966 4096 1 1 : tunables 24 12 8 : slabdata 2943 2966 6 : globalstat 3076416 267189 2938145 5628 0 38 180 0 0 : cpustat 962105 3013112 3655887 316400 btrfs_inode_cache 3694 3731 4096 1 1 : tunables 24 12 8 : slabdata 3694 3731 96 : globalstat 3077820 267189 2939533 5769 0 38 180 0 0 : cpustat 962156 3014505 3656631 316454 btrfs_inode_cache 3458 3458 4096 1 1 : tunables 24 12 8 : slabdata 3458 3458 0 : globalstat 3077826 267189 2939539 5894 0 38 180 0 0 : cpustat 962161 3014512 3656751 316464 btrfs_inode_cache 3208 3237 4096 1 1 : tunables 24 12 8 : slabdata 3208 3237 36 : globalstat 3077838 267189 2939539 6019 0 38 180 0 0 : cpustat 962167 3014516 3657029 316482 btrfs_inode_cache 3100 3113 4096 1 1 : tunables 24 12 8 : slabdata 3100 3113 0 : globalstat 3077866 267189 2939543 6130 0 38 180 0 0 : cpustat 962169 3014526 3657110 316486 btrfs_inode_cache 2967 2994 4096 1 1 : tunables 24 12 8 : slabdata 2967 2994 40 : globalstat 3077886 267189 2939545 6236 0 38 180 0 0 : cpustat 962175 3014532 3657284 316496 btrfs_inode_cache 2840 2840 4096 1 1 : tunables 24 12 8 : slabdata 2840 2840 0 : globalstat 3077900 267189 2939547 6335 0 38 180 0 0 : cpustat 962178 3014535 3657372 316501 btrfs_inode_cache 2735 2735 4096 1 1 : tunables 24 12 8 : slabdata 2735 2735 0 : globalstat 3078008 267189 2939604 6445 0 38 180 0 0 : cpustat 962232 3014600 3657585 316512 btrfs_inode_cache 2374 2397 4096 1 1 : tunables 24 12 8 : slabdata 2374 2397 1 : globalstat 3078022 267189 2939606 6589 0 38 180 0 0 : cpustat 962237 3014607 3657935 316536 btrfs_inode_cache 2143 2172 4096 1 1 : tunables 24 12 8 : slabdata 2143 2172 36 : globalstat 3078125 267189 2939680 6707 0 38 180 0 0 : cpustat 962290 3014689 3658308 316564 btrfs_inode_cache 1522 1544 4096 1 1 : tunables 24 12 8 : slabdata 1522 1544 52 : globalstat 3078134 267189 2939680 6875 0 38 180 0 0 : cpustat 962298 3014693 3658918 316613 btrfs_inode_cache 1210 1239 4096 1 1 : tunables 24 12 8 : slabdata 1210 1239 24 : globalstat 3078138 267189 2939684 7007 0 38 180 0 0 : cpustat 962302 3014701 3659193 316634 btrfs_inode_cache 898 927 4096 1 1 : tunables 24 12 8 : slabdata 898 927 76 : globalstat 3078151 267189 2939684 7167 0 38 180 0 0 : cpustat 962338 3014709 3659562 316664 btrfs_inode_cache 714 714 4096 1 1 : tunables 24 12 8 : slabdata 714 714 12 : globalstat 3078157 267189 2939690 7268 0 38 180 0 0 : cpustat 962343 3014716 3659706 316672 btrfs_inode_cache 414 414 4096 1 1 : tunables 24 12 8 : slabdata 414 414 0 : globalstat 3078173 267189 2939694 7367 0 38 180 0 0 : cpustat 962344 3014723 3659959 316694 btrfs_inode_cache 424 424 4096 1 1 : tunables 24 12 8 : slabdata 424 424 0 : globalstat 3078183 267189 2939704 7367 0 38 180 0 0 : cpustat 962344 3014733 3659959 316694 btrfs_inode_cache 504 504 4096 1 1 : tunables 24 12 8 : slabdata 504 504 0 : globalstat 3078263 267189 2939784 7367 0 38 180 0 0 : cpustat 962344 3014813 3659959 316694 btrfs_inode_cache 528 528 4096 1 1 : tunables 24 12 8 : slabdata 528 528 0 : globalstat 3078287 267189 2939808 7367 0 38 180 0 0 : cpustat 962344 3014837 3659959 316694 btrfs_inode_cache 473 473 4096 1 1 : tunables 24 12 8 : slabdata 473 473 0 : globalstat 3078304 267189 2939820 7424 0 38 180 0 0 : cpustat 962350 3014851 3660029 316699 btrfs_inode_cache 493 493 4096 1 1 : tunables 24 12 8 : slabdata 493 493 0 : globalstat 3078324 267189 2939840 7424 0 38 180 0 0 : cpustat 962350 3014871 3660029 316699 btrfs_inode_cache 463 463 4096 1 1 : tunables 24 12 8 : slabdata 463 463 0 : globalstat 3078348 267189 2939852 7466 0 38 180 0 0 : cpustat 962352 3014885 3660073 316701 btrfs_inode_cache 483 483 4096 1 1 : tunables 24 12 8 : slabdata 483 483 0 : globalstat 3078368 267189 2939872 7466 0 38 180 0 0 : cpustat 962352 3014905 3660073 316701 btrfs_inode_cache 553 553 4096 1 1 : tunables 24 12 8 : slabdata 553 553 0 : globalstat 3078438 267189 2939942 7466 0 38 180 0 0 : cpustat 962352 3014975 3660073 316701 btrfs_inode_cache 537 537 4096 1 1 : tunables 24 12 8 : slabdata 537 537 0 : globalstat 3078486 267189 2939990 7527 0 38 180 0 0 : cpustat 962375 3015024 3660155 316707 btrfs_inode_cache 544 544 4096 1 1 : tunables 24 12 8 : slabdata 544 544 0 : globalstat 3078501 267189 2940004 7534 0 38 180 0 0 : cpustat 962376 3015039 3660164 316707 btrfs_inode_cache 568 568 4096 1 1 : tunables 24 12 8 : slabdata 568 568 0 : globalstat 3078525 267189 2940028 7534 0 38 180 0 0 : cpustat 962376 3015063 3660164 316707 btrfs_inode_cache 537 537 4096 1 1 : tunables 24 12 8 : slabdata 537 537 0 : globalstat 3078537 267189 2940040 7577 0 38 180 0 0 : cpustat 962380 3015075 3660209 316709 btrfs_inode_cache 531 531 4096 1 1 : tunables 24 12 8 : slabdata 531 531 0 : globalstat 3078563 267189 2940066 7609 0 38 180 0 0 : cpustat 962383 3015102 3660244 316710 btrfs_inode_cache 523 523 4096 1 1 : tunables 24 12 8 : slabdata 523 523 0 : globalstat 3078583 267189 2940082 7633 0 38 180 0 0 : cpustat 962385 3015120 3660271 316711 btrfs_inode_cache 535 535 4096 1 1 : tunables 24 12 8 : slabdata 535 535 0 : globalstat 3078631 267189 2940130 7669 0 38 180 0 0 : cpustat 962388 3015169 3660309 316713 btrfs_inode_cache 548 548 4096 1 1 : tunables 24 12 8 : slabdata 548 548 0 : globalstat 3078699 267189 2940180 7706 0 38 180 0 0 : cpustat 962425 3015224 3660385 316716 btrfs_inode_cache 558 558 4096 1 1 : tunables 24 12 8 : slabdata 558 558 0 : globalstat 3078709 267189 2940190 7706 0 38 180 0 0 : cpustat 962425 3015234 3660385 316716 btrfs_inode_cache 568 568 4096 1 1 : tunables 24 12 8 : slabdata 568 568 0 : globalstat 3078719 267189 2940200 7706 0 38 180 0 0 : cpustat 962425 3015244 3660385 316716 btrfs_inode_cache 542 542 4096 1 1 : tunables 24 12 8 : slabdata 542 542 0 : globalstat 3078731 267189 2940208 7740 0 38 180 0 0 : cpustat 962426 3015253 3660420 316717 btrfs_inode_cache 524 524 4096 1 1 : tunables 24 12 8 : slabdata 524 524 0 : globalstat 3078743 267189 2940220 7770 0 38 180 0 0 : cpustat 962426 3015265 3660449 316718 btrfs_inode_cache 500 500 4096 1 1 : tunables 24 12 8 : slabdata 500 500 0 : globalstat 3078755 267189 2940228 7802 0 38 180 0 0 : cpustat 962427 3015275 3660485 316720 btrfs_inode_cache 467 467 4096 1 1 : tunables 24 12 8 : slabdata 467 467 0 : globalstat 3078759 267189 2940232 7839 0 38 180 0 0 : cpustat 962431 3015280 3660522 316722 btrfs_inode_cache 429 429 4096 1 1 : tunables 24 12 8 : slabdata 429 429 0 : globalstat 3078763 267189 2940236 7881 0 38 180 0 0 : cpustat 962434 3015285 3660566 316724 btrfs_inode_cache 404 404 4096 1 1 : tunables 24 12 8 : slabdata 404 404 0 : globalstat 3078778 267189 2940246 7916 0 38 180 0 0 : cpustat 962436 3015297 3660603 316726 btrfs_inode_cache 420 420 4096 1 1 : tunables 24 12 8 : slabdata 420 420 0 : globalstat 3078794 267189 2940262 7916 0 38 180 0 0 : cpustat 962436 3015313 3660603 316726 btrfs_inode_cache 545 545 4096 1 1 : tunables 24 12 8 : slabdata 545 545 0 : globalstat 3078968 267189 2940414 7943 0 38 180 0 0 : cpustat 962543 3015475 3660738 316735 btrfs_inode_cache 520 520 4096 1 1 : tunables 24 12 8 : slabdata 520 520 0 : globalstat 3078972 267189 2940418 7972 0 38 180 0 0 : cpustat 962545 3015479 3660768 316736 btrfs_inode_cache 550 550 4096 1 1 : tunables 24 12 8 : slabdata 550 550 0 : globalstat 3079002 267189 2940448 7972 0 38 180 0 0 : cpustat 962545 3015509 3660768 316736 btrfs_inode_cache 567 567 4096 1 1 : tunables 24 12 8 : slabdata 567 567 0 : globalstat 3079057 267189 2940481 7988 0 38 180 0 0 : cpustat 962554 3015546 3660796 316737 btrfs_inode_cache 582 597 4096 1 1 : tunables 24 12 8 : slabdata 582 597 0 : globalstat 3079107 267189 2940519 7996 0 38 180 0 0 : cpustat 962563 3015587 3660837 316739 btrfs_inode_cache 574 574 4096 1 1 : tunables 24 12 8 : slabdata 574 574 0 : globalstat 3079107 267189 2940519 8019 0 38 180 0 0 : cpustat 962563 3015587 3660837 316739 btrfs_inode_cache 621 621 4096 1 1 : tunables 24 12 8 : slabdata 621 621 0 : globalstat 3079185 267189 2940597 8050 0 38 180 0 0 : cpustat 962564 3015666 3660869 316740 btrfs_inode_cache 614 614 4096 1 1 : tunables 24 12 8 : slabdata 614 614 24 : globalstat 3079193 267189 2940603 8063 0 38 180 0 0 : cpustat 962565 3015673 3660920 316742 btrfs_inode_cache 548 548 4096 1 1 : tunables 24 12 8 : slabdata 548 548 0 : globalstat 3079201 267189 2940605 8131 0 38 180 0 0 : cpustat 962566 3015676 3660951 316743 btrfs_inode_cache 532 532 4096 1 1 : tunables 24 12 8 : slabdata 532 532 0 : globalstat 3079209 267189 2940613 8155 0 38 180 0 0 : cpustat 962567 3015685 3660976 316744 btrfs_inode_cache 538 538 4096 1 1 : tunables 24 12 8 : slabdata 538 538 0 : globalstat 3079215 267189 2940619 8155 0 38 180 0 0 : cpustat 962567 3015691 3660976 316744 btrfs_inode_cache 523 523 4096 1 1 : tunables 24 12 8 : slabdata 523 523 0 : globalstat 3079233 267189 2940637 8188 0 38 180 0 0 : cpustat 962567 3015709 3661008 316745 btrfs_inode_cache 450 450 4096 1 1 : tunables 24 12 8 : slabdata 450 450 0 : globalstat 3079248 267189 2940645 8269 0 38 180 0 0 : cpustat 962573 3015721 3661095 316749 btrfs_inode_cache 430 431 4096 1 1 : tunables 24 12 8 : slabdata 430 431 0 : globalstat 3079276 267189 2940663 8306 0 38 180 0 0 : cpustat 962576 3015742 3661137 316751 btrfs_inode_cache 452 452 4096 1 1 : tunables 24 12 8 : slabdata 452 452 0 : globalstat 3079298 267189 2940685 8307 0 38 180 0 0 : cpustat 962576 3015764 3661137 316751 btrfs_inode_cache 451 456 4096 1 1 : tunables 24 12 8 : slabdata 451 456 0 : globalstat 3079327 267189 2940705 8323 0 38 180 0 0 : cpustat 962596 3015786 3661178 316753 btrfs_inode_cache 8029 8029 4096 1 1 : tunables 24 12 8 : slabdata 8029 8029 0 : globalstat 3086904 267189 2948278 8323 0 38 181 0 0 : cpustat 962730 3023366 3661308 316759 btrfs_inode_cache 17518 17518 4096 1 1 : tunables 24 12 8 : slabdata 17518 17518 0 : globalstat 3096531 267189 2957868 8368 0 38 181 0 0 : cpustat 962826 3032963 3661505 316766 btrfs_inode_cache 28854 28854 4096 1 1 : tunables 24 12 8 : slabdata 28854 28854 0 : globalstat 3107867 267189 2969204 8368 0 38 181 0 0 : cpustat 964636 3044394 3663315 316861 btrfs_inode_cache 28291 28291 4096 1 1 : tunables 24 12 8 : slabdata 28291 28291 0 : globalstat 3111532 267189 2972799 8416 0 38 181 0 0 : cpustat 965771 3048049 3668262 317267 btrfs_inode_cache 22247 22284 4096 1 1 : tunables 24 12 8 : slabdata 22247 22284 96 : globalstat 3112396 267189 2973663 8552 0 38 181 0 0 : cpustat 966145 3048940 3675091 317863 btrfs_inode_cache 16030 16067 4096 1 1 : tunables 24 12 8 : slabdata 16030 16067 96 : globalstat 3112396 267189 2973663 8752 0 38 181 0 0 : cpustat 966150 3048942 3680817 318363 btrfs_inode_cache 12213 12250 4096 1 1 : tunables 24 12 8 : slabdata 12213 12250 96 : globalstat 3112396 267189 2973663 8920 0 38 181 0 0 : cpustat 966159 3048942 3684344 318662 btrfs_inode_cache 7999 8028 4096 1 1 : tunables 24 12 8 : slabdata 7999 8028 76 : globalstat 3112396 267189 2973663 9096 0 38 181 0 0 : cpustat 966164 3048942 3688211 318991 btrfs_inode_cache 5993 6030 4096 1 1 : tunables 24 12 8 : slabdata 5993 6030 96 : globalstat 3113173 267189 2974384 9232 0 38 181 0 0 : cpustat 966667 3049702 3691264 319242 btrfs_inode_cache 4797 4824 4096 1 1 : tunables 24 12 8 : slabdata 4797 4824 4 : globalstat 3113173 267189 2974384 9416 0 38 181 0 0 : cpustat 966684 3049706 3692276 319321 btrfs_inode_cache 4443 4472 4096 1 1 : tunables 24 12 8 : slabdata 4443 4472 28 : globalstat 3113188 267189 2974384 9592 0 38 181 0 0 : cpustat 966688 3049712 3692652 319341 btrfs_inode_cache 4365 4402 4096 1 1 : tunables 24 12 8 : slabdata 4365 4402 96 : globalstat 3114014 267189 2975171 9669 0 38 181 0 0 : cpustat 966797 3050510 3693639 319422 btrfs_inode_cache 3011 3040 4096 1 1 : tunables 24 12 8 : slabdata 3011 3040 18 : globalstat 3114026 267189 2975171 9821 0 38 181 0 0 : cpustat 966828 3050513 3694825 319523 btrfs_inode_cache 2186 2203 4096 1 1 : tunables 24 12 8 : slabdata 2186 2203 0 : globalstat 3114062 267189 2975173 9954 0 38 181 0 0 : cpustat 966893 3050525 3695645 319587 btrfs_inode_cache 2161 2166 4096 1 1 : tunables 24 12 8 : slabdata 2161 2166 0 : globalstat 3114230 267189 2975292 10070 0 38 181 0 0 : cpustat 966926 3050649 3695817 319597 btrfs_inode_cache 2402 2424 4096 1 1 : tunables 24 12 8 : slabdata 2402 2424 76 : globalstat 3114624 267189 2975651 10144 0 38 181 0 0 : cpustat 966969 3051015 3696049 319613 btrfs_inode_cache 1627 1627 4096 1 1 : tunables 24 12 8 : slabdata 1627 1627 0 : globalstat 3114645 267189 2975653 10312 0 38 181 0 0 : cpustat 966991 3051022 3696722 319665 btrfs_inode_cache 1612 1612 4096 1 1 : tunables 24 12 8 : slabdata 1612 1612 0 : globalstat 3114760 267189 2975756 10389 0 38 181 0 0 : cpustat 967005 3051127 3696847 319673 btrfs_inode_cache 3060 3097 4096 1 1 : tunables 24 12 8 : slabdata 3060 3097 96 : globalstat 3116672 267189 2977661 10435 0 38 181 0 0 : cpustat 967237 3053046 3697631 319727 btrfs_inode_cache 2862 2862 4096 1 1 : tunables 24 12 8 : slabdata 2862 2862 0 : globalstat 3116704 267189 2977668 10526 0 38 181 0 0 : cpustat 967266 3053060 3697730 319734 btrfs_inode_cache 2971 2971 4096 1 1 : tunables 24 12 8 : slabdata 2971 2971 0 : globalstat 3116910 267189 2977861 10575 0 38 181 0 0 : cpustat 967268 3053255 3697813 319739 btrfs_inode_cache 4719 4755 4096 1 1 : tunables 24 12 8 : slabdata 4719 4755 96 : globalstat 3118780 267189 2979731 10653 0 38 181 0 0 : cpustat 967334 3055129 3698099 319756 -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Stephane Chazelas''s message of 2011-07-11 05:01:21 -0400:> 2011-07-10 19:37:28 +0100, Stephane Chazelas: > > 2011-07-10 08:44:34 -0400, Chris Mason: > > [...] > > > Great, we''re on the right track. Does it trigger with mount -o compress > > > instead of mount -o compress_force? > > [...] > > > > It does trigger. I get that same "invalid opcode". > > > > BTW, I tried with CONFIG_SLUB and slub_debug and no more useful > > information than with SLAB_DEBUG. > > > > I''m trying now without dmcrypt. Then I won''t have much bandwidth > > for testing. > [...] > > Same without dmcrypt. So to sum up, BUG() reached in btrfs-fixup > thread when doing anThis is some fantastic debugging, thank you. I know you tested with slab debugging turned on, did you have CONFIG_DEBUG_PAGEALLOC on as well? It''s probably something to do with a specific file, but pulling that file out without extra printks is going to be tricky. I''ll see if I can reproduce it here. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-11 11:00:19 -0400, Chris Mason:> Excerpts from Stephane Chazelas''s message of 2011-07-11 05:01:21 -0400: > > 2011-07-10 19:37:28 +0100, Stephane Chazelas: > > > 2011-07-10 08:44:34 -0400, Chris Mason: > > > [...] > > > > Great, we''re on the right track. Does it trigger with mount -o compress > > > > instead of mount -o compress_force? > > > [...] > > > > > > It does trigger. I get that same "invalid opcode". > > > > > > BTW, I tried with CONFIG_SLUB and slub_debug and no more useful > > > information than with SLAB_DEBUG. > > > > > > I''m trying now without dmcrypt. Then I won''t have much bandwidth > > > for testing. > > [...] > > > > Same without dmcrypt. So to sum up, BUG() reached in btrfs-fixup > > thread when doing an[...]> > - CONFIG_SLAB_DEBUG, CONFIG_DEBUG_PAGEALLOC, > > CONFIG_DEBUG_SLAB_LEAK, slub_debug don''t tell us anything > > useful (there''s more info in /proc/slabinfo when > > CONFIG_SLAB_DEBUG is on, see below)[...]> This is some fantastic debugging, thank you. I know you tested with > slab debugging turned on, did you have CONFIG_DEBUG_PAGEALLOC on as > well?Yes when using SLAB, not when using SLUB.> It''s probably something to do with a specific file, but pulling that > file out without extra printks is going to be tricky. I''ll see if I can > reproduce it here.[...] For one occurrence, I know what file was being transfered at the time of the crash (looking at ctimes on the dest FS, see one of my earlier emails). And after just checking on the latest BUG(), it''s a different one. Also, when I resume the rsync (so it doesn''t transfer the already transfered files), it does BUG() again. regards, Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Stephane Chazelas''s message of 2011-07-11 11:35:56 -0400:> 2011-07-11 11:00:19 -0400, Chris Mason: > > Excerpts from Stephane Chazelas''s message of 2011-07-11 05:01:21 -0400: > > > 2011-07-10 19:37:28 +0100, Stephane Chazelas: > > > > 2011-07-10 08:44:34 -0400, Chris Mason: > > > > [...] > > > > > Great, we''re on the right track. Does it trigger with mount -o compress > > > > > instead of mount -o compress_force? > > > > [...] > > > > > > > > It does trigger. I get that same "invalid opcode". > > > > > > > > BTW, I tried with CONFIG_SLUB and slub_debug and no more useful > > > > information than with SLAB_DEBUG. > > > > > > > > I''m trying now without dmcrypt. Then I won''t have much bandwidth > > > > for testing. > > > [...] > > > > > > Same without dmcrypt. So to sum up, BUG() reached in btrfs-fixup > > > thread when doing an > [...] > > > - CONFIG_SLAB_DEBUG, CONFIG_DEBUG_PAGEALLOC, > > > CONFIG_DEBUG_SLAB_LEAK, slub_debug don''t tell us anything > > > useful (there''s more info in /proc/slabinfo when > > > CONFIG_SLAB_DEBUG is on, see below) > [...] > > This is some fantastic debugging, thank you. I know you tested with > > slab debugging turned on, did you have CONFIG_DEBUG_PAGEALLOC on as > > well? > > Yes when using SLAB, not when using SLUB. > > > It''s probably something to do with a specific file, but pulling that > > file out without extra printks is going to be tricky. I''ll see if I can > > reproduce it here. > [...] > > For one occurrence, I know what file was being transfered at the > time of the crash (looking at ctimes on the dest FS, see one of > my earlier emails). And after just checking on the latest BUG(), > it''s a different one. > > Also, when I resume the rsync (so it doesn''t transfer the > already transfered files), it does BUG() again.Ok, could you please send along the exact rsync command you were running? Thanks, Chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-11 12:25:51 -0400, Chris Mason: [...]> > Also, when I resume the rsync (so it doesn''t transfer the > > already transfered files), it does BUG() again. > > Ok, could you please send along the exact rsync command you were > running?[...] I did earlier, but here it is again: rsync --archive --xattrs --hard-links --numeric-ids --sparse --acls /src/ /dst/ Also with: (cd /src && bsdtar cf - .) | pv | (cd /dst && bsdtar -xpSf - --numeric-owner) -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-11 10:01:21 +0100, Stephane Chazelas: [...]> Same without dmcrypt. So to sum up, BUG() reached in btrfs-fixup > thread when doing an > > - rsync (though I also got (back when on ubuntu and 2.6.38) at > least one occurrence using bsdtar | bsdtar) > - of a large amount of data (with a large number of files), > though the bug occurs quite early probably after having > transfered about 50-100GB > - the source FS being btrfs with compress-force on 3 devices > (one of which slightly shorter than the others) and a lot of > subvolumes and snapshots (I''m now copying from read-only > snapshots but that happened with RW ones as well). > - to a newly created btrfs fs > - on one device (/dev/sdd or dmcrypt) > - mounted with compress or compress-force. > > - noatime on either source or dest doesn''t make a difference > (wrt the occurrence of fixup BUG()) > - can''t reproduce it when dest is not mounted with compress > - beside that BUG(), > - kernel memory is being used up (mostly in > btrfs_inode_cache) and can''t be reclaimed (leading to crash > with oom killing everybody) > - the target FS can be unmounted but that does not reclaim > memory. However the *source* FS (that is not the one we tried > with and without compress) cannot be unmounted (umount hangs, > see another email for its stack trace). > - Only way to get out of there is reboot with sysrq-b > - happens with 2.6.38, 2.6.39, 3.0.0rc6 > - CONFIG_SLAB_DEBUG, CONFIG_DEBUG_PAGEALLOC, > CONFIG_DEBUG_SLAB_LEAK, slub_debug don''t tell us anything > useful (there''s more info in /proc/slabinfo when > CONFIG_SLAB_DEBUG is on, see below) > - happens with CONFIG_SLUB as well.[...] I don''t know which of CONFIG_SLUB or noatime made it, but in that setup with both enabled, I do get the BUG(), but the system memory is not exhausted even after rsync goes over the section with millions of files where it used to cause the oom crash. The only issue remaining then is that I can''t umount the source FS (and thus causing reboot issues). We could still have 2 or 3 different issues here for all we know. The situation is a lot more comfortable for me now though. -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Still on my btrfs-based backup system. I still see one BUG() reached in btrfs-fixup per boot time, no memory exhaustion anymore. There is now however something new: write performance is down to a few bytes per second. I''ve got a few processes (rsync, patched ntfsclone, shells mostly) writing to files at the same time on this server. disk stats per second: --dsk/sda-----dsk/sdb-----dsk/sdc-- read writ: read writ: read writ 264k 44k: 193k 44k: 225k 42k 0 0 : 0 0 : 0 0 0 0 : 0 0 : 0 0 0 0 : 0 0 : 0 0 0 60k: 0 0 : 0 0 0 12k: 0 1176k: 0 1164k 0 0 : 0 0 : 0 0 0 0 : 0 0 : 0 0 0 0 : 0 0 : 0 0 0 40k: 0 0 :8192B 0 0 0 : 0 0 : 0 0 0 0 : 0 0 : 0 0 0 0 : 0 0 : 0 0 0 0 : 0 0 : 0 0 0 0 : 0 0 :4096B 0 0 0 : 0 0 : 0 0 0 0 : 0 0 : 0 0 0 0 : 0 0 : 0 0 324k 0 : 248k 0 : 548k 0 0 4096B: 0 0 : 0 0 352k 0 : 0 0 : 0 0 0 0 : 0 0 : 0 0 0 0 : 0 0 :4096B 0 0 0 : 0 0 : 0 0 0 80k: 0 0 : 0 0 rsync: # strace -Ts0 -p 5015 Process 5015 attached - interrupt to quit write(3, ""..., 1024) = 1024 <0.007700> write(3, ""..., 1024) = 1024 <0.015822> write(3, ""..., 1024) = 1024 <0.031853> write(3, ""..., 1024) = 1024 <0.015881> write(3, ""..., 1024) = 1024 <0.015911> write(3, ""..., 1024) = 1024 <0.015796> write(3, ""..., 1024) = 1024 <0.031946> write(3, ""..., 1024) = 1024 <4.083854> write(3, ""..., 1024) = 1024 <0.007925> write(3, ""..., 1024) = 1024 <0.003776> write(3, ""..., 1024) = 1024 <0.031862> write(3, ""..., 1024) = 1024 <0.011807> write(3, ""..., 1024) = 1024 <0.019742> write(3, ""..., 1024) = 1024 <0.015857> write(3, ""..., 1024) = 1024 <0.031833> write(3, ""..., 1024) = 1024 <0.015789> write(3, ""..., 1024) = 1024 <0.015926> write(3, ""..., 1024) = 1024 <4.095967> write(3, ""..., 1024) = 1024 <0.019798> write(3, ""..., 1024) = 1024 <4.083682> write(3, ""..., 1024) = 1024 <0.015398> write(3, ""..., 1024) = 1024 <0.015951> write(3, ""..., 1024) = 1024 <0.035837> write(3, ""..., 1024) = 1024 <0.015962> write(3, ""..., 1024) = 1024 <0.015909> write(3, ""..., 1024) = 1024 <0.015967> write(3, ""..., 48) = 48 <0.003782> write(3, ""..., 1024) = 1024 <0.031802> write(3, ""..., 1024) = 1024 <0.015811> write(3, ""..., 1024) = 1024 <0.015944> write(3, ""..., 1024) = 1024 <0.019810> write(3, ""..., 1024) = 1024 <0.031948> ntfsclone (patched to only write modified clusters): # strace -Te write -p 4717 Process 4717 attached - interrupt to quit write(1, " 65.16 percent completed\r", 25) = 25 <0.008996> write(1, " 65.16 percent completed\r", 25) = 25 <0.743358> write(1, " 65.16 percent completed\r", 25) = 25 <0.306582> write(1, " 65.17 percent completed\r", 25) = 25 <4.082723> write(1, " 65.17 percent completed\r", 25) = 25 <0.006402> write(1, " 65.17 percent completed\r", 25) = 25 <0.012582> write(1, " 65.17 percent completed\r", 25) = 25 <4.052504> write(1, " 65.17 percent completed\r", 25) = 25 <0.012111> write(1, " 65.17 percent completed\r", 25) = 25 <0.016001> write(1, " 65.17 percent completed\r", 25) = 25 <4.028017> write(1, " 65.18 percent completed\r", 25) = 25 <0.013365> write(1, " 65.18 percent completed\r", 25) = 25 <0.003963> (that''s writing to a log file) See how many write(2)s take 4 seconds. No issue when writing to an ext4 FS. SMART status on all drives OK. What else could I look at? Attached is a sysrq-t output. -- Stephane
2011-07-16 13:12:10 +0100, Stephane Chazelas: [...]> ntfsclone (patched to only write modified clusters): > > # strace -Te write -p 4717 > Process 4717 attached - interrupt to quit > write(1, " 65.16 percent completed\r", 25) = 25 <0.008996> > write(1, " 65.16 percent completed\r", 25) = 25 <0.743358> > write(1, " 65.16 percent completed\r", 25) = 25 <0.306582> > write(1, " 65.17 percent completed\r", 25) = 25 <4.082723> > write(1, " 65.17 percent completed\r", 25) = 25 <0.006402> > write(1, " 65.17 percent completed\r", 25) = 25 <0.012582> > write(1, " 65.17 percent completed\r", 25) = 25 <4.052504> > write(1, " 65.17 percent completed\r", 25) = 25 <0.012111> > write(1, " 65.17 percent completed\r", 25) = 25 <0.016001> > write(1, " 65.17 percent completed\r", 25) = 25 <4.028017> > write(1, " 65.18 percent completed\r", 25) = 25 <0.013365> > write(1, " 65.18 percent completed\r", 25) = 25 <0.003963> > (that''s writing to a log file) > > See how many write(2)s take 4 seconds.[...] top - 17:14:18 up 1 day, 9:20, 3 users, load average: 1.00, 1.06, 1.11 Tasks: 146 total, 1 running, 145 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 25.0%id, 75.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 5094800k total, 4616412k used, 478388k free, 1420192k buffers Swap: 4194300k total, 8720k used, 4185580k free, 2266240k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND 2156 root 20 0 0 0 0 D 0 0.0 0:00.02 0 flush-btrfs-1 6206 root 20 0 19112 1284 916 R 0 0.0 0:00.09 0 top 1 root 20 0 8400 220 196 S 0 0.0 0:02.26 0 init (all the other processes sleeping) I suspect load 1 is for that flush-btrfs-1 in [86372.445554] flush-btrfs-1 D ffff88014438da30 0 2156 2 0x00000000 [86372.445554] ffff88014438da30 0000000000000046 ffffffff8100e366 ffff88014438a9a0 [86372.445554] 00000000000127c0 ffff880021501fd8 ffff880021501fd8 00000000000127c0 [86372.445554] ffff88014438da30 ffff880021500010 ffffffff8100e366 ffffffff81066ec6 [86372.445554] Call Trace: [86372.445554] [<ffffffff8100e366>] ? read_tsc+0x5/0x16 [86372.445554] [<ffffffff8100e366>] ? read_tsc+0x5/0x16 [86372.445554] [<ffffffff81066ec6>] ? timekeeping_get_ns+0xd/0x2a [86372.445554] [<ffffffff810b607c>] ? __lock_page+0x63/0x63 [86372.445554] [<ffffffff81339473>] ? io_schedule+0x84/0xc3 [86372.445554] [<ffffffff811aa0aa>] ? radix_tree_gang_lookup_tag_slot+0x7a/0x9f [86372.445554] [<ffffffff810b6085>] ? sleep_on_page+0x9/0xd [86372.445554] [<ffffffff813399d5>] ? __wait_on_bit_lock+0x3c/0x85 [86372.445554] [<ffffffff810b6076>] ? __lock_page+0x5d/0x63 [86372.445554] [<ffffffff8105fff7>] ? autoremove_wake_function+0x2a/0x2a [86372.445554] [<ffffffffa0192784>] ? T.1090+0xba/0x234 [btrfs] [86372.445554] [<ffffffffa01929ee>] ? extent_writepages+0x40/0x56 [btrfs] [86372.445554] [<ffffffffa017e5f0>] ? btrfs_submit_direct+0x403/0x403 [btrfs] [86372.445554] [<ffffffff8111b29c>] ? writeback_single_inode+0xb8/0x1b8 [86372.445554] [<ffffffff8111b5af>] ? writeback_sb_inodes+0xc2/0x13b [86372.445554] [<ffffffff8111b96e>] ? writeback_inodes_wb+0xfd/0x10f [86372.445554] [<ffffffff8111c088>] ? wb_writeback+0x213/0x330 [86372.445554] [<ffffffff81052368>] ? lock_timer_base+0x25/0x49 [86372.445554] [<ffffffff8111c312>] ? wb_do_writeback+0x16d/0x1fc [86372.445554] [<ffffffff81052824>] ? del_timer_sync+0x34/0x3e [86372.445554] [<ffffffff8111c464>] ? bdi_writeback_thread+0xc3/0x1ff [86372.445554] [<ffffffff8111c3a1>] ? wb_do_writeback+0x1fc/0x1fc [86372.445554] [<ffffffff8111c3a1>] ? wb_do_writeback+0x1fc/0x1fc [86372.445554] [<ffffffff8105fb91>] ? kthread+0x7a/0x82 [86372.445554] [<ffffffff81340f24>] ? kernel_thread_helper+0x4/0x10 [86372.445554] [<ffffffff8105fb17>] ? kthread_worker_fn+0x147/0x147 [86372.445554] [<ffffffff81340f20>] ? gs_change+0x13/0x13 Also, if I run sync(1), it seems to never return. [120348.788021] sync D ffff88011b3e1bc0 0 6215 1789 0x00000000 [120348.788021] ffff88011b3e1bc0 0000000000000082 0000000000000000 ffffffff8160b020 [120348.788021] 00000000000127c0 ffff8800b0f25fd8 ffff8800b0f25fd8 00000000000127c0 [120348.788021] ffff88011b3e1bc0 ffff8800b0f24010 ffff88011b3e1bc0 000000014fc127c0 [120348.788021] Call Trace: [120348.788021] [<ffffffff8133989f>] ? schedule_timeout+0x2d/0xd7 [120348.788021] [<ffffffff8111f2fa>] ? __sync_filesystem+0x74/0x74 [120348.788021] [<ffffffff81339714>] ? wait_for_common+0xd1/0x14e [120348.788021] [<ffffffff810427f2>] ? try_to_wake_up+0x18c/0x18c [120348.788021] [<ffffffff8111f2fa>] ? __sync_filesystem+0x74/0x74 [120348.788021] [<ffffffff8111f2fa>] ? __sync_filesystem+0x74/0x74 [120348.788021] [<ffffffff8111bab6>] ? writeback_inodes_sb_nr+0x72/0x78 [120348.788021] [<ffffffff8111f2d4>] ? __sync_filesystem+0x4e/0x74 [120348.788021] [<ffffffff81100d0d>] ? iterate_supers+0x5e/0xab [120348.788021] [<ffffffff8111f337>] ? sys_sync+0x28/0x53 [120348.788021] [<ffffffff8133fe12>] ? system_call_fastpath+0x16/0x1b -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-16 13:12:10 +0100, Stephane Chazelas:> Still on my btrfs-based backup system. I still see one BUG() > reached in btrfs-fixup per boot time, no memory exhaustion > anymore. There is now however something new: write performance > is down to a few bytes per second.[...] The condition that was causing that seems to have cleared by itself this morning before 4am. flush-btrfs-1 and sync are still in D state. Can''t really tell what cleared it. Could be when the first of the rsyncs ended as all the other ones (and ntfsclones from nbd devices) ended soon after Cheers, Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-17 10:17:37 +0100, Stephane Chazelas:> 2011-07-16 13:12:10 +0100, Stephane Chazelas: > > Still on my btrfs-based backup system. I still see one BUG() > > reached in btrfs-fixup per boot time, no memory exhaustion > > anymore. There is now however something new: write performance > > is down to a few bytes per second. > [...] > > The condition that was causing that seems to have cleared by > itself this morning before 4am. > > flush-btrfs-1 and sync are still in D state. > > Can''t really tell what cleared it. Could be when the first of > the rsyncs ended as all the other ones (and ntfsclones from nbd > devices) ended soon after[...] New nightly backup, and it''s happening again. Started about 40 minutes after the start of the backup. ----system---- -net/total- ---procs--- --dsk/sda-----dsk/sdb-----dsk/sdc-- time | recv send|run blk new| read writ: read writ: read writ 17-07 20:19:18| 0 0 |0.0 0.0 0.0| 142k 31k: 119k 36k: 120k 33k 17-07 20:19:48|8087k 224k|1.2 5.3 0.1|2976k 98k: 793k 400k:2856k 375k 17-07 20:20:18|5174k 134k|0.8 4.6 0.9| 880k 179k: 830k 916k:1801k 825k 17-07 20:20:48|6634k 148k|1.3 4.9 0.2| 609k 101k:1259k 96k:2628k 98k 17-07 20:21:18|6725k 165k|0.7 5.8 0.0| 237k 442k: 975k 723k:1870k 644k 17-07 20:21:48|7100k 153k|0.7 5.4 0| 305k 83k:1124k 314k:2155k 274k 17-07 20:22:18|4440k 178k|0.5 5.3 0.0| 296k 1775B:2094k 240k:1663k 239k 17-07 20:22:48|8181k 220k|0.9 5.8 0| 360k 410B:1579k 196k:2065k 196k 17-07 20:23:18|8144k 228k|1.3 5.6 0| 348k 54k:1781k 216k:2213k 164k 17-07 20:23:48|5506k 185k|0.8 5.2 0.1| 307k 0 :2040k 0 :2166k 0 17-07 20:24:18|6260k 206k|1.0 5.4 0.1| 474k 78k:2034k 285k:2218k 207k 17-07 20:24:48|8420k 314k|1.5 5.4 0| 313k 363k:2367k 391k:2182k 124k 17-07 20:25:18|8367k 247k|0.9 5.1 0.2| 475k 77k:1797k 75k:2220k 410B 17-07 20:25:48|7511k 179k|1.0 4.7 0| 406k 7646B:1596k 145k:2397k 147k 17-07 20:26:18|7930k 162k|0.7 5.1 0| 991k 410B:1468k 26k:2186k 26k 17-07 20:26:48|7757k 176k|1.0 5.3 0|1884k 26k:1147k 58k:2761k 32k [...] 17-07 20:57:18|6917k 120k|0.3 4.1 0| 56k 410B: 65k 4506B: 213k 4506B 17-07 20:57:48|5698k 103k|0.1 4.0 0| 0 410B: 27k 6007B: 590k 6007B 17-07 20:58:18|6582k 117k|0.2 4.0 0| 229k 20k: 195k 956B: 290k 21k 17-07 20:58:48|6048k 110k|0.6 4.0 0.1| 32k 21k: 81k 410B: 331k 21k 17-07 20:59:18|8057k 138k|0.6 4.1 0| 42k 5871B: 33k 410B: 35k 5871B 17-07 20:59:48|7369k 145k|0.5 4.1 0| 59k 3959B: 230k 410B: 532k 3959B 17-07 21:00:18|8189k 140k|0.7 4.0 0| 53k 6007B: 58k 410B: 40k 6007B 17-07 21:00:48|7596k 137k|0.3 4.2 0| 24k 6690B: 250k 410B: 15k 5734B 17-07 21:01:18|8448k 145k|0.7 4.2 0| 24k 1365B: 325k 6827B: 15k 7646B 17-07 21:01:48|6821k 119k|0.3 4.0 0| 17k 410B: 175k 3004B: 11k 3004B 17-07 21:02:18|3614k 66k|0.7 2.7 0| 39k 410B: 538k 4779B: 45k 4779B 17-07 21:02:48| 417k 14k|0.5 1.3 0.3| 106k 1638B: 209k 4779B: 0 4779B 17-07 21:03:18| 353k 7979B|0.8 1.2 0| 0 1229B: 449k 2867B: 0 2867B 17-07 21:03:48| 327k 8981B|1.1 1.2 0| 0 410B: 686k 4506B: 43k 4506B [...] 18-07 11:02:48| 243k 4866B|0.0 1.2 0.1| 0 2458B: 0 3550B: 0 3550B 18-07 11:03:18| 274k 5506B|0.1 1.2 0.1| 0 1775B: 0 3550B: 0 3550B 18-07 11:03:48| 238k 4851B|0.1 1.2 0.0| 0 4369B: 0 3550B: 0 3550B 18-07 11:04:18| 243k 4999B|0.1 1.1 0.1| 0 4506B: 0 3550B: 0 3550B 18-07 11:04:48| 288k 6488B|0.1 1.1 0.4| 0 2458B: 0 3550B: 0 3550B Because that''s after the week-end, there''s not much to write. What''s holding 3 of the backups is actually writing log data like "xx% Completed". Actively running at the moment are 1 rsync and 3 ntfsclone. # strace -tt -s 2 -Te write -p 8771 -p 8567 -p 8856 -p 8403 Process 8771 attached - interrupt to quit Process 8567 attached - interrupt to quit Process 8856 attached - interrupt to quit Process 8403 attached - interrupt to quit [pid 8403] 11:12:26.539830 write(4, "es"..., 1024 <unfinished ...> [pid 8771] 11:12:26.540417 write(4, "hb"..., 4096 <unfinished ...> [pid 8567] 11:12:26.555211 write(1, " 3"..., 25 <unfinished ...> [pid 8856] 11:12:26.593232 write(1, " 6"..., 25 <unfinished ...> [pid 8403] 11:12:30.635257 <... write resumed> ) = 1024 <4.095271> [pid 8403] 11:12:30.635309 write(4, "19"..., 112 <unfinished ...> [pid 8567] 11:12:30.635364 <... write resumed> ) = 25 <4.080091> [pid 8856] 11:12:30.635553 <... write resumed> ) = 25 <4.042268> [pid 8771] 11:12:30.635799 <... write resumed> ) = 4096 <4.095350> [pid 8771] 11:12:30.636182 write(4, "hb"..., 4096 <unfinished ...> [pid 8567] 11:12:30.649904 write(1, " 3"..., 25 <unfinished ...> [pid 8403] 11:12:30.651452 <... write resumed> ) = 112 <0.015921> [pid 8567] 11:12:30.651595 <... write resumed> ) = 25 <0.001640> [pid 8403] 11:12:30.651787 write(4, "@d"..., 1024 <unfinished ...> [pid 8771] 11:12:30.651865 <... write resumed> ) = 4096 <0.015638> [pid 8771] 11:12:30.652281 write(4, "hb"..., 4096 <unfinished ...> [pid 8856] 11:12:30.657579 write(1, " 6"..., 25 <unfinished ...> [pid 8567] 11:12:30.691113 write(1, " 3"..., 25 <unfinished ...> [pid 8403] 11:12:34.747526 <... write resumed> ) = 1024 <4.095421> [pid 8403] 11:12:34.747585 write(4, "oo"..., 112 <unfinished ...> [pid 8856] 11:12:34.747640 <... write resumed> ) = 25 <4.090004> [pid 8567] 11:12:34.747662 <... write resumed> ) = 25 <4.056499> [pid 8771] 11:12:34.747698 <... write resumed> ) = 4096 <4.095385> [pid 8771] 11:12:34.748054 write(4, "hb"..., 4096 <unfinished ...> [pid 8567] 11:12:34.765660 write(1, " 3"..., 25 <unfinished ...> [pid 8856] 11:12:34.823355 write(1, " 6"..., 25 <unfinished ...> [pid 8771] 11:12:38.843269 <... write resumed> ) = 4096 <4.095149> [pid 8403] 11:12:38.843315 <... write resumed> ) = 112 <4.095707> [pid 8403] 11:12:38.843358 write(4, "mE"..., 1024 <unfinished ...> [pid 8856] 11:12:38.843502 <... write resumed> ) = 25 <4.020080> [pid 8567] 11:12:38.843529 <... write resumed> ) = 25 <4.077820> [pid 8771] 11:12:38.843759 write(4, "hb"..., 4096 <unfinished ...> [pid 8856] 11:12:38.883637 write(1, " 6"..., 25 <unfinished ...> [pid 8567] 11:12:38.917770 write(1, " 3"..., 25 <unfinished ...> When write(2) takes more than a few miliseconds, it''s always just about 4 seconds. Just as I''m writing this, it recovered again. Again after one of the processes finished (this time I can tell it''s a ntfsclone, last time, it looked as if it was after an rsync finished but I can''t be sure) The ntfsclones do lseek and read on an nbd device, check a checksum on a mmapped file (on btrfs) and if the checksum doesn''t match a lseek and write on a file on btrfs, standard output and error to a file on btrfs as well (different subvolume, same as the sum file). (yes, I know it''s not network bandwidth efficient). Any idea how I could investigate this any further? Of all the btrfs issues I''ve had for that backup system, that''s the worse as it renders the whole system unusable as nightly backups last over 24 hours. Cheers, Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-18 11:39:12 +0100, Stephane Chazelas:> 2011-07-17 10:17:37 +0100, Stephane Chazelas: > > 2011-07-16 13:12:10 +0100, Stephane Chazelas: > > > Still on my btrfs-based backup system. I still see one BUG() > > > reached in btrfs-fixup per boot time, no memory exhaustion > > > anymore. There is now however something new: write performance > > > is down to a few bytes per second. > > [...] > > > > The condition that was causing that seems to have cleared by > > itself this morning before 4am. > > > > flush-btrfs-1 and sync are still in D state. > > > > Can''t really tell what cleared it. Could be when the first of > > the rsyncs ended as all the other ones (and ntfsclones from nbd > > devices) ended soon after > [...] > > New nightly backup, and it''s happening again. Started about 40 > minutes after the start of the backup.[...]> Actively running at the moment are 1 rsync and 3 ntfsclone.[...] And then again today. Interestingly, I "killall -STOP"ed all the ntfsclone and rsync processes and: # strace -tt -Te write yes > a-file-on-the-btrfs-fs 20:23:26.635848 write(1, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 4096) = 4096 <4.095223> 20:23:30.731391 write(1, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 4096) = 4096 <4.095769> 20:23:34.827390 write(1, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 4096) = 4096 <4.095788> 20:23:38.923388 write(1, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 4096) = 4096 <4.095771> Now 95% of the write(2)s take 4 seconds (while it was about 15% before I stopped the processes). [304257.760119] yes S ffff88001e8e3780 0 13179 13178 0x00000001 [304257.760119] ffff88001e8e3780 0000000000000086 0000000000000000 ffffffff8160b020 [304257.760119] 00000000000127c0 ffff880074543fd8 ffff880074543fd8 00000000000127c0 [304257.760119] ffff88001e8e3780 ffff880074542010 0000000000000286 0000000100000286 [304257.760119] Call Trace: [304257.760119] [<ffffffff81339912>] ? schedule_timeout+0xa0/0xd7 [304257.760119] [<ffffffff8105238c>] ? lock_timer_base+0x49/0x49 [304257.760119] [<ffffffffa0162e7d>] ? shrink_delalloc+0x100/0x14e [btrfs] [304257.760119] [<ffffffffa0163d66>] ? btrfs_delalloc_reserve_metadata+0xf9/0x10b [btrfs] [304257.760119] [<ffffffffa0167aa8>] ? btrfs_delalloc_reserve_space+0x20/0x3e [btrfs] [304257.760119] [<ffffffffa0182540>] ? __btrfs_buffered_write+0x137/0x2dc [btrfs] [304257.760119] [<ffffffffa017ad0e>] ? btrfs_dirty_inode+0x119/0x139 [btrfs] [304257.760119] [<ffffffffa0182a7a>] ? btrfs_file_aio_write+0x395/0x42b [btrfs] [304257.760119] [<ffffffff8100866d>] ? __switch_to+0x19c/0x288 [304257.760119] [<ffffffff810fee6d>] ? do_sync_write+0xb1/0xea [304257.760119] [<ffffffff81056522>] ? ptrace_notify+0x7f/0x9d [304257.760119] [<ffffffff811691ce>] ? security_file_permission+0x18/0x2d [304257.760119] [<ffffffff810ff7c4>] ? vfs_write+0xa4/0xff [304257.760119] [<ffffffff810116a7>] ? syscall_trace_enter+0xb6/0x15b [304257.760119] [<ffffffff810ff8d5>] ? sys_write+0x45/0x6e [304257.760119] [<ffffffff81340027>] ? tracesys+0xd9/0xde After killall -CONT, it''s back to 15% write(2)s delayed. What''s going on? -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011-07-18 20:37:25 +0100, Stephane Chazelas:> 2011-07-18 11:39:12 +0100, Stephane Chazelas: > > 2011-07-17 10:17:37 +0100, Stephane Chazelas: > > > 2011-07-16 13:12:10 +0100, Stephane Chazelas: > > > > Still on my btrfs-based backup system. I still see one BUG() > > > > reached in btrfs-fixup per boot time, no memory exhaustion > > > > anymore. There is now however something new: write performance > > > > is down to a few bytes per second. > > > [...] > > > > > > The condition that was causing that seems to have cleared by > > > itself this morning before 4am. > > > > > > flush-btrfs-1 and sync are still in D state. > > > > > > Can''t really tell what cleared it. Could be when the first of > > > the rsyncs ended as all the other ones (and ntfsclones from nbd > > > devices) ended soon after > > [...] > > > > New nightly backup, and it''s happening again. Started about 40 > > minutes after the start of the backup. > [...] > > Actively running at the moment are 1 rsync and 3 ntfsclone. > [...] > > And then again today. > > Interestingly, I "killall -STOP"ed all the ntfsclone and rsync > processes and:[...]> Now 95% of the write(2)s take 4 seconds (while it was about 15% > before I stopped the processes).[...] And this morning, after killing everything so that nothing was writing to the FS anymore, 95% of write(2)s were delayed as well (according to strace -Te write yes > file-on-btrfs). Then I rebooted (sysrq-b) and am trying btrfsck (from integration-20110705) on it, but btrfsck is using 8G of memory on a system that has only 5G so it''s swapping in and out constantly and getting nowhere (and renders the system hardly usable) I found http://thread.gmane.org/gmane.comp.file-systems.btrfs/5716/focus=5728 from last year. Is that still the case? PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1950 root 20 0 7684m 4.4g 232 R 4 91.1 4:22.87 btrfsck (and still growing) vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 2 3232016 115232 4524 3520 698 708 3305 716 991 570 3 1 56 39 0 2 3231816 111536 5976 3428 2964 532 4912 532 1569 683 1 0 46 53 0 2 3231144 105832 8144 3536 3140 24 5324 24 1612 392 1 1 38 60 0 2 3231532 104964 8180 3684 2672 900 2708 900 1017 324 1 1 34 64 -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html