Andrew Guertin
2011-Aug-09 21:29 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
On 06/21/2011 01:15 PM, Jan Stilow wrote:> Hello, > > Nirbheek Chauhan <nirbheek <at> gentoo.org> writes: >> [...] >> >> Every few minutes, (I guess) when applications do fsync (firefox, >> xchat, vim, etc), all applications that use fsync() hang for several >> seconds, and applications that use general IO suffer extreme >> slowdowns. iotop shows various combinations of the processes listed >> below doing writes, and the total write as 2-3MB/s. >> >> [btrfs-dealloc-] >> [btrfs-submit-0] >> [btrfs-transacti] >> [btrfs-endio-wri] >> [flush-btrfs-1] > > I''m using btrfs under a 2.6.39-ARCH kernel and run into the same issue. > > In my case the [btrfs-submit-0] and [btrfs-transacti] shows up in iotop > and produce 99% of IO at the time a application is frozen. For something > like 10 to 30 seconds. > > [...]I see the same issue. I have bisected it to 4e69b598f6cfb0940b75abf7e179d6020e94ad1e is the first bad commit commit 4e69b598f6cfb0940b75abf7e179d6020e94ad1e Author: Josef Bacik <josef@redhat.com> Date: Mon Mar 21 10:11:24 2011 -0400 Btrfs: cleanup how we setup free space clusters ...which came in between 2.6.38 and 2.6.39. The newest kernel I have tried was 3.0-rc7, which still had the bug. I have not tried 3.1-rc1, but plan to soon. --Andrew -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Andrew Guertin
2011-Aug-12 01:13 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
On 08/09/2011 05:29 PM, Andrew Guertin wrote:> I have not tried 3.1-rc1, but plan to soon.I''ve tested now, this does still occur in 3.1-rc1. --Andrew -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Andrew Guertin
2011-Aug-17 14:24 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
On 08/09/2011 05:29 PM, Andrew Guertin wrote:> On 06/21/2011 01:15 PM, Jan Stilow wrote: >> Hello, >> >> Nirbheek Chauhan <nirbheek <at> gentoo.org> writes: >>> [...] >>> >>> Every few minutes, (I guess) when applications do fsync (firefox, >>> xchat, vim, etc), all applications that use fsync() hang for several >>> seconds, and applications that use general IO suffer extreme >>> slowdowns. iotop shows various combinations of the processes listed >>> below doing writes, and the total write as 2-3MB/s. >>> >>> [btrfs-dealloc-] >>> [btrfs-submit-0] >>> [btrfs-transacti] >>> [btrfs-endio-wri] >>> [flush-btrfs-1] >> >> I''m using btrfs under a 2.6.39-ARCH kernel and run into the same issue. >> >> In my case the [btrfs-submit-0] and [btrfs-transacti] shows up in iotop >> and produce 99% of IO at the time a application is frozen. For something >> like 10 to 30 seconds. >> >> [...] > > I see the same issue. I have bisected it to > > 4e69b598f6cfb0940b75abf7e179d6020e94ad1e is the first bad commit > commit 4e69b598f6cfb0940b75abf7e179d6020e94ad1e > Author: Josef Bacik <josef@redhat.com> > Date: Mon Mar 21 10:11:24 2011 -0400 > > Btrfs: cleanup how we setup free space clusters > > ...which came in between 2.6.38 and 2.6.39.Any chance of someone looking at this? I (and presumably others) haven''t been able to upgrade my kernel past 2.6.38 because of this. --Andrew -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Michael Cronenworth
2011-Aug-17 14:29 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
Andrew Guertin on 08/17/2011 09:24 AM wrote:> I (and presumably others) haven''t > been able to upgrade my kernel past 2.6.38 because of this.I''m running kernel 3.0 (Fedora 15''s 2.6.40) on two boxes and I have not seen slow downs or hangs. I use Firefox. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Andrew Guertin
2011-Aug-17 14:38 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
On 08/17/2011 10:29 AM, Michael Cronenworth wrote:> Andrew Guertin on 08/17/2011 09:24 AM wrote: >> I (and presumably others) haven''t >> been able to upgrade my kernel past 2.6.38 because of this. > > I''m running kernel 3.0 (Fedora 15''s 2.6.40) on two boxes and I have not > seen slow downs or hangs. I use Firefox.Well I''d expect it to be somewhat uncommon, or it wouldn''t survive 3 kernel versions :) But at least 3 people have reported it, and for me at least it''s reliably reproducible enough to bisect, so I''m quite certain there''s something going on. --Andrew -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dave
2011-Aug-17 14:55 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
On Wed, Aug 17, 2011 at 10:38:42AM -0400, Andrew Guertin wrote:> Well I''d expect it to be somewhat uncommon, or it wouldn''t survive 3 > kernel versions :) But at least 3 people have reported it, and for me at > least it''s reliably reproducible enough to bisect, so I''m quite certain > there''s something going on.I''ve been simply living with this issue. I can reproduce it by rsyncing very large files to a btrfs volume. My entire desktop will freeze for up to three minutes and no amount of nice/ionice can temper this. Once I''ve finished the rsync certain apps will periodically hang (Firefox in particular). This behavior goes away after a reboot. I''m running kernel version 3.0. -- -=[dave]=- Entropy isn''t what it used to be.
Anand Jain
2011-Aug-18 02:41 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
Dave, good to have a test case on the 3.0 kernel. do you have btrfs as root fs ? and can you show how are you using the btrfs mainly I would need ''btrfs fi show'' let me try if I can reproduce. Thanks, Anand> I''ve been simply living with this issue. I can reproduce it by rsyncing very > large files to a btrfs volume. My entire desktop will freeze for up to three > minutes and no amount of nice/ionice can temper this. > > Once I''ve finished the rsync certain apps will periodically hang (Firefox in > particular). This behavior goes away after a reboot. > > I''m running kernel version 3.0.-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
youagree
2011-Aug-18 06:44 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
This is most probably related to the same regression seen after 2.6.38, my blocked comment on 3 August included an indication to that the behavior was present in my distro 2.6.38 kernel too, it just was appearing after a considerably longer uptime (on my desktop system using btrfs as rootfs on an Intel ICH10 driven SATA HDD). I have reverted my / to ext4 since, and I''m okay with it, although I would be very happy to see some improvement on this serious-for-me issue. Btrfs slowdown news://news.gmane.org:119/CAO47_-9BLKWUGDEuzaLqHSq9tZkAUaO8FMQEy1pPk9A2Hb+5AQ@mail.gmail.com Also, a patch by Josef Bacik was an attempt for fixing this, but no one reported about testing it on an affected system, it did not eliminate the slowdowns for me: PLEASE TEST: Everybody who is seeing weird and long hangs news://news.gmane.org:119/4E36C47E.70309@redhat.com My comment was going as an aswer to Mck''s post in "Btrfs slowdown" thread, where I reported about this in a little more detail - but it never appeared on the list. I try including it now: ________________________________________________________________________ I''m confirming this too. Following advices given on #btrfs irc, I have applied Josef''s second patch for fs/btrfs/extent_io.c and I''m reporting that it did NOT make the slowdowns disappear on 3.0 kernels (even with some rather different configs). The HDD thrashing appeared on all other kernel versions I tried, higher than 2.6.37. Initially, I had been into looking for a latest known good kernel (to prepare a proper git bisect as cwillu advised) and at first I also felt like 2.6.38 does not show this miserable behaviour. But later it turned out this was only for approximately 2 days of uptime. Given enough time, the lock-ups appeared on 2.6.38 too. Although they were not that apparent than on later kernel versions, and the individual lockups took much less time with 2.6.38 running for 2 days (binary Sabayon Linux repository kernel). My HDD, with btrfs as / on it emits very distinct (and loud enough) noises with a slightly different character for reads and writes - and I can actually hear the disk''s repetitive seek pattern during a such thrashing period. Based on that, I guess it must be the exact same thing happening with 2.6.38 as with later kernels because they sound very similar. They last much shorter but they have a similarly repetitive seeking nature with other I/O severely throttled and I believe it is write what is mostly what''s happening during a lockup. So I concluded that I failed to identify a known good version so far. I didn''t have time to get into earlier kernels than .38. (Tried .37, but for too brief of uptime to claim they did not appear when I was on .37) Similar with my current kernel. It started happening after about 12 hours of running the machine using # uname -a Linux insula 3.0.0-git15genseed #2 SMP PREEMPT Tue Aug 2 20:10:05 CEST 2011 x86_64 Intel(R) Core(TM)2 Duo CPU E4500 @ 2.20GHz GenuineIntel GNU/Linux As appended string reflects, it is a custom kernel, it has Josef''s patch applied with the config attached.Tried to patch my distro''s 3.0 kernel, no change was experienced with regards to the issue (iirc it was even a lot worse). Let me know if I can contribute with anything that would be valuable for the developers towards elimination of this very nasty bug. Now, after 23 hours of uptime, my PC has become almost unusable. Currently there''s about 8 seconds thrashing, 10 seconds not thrashing, and during thrashing, all other (disk) I/O is practically blocked. SysRq+W under thrashing (dunno how informative it is, but here''s one): [62279.779382] SysRq : Show Blocked State [62279.779389] task PC stack pid father [62279.779404] btrfs-submit-0 D 0000000000000000 5616 4678 2 0x00000000 [62279.779413] ffff88012b1370d0 0000000000000046 ffff880100000000 ffffffff8182c020 [62279.779422] ffff880128d39fd8 0000000000010480 0000000000004000 ffff880128d38000 [62279.779429] ffff880128d39fd8 0000000000010480 ffff88012b1370d0 0000000000010480 [62279.779437] Call Trace: [62279.779449] [<ffffffff812779c6>] ? cfq_set_request+0x33e/0x37e [62279.779456] [<ffffffff81277063>] ? cfq_cic_lookup+0x35/0x139 [62279.779462] [<ffffffff812773a2>] ? cfq_may_queue+0x51/0x6e [62279.779470] [<ffffffff8143ed81>] ? io_schedule+0x4e/0x63 [62279.779477] [<ffffffff8126b276>] ? get_request_wait+0xaa/0x10e [62279.779484] [<ffffffff8104f2ad>] ? wake_up_bit+0x23/0x23 [62279.779490] [<ffffffff8126c2a6>] ? __make_request+0x175/0x26b [62279.779496] [<ffffffff8126a267>] ? generic_make_request+0x224/0x289 [62279.779502] [<ffffffff8126a37f>] ? submit_bio+0xb3/0xbc [62279.779509] [<ffffffff81372238>] ? dm_any_congested+0x4f/0x57 [62279.779516] [<ffffffff81206de6>] ? run_scheduled_bios+0x246/0x3b1 [62279.779523] [<ffffffff8120c791>] ? worker_loop+0x180/0x4bb [62279.779529] [<ffffffff8120c611>] ? btrfs_queue_worker+0x24e/0x24e [62279.779535] [<ffffffff8104eee7>] ? kthread+0x7a/0x82 [62279.779542] [<ffffffff81442554>] ? kernel_thread_helper+0x4/0x10 [62279.779548] [<ffffffff8104ee6d>] ? kthread_worker_fn+0x149/0x149 [62279.779554] [<ffffffff81442550>] ? gs_change+0xb/0xb [62279.779560] btrfs-transacti D 0000000000000001 3856 4689 2 0x00000000 [62279.779568] ffff88012b205320 0000000000000046 0000000000000000 ffff88012b06d320 [62279.779576] ffff880128d97fd8 0000000000010480 0000000000004000 ffff880128d96000 [62279.779583] ffff880128d97fd8 0000000000010480 ffff88012b205320 0000000000010480 [62279.779591] Call Trace: [62279.779597] [<ffffffff8120152f>] ? alloc_extent_state+0x12/0x55 [62279.779605] [<ffffffff810aefbe>] ? kmem_cache_free+0x87/0x8e [62279.779611] [<ffffffff8127e2ab>] ? rb_erase+0x134/0x26f [62279.779617] [<ffffffff81081326>] ? __lock_page+0x63/0x63 [62279.779622] [<ffffffff8143ed81>] ? io_schedule+0x4e/0x63 [62279.779628] [<ffffffff8108132f>] ? sleep_on_page+0x9/0x10 [62279.779633] [<ffffffff81081326>] ? __lock_page+0x63/0x63 [62279.779638] [<ffffffff8143f36c>] ? __wait_on_bit+0x3e/0x71 [62279.779644] [<ffffffff810814c9>] ? wait_on_page_bit+0x6a/0x70 [62279.779650] [<ffffffff8104f2d7>] ? autoremove_wake_function+0x2a/0x2a [62279.779657] [<ffffffff811ebb53>] ? btrfs_wait_marked_extents+0xf5/0x12f [62279.779664] [<ffffffff811ebbb6>] ? btrfs_write_and_wait_marked_extents+0x29/0x3d [62279.779670] [<ffffffff811ec2b0>] ? btrfs_commit_transaction+0x5c7/0x6e8 [62279.779677] [<ffffffff810433c4>] ? del_timer_sync+0x34/0x3e [62279.779682] [<ffffffff8143f1bd>] ? schedule_timeout+0x182/0x1a0 [62279.779688] [<ffffffff8104f2ad>] ? wake_up_bit+0x23/0x23 [62279.779694] [<ffffffff811ec801>] ? start_transaction+0x1e0/0x21a [62279.779700] [<ffffffff811e66c4>] ? transaction_kthread+0x180/0x238 [62279.779706] [<ffffffff811e6544>] ? btrfs_congested_fn+0x87/0x87 [62279.779712] [<ffffffff811e6544>] ? btrfs_congested_fn+0x87/0x87 [62279.779718] [<ffffffff8104eee7>] ? kthread+0x7a/0x82 [62279.779724] [<ffffffff81442554>] ? kernel_thread_helper+0x4/0x10 [62279.779730] [<ffffffff8104ee6d>] ? kthread_worker_fn+0x149/0x149 [62279.779736] [<ffffffff81442550>] ? gs_change+0xb/0xb [62279.779759] btrfs-endio-wri D 0000000000000000 4208 11320 2 0x00000000 [62279.779767] ffff88012b173570 0000000000000046 0000000000000000 ffffffff8182c020 [62279.779775] ffff88011afa9fd8 0000000000010480 0000000000004000 ffff88011afa8000 [62279.779782] ffff88011afa9fd8 0000000000010480 ffff88012b173570 0000000000010480 [62279.779789] Call Trace: [62279.779796] [<ffffffff8126a267>] ? generic_make_request+0x224/0x289 [62279.779802] [<ffffffff811faaeb>] ? lookup_extent_mapping+0x37/0xb3 [62279.779808] [<ffffffff81081326>] ? __lock_page+0x63/0x63 [62279.779813] [<ffffffff8143ed81>] ? io_schedule+0x4e/0x63 [62279.779818] [<ffffffff8108132f>] ? sleep_on_page+0x9/0x10 [62279.779823] [<ffffffff81081326>] ? __lock_page+0x63/0x63 [62279.779828] [<ffffffff8143f36c>] ? __wait_on_bit+0x3e/0x71 [62279.779834] [<ffffffff810814c9>] ? wait_on_page_bit+0x6a/0x70 [62279.779840] [<ffffffff8104f2d7>] ? autoremove_wake_function+0x2a/0x2a [62279.779846] [<ffffffff81205835>] ? read_extent_buffer_pages+0x318/0x39b [62279.779852] [<ffffffff811e5a9e>] ? verify_parent_transid+0x1d9/0x1d9 [62279.779859] [<ffffffff811e6c95>] ? btree_read_extent_buffer_pages.clone.66+0x58/0xb2 [62279.779865] [<ffffffff811e78b7>] ? read_tree_block+0x31/0x44 [62279.779871] [<ffffffff811d1a8a>] ? read_block_for_search.clone.41+0x309/0x33f [62279.779878] [<ffffffff812115fa>] ? btrfs_tree_read_unlock+0x9/0x33 [62279.779884] [<ffffffff811cd235>] ? unlock_up+0x114/0x140 [62279.779890] [<ffffffff811d4203>] ? btrfs_search_slot+0x7e7/0xa5e [62279.779897] [<ffffffff811d54fc>] ? btrfs_insert_empty_items+0x62/0xb3 [62279.779904] [<ffffffff811da616>] ? alloc_reserved_file_extent.clone.68+0x9b/0x213 [62279.779911] [<ffffffff811dd08c>] ? run_clustered_refs+0x61f/0x70b [62279.779918] [<ffffffff811dd241>] ? btrfs_run_delayed_refs+0xc9/0x1cd [62279.779924] [<ffffffff811ec46f>] ? __btrfs_end_transaction+0x83/0x1e2 [62279.779931] [<ffffffff811f171d>] ? btrfs_finish_ordered_io+0x280/0x2a5 [62279.779937] [<ffffffff81202316>] ? end_bio_extent_writepage+0xa0/0x14a [62279.779943] [<ffffffff8120c791>] ? worker_loop+0x180/0x4bb [62279.779949] [<ffffffff8120c611>] ? btrfs_queue_worker+0x24e/0x24e [62279.779955] [<ffffffff8104eee7>] ? kthread+0x7a/0x82 [62279.779962] [<ffffffff81442554>] ? kernel_thread_helper+0x4/0x10 [62279.779968] [<ffffffff8104ee6d>] ? kthread_worker_fn+0x149/0x149 [62279.779974] [<ffffffff81442550>] ? gs_change+0xb/0xb # mount | grep btrfs /dev/mapper/vg0-rootvol on / type btrfs (rw,relatime) Thanks for all efforts. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Samuel
2011-Aug-18 06:47 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
On 18/08/11 00:29, Michael Cronenworth wrote:> I''m running kernel 3.0 (Fedora 15''s 2.6.40) on two boxes > and I have not seen slow downs or hangs. I use Firefox.I''ve got btrfs on an external USB drive with the 3.0.1 kernel and I see that sync seems to take an age, according to iotop it seems that the btrfs processes are hitting it quite hard, IIRC. cheers, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
youagree
2011-Aug-18 06:58 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
Are these processes principally btrfs-submit and btrfs-transacti in particular? Then it may be related to my very similar issue reported earlier. On 08/18/2011 08:47 AM, Chris Samuel wrote:> On 18/08/11 00:29, Michael Cronenworth wrote: > >> I''m running kernel 3.0 (Fedora 15''s 2.6.40) on two boxes >> and I have not seen slow downs or hangs. I use Firefox. > > I''ve got btrfs on an external USB drive with the 3.0.1 kernel and > I see that sync seems to take an age, according to iotop it seems > that the btrfs processes are hitting it quite hard, IIRC. > > cheers, > Chris-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Andrew Guertin
2011-Aug-18 07:29 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
On 08/18/2011 02:44 AM, youagree wrote:> Also, a patch by Josef Bacik was an attempt for fixing this, but no one > reported about testing it on an affected system, it did not eliminate > the slowdowns for me: > > PLEASE TEST: Everybody who is seeing weird and long hangs > news://news.gmane.org:119/4E36C47E.70309@redhat.comI had not seen this (actually, I had skimmed it but not thought it was relevant). I will try it as soon as I get a chance.> The HDD thrashing appeared on all other kernel versions I tried, higher > than 2.6.37. > Initially, I had been into looking for a latest known good kernel (to > prepare a proper git bisect as cwillu advised) and at first I also felt > like 2.6.38 does not show this miserable behaviour. But later it turned > out this was only for approximately 2 days of uptime. Given enough time, > the lock-ups appeared on 2.6.38 too. Although they were not that > apparent than on later kernel versions, and the individual lockups took > much less time with 2.6.38 running for 2 days (binary Sabayon Linux > repository kernel).I have not seen slowdowns on 2.6.38. More specifically, I observe the following behaviors after commit 4e69b59: * Many processes occasionally hang for a short time * When this happens, my cpu monitor shows a short burst of cpu activity (100% of 1 core) followed by a longer period of IO * When this happens, iotop shows [btrfs-submit-0] and [btrfs-transacti] at the top of the list * Behavior slowly increases in duration (and frequency?) over time, and goes away with a reboot * Heavy IO makes behavior appear faster ... and the following behaviors before commit 4e69b59: * Occasional spikes of IO on cpu monitor concurrent with [btrfs-submit-0] and [btrfs-transacti] at top of iotop * No hangs, even when that occurs I wasn''t taking notes or anything though, so I''m not 100% certain I was observing or interpreting or remembering everything correctly. --Andrew -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Andrew Guertin
2011-Aug-18 07:41 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
On 08/17/2011 10:41 PM, Anand Jain wrote:> Dave, > > good to have a test case on the 3.0 kernel. do you have btrfs as > root fs ? and > can you show how are you using the btrfs mainly I would need > ''btrfs fi show'' let me try if I can reproduce. > > Thanks, AnandPersonally, I find that large compiles are very "useful" in making the issue occur sooner. I''m on gentoo, so when I was bisecting, I''d often just emerge openoffice and let it run for a while. For observing, the best way I found was to run JOSM (Java OpenStreetMap editor). Browsing around a map is very interactive, so it''s immediately noticeable when it hangs, and downloading map tiles all the time uses a lot of IO. In-browser map applications would probably work too. My filesystem is partitioned with a small ext2 /boot as sda1, a 2GB swap as sda2, and the remaining space as btrfs / on sda3. btrfs fi show gives: Label: none uuid: 28559ad8-7db8-402b-a93d-27ec9c5e943b Total devices 1 FS bytes used 102.83GB devid 1 size 144.90GB used 144.90GB path /dev/sda3 Btrfs v0.19-35-g1b444cd-dirty --Andrew -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
youagree
2011-Aug-18 07:55 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
On 08/18/2011 09:29 AM, Andrew Guertin wrote:> * Many processes occasionally hang for a short time > * When this happens, my cpu monitor shows a short burst of cpu activity > (100% of 1 core) followed by a longer period of IO > * When this happens, iotop shows [btrfs-submit-0] and [btrfs-transacti] > at the top of the list > * Behavior slowly increases in duration (and frequency?) over time, and > goes away with a reboot > * Heavy IO makes behavior appear faster > > ... and the following behaviors before commit 4e69b59: > > * Occasional spikes of IO on cpu monitor concurrent with > [btrfs-submit-0] and [btrfs-transacti] at top of iotop > * No hangs, even when that occursYes, exactly that happened in my case too. Yours is a much more precise description! I did not diagnose 2.6.38 further because I just wanted to establish a known-good version and at first sight (2 days uptime) my HDD behavior showed that it cannot be good if _any_ HDD thrashing appears at all in the first place... I was able to work with the computer during those IO spikes on 2.6.38 too, although it was observable that the HDD is being thrased (meanwhile, LED was almost constant lit). But it didn''t cause other programs to be unresponsive, I confirm...> I wasn''t taking notes or anything though, so I''m not 100% certain I was > observing or interpreting or remembering everything correctly. > > --Andrew > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Andrew Guertin
2011-Aug-18 11:45 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
On 08/18/2011 03:29 AM, Andrew Guertin wrote:> I have not seen slowdowns on 2.6.38. More specifically, I observe the > following behaviors after commit 4e69b59: > > * Many processes occasionally hang for a short time > * When this happens, my cpu monitor shows a short burst of cpu activity > (100% of 1 core) followed by a longer period of IO > * When this happens, iotop shows [btrfs-submit-0] and [btrfs-transacti] > at the top of the list > * Behavior slowly increases in duration (and frequency?) over time, and > goes away with a reboot > * Heavy IO makes behavior appear faster > > ... and the following behaviors before commit 4e69b59: > > * Occasional spikes of IO on cpu monitor concurrent with > [btrfs-submit-0] and [btrfs-transacti] at top of iotop > * No hangs, even when that occurs > > I wasn''t taking notes or anything though, so I''m not 100% certain I was > observing or interpreting or remembering everything correctly.I''ve investigated a little more, and have a few things to add: Before commit 4e69b59: * In the IO spikes where [btrfs-submit-0] and [btrfs-transacti] are at the top of iotop, there is no short burst of cpu activity preceding them * When running gentoo''s emerge --sync (which IIRC is mainly an rsync of ~200MB of small files), output appears to pause during these spikes. I wasn''t able to tell if output stopped entirely or just slowed down. --Andrew -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2011-Aug-18 14:38 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
Excerpts from Andrew Guertin''s message of 2011-08-11 21:13:18 -0400:> On 08/09/2011 05:29 PM, Andrew Guertin wrote: > > I have not tried 3.1-rc1, but plan to soon. > > I''ve tested now, this does still occur in 3.1-rc1.Ok, I had high hopes that the btrfs changes in rc1 would fix this. Could you please try with the deadline elevator instead of the cfq default? -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Samuel
2011-Aug-19 07:34 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
On 18/08/11 16:58, youagree wrote:> Are these processes principally btrfs-submit and btrfs-transacti > in particular? > > Then it may be related to my very similar issue reported earlier.I spent a little bit of time last night looking at it and it seems that what I''m seeing also affects ext4 on my local SATA mirror too, so whatever is going on doesn''t appear to be related to btrfs. So ignore my comment.. ;-) cheers, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Anand Jain
2011-Aug-19 09:58 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
Andrew, Facing some challenges to test this. If you have a chance to test it again, the following output will be interesting to observe. iostat -ctx -p sda 3 > /tmp/iostat.out Also note your system time when this problem occurs, (iostat has time stamp, I wish see the waitQ and activeQ at that time, hopefully captured in the file /tmp/iostat.out above). We need more clarity on the test-case which can reproduce this issue. As I know you are writing into the btrfs. However is that a large number of small files (you are creating new files) OR you are writing a few large new files ? If there is any script to test this that will make understanding a lot easier. PS: Does anybody know Solaris lockstat(1M) equivalent in Linux ? Thanks, Anand -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Andrew Guertin
2011-Aug-20 17:18 UTC
Re: Applications using fsync cause hangs for several seconds every few minutes
On 08/18/2011 10:38 AM, Chris Mason wrote:> Excerpts from Andrew Guertin''s message of 2011-08-11 21:13:18 -0400: >> On 08/09/2011 05:29 PM, Andrew Guertin wrote: >>> I have not tried 3.1-rc1, but plan to soon. >> >> I''ve tested now, this does still occur in 3.1-rc1. > > Ok, I had high hopes that the btrfs changes in rc1 would fix this. > > Could you please try with the deadline elevator instead of the cfq > default?The deadline elevator does not fix it (tested with 3.1-rc2) Sorry for taking a long time with this. --Andrew -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html