I''m running into a significant slowdown in Btrfs (> 10x slower than normal) that appears to be due to some issue between how Btrfs is allocating memory, and how the kernel is expecting Btrfs to allocate memory. The problem does seem to be somewhat hardware specific. I can reproduce on two of my computers (an older AMD Athlon(tm) XP 2600+ with PATA, and a newer ACER Aspire netbook with an Atom CPU). My Core2Duo computer with SATA seems unaffected by this slowdown. I''ve replicated this on 2.6.38, 2.6.39, and 3.0 kernels. The following information was all obtained running on a 3.0 kernel merged with the latest ''for-linus'' branch of Chris'' git repo. I''ve also tested on ext4 (no slow down encountered) to make sure the issue wasn''t completely unrelated to Btrfs. The steps to reproduce are as follows: Prerequisite: Have a btrfs partition with a copy of a linux kernel git repository stored. (1) Boot with 768 MB RAM (using ''mem=768M'' in the grub command line). (2) From a second machine, run a git clone of of the kernel git repository (such as ''git clone ssh://<user>@<address>/path/to/linux-git-repo''). The clone process slows down when it reaches the ''remote: Compressing objects:'' step. Looking at the Alt-SysRq-W output and Latencytop output (see attached), I get a steady stream of memory page faults, and other memory issues. The git clone is definitely causing memory pressure when booted with only 768MB of RAM. However, I still see plenty of cached RAM available, and there is little or no activity on my swap partition. The dmesg output is otherwise silent except for the Alt-SysRq-W output. No OOM errors. A typical ''top'' snapshot during the affected period looks like this: top - 08:53:08 up 32 min, 3 users, load average: 1.06, 1.01, 0.84 Tasks: 104 total, 1 running, 103 sleeping, 0 stopped, 0 zombie Cpu(s): 2.3%us, 12.3%sy, 0.0%ni, 0.0%id, 85.1%wa, 0.0%hi, 0.3%si, 0.0%st Mem: 768452k total, 760248k used, 8204k free, 4396k buffers Swap: 1004056k total, 13824k used, 990232k free, 352596k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2876 root 20 0 0 0 0 S 11.0 0.0 1:26.62 btrfs-endio-1 3117 dontpani 20 0 720m 386m 52m D 4.0 51.5 2:38.78 git 526 root 20 0 0 0 0 S 0.3 0.0 0:06.42 kswapd0 2576 root 20 0 0 0 0 S 0.3 0.0 0:44.09 btrfs-endio-0 1 root 20 0 1844 568 540 S 0.0 0.1 0:00.32 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0 5 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kworker/u:0 6 root -2 0 0 0 0 S 0.0 0.0 0:04.17 rcu_kthread 7 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 cpuset 8 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 khelper So, while I may be truly running out of RAM, the kernel doesn''t seem to be handling issue normally (i.e., pushing more off to the Swap or giving OOM errors). Let me know if you have some feedback on how to track this issue down.
Excerpts from Mitch Harder''s message of 2011-08-02 10:35:54 -0400:> I''m running into a significant slowdown in Btrfs (> 10x slower than > normal) that appears to be due to some issue between how Btrfs is > allocating memory, and how the kernel is expecting Btrfs to allocate > memory. > > The problem does seem to be somewhat hardware specific. I can > reproduce on two of my computers (an older AMD Athlon(tm) XP 2600+ > with PATA, and a newer ACER Aspire netbook with an Atom CPU). My > Core2Duo computer with SATA seems unaffected by this slowdown. > > I''ve replicated this on 2.6.38, 2.6.39, and 3.0 kernels. The > following information was all obtained running on a 3.0 kernel merged > with the latest ''for-linus'' branch of Chris'' git repo. I''ve also > tested on ext4 (no slow down encountered) to make sure the issue > wasn''t completely unrelated to Btrfs.Just to double check, what was the top commit of for-linus when you did this? The tracing shows that you''re spending your time in mmap''d readahead. So one of three things is happening: 1) The VM is favoring our metadata over data pages for the git packed file 2) We''re reading ahead too aggressively, or not aggressively enough 3) The git pack file is somehow more fragmented, and this is making the read ahead much less effective. The very first thing I''d check is to make sure the .git repo between the slow machines and the fast machines are identical. Git does a lot of packing behind the scenes, and so an older repo that isn''t freshly cloned is going to be slower than a new repo. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 4, 2011 at 10:05 AM, Chris Mason <chris.mason@oracle.com> wrote:> Excerpts from Chris Mason''s message of 2011-08-04 11:04:54 -0400: >> Excerpts from Mitch Harder''s message of 2011-08-04 10:45:51 -0400: >> > On Thu, Aug 4, 2011 at 9:22 AM, Chris Mason <chris.mason@oracle.com> wrote: >> > > Excerpts from Mitch Harder''s message of 2011-08-02 10:35:54 -0400: >> > >> I''m running into a significant slowdown in Btrfs (> 10x slower than >> > >> normal) that appears to be due to some issue between how Btrfs is >> > >> allocating memory, and how the kernel is expecting Btrfs to allocate >> > >> memory. >> > >> >> > >> The problem does seem to be somewhat hardware specific. I can >> > >> reproduce on two of my computers (an older AMD Athlon(tm) XP 2600+ >> > >> with PATA, and a newer ACER Aspire netbook with an Atom CPU). My >> > >> Core2Duo computer with SATA seems unaffected by this slowdown. >> > >> >> > >> I''ve replicated this on 2.6.38, 2.6.39, and 3.0 kernels. The >> > >> following information was all obtained running on a 3.0 kernel merged >> > >> with the latest ''for-linus'' branch of Chris'' git repo. I''ve also >> > >> tested on ext4 (no slow down encountered) to make sure the issue >> > >> wasn''t completely unrelated to Btrfs. >> > > >> > > Just to double check, what was the top commit of for-linus when you did >> > > this? >> > > >> > > The tracing shows that you''re spending your time in mmap''d readahead. >> > > So one of three things is happening: >> > > >> > > 1) The VM is favoring our metadata over data pages for the git packed >> > > file >> > > >> > > 2) We''re reading ahead too aggressively, or not aggressively enough >> > > >> > > 3) The git pack file is somehow more fragmented, and this is making the >> > > read ahead much less effective. >> > > >> > > The very first thing I''d check is to make sure the .git repo between the >> > > slow machines and the fast machines are identical. Git does a lot of >> > > packing behind the scenes, and so an older repo that isn''t freshly >> > > cloned is going to be slower than a new repo. >> > > >> > > -chris >> > > >> > >> > The top commit merged for the kernel used to generate the information >> > in this post was: >> > >> > Btrfs: make sure reserve_metadata_bytes doesn''t leak out strange errors >> > 75c195a2cac2c3c8366c0b87de2d6814c4f4d638 >> > >> > I have since replicated the slowdown with a kernel merged with the >> > latest ''for-linus'' branch, whose top commit was: >> > Btrfs: don''t call writepages from within write_full_page >> > 0d10ee2e6deb5c8409ae65b970846344897d5e4e >> >> Ok, so I''m going to guess that your problem is really with either file >> layout or just us using more metadata pages than the others. The file >> layout part is easy to test, just replace your git repo with a fresh >> clone (or completely repack it). > > Sorry, I should have said replace your git repo with a fresh, > non-hardlinked clone. git clone by default will just make hardlinks if > it can, so it has to be a fresh clone. > > -chris >Oops, sorry, I let my responses slip off the list. You are right about there being a potentially huge difference between a cloned git repo and it''s parent. I didn''t realize it could make such a difference. This problem now appears to have nothing to do with btrfs. I can replicate the problem on an ext4 partition also if I use a copy of the parent git repository instead of a clone. The problem seems to lie in the fragmentation of the git repository. If I work with a clone of my linux-btrfs repository, subsequent clones are much faster. Cloning my parent linux-btrfs repo takes about 90 minutes (when I have restricted free RAM). Cloning a clone of the parent drops down to less than 10 minutes. With there being several other threads relating to btrfs ''slow downs'', I though this issue might be related. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Mitch Harder''s message of 2011-08-04 14:40:20 -0400:> On Thu, Aug 4, 2011 at 10:05 AM, Chris Mason <chris.mason@oracle.com> wrote: > >> > >> Ok, so I''m going to guess that your problem is really with either file > >> layout or just us using more metadata pages than the others. Â The file > >> layout part is easy to test, just replace your git repo with a fresh > >> clone (or completely repack it). > > > > Sorry, I should have said replace your git repo with a fresh, > > non-hardlinked clone. Â git clone by default will just make hardlinks if > > it can, so it has to be a fresh clone. > > > > -chris > > > > Oops, sorry, I let my responses slip off the list. > > You are right about there being a potentially huge difference between > a cloned git repo and it''s parent. I didn''t realize it could make > such a difference. > > This problem now appears to have nothing to do with btrfs. I can > replicate the problem on an ext4 partition also if I use a copy of the > parent git repository instead of a clone. The problem seems to lie in > the fragmentation of the git repository. > > If I work with a clone of my linux-btrfs repository, subsequent clones > are much faster. Cloning my parent linux-btrfs repo takes about 90 > minutes (when I have restricted free RAM). Cloning a clone of the > parent drops down to less than 10 minutes. > > With there being several other threads relating to btrfs ''slow downs'', > I though this issue might be related.Great, glad to hear turned out to be filesystem agnostic. The original git file format was basically very filesystem unfriendly and it tends to fragment very badly. Linus'' solution to this is the pack file format, which is space efficient and very fast to access. The only downside is that you need to repack the repo from time to time or performance tends to fall off a cliff. There is a git-pack command and a git gc command that you can use to restructure things, both making it smaller and much faster. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html