thr3ads.net - Btrfs devel - New experimental btrfs branch ready for testing [Jun 2009]

If this information is useful, please help other people find it:
Share via:

Chris Mason

2009-Jun-01 21:04 UTC

New experimental btrfs branch ready for testing

Hello everyone,

Yan Zheng has been doing some major surgery to the back references and
extent allocation code, tackling bottlenecks in the code that tracks
extents.  It scales better with many snapshots and performs better in
the common case of no snapshots at all.

THE NEW CODE IS A FORWARD ROLLING DISK FORMAT CHANGE.  This means it is
compatible with the current btrfs disk format, but once you mount a
filesystem with the new code, it WILL NO LONGER BE MOUNTABLE FROM OLD
KERNELS.  Old kernels spit out an error message when you try them on new
format filesystems.

This is a large change, and I''m hoping to have it stable in time for
the
2.6.31 merge window.  I''ve been testing it for about a week now, and
haven''t been able to cause major problems yet.  But, testing the
compatibility with old format filesystems is the hard part, and
everyone that pulls the new code should backup their data first.

I''ve setup git branches called newformat where you can pull the new
code.

For the kernel (based on 2.6.30-rc7):

git pull git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git
newformat

For the progs:

git pull
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs-unstable.git
newformat

The main benefit of the new code is that backrefs on the extent
allocation tree use a fuzzier format.  It basically means that we search
for the key in the extent allocation tree instead of providing an exact
backref to the parent block.

This means we can predict how many blocks will be changed when changing
the extent allocation tree, and it makes enospc much less complex.  It
is also significantly faster.

For regular subvolume trees, a similar change is made as long as there
are no snapshots against a given block.  This is the common case, and it
makes COW less expensive overall.

Yan Zheng also worked out a way to free blocks during the transaction
without needing to do an explicit snapshot deletion on the old root when
the transaction was done.  This gets rid of some complex caching code,
and fixes worst-case problems where btrfs could take a very very long
time to unmount.

btrfs-vol -b is faster with the new code as well, he added caching of
high levels in the tree to speed things up.

(Many kudos to Yan Zheng for all of this work!)

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason

2009-Jun-02 13:28 UTC

head link

Re: New experimental btrfs branch ready for testing

On Mon, Jun 01, 2009 at 05:04:47PM -0400, Chris Mason
wrote:> Hello everyone,
> 
> Yan Zheng has been doing some major surgery to the back references and
> extent allocation code, tackling bottlenecks in the code that tracks
> extents.  It scales better with many snapshots and performs better in
> the common case of no snapshots at all.
> 
> THE NEW CODE IS A FORWARD ROLLING DISK FORMAT CHANGE.  This means it is
> compatible with the current btrfs disk format, but once you mount a
> filesystem with the new code, it WILL NO LONGER BE MOUNTABLE FROM OLD
> KERNELS.  Old kernels spit out an error message when you try them on new
> format filesystems.
Just a quick note that I''m having some issues with the backward
compatibility code on 32 bit kernels.  It can still read all the old
items but it is having problems with creating new backrefs.

32bit is working fine on an entirely new format FS, and my 64 bit box
can read and write the old format FS just fine.  I''m hoping to track
this one down today, but it would be a good idea to wait if you want to
try the new code on old filesystems on 32 bit machines.

If you do hit crashes, please don''t immediately reformat your FS if you
can avoid it.  We should be able to fix most problems people hit.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason

2009-Jun-03 17:08 UTC

head link

Re: New experimental btrfs branch ready for testing

On Tue, Jun 02, 2009 at 09:28:30AM -0400, Chris Mason
wrote:> On Mon, Jun 01, 2009 at 05:04:47PM -0400, Chris Mason wrote:
> > Hello everyone,
> > 
> > Yan Zheng has been doing some major surgery to the back references and
> > extent allocation code, tackling bottlenecks in the code that tracks
> > extents.  It scales better with many snapshots and performs better in
> > the common case of no snapshots at all.
> > 
> > THE NEW CODE IS A FORWARD ROLLING DISK FORMAT CHANGE.  This means it
is
> > compatible with the current btrfs disk format, but once you mount a
> > filesystem with the new code, it WILL NO LONGER BE MOUNTABLE FROM OLD
> > KERNELS.  Old kernels spit out an error message when you try them on
new
> > format filesystems.
> 
> Just a quick note that I''m having some issues with the backward
> compatibility code on 32 bit kernels.  It can still read all the old
> items but it is having problems with creating new backrefs.
> 
> 32bit is working fine on an entirely new format FS, and my 64 bit box
> can read and write the old format FS just fine.  I''m hoping to
track
> this one down today, but it would be a good idea to wait if you want to
> try the new code on old filesystems on 32 bit machines.
> 
> If you do hit crashes, please don''t immediately reformat your FS
if you
> can avoid it.  We should be able to fix most problems people hit.
Looks like Yan Zheng tracked this down yesterday, Jens Axboe bravely
tested out 32bit old format compat again with his laptop.  At this point
I think the new format code is looking pretty stable and it is generally
ready for more testing.

I''ve rebased the newformat kernel tree to fold in the corruption fixes.
This way if anyone does a git bisect they won''t end up on a commit that
can corrupt their FS by accident.  If you''ve already pulled the
newformat tree, the new commits will conflict with the old.

So, something like this will fix things if you have already pulled the
newformat branch:

git reset --hard v2.6.30-rc7
git pull git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git
newformat

If you''ve made your own commits or pulls other than just btrfs,
different steps will be required.

The btrfs-progs unstable tree was also rebased.

Use git reset --hard ed20f5fc905145a0673097b539442d2a59491e77 on the
progs tree if you''ve already pulled down the newformat branch.

Happy testing everyone

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Steven Pratt

2009-Jun-04 19:02 UTC

head link

Re: New experimental btrfs branch ready for testing

Chris Mason wrote:> Hello everyone,
>
> Yan Zheng has been doing some major surgery to the back references and
> extent allocation code, tackling bottlenecks in the code that tracks
> extents.  It scales better with many snapshots and performs better in
> the common case of no snapshots at all.
>
> THE NEW CODE IS A FORWARD ROLLING DISK FORMAT CHANGE.  This means it is
> compatible with the current btrfs disk format, but once you mount a
> filesystem with the new code, it WILL NO LONGER BE MOUNTABLE FROM OLD
> KERNELS.  Old kernels spit out an error message when you try them on new
> format filesystems.
>
> This is a large change, and I''m hoping to have it stable in time
for the
> 2.6.31 merge window.  I''ve been testing it for about a week now,
and
> haven''t been able to cause major problems yet.  But, testing the
> compatibility with old format filesystems is the hard part, and
> everyone that pulls the new code should backup their data first.
>
> I''ve setup git branches called newformat where you can pull the
new code.
>
> For the kernel (based on 2.6.30-rc7):
>
> git pull
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git newformat
>
>   So I started the performance runs on this. The base tests completed fine 
on the raid system and I will post results as soon as I can finish 
postprocessing, but when I tried to do nodatacow that machine it crashed 
pretty early. Here is console log:

btrfs2 kernel: [82057.882255] ------------[ cut here ]------------
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] invalid opcode: 0000 [#1] SMP
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] last sysfs file: 
/sys/devices/system/cpu/cpu15/cache/index1/shared_cpu_map
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] Stack:
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535]  ffff88011786d800 ffff8801259f6ea0 
000000b21f256030 00000000000000e9
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535]  000000352231b250 ffff880089abbf40 
ffff88013d0e2440 0000000000000001
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] Call Trace:
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa0445198>] 
run_one_delayed_ref+0x382/0x42f [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa0464bd1>] ? 
map_extent_buffer+0xab/0xbe [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa0445f75>] 
run_clustered_refs+0x237/0x2b4 [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa0478f85>] ? 
btrfs_find_ref_cluster+0xdc/0x115 [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa044609e>] 
btrfs_run_delayed_refs+0xac/0x195 [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa044e86e>] 
__btrfs_end_transaction+0x59/0xfe [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa044e92e>] 
btrfs_end_transaction+0xb/0xd [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa045418b>] 
btrfs_finish_ordered_io+0x224/0x24d [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa04541c4>] 
btrfs_writepage_end_io_hook+0x10/0x12 [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa0467599>] 
end_bio_extent_writepage+0xa3/0x18f [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffff8024276e>] ? 
del_timer_sync+0x14/0x20
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffff802cbbee>] bio_endio+0x26/0x28
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa044b5d6>] 
end_workqueue_fn+0x111/0x11e [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa046eff5>] 
worker_loop+0x67/0x1ee [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa046ef8e>] ? 
worker_loop+0x0/0x1ee [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffff8024c324>] kthread+0x56/0x86
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffff8020c9fa>] child_rip+0xa/0x20
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffff8024c2ce>] ? kthread+0x0/0x86
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffff8020c9f0>] ? child_rip+0x0/0x20
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] Code: 08 4c 8d 45 d4 41 8d 44 24 18 48 8b 
73 20 48 8b 4d 18 41 b9 01 00 00 00 48 8b 7d b8 4c 89 ea 89 45 d4 e8 df 
e3 ff ff 85 c0 74 04 <0f> 0b eb fe 49 63 75 40 4d 8b 65 00 49 83 cf 01 
4c 89 e7 48 6b
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...


I also ran this on the single disk system and it did not make it through 
base tests.  Error are different.

[101511.664497] Pid: 28597, comm: btrfs-transacti Tainted: G      D    
2.6.30-rc7-autokern1 #1 IBM x3950-[88726RU]-
[101511.675497] RIP: 0010:[<ffffffff804cd70d>]  [<ffffffff804cd70d>]
_spin_lock+0x14/0x1a
[101511.684494] RSP: 0018:ffff8801309bbb40  EFLAGS: 00000297
[101511.689494] RAX: 0000000000001514 RBX: ffff8801309bbb40 RCX: 
ffff8801309bbb40
[101511.697493] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
ffff8800b7427d70
[101511.705491] RBP: ffffffff8020c50e R08: 0000000000000001 R09: 
ffff8801309bba68
[101511.713490] R10: ffff88012231b910 R11: ffff8800478ad5b0 R12: 
0000001a00000032
[101511.721488] R13: ffffffffa04370b1 R14: ffff8801309bbb60 R15: 
00000000000003bf
[101511.729486] FS:  0000000000000000(0000) GS:ffff88002bac0000(0000) 
knlGS:0000000000000000
[101511.738483] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[101511.744482] CR2: 00007fbcd3ff1b80 CR3: 0000000000201000 CR4: 
00000000000006e0
[101511.752480] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[101511.760479] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[101511.768478] Call Trace:
[101511.771478]  [<ffffffffa0471187>] ? btrfs_try_spin_lock+0x1c/0x61 
[btrfs]
[101511.778476]  [<ffffffffa043ea17>] ? btrfs_search_slot+0x619/0x73e 
[btrfs]
[101511.786474]  [<ffffffffa043f11d>] ? 
btrfs_insert_empty_items+0x5e/0xa9 [btrfs]
[101511.803472]  [<ffffffffa0440ce0>] ? 
alloc_reserved_file_extent+0x89/0x1c3 [btrfs]
[101511.811470]  [<ffffffffa04401d8>] ? 
update_reserved_extents+0x98/0xab [btrfs]
[101511.819468]  [<ffffffffa0445198>] ? run_one_delayed_ref+0x382/0x42f 
[btrfs]
[101511.827467]  [<ffffffff802a5387>] ? cache_flusharray+0xa2/0xae
[101511.833466]  [<ffffffffa0445f75>] ? run_clustered_refs+0x237/0x2b4 
[btrfs]
[101511.840463]  [<ffffffffa0478f85>] ? 
btrfs_find_ref_cluster+0xdc/0x115 [btrfs]
[101511.848462]  [<ffffffff804cbdad>] ? thread_return+0x3e/0x91
[101511.854461]  [<ffffffffa044609e>] ? 
btrfs_run_delayed_refs+0xac/0x195 [btrfs]
[101511.862459]  [<ffffffffa044f59f>] ? 
btrfs_commit_transaction+0x7b/0x69c [btrfs]
[101511.870458]  [<ffffffff8024c460>] ? autoremove_wake_function+0x0/0x38
[101511.877458]  [<ffffffffa044ee87>] ? start_transaction+0x103/0x10f 
[btrfs]
[101511.885456]  [<ffffffffa044c2c6>] ? transaction_kthread+0x17f/0x20a 
[btrfs]
[101511.892453]  [<ffffffffa044c147>] ? transaction_kthread+0x0/0x20a 
[btrfs]
[101511.900453]  [<ffffffffa044c147>] ? transaction_kthread+0x0/0x20a 
[btrfs]
[101511.907452]  [<ffffffff8024c324>] ? kthread+0x56/0x86
[101511.912450]  [<ffffffff8020c9fa>] ? child_rip+0xa/0x20
[101511.918449]  [<ffffffff8024c2ce>] ? kthread+0x0/0x86
[101511.923449]  [<ffffffff8020c9f0>] ? child_rip+0x0/0

[101536.249729] Pid: 28594, comm: btrfs-endio-wri Tainted: G      D    
2.6.30-rc7-autokern1 #1 IBM x3950-[88726RU]-
[101536.249729] RIP: 0010:[<ffffffff804cd70d>]  [<ffffffff804cd70d>]
_spin_lock+0x14/0x1a
[101536.249729] RSP: 0018:ffff88011a80da80  EFLAGS: 00000297
[101536.249729] RAX: 000000000000c6c2 RBX: ffff88011a80da80 RCX: 
0000000000000000
[101536.249729] RDX: 0000000000000000 RSI: ffff88013d080000 RDI: 
ffff8800478ad6b0
[101536.249729] RBP: ffffffff8020c50e R08: 000000000000004c R09: 
0000000000000001
[101536.249729] R10: 0000000000000008 R11: 0000000000086000 R12: 
ffff88011a80da40
[101536.249729] R13: ffff8800aa254800 R14: 0000000b470c7fff R15: 
ffff88011f256030
[101536.249729] FS:  0000000000000000(0000) GS:ffff88002ba30000(0000) 
knlGS:0000000000000000
[101536.249729] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[101536.249729] CR2: 000000000065b078 CR3: 0000000000201000 CR4: 
00000000000006e0
[101536.249729] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[101536.249729] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[101536.249729] Call Trace:
[101536.249729]  [<ffffffffa04710cf>] ? btrfs_tree_lock+0x54/0x9e [btrfs]
[101536.249729]  [<ffffffffa0471022>] ? btrfs_wake_function+0x0/0x10
[btrfs]
[101536.249729]  [<ffffffffa0438104>] ? btrfs_lock_root_node+0x1d/0x4b 
[btrfs]
[101536.249729]  [<ffffffffa043e4c5>] ? btrfs_search_slot+0xc7/0x73e
[btrfs]
[101536.249729]  [<ffffffffa043f11d>] ? 
btrfs_insert_empty_items+0x5e/0xa9 [btrfs]
[101536.249729]  [<ffffffffa0444f7a>] ? run_one_delayed_ref+0x164/0x42f 
[btrfs]
[101536.249729]  [<ffffffffa0445f75>] ? run_clustered_refs+0x237/0x2b4 
[btrfs]
[101536.249729]  [<ffffffffa0478f85>] ? 
btrfs_find_ref_cluster+0xdc/0x115 [btrfs]
[101536.249729]  [<ffffffffa044609e>] ? 
btrfs_run_delayed_refs+0xac/0x195 [btrfs]
[101536.249729]  [<ffffffffa044e86e>] ? 
__btrfs_end_transaction+0x59/0xfe [btrfs]
[101536.249729]  [<ffffffffa044e92e>] ? btrfs_end_transaction+0xb/0xd 
[btrfs]
[101536.249729]  [<ffffffffa045418b>] ? 
btrfs_finish_ordered_io+0x224/0x24d [btrfs]
[101536.249729]  [<ffffffffa04541c4>] ? 
btrfs_writepage_end_io_hook+0x10/0x12 [btrfs]
[101536.249729]  [<ffffffffa0467599>] ? 
end_bio_extent_writepage+0xa3/0x18f [btrfs]
[101536.249729]  [<ffffffff8024276e>] ? del_timer_sync+0x14/0x20
[101536.249729]  [<ffffffff802cbbee>] ? bio_endio+0x26/0x28
[101536.249729]  [<ffffffffa044b5d6>] ? end_workqueue_fn+0x111/0x11e
[btrfs]
[101536.249729]  [<ffffffffa046eff5>] ? worker_loop+0x67/0x1ee [btrfs]
:


> For the progs:
>
> git pull
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs-unstable.git
newformat
>   
I should mention that I missed the part about the new user tools, so 
while these we newly formated filesystems, they were created with the 
old tools.  These are both running 64bit. I plan to install the new 
tools and re-run.

Steve

> The main benefit of the new code is that backrefs on the extent
> allocation tree use a fuzzier format.  It basically means that we search
> for the key in the extent allocation tree instead of providing an exact
> backref to the parent block.
>
> This means we can predict how many blocks will be changed when changing
> the extent allocation tree, and it makes enospc much less complex.  It
> is also significantly faster.
>
> For regular subvolume trees, a similar change is made as long as there
> are no snapshots against a given block.  This is the common case, and it
> makes COW less expensive overall.
>
> Yan Zheng also worked out a way to free blocks during the transaction
> without needing to do an explicit snapshot deletion on the old root when
> the transaction was done.  This gets rid of some complex caching code,
> and fixes worst-case problems where btrfs could take a very very long
> time to unmount.
>
> btrfs-vol -b is faster with the new code as well, he added caching of
> high levels in the tree to speed things up.
>
> (Many kudos to Yan Zheng for all of this work!)
>
> -chris
>
> --
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason

2009-Jun-04 19:05 UTC

head link

Re: New experimental btrfs branch ready for testing

On Thu, Jun 04, 2009 at 02:02:20PM -0500, Steven Pratt
wrote:> Chris Mason wrote:
>> Hello everyone,
>>
>> Yan Zheng has been doing some major surgery to the back references and
>> extent allocation code, tackling bottlenecks in the code that tracks
>> extents.  It scales better with many snapshots and performs better in
>> the common case of no snapshots at all.
>>
>> THE NEW CODE IS A FORWARD ROLLING DISK FORMAT CHANGE.  This means it is
>> compatible with the current btrfs disk format, but once you mount a
>> filesystem with the new code, it WILL NO LONGER BE MOUNTABLE FROM OLD
>> KERNELS.  Old kernels spit out an error message when you try them on
new
>> format filesystems.
>>
>> This is a large change, and I''m hoping to have it stable in
time for the
>> 2.6.31 merge window.  I''ve been testing it for about a week
now, and
>> haven''t been able to cause major problems yet.  But, testing
the
>> compatibility with old format filesystems is the hard part, and
>> everyone that pulls the new code should backup their data first.
>>
>> I''ve setup git branches called newformat where you can pull
the new code.
>>
>> For the kernel (based on 2.6.30-rc7):
>>
>> git pull
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git newformat
>>
>>   
> So I started the performance runs on this. The base tests completed fine  
> on the raid system and I will post results as soon as I can finish  
> postprocessing, but when I tried to do nodatacow that machine it crashed  
> pretty early. Here is console log:
Thanks Steve.  Just to clarify, which commit was the head of your git
tree when you ran these tests?

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason

2009-Jun-05 14:20 UTC

head link

Re: New experimental btrfs branch ready for testing

On Thu, Jun 04, 2009 at 02:02:20PM -0500, Steven Pratt
wrote:> Chris Mason wrote:
>> Hello everyone,
>>
>> Yan Zheng has been doing some major surgery to the back references and
>> extent allocation code, tackling bottlenecks in the code that tracks
>> extents.  It scales better with many snapshots and performs better in
>> the common case of no snapshots at all.
>>
>> THE NEW CODE IS A FORWARD ROLLING DISK FORMAT CHANGE.  This means it is
>> compatible with the current btrfs disk format, but once you mount a
>> filesystem with the new code, it WILL NO LONGER BE MOUNTABLE FROM OLD
>> KERNELS.  Old kernels spit out an error message when you try them on
new
>> format filesystems.
>>
>> This is a large change, and I''m hoping to have it stable in
time for the
>> 2.6.31 merge window.  I''ve been testing it for about a week
now, and
>> haven''t been able to cause major problems yet.  But, testing
the
>> compatibility with old format filesystems is the hard part, and
>> everyone that pulls the new code should backup their data first.
>>
>> I''ve setup git branches called newformat where you can pull
the new code.
>>
>> For the kernel (based on 2.6.30-rc7):
>>
>> git pull
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git newformat
>>
>>   
> So I started the performance runs on this. The base tests completed fine  
> on the raid system and I will post results as soon as I can finish  
> postprocessing, but when I tried to do nodatacow that machine it crashed  
> pretty early. Here is console log:
Hi Steve,

Thanks again for hammering on these.  Yan Zheng and I have both been
trying to reproduce problems with nodatacow and with the database random
write run.

But, so far we haven''t been able to trigger any crashes.    Do you see
anything in your config or setup that is unusual?

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Steven Pratt

2009-Jun-05 16:02 UTC

head link

Re: New experimental btrfs branch ready for testing

Chris Mason wrote:> On Thu, Jun 04, 2009 at 02:02:20PM -0500, Steven Pratt wrote:
>   
>> Chris Mason wrote:
>>     
>>> Hello everyone,
>>>
>>> Yan Zheng has been doing some major surgery to the back references
and
>>> extent allocation code, tackling bottlenecks in the code that
tracks
>>> extents.  It scales better with many snapshots and performs better
in
>>> the common case of no snapshots at all.
>>>
>>> THE NEW CODE IS A FORWARD ROLLING DISK FORMAT CHANGE.  This means
it is
>>> compatible with the current btrfs disk format, but once you mount a
>>> filesystem with the new code, it WILL NO LONGER BE MOUNTABLE FROM
OLD
>>> KERNELS.  Old kernels spit out an error message when you try them
on new
>>> format filesystems.
>>>
>>> This is a large change, and I''m hoping to have it stable
in time for the
>>> 2.6.31 merge window.  I''ve been testing it for about a
week now, and
>>> haven''t been able to cause major problems yet.  But,
testing the
>>> compatibility with old format filesystems is the hard part, and
>>> everyone that pulls the new code should backup their data first.
>>>
>>> I''ve setup git branches called newformat where you can
pull the new code.
>>>
>>> For the kernel (based on 2.6.30-rc7):
>>>
>>> git pull
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git newformat
>>>
>>>   
>>>       
>> So I started the performance runs on this. The base tests completed
fine
>> on the raid system and I will post results as soon as I can finish  
>> postprocessing, but when I tried to do nodatacow that machine it
crashed
>> pretty early. Here is console log:
>>     
>
> Hi Steve,
>
> Thanks again for hammering on these.  Yan Zheng and I have both been
> trying to reproduce problems with nodatacow and with the database random
> write run.
>   So now that the raid machine is actually up, I discovered it got further 
than I thought on nodatacow. It did all the read tests, but appeared to 
died on 16 thread random write(not odirect). There were no messages 
logged to var/log/messages at all. Last I saw was :

Jun  4 03:14:24 btrfs1 kernel: [65856.065491] btrfs: setting nodatacow
Jun  4 15:24:45 btrfs1 syslogd 1.4.1: restart.

Just dead until we rebooted machine later that day.
> But, so far we haven''t been able to trigger any crashes.    Do you
see
> anything in your config or setup that is unusual?
>   No, other than using the old mkfs with the new format.  I''ve kicked off
new runs to see if I hit the same issues

Steve> -chris
>   
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Steven Pratt

2009-Jun-05 21:27 UTC

head link

Re: New experimental btrfs branch ready for testing

Steven Pratt wrote:> Chris Mason wrote:
>> On Thu, Jun 04, 2009 at 02:02:20PM -0500, Steven Pratt wrote:
>>  
>>> Chris Mason wrote:
>>>    
>>>> Hello everyone,
>>>>
>>>> Yan Zheng has been doing some major surgery to the back
references and
>>>> extent allocation code, tackling bottlenecks in the code that
tracks
>>>> extents.  It scales better with many snapshots and performs
better in
>>>> the common case of no snapshots at all.
>>>>
>>>> THE NEW CODE IS A FORWARD ROLLING DISK FORMAT CHANGE.  This
means
>>>> it is
>>>> compatible with the current btrfs disk format, but once you
mount a
>>>> filesystem with the new code, it WILL NO LONGER BE MOUNTABLE
FROM OLD
>>>> KERNELS.  Old kernels spit out an error message when you try
them
>>>> on new
>>>> format filesystems.
>>>>
>>>> This is a large change, and I''m hoping to have it
stable in time
>>>> for the
>>>> 2.6.31 merge window.  I''ve been testing it for about a
week now, and
>>>> haven''t been able to cause major problems yet.  But,
testing the
>>>> compatibility with old format filesystems is the hard part, and
>>>> everyone that pulls the new code should backup their data
first.
>>>>
>>>> I''ve setup git branches called newformat where you can
pull the new
>>>> code.
>>>>
>>>> For the kernel (based on 2.6.30-rc7):
>>>>
>>>> git pull 
>>>>
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git
>>>> newformat
>>>>
>>>>         
>>> So I started the performance runs on this. The base tests completed
>>> fine  on the raid system and I will post results as soon as I can 
>>> finish  postprocessing, but when I tried to do nodatacow that 
>>> machine it crashed  pretty early. Here is console log:
>>>     
>>
>> Hi Steve,
>>
>> Thanks again for hammering on these.  Yan Zheng and I have both been
>> trying to reproduce problems with nodatacow and with the database
random
>> write run.
>>   
> So now that the raid machine is actually up, I discovered it got 
> further than I thought on nodatacow. It did all the read tests, but 
> appeared to died on 16 thread random write(not odirect). There were no 
> messages logged to var/log/messages at all. Last I saw was :
>
> Jun  4 03:14:24 btrfs1 kernel: [65856.065491] btrfs: setting nodatacow
> Jun  4 15:24:45 btrfs1 syslogd 1.4.1: restart.
>
> Just dead until we rebooted machine later that day.
So the raid system complete the re-run of the nodatacow runs without 
error.  So still no idea what happened on this box the first time 
around.  As for the single disk system, it died during the random write 
test again, but it now looks like we might have a real HW failure.  This 
time we see SCSI error messages.  I have replaced the test disks and 
will try one more time.

The net is, I would hold off digging too much into this as even I don''t
have any repeatable errors.

Steve>
>> But, so far we haven''t been able to trigger any crashes.    Do
you see
>> anything in your config or setup that is unusual?
>>   
> No, other than using the old mkfs with the new format.  I''ve
kicked
> off new runs to see if I hit the same issues
>
> Steve
>> -chris
>>   
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason

2009-Jun-06 00:20 UTC

head link

Re: New experimental btrfs branch ready for testing

On Fri, Jun 05, 2009 at 04:27:55PM -0500, Steven Pratt
wrote:> Steven Pratt wrote:
>> Chris Mason wrote:
>>> On Thu, Jun 04, 2009 at 02:02:20PM -0500, Steven Pratt wrote:
>>>  
>>>> Chris Mason wrote:
>>>>    
>>>>> Hello everyone,
>>>>>
>>>>> Yan Zheng has been doing some major surgery to the back
references and
>>>>> extent allocation code, tackling bottlenecks in the code
that tracks
>>>>> extents.  It scales better with many snapshots and performs
better in
>>>>> the common case of no snapshots at all.
>>>>>
>>>>> THE NEW CODE IS A FORWARD ROLLING DISK FORMAT CHANGE.  This
means
>>>>> it is
>>>>> compatible with the current btrfs disk format, but once you
mount a
>>>>> filesystem with the new code, it WILL NO LONGER BE
MOUNTABLE FROM OLD
>>>>> KERNELS.  Old kernels spit out an error message when you
try them
>>>>> on new
>>>>> format filesystems.
>>>>>
>>>>> This is a large change, and I''m hoping to have it
stable in time
>>>>> for the
>>>>> 2.6.31 merge window.  I''ve been testing it for
about a week now, and
>>>>> haven''t been able to cause major problems yet. 
But, testing the
>>>>> compatibility with old format filesystems is the hard part,
and
>>>>> everyone that pulls the new code should backup their data
first.
>>>>>
>>>>> I''ve setup git branches called newformat where you
can pull the
>>>>> new code.
>>>>>
>>>>> For the kernel (based on 2.6.30-rc7):
>>>>>
>>>>> git pull  
>>>>>
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git
>>>>> newformat
>>>>>
>>>>>         
>>>> So I started the performance runs on this. The base tests
completed
>>>> fine  on the raid system and I will post results as soon as I
can
>>>> finish  postprocessing, but when I tried to do nodatacow that  
>>>> machine it crashed  pretty early. Here is console log:
>>>>     
>>>
>>> Hi Steve,
>>>
>>> Thanks again for hammering on these.  Yan Zheng and I have both
been
>>> trying to reproduce problems with nodatacow and with the database
random
>>> write run.
>>>   
>> So now that the raid machine is actually up, I discovered it got  
>> further than I thought on nodatacow. It did all the read tests, but  
>> appeared to died on 16 thread random write(not odirect). There were no
>> messages logged to var/log/messages at all. Last I saw was :
>>
>> Jun  4 03:14:24 btrfs1 kernel: [65856.065491] btrfs: setting nodatacow
>> Jun  4 15:24:45 btrfs1 syslogd 1.4.1: restart.
>>
>> Just dead until we rebooted machine later that day.
>
> So the raid system complete the re-run of the nodatacow runs without  
> error.  So still no idea what happened on this box the first time  
> around.  As for the single disk system, it died during the random write  
> test again, but it now looks like we might have a real HW failure.  This  
> time we see SCSI error messages.  I have replaced the test disks and  
> will try one more time.
>
> The net is, I would hold off digging too much into this as even I
don''t
> have any repeatable errors.
Thanks for rerunning all of this, appreciate the update.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Steven Pratt

2009-Jun-06 16:38 UTC

head link

Re: New experimental btrfs branch ready for testing

Chris Mason wrote:> On Fri, Jun 05, 2009 at 04:27:55PM -0500, Steven Pratt wrote:
>   
>> Steven Pratt wrote:
>>     
>>> Chris Mason wrote:
>>>       
>>>> On Thu, Jun 04, 2009 at 02:02:20PM -0500, Steven Pratt wrote:
>>>>  
>>>>         
>>>>> Chris Mason wrote:
>>>>>    
>>>>>           
>>>>>> Hello everyone,
>>>>>>
>>>>>> Yan Zheng has been doing some major surgery to the back
references and
>>>>>> extent allocation code, tackling bottlenecks in the
code that tracks
>>>>>> extents.  It scales better with many snapshots and
performs better in
>>>>>> the common case of no snapshots at all.
>>>>>>
>>>>>> THE NEW CODE IS A FORWARD ROLLING DISK FORMAT CHANGE. 
This means
>>>>>> it is
>>>>>> compatible with the current btrfs disk format, but once
you mount a
>>>>>> filesystem with the new code, it WILL NO LONGER BE
MOUNTABLE FROM OLD
>>>>>> KERNELS.  Old kernels spit out an error message when
you try them
>>>>>> on new
>>>>>> format filesystems.
>>>>>>
>>>>>> This is a large change, and I''m hoping to have
it stable in time
>>>>>> for the
>>>>>> 2.6.31 merge window.  I''ve been testing it for
about a week now, and
>>>>>> haven''t been able to cause major problems yet.
But, testing the
>>>>>> compatibility with old format filesystems is the hard
part, and
>>>>>> everyone that pulls the new code should backup their
data first.
>>>>>>
>>>>>> I''ve setup git branches called newformat where
you can pull the
>>>>>> new code.
>>>>>>
>>>>>> For the kernel (based on 2.6.30-rc7):
>>>>>>
>>>>>> git pull  
>>>>>>
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git
>>>>>> newformat
>>>>>>
>>>>>>         
>>>>>>             
>>>>> So I started the performance runs on this. The base tests
completed
>>>>> fine  on the raid system and I will post results as soon as
I can
>>>>> finish  postprocessing, but when I tried to do nodatacow
that
>>>>> machine it crashed  pretty early. Here is console log:
>>>>>     
>>>>>           
>>>> Hi Steve,
>>>>
>>>> Thanks again for hammering on these.  Yan Zheng and I have both
been
>>>> trying to reproduce problems with nodatacow and with the
database random
>>>> write run.
>>>>   
>>>>         
>>> So now that the raid machine is actually up, I discovered it got  
>>> further than I thought on nodatacow. It did all the read tests, but
>>> appeared to died on 16 thread random write(not odirect). There were
no
>>> messages logged to var/log/messages at all. Last I saw was :
>>>
>>> Jun  4 03:14:24 btrfs1 kernel: [65856.065491] btrfs: setting
nodatacow
>>> Jun  4 15:24:45 btrfs1 syslogd 1.4.1: restart.
>>>
>>> Just dead until we rebooted machine later that day.
>>>       
>> So the raid system complete the re-run of the nodatacow runs without  
>> error.  So still no idea what happened on this box the first time  
>> around.  As for the single disk system, it died during the random write
>> test again, but it now looks like we might have a real HW failure. 
This
>> time we see SCSI error messages.  I have replaced the test disks and  
>> will try one more time.
>>
>> The net is, I would hold off digging too much into this as even I
don''t
>> have any repeatable errors.
>>     
>
> Thanks for rerunning all of this, appreciate the update.
>
>   No problem.  Raid results are uploading to 
http://btrfs.boxacle.net/repository/raid/history/History.html  now.  
There were massive improvements in the random write workloads, 
especially with cow enabled!!  MailServer had moderate perf gains, but 
dramatic decrease in CPU utilization, so this is very good as well.

The only regression I see is on large file creates, CPU is up 200% or 
more while performance is fairly flat.  btrfs_tree_lock now dominates 
the profile.

I am still having issues on the single disk system, which I am still not 
sure if it is btrfs or HW, but I am off on a family vacation tomorrow so 
it will have to wait for a week or so.

Steve
> -chris
>
> --
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Roy Sigurd Karlsbakk

2009-Jun-07 11:50 UTC

head link

Re: New experimental btrfs branch ready for testing

On 1. juni. 2009, at 23.04, Chris Mason wrote:
> I''ve setup git branches called newformat where you can pull the
new
> code.
>
> For the kernel (based on 2.6.30-rc7):
>
> git pull git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs- 
> unstable.git newformat
# git pull git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs- 
unstable.git newformat
fatal: Not a git repository
> For the progs:
>
> git pull git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs- 
> progs-unstable.git newformat
# git pull git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs- 
progs-unstable.git newformat
fatal: Not a git repository
>Have this code been removed, or is it me doing something funny?

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
roy@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres  
intelligibelt. Det er et elementært imperativ for alle pedagoger å  
unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de  
fleste tilfeller eksisterer adekvate og relevante synonymer på norsk.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Daniel Cordero

2009-Jun-07 12:13 UTC

head link

Re: New experimental btrfs branch ready for testing

On Sun, Jun 07, 2009 at 01:50:27PM +0200, Roy Sigurd Karlsbakk
wrote:> On 1. juni. 2009, at 23.04, Chris Mason wrote:
>
>> I''ve setup git branches called newformat where you can pull
the new
>> code.
>>
>> For the kernel (based on 2.6.30-rc7):
>>
>> git pull git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs- 
>> unstable.git newformat
>
> # git pull git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs- 
> unstable.git newformat
> fatal: Not a git repository
>
>> For the progs:
>>
>> git pull git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs- 
>> progs-unstable.git newformat
>
> # git pull git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs- 
> progs-unstable.git newformat
> fatal: Not a git repository
>
>>
> Have this code been removed, or is it me doing something funny?
You''re doing something funny.

I''m guessing you don''t already have a copy of the btrfs
repositories, so
you should be using clone instead of pull.
If you do have them, cd into them.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Yan Zheng

2009-Jun-08 12:33 UTC

head link

Re: New experimental btrfs branch ready for testing

2009/6/2 Chris Mason <chris.mason@oracle.com>>
> Hello everyone,
>
> Yan Zheng has been doing some major surgery to the back references and
> extent allocation code, tackling bottlenecks in the code that tracks
> extents.  It scales better with many snapshots and performs better in
> the common case of no snapshots at all.
>
> THE NEW CODE IS A FORWARD ROLLING DISK FORMAT CHANGE.  This means it is
> compatible with the current btrfs disk format, but once you mount a
> filesystem with the new code, it WILL NO LONGER BE MOUNTABLE FROM OLD
> KERNELS.  Old kernels spit out an error message when you try them on new
> format filesystems.
>
Hello, everyone

I have a minor disk format change for the new format. The disk format change
makes snapshot dropping more efficient. The format change only affects FS
has been balanced. If you are testing the new format, please don''t use
btrfs-vol -b or btrfs-vol -r. If you have already used btrfs-vol -b or
btrfs-vol -r,
please backup your data.

Regards
Yan Zheng
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason

2009-Jun-09 15:26 UTC

head link

Re: New experimental btrfs branch ready for testing

On Sat, Jun 06, 2009 at 11:38:45AM -0500, Steven Pratt
wrote:>
> No problem.  Raid results are uploading to  
> http://btrfs.boxacle.net/repository/raid/history/History.html  now.   
> There were massive improvements in the random write workloads,  
> especially with cow enabled!!  MailServer had moderate perf gains, but  
> dramatic decrease in CPU utilization, so this is very good as well.
>
> The only regression I see is on large file creates, CPU is up 200% or  
> more while performance is fairly flat.  btrfs_tree_lock now dominates  
> the profile.
I''m not able to reproduce the btrfs_tree_lock usage that
you''re seeing.
Could you please use the callgraph option to oprofile?

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Steven Pratt

2009-Jun-15 15:46 UTC

head link

Re: New experimental btrfs branch ready for testing

Chris Mason wrote:> On Sat, Jun 06, 2009 at 11:38:45AM -0500, Steven Pratt wrote:
>   
>> No problem.  Raid results are uploading to  
>> http://btrfs.boxacle.net/repository/raid/history/History.html  now.   
>> There were massive improvements in the random write workloads,  
>> especially with cow enabled!!  MailServer had moderate perf gains, but
>> dramatic decrease in CPU utilization, so this is very good as well.
>>
>> The only regression I see is on large file creates, CPU is up 200% or  
>> more while performance is fairly flat.  btrfs_tree_lock now dominates  
>> the profile.
>>     
>
> I''m not able to reproduce the btrfs_tree_lock usage that
you''re seeing.
> Could you please use the callgraph option to oprofile?
>   
Ok, back from vacation and have re-engaged my brain :-)  Was thinking I 
would have to re-run this for you, but we already have callgraph data 
for all the runs.  For the 128 thread create workload it is here:
http://btrfs.boxacle.net/repository/raid//2-6-30-rc7-newformat/btrfs-6-2-newformat/btrfs1.ffsb.large_file_creates__threads_0128.09-06-04_01.23.30/analysis/oprofile.breakout.001/oprofile-callgraph

Steve> -chris
>   
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - Jun 2009 - New experimental btrfs branch ready for testing

New experimental btrfs branch ready for testing

Re: New experimental btrfs branch ready for testing

Re: New experimental btrfs branch ready for testing

Re: New experimental btrfs branch ready for testing

Re: New experimental btrfs branch ready for testing

Re: New experimental btrfs branch ready for testing

Re: New experimental btrfs branch ready for testing

Re: New experimental btrfs branch ready for testing

Re: New experimental btrfs branch ready for testing

Re: New experimental btrfs branch ready for testing

Re: New experimental btrfs branch ready for testing

Re: New experimental btrfs branch ready for testing

Re: New experimental btrfs branch ready for testing

Re: New experimental btrfs branch ready for testing

Re: New experimental btrfs branch ready for testing