Josef Bacik
2012-Apr-09 15:53 UTC
[PATCH] Btrfs: use i_version instead of our own sequence
We''ve been keeping around the inode sequence number in hopes that somebody would use it, but nobody uses it and people actually use i_version which serves the same purpose, so use i_version where we used the incore inode''s sequence number and that way the sequence is updated properly across the board, and not just in file write. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> --- fs/btrfs/btrfs_inode.h | 3 --- fs/btrfs/delayed-inode.c | 4 ++-- fs/btrfs/file.c | 1 - fs/btrfs/inode.c | 5 ++--- fs/btrfs/super.c | 2 +- 5 files changed, 5 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index 9b9b15f..3771b85 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -83,9 +83,6 @@ struct btrfs_inode { */ u64 generation; - /* sequence number for NFS changes */ - u64 sequence; - /* * transid of the trans_handle that last modified this inode */ diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c index 03e3748..bcd40c7 100644 --- a/fs/btrfs/delayed-inode.c +++ b/fs/btrfs/delayed-inode.c @@ -1706,7 +1706,7 @@ static void fill_stack_inode_item(struct btrfs_trans_handle *trans, btrfs_set_stack_inode_nbytes(inode_item, inode_get_bytes(inode)); btrfs_set_stack_inode_generation(inode_item, BTRFS_I(inode)->generation); - btrfs_set_stack_inode_sequence(inode_item, BTRFS_I(inode)->sequence); + btrfs_set_stack_inode_sequence(inode_item, inode->i_version); btrfs_set_stack_inode_transid(inode_item, trans->transid); btrfs_set_stack_inode_rdev(inode_item, inode->i_rdev); btrfs_set_stack_inode_flags(inode_item, BTRFS_I(inode)->flags); @@ -1754,7 +1754,7 @@ int btrfs_fill_inode(struct inode *inode, u32 *rdev) set_nlink(inode, btrfs_stack_inode_nlink(inode_item)); inode_set_bytes(inode, btrfs_stack_inode_nbytes(inode_item)); BTRFS_I(inode)->generation = btrfs_stack_inode_generation(inode_item); - BTRFS_I(inode)->sequence = btrfs_stack_inode_sequence(inode_item); + inode->i_version = btrfs_stack_inode_sequence(inode_item); inode->i_rdev = 0; *rdev = btrfs_stack_inode_rdev(inode_item); BTRFS_I(inode)->flags = btrfs_stack_inode_flags(inode_item); diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 431b565..f0da02b 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1404,7 +1404,6 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, mutex_unlock(&inode->i_mutex); goto out; } - BTRFS_I(inode)->sequence++; start_pos = round_down(pos, root->sectorsize); if (start_pos > i_size_read(inode)) { diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 7a084fb..7d3dd2f 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2510,7 +2510,7 @@ static void btrfs_read_locked_inode(struct inode *inode) inode_set_bytes(inode, btrfs_inode_nbytes(leaf, inode_item)); BTRFS_I(inode)->generation = btrfs_inode_generation(leaf, inode_item); - BTRFS_I(inode)->sequence = btrfs_inode_sequence(leaf, inode_item); + inode->i_version = btrfs_inode_sequence(leaf, inode_item); inode->i_generation = BTRFS_I(inode)->generation; inode->i_rdev = 0; rdev = btrfs_inode_rdev(leaf, inode_item); @@ -2594,7 +2594,7 @@ static void fill_inode_item(struct btrfs_trans_handle *trans, btrfs_set_inode_nbytes(leaf, item, inode_get_bytes(inode)); btrfs_set_inode_generation(leaf, item, BTRFS_I(inode)->generation); - btrfs_set_inode_sequence(leaf, item, BTRFS_I(inode)->sequence); + btrfs_set_inode_sequence(leaf, item, inode->i_version); btrfs_set_inode_transid(leaf, item, trans->transid); btrfs_set_inode_rdev(leaf, item, inode->i_rdev); btrfs_set_inode_flags(leaf, item, BTRFS_I(inode)->flags); @@ -6884,7 +6884,6 @@ struct inode *btrfs_alloc_inode(struct super_block *sb) ei->root = NULL; ei->space_info = NULL; ei->generation = 0; - ei->sequence = 0; ei->last_trans = 0; ei->last_sub_trans = 0; ei->logged_trans = 0; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 54e7ee9..ee1bb31 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -770,7 +770,7 @@ static int btrfs_fill_super(struct super_block *sb, #ifdef CONFIG_BTRFS_FS_POSIX_ACL sb->s_flags |= MS_POSIXACL; #endif - + sb->s_flags |= MS_I_VERSION; err = open_ctree(sb, fs_devices, (char *)data); if (err) { printk("btrfs: open_ctree failed\n"); -- 1.7.5.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Kasatkin, Dmitry
2012-Apr-12 13:22 UTC
Re: [PATCH] Btrfs: use i_version instead of our own sequence
On Mon, Apr 9, 2012 at 6:53 PM, Josef Bacik <josef@redhat.com> wrote:> We''ve been keeping around the inode sequence number in hopes that somebody > would use it, but nobody uses it and people actually use i_version which > serves the same purpose, so use i_version where we used the incore inode''s > sequence number and that way the sequence is updated properly across the > board, and not just in file write. Thanks, > > Signed-off-by: Josef Bacik <josef@redhat.com> > --- > fs/btrfs/btrfs_inode.h | 3 --- > fs/btrfs/delayed-inode.c | 4 ++-- > fs/btrfs/file.c | 1 - > fs/btrfs/inode.c | 5 ++--- > fs/btrfs/super.c | 2 +- > 5 files changed, 5 insertions(+), 10 deletions(-) > > diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h > index 9b9b15f..3771b85 100644 > --- a/fs/btrfs/btrfs_inode.h > +++ b/fs/btrfs/btrfs_inode.h > @@ -83,9 +83,6 @@ struct btrfs_inode { > */ > u64 generation; > > - /* sequence number for NFS changes */ > - u64 sequence; > - > /* > * transid of the trans_handle that last modified this inode > */ > diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c > index 03e3748..bcd40c7 100644 > --- a/fs/btrfs/delayed-inode.c > +++ b/fs/btrfs/delayed-inode.c > @@ -1706,7 +1706,7 @@ static void fill_stack_inode_item(struct btrfs_trans_handle *trans, > btrfs_set_stack_inode_nbytes(inode_item, inode_get_bytes(inode)); > btrfs_set_stack_inode_generation(inode_item, > BTRFS_I(inode)->generation); > - btrfs_set_stack_inode_sequence(inode_item, BTRFS_I(inode)->sequence); > + btrfs_set_stack_inode_sequence(inode_item, inode->i_version); > btrfs_set_stack_inode_transid(inode_item, trans->transid); > btrfs_set_stack_inode_rdev(inode_item, inode->i_rdev); > btrfs_set_stack_inode_flags(inode_item, BTRFS_I(inode)->flags); > @@ -1754,7 +1754,7 @@ int btrfs_fill_inode(struct inode *inode, u32 *rdev) > set_nlink(inode, btrfs_stack_inode_nlink(inode_item)); > inode_set_bytes(inode, btrfs_stack_inode_nbytes(inode_item)); > BTRFS_I(inode)->generation = btrfs_stack_inode_generation(inode_item); > - BTRFS_I(inode)->sequence = btrfs_stack_inode_sequence(inode_item); > + inode->i_version = btrfs_stack_inode_sequence(inode_item); > inode->i_rdev = 0; > *rdev = btrfs_stack_inode_rdev(inode_item); > BTRFS_I(inode)->flags = btrfs_stack_inode_flags(inode_item); > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > index 431b565..f0da02b 100644 > --- a/fs/btrfs/file.c > +++ b/fs/btrfs/file.c > @@ -1404,7 +1404,6 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, > mutex_unlock(&inode->i_mutex); > goto out; > } > - BTRFS_I(inode)->sequence++; > > start_pos = round_down(pos, root->sectorsize); > if (start_pos > i_size_read(inode)) { > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > index 7a084fb..7d3dd2f 100644 > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -2510,7 +2510,7 @@ static void btrfs_read_locked_inode(struct inode *inode) > > inode_set_bytes(inode, btrfs_inode_nbytes(leaf, inode_item)); > BTRFS_I(inode)->generation = btrfs_inode_generation(leaf, inode_item); > - BTRFS_I(inode)->sequence = btrfs_inode_sequence(leaf, inode_item); > + inode->i_version = btrfs_inode_sequence(leaf, inode_item); > inode->i_generation = BTRFS_I(inode)->generation; > inode->i_rdev = 0; > rdev = btrfs_inode_rdev(leaf, inode_item); > @@ -2594,7 +2594,7 @@ static void fill_inode_item(struct btrfs_trans_handle *trans, > > btrfs_set_inode_nbytes(leaf, item, inode_get_bytes(inode)); > btrfs_set_inode_generation(leaf, item, BTRFS_I(inode)->generation); > - btrfs_set_inode_sequence(leaf, item, BTRFS_I(inode)->sequence); > + btrfs_set_inode_sequence(leaf, item, inode->i_version); > btrfs_set_inode_transid(leaf, item, trans->transid); > btrfs_set_inode_rdev(leaf, item, inode->i_rdev); > btrfs_set_inode_flags(leaf, item, BTRFS_I(inode)->flags); > @@ -6884,7 +6884,6 @@ struct inode *btrfs_alloc_inode(struct super_block *sb) > ei->root = NULL; > ei->space_info = NULL; > ei->generation = 0; > - ei->sequence = 0; > ei->last_trans = 0; > ei->last_sub_trans = 0; > ei->logged_trans = 0; > diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c > index 54e7ee9..ee1bb31 100644 > --- a/fs/btrfs/super.c > +++ b/fs/btrfs/super.c > @@ -770,7 +770,7 @@ static int btrfs_fill_super(struct super_block *sb, > #ifdef CONFIG_BTRFS_FS_POSIX_ACL > sb->s_flags |= MS_POSIXACL; > #endif > - > + sb->s_flags |= MS_I_VERSION; > err = open_ctree(sb, fs_devices, (char *)data); > if (err) { > printk("btrfs: open_ctree failed\n"); > -- > 1.7.5.2 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.htmlBTW. 1. where is BTRFS devel git tree? 2. when this is coming to mainline? - Dmitry -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Josef Bacik
2012-Apr-12 13:31 UTC
Re: [PATCH] Btrfs: use i_version instead of our own sequence
On Thu, Apr 12, 2012 at 04:22:26PM +0300, Kasatkin, Dmitry wrote:> On Mon, Apr 9, 2012 at 6:53 PM, Josef Bacik <josef@redhat.com> wrote: > > We''ve been keeping around the inode sequence number in hopes that somebody > > would use it, but nobody uses it and people actually use i_version which > > serves the same purpose, so use i_version where we used the incore inode''s > > sequence number and that way the sequence is updated properly across the > > board, and not just in file write. Thanks, > > > > Signed-off-by: Josef Bacik <josef@redhat.com> > > --- > > fs/btrfs/btrfs_inode.h | 3 --- > > fs/btrfs/delayed-inode.c | 4 ++-- > > fs/btrfs/file.c | 1 - > > fs/btrfs/inode.c | 5 ++--- > > fs/btrfs/super.c | 2 +- > > 5 files changed, 5 insertions(+), 10 deletions(-) > > > > diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h > > index 9b9b15f..3771b85 100644 > > --- a/fs/btrfs/btrfs_inode.h > > +++ b/fs/btrfs/btrfs_inode.h > > @@ -83,9 +83,6 @@ struct btrfs_inode { > > */ > > u64 generation; > > > > - /* sequence number for NFS changes */ > > - u64 sequence; > > - > > /* > > * transid of the trans_handle that last modified this inode > > */ > > diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c > > index 03e3748..bcd40c7 100644 > > --- a/fs/btrfs/delayed-inode.c > > +++ b/fs/btrfs/delayed-inode.c > > @@ -1706,7 +1706,7 @@ static void fill_stack_inode_item(struct btrfs_trans_handle *trans, > > btrfs_set_stack_inode_nbytes(inode_item, inode_get_bytes(inode)); > > btrfs_set_stack_inode_generation(inode_item, > > BTRFS_I(inode)->generation); > > - btrfs_set_stack_inode_sequence(inode_item, BTRFS_I(inode)->sequence); > > + btrfs_set_stack_inode_sequence(inode_item, inode->i_version); > > btrfs_set_stack_inode_transid(inode_item, trans->transid); > > btrfs_set_stack_inode_rdev(inode_item, inode->i_rdev); > > btrfs_set_stack_inode_flags(inode_item, BTRFS_I(inode)->flags); > > @@ -1754,7 +1754,7 @@ int btrfs_fill_inode(struct inode *inode, u32 *rdev) > > set_nlink(inode, btrfs_stack_inode_nlink(inode_item)); > > inode_set_bytes(inode, btrfs_stack_inode_nbytes(inode_item)); > > BTRFS_I(inode)->generation = btrfs_stack_inode_generation(inode_item); > > - BTRFS_I(inode)->sequence = btrfs_stack_inode_sequence(inode_item); > > + inode->i_version = btrfs_stack_inode_sequence(inode_item); > > inode->i_rdev = 0; > > *rdev = btrfs_stack_inode_rdev(inode_item); > > BTRFS_I(inode)->flags = btrfs_stack_inode_flags(inode_item); > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > > index 431b565..f0da02b 100644 > > --- a/fs/btrfs/file.c > > +++ b/fs/btrfs/file.c > > @@ -1404,7 +1404,6 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, > > mutex_unlock(&inode->i_mutex); > > goto out; > > } > > - BTRFS_I(inode)->sequence++; > > > > start_pos = round_down(pos, root->sectorsize); > > if (start_pos > i_size_read(inode)) { > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > > index 7a084fb..7d3dd2f 100644 > > --- a/fs/btrfs/inode.c > > +++ b/fs/btrfs/inode.c > > @@ -2510,7 +2510,7 @@ static void btrfs_read_locked_inode(struct inode *inode) > > > > inode_set_bytes(inode, btrfs_inode_nbytes(leaf, inode_item)); > > BTRFS_I(inode)->generation = btrfs_inode_generation(leaf, inode_item); > > - BTRFS_I(inode)->sequence = btrfs_inode_sequence(leaf, inode_item); > > + inode->i_version = btrfs_inode_sequence(leaf, inode_item); > > inode->i_generation = BTRFS_I(inode)->generation; > > inode->i_rdev = 0; > > rdev = btrfs_inode_rdev(leaf, inode_item); > > @@ -2594,7 +2594,7 @@ static void fill_inode_item(struct btrfs_trans_handle *trans, > > > > btrfs_set_inode_nbytes(leaf, item, inode_get_bytes(inode)); > > btrfs_set_inode_generation(leaf, item, BTRFS_I(inode)->generation); > > - btrfs_set_inode_sequence(leaf, item, BTRFS_I(inode)->sequence); > > + btrfs_set_inode_sequence(leaf, item, inode->i_version); > > btrfs_set_inode_transid(leaf, item, trans->transid); > > btrfs_set_inode_rdev(leaf, item, inode->i_rdev); > > btrfs_set_inode_flags(leaf, item, BTRFS_I(inode)->flags); > > @@ -6884,7 +6884,6 @@ struct inode *btrfs_alloc_inode(struct super_block *sb) > > ei->root = NULL; > > ei->space_info = NULL; > > ei->generation = 0; > > - ei->sequence = 0; > > ei->last_trans = 0; > > ei->last_sub_trans = 0; > > ei->logged_trans = 0; > > diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c > > index 54e7ee9..ee1bb31 100644 > > --- a/fs/btrfs/super.c > > +++ b/fs/btrfs/super.c > > @@ -770,7 +770,7 @@ static int btrfs_fill_super(struct super_block *sb, > > #ifdef CONFIG_BTRFS_FS_POSIX_ACL > > sb->s_flags |= MS_POSIXACL; > > #endif > > - > > + sb->s_flags |= MS_I_VERSION; > > err = open_ctree(sb, fs_devices, (char *)data); > > if (err) { > > printk("btrfs: open_ctree failed\n"); > > -- > > 1.7.5.2 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > BTW. > 1. where is BTRFS devel git tree? > 2. when this is coming to mainline? >There''s a bunch, my personal tree with just my patches is here git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work.git a tree with all outstanding mailinglist patches is here git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git and Chris''s tree which is where all things go through to get to mainline is here git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git It will probably be in the next merge window. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Duncan
2012-Apr-12 21:41 UTC
Wiki update request: source repo page Was: [PATCH] Btrfs: use i_version instead of our own sequence
Josef Bacik posted on Thu, 12 Apr 2012 09:31:07 -0400 as excerpted:>> BTW. >> 1. where is BTRFS devel git tree? >> 2. when this is coming to mainline? >> >> > There''s a bunch, my personal tree with just my patches is here > > git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work.git > > a tree with all outstanding mailinglist patches is here > > git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git > > and Chris''s tree which is where all things go through to get to mainline > is here > > git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git > > It will probably be in the next merge window. Thanks,Could this list be added to the btrfs wiki, source repositories page? http://btrfs.ipv5.de/index.php?title=Btrfs_source_repositories While there, please review the dkms information: 0) At least a paragraph actually describing what dkms is/does would be extremely useful. A link to another page on the topic or to an external dkms resource for more information is probably in order as well. 1) Near the top of the dkms section, under "You have a very recent kernel", the for instance says dkms doesn''t work with recent kernels, but then backporting is mentioned. So you want to use it if you have a very recent kernel, but it doesn''t work with recent kernels and backporting is needed? WTF? 2) Is Chris''s tree STILL based on old 2.6.32 without further updates except to btrfs? If so, the link to it in the earlier btrfs kernel module git repository section should probably have a BIG WARNING TO THAT EFFECT, instead of simply saying it downloads a complete Linux kernel tree. 3) Further down there''s a step that says Patch version script, noting 2.6.27, which is older still. Has cmason merged that patch? 4) The instructions appear to assume a kernel module an initr* based setup. What about people who configure and build a custom monolithic kernel, with module loading disabled? -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hugo Mills
2012-Apr-12 21:55 UTC
Re: Wiki update request: source repo page Was: [PATCH] Btrfs: use i_version instead of our own sequence
On Thu, Apr 12, 2012 at 09:41:17PM +0000, Duncan wrote:> Josef Bacik posted on Thu, 12 Apr 2012 09:31:07 -0400 as excerpted: > > >> BTW. > >> 1. where is BTRFS devel git tree? > >> 2. when this is coming to mainline? > >> > >> > > There''s a bunch, my personal tree with just my patches is here > > > > git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work.git > > > > a tree with all outstanding mailinglist patches is here > > > > git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git > > > > and Chris''s tree which is where all things go through to get to mainline > > is here > > > > git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git > > > > It will probably be in the next merge window. Thanks, > > > Could this list be added to the btrfs wiki, source repositories page?Well, it _is_ a wiki... Knock yourself out.> http://btrfs.ipv5.de/index.php?title=Btrfs_source_repositories > > While there, please review the dkms information:[snip dkms]> 2) Is Chris''s tree STILL based on old 2.6.32 without further updates > except to btrfs? If so, the link to it in the earlier btrfs kernel > module git repository section should probably have a BIG WARNING TO THAT > EFFECT, instead of simply saying it downloads a complete Linux kernel > tree.No, it''s generally based on some recent linux-kernel (usually not more than one revision out). Possibly some instructions on using git merge to combine Chris''s tree with Linus''s would be useful (although I think it''s generally assumed that if you''re using git to pull some arbitrary repo to build from that you know how to drive git to that degree anyway).> 3) Further down there''s a step that says Patch version script, noting > 2.6.27, which is older still. Has cmason merged that patch? > > 4) The instructions appear to assume a kernel module an initr* based > setup. What about people who configure and build a custom monolithic > kernel, with module loading disabled?Then in general, they''re stuffed. If you want to mount a multi-device filesystem, you have to run btrfs dev scan before it''s mounted. If that filesystem is your root filesystem, then you have to do it before root is mounted. This requires an initramfs/initrd. It is possible to supply a full list of explicit device names for the root FS to the kernel at boot time with the device= mount parameter, but this is unreliable at best. We certainly had a very hard time getting it to work last time the issue came up on IRC. The general advice is -- use a single-device root filesystem, or an initramfs. These are simple, supported, and will generally get good help. Any other configuration will cause you to be told to use an initramfs. So far, I''ve not heard any concrete reason why one shouldn''t be used except "ooh, I don''t understand them, and they''re scary!". Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- This year, I''m giving up Lent. ---
Duncan
2012-Apr-12 22:56 UTC
Re: Wiki update request: source repo page Was: [PATCH] Btrfs: use i_version instead of our own sequence
Hugo Mills posted on Thu, 12 Apr 2012 22:55:46 +0100 as excerpted:> On Thu, Apr 12, 2012 at 09:41:17PM +0000, Duncan wrote:>> 4) The instructions appear to assume a kernel module an initr* based >> setup. What about people who configure and build a custom monolithic >> kernel, with module loading disabled? > > Then in general, they''re stuffed. > > If you want to mount a multi-device filesystem, you have to run > btrfs dev scan before it''s mounted. If that filesystem is your root > filesystem, then you have to do it before root is mounted. This requires > an initramfs/initrd. > > It is possible to supply a full list of explicit device names for > the root FS to the kernel at boot time with the device= mount parameter, > but this is unreliable at best. We certainly had a very hard time > getting it to work last time the issue came up on IRC. > > The general advice is -- use a single-device root filesystem, or an > initramfs. These are simple, supported, and will generally get good > help. Any other configuration will cause you to be told to use an > initramfs. So far, I''ve not heard any concrete reason why one shouldn''t > be used except "ooh, I don''t understand them, and they''re scary!".FWIW, device names appear to be reasonably stable, here. Stable enough that I currently have this built into the kernel as part of my kernel command line: md=3,/dev/sda6,/dev/sdb6,/dev/sdc6,/dev/sdd6 root=/dev/md3p1 When I need to override that to mount the primary backup/recovery root, this as part of grub2''s linux line extends/overrides the kernel builtin: md=9,/dev/sda12,/dev/sdb12,/dev/sdc12,/dev/sdd12 root=/dev/md9p1 When I boot from thumbdrive or otherwise might trigger device reordering, grub''s interactivity allows me to find the correct mds and substitute device names as appropriate. And yes, if you''re wondering, init=/bin/bash is tested and known to work, too. =:^) I don''t see why btrfs would have additional kernel device naming or finding problems that md doesn''t already have. So while I''d agree that multi-device noinitr* btrfs builtin might not be appropriate as a general distro-wide solution, it does seem quite reasonable (here) for sysadmins familiar enough with their own systems to have custom-built no-module-loading kernels in general, to be able to do the same with btrfs. That''s one of a couple reasons I don''t use lvm2, as well. Both lvm2 and an initr* add complexity and thus recovery failure risk due to admin fat- fingering or failure to anticipate and test all permutations of failure mode, for little or no gain in my current deployments. Because lvm2 requires an initr* to handle root, it''s TWO such layers of additional complexity to test the failure modes for and be prepared to deal with at recovery time. The added complexity and risk is simply not a reasonable tradeoff, for me, and I sleep better with a tested confidence in my disaster recovery abilities. =:^) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hugo Mills
2012-Apr-13 13:16 UTC
Re: Wiki update request: source repo page Was: [PATCH] Btrfs: use i_version instead of our own sequence
On Thu, Apr 12, 2012 at 10:56:00PM +0000, Duncan wrote:> Hugo Mills posted on Thu, 12 Apr 2012 22:55:46 +0100 as excerpted: > > The general advice is -- use a single-device root filesystem, or an > > initramfs. These are simple, supported, and will generally get good > > help. Any other configuration will cause you to be told to use an > > initramfs. So far, I''ve not heard any concrete reason why one shouldn''t > > be used except "ooh, I don''t understand them, and they''re scary!". > > FWIW, device names appear to be reasonably stable, here. Stable enough > that I currently have this built into the kernel as part of my kernel > command line:That''s all well and good for you, but my point is that in the general case, device names are *not* stable, and the kernel does *not* claim to guarantee this. If you assume that they are stable, and your system breaks as a result, then you get to keep both pieces. Your choice, but your risk as well. The recommendation for a stable reliable system is to run btrfs dev scan before mounting a btrfs filesystem.> md=3,/dev/sda6,/dev/sdb6,/dev/sdc6,/dev/sdd6 root=/dev/md3p1 > > When I need to override that to mount the primary backup/recovery root, > this as part of grub2''s linux line extends/overrides the kernel builtin: > > md=9,/dev/sda12,/dev/sdb12,/dev/sdc12,/dev/sdd12 root=/dev/md9p1So you''re doing the same thing as btrfs''s "device=" mount option. Again, if this works, all well and good, but it''ll break if devices move around, requiring the manual step. If you want to avoid that and have it all Just Work, use an initrd (for both MD and btrfs).> When I boot from thumbdrive or otherwise might trigger device reordering, > grub''s interactivity allows me to find the correct mds and substitute > device names as appropriate. And yes, if you''re wondering, > init=/bin/bash is tested and known to work, too. =:^) > > I don''t see why btrfs would have additional kernel device naming or > finding problems that md doesn''t already have.As far as I recall, MD also requires userspace help in order to reassemble a device correctly, except when v0.9 superblocks are used (which have limitations on the set of other features they support).> So while I''d agree that multi-device noinitr* btrfs builtin might not be > appropriate as a general distro-wide solution, it does seem quite > reasonable (here) for sysadmins familiar enough with their own systems to > have custom-built no-module-loading kernels in general, to be able to do > the same with btrfs.I disagree. You could conceivably get the kernel to scan every single block device it knows about as the device is discovered, but then the kernel would take unnecessarily long to boot (consider something with a stack of DVD drives -- you''d have to wait for each one to spin up before scanning it). If you were to restrict the set of devices scanned, you''re putting policy into the kernel, which would get rejected pretty much instantly by anyone upstream.> That''s one of a couple reasons I don''t use lvm2, as well. Both lvm2 and > an initr* add complexity and thus recovery failure risk due to admin fat- > fingering or failure to anticipate and test all permutations of failure > mode, for little or no gain in my current deployments.Again, that''s good for you, but all such attempts should be viewed as unreliable workarounds, not as recommended and supported approaches.> Because lvm2 requires an initr* to handle root, it''s TWO such > layers of additional complexity to test the failure modes for and be > prepared to deal with at recovery time. The added complexity and > risk is simply not a reasonable tradeoff, for me, and I sleep better > with a tested confidence in my disaster recovery abilities. =:^)I have had my initrd (for root on LVM on MD) fail precisely once in 8 years or so, and that was down to my upgrading some LVM tools manually and not rebuilding the initrd. If you stick to your distribution''s packages (at least for boot-critical things), then I would think it highly unlikely you''ll end up with a system that fails to boot. If you muck around with it manually and it breaks, then I would suggest you treat it as a learning experience. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Our so-called leaders speak/with words they try to jail ya/ --- They subjugate the meek/but it''s the rhetoric of failure.
Duncan
2012-Apr-13 18:48 UTC
Re: Wiki update request: source repo page Was: [PATCH] Btrfs: use i_version instead of our own sequence
Hugo Mills posted on Fri, 13 Apr 2012 14:16:32 +0100 as excerpted:> On Thu, Apr 12, 2012 at 10:56:00PM +0000, Duncan wrote: >> Hugo Mills posted on Thu, 12 Apr 2012 22:55:46 +0100 as excerpted: >> > The general advice is -- use a single-device root filesystem, or an >> > initramfs. These are simple, supported, and will generally get good >> > help. Any other configuration will cause you to be told to use an >> > initramfs. So far, I''ve not heard any concrete reason why one >> > shouldn''t be used except "ooh, I don''t understand them, and they''re >> > scary!". >> >> FWIW, device names appear to be reasonably stable, here. Stable enough >> that I currently have this built into the kernel as part of my kernel >> command line: > > That''s all well and good for you, but my point is that in the > general case, device names are *not* stable, and the kernel does *not* > claim to guarantee this. If you assume that they are stable, and your > system breaks as a result, then you get to keep both pieces. Your > choice, but your risk as well. The recommendation for a stable reliable > system is to run btrfs dev scan before mounting a btrfs filesystem.BTW, I don''t believe I''ve thanked you for the replies, yet. Thanks, and I don''t disagree with the rest. I guess I generally agree here, too... with /generally/ stressed. But the wiki should cover more than the default, "general" case. Meanwhile [kernel command line] ...>> md=3,/dev/sda6,/dev/sdb6,/dev/sdc6,/dev/sdd6 root=/dev/md3p1> So you''re doing the same thing as btrfs''s "device=" mount option.Same general thing, yes.> Again, if this works, all well and good, but it''ll break if devices move > around, requiring the manual step. If you want to avoid that and have it > all Just Work, use an initrd (for both MD and btrfs).Obviously, for systems where it all moves around at every boot, this isn''t going to be a very useful alternative, I''ll agree. But that''s not the case on a lot of hardware. And the trouble with "just works" isn''t when it does INDEED "just work", but when it doesn''t. In that case, the admin needs to be comfortable enough with the system and his understanding of it to get things back into operational condition. The more layers of unnecessary complexity (like a shouldn''t be necessary initr*) there are, the more difficult that "grok the system well enough to be reasonably confident in one''s ability to restore it" status is to achieve, and the higher the risk of failure, should the admin''s disaster recovery skills actually be needed.> As far as I recall, MD also requires userspace help in order to > reassemble a device correctly, except when v0.9 superblocks are used > (which have limitations on the set of other features they support).I''ve read that is true, altho I''ve not tested it, I''ve simply gone with 0.90 superblocks, because here, the cost of 0.90 is rather low compared to the benefits of no-userspace-required assembly from kernel commandline and detected data. If btrfs (or md) should lose that ability, it''d be a shame.>> So while I''d agree that multi-device noinitr* btrfs builtin might not >> be appropriate as a general distro-wide solution, it does seem quite >> reasonable (here) for sysadmins familiar enough with their own systems >> to have custom-built no-module-loading kernels in general, to be able >> to do the same with btrfs. > > I disagree. You could conceivably get the kernel to scan every > single block device it knows about as the device is discovered, but then > the kernel would take unnecessarily long to boot (consider something > with a stack of DVD drives -- you''d have to wait for each one to spin up > before scanning it). If you were to restrict the set of devices scanned, > you''re putting policy into the kernel, which would get rejected pretty > much instantly by anyone upstream.No need to have the kernel scan everything (certainly not by default) OR limit policy... while still retaining flexibility. Simply keep the ability to have it specified on the kernel commandline, for systems with a stable enough device layout that it works. Actually, with a flexible enough bootloader, and grub2 certainly seems at least close to that already, as with its modules it already supports both mdraid and lvm2 as well as btrfs (and apparently zfs as well), any scanning ability as well as site configuration and policy, could be and already is to some degree, built into the bootloader''s modules, as long as the kernel commandline (or other similar mechanism) remains available to handle the necessary data passed to it from the bootloader.> I have had my initrd (for root on LVM on MD) fail precisely once in > 8 years or so, and that was down to my upgrading some LVM tools manually > and not rebuilding the initrd. If you stick to your distribution''s > packages (at least for boot-critical things), then I would think it > highly unlikely you''ll end up with a system that fails to boot. If you > muck around with it manually and it breaks, then I would suggest you > treat it as a learning experience.Stick to a distro''s packages? How are people only running distro packages supposed to run the patches they''re asked to test on the list, talk the distro package maintainer into including possibly one-shot debugging patches into a new general rollout? True, some distros are packaging and even supporting btrfs already, and some users, their own admins, are reckless enough to dive right in, without checking project status, or the wiki, or the list... but that can hardly be the /target/ audience for btrfs at this point, right? -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
cwillu
2012-Apr-13 20:03 UTC
Re: Wiki update request: source repo page Was: [PATCH] Btrfs: use i_version instead of our own sequence
On Fri, Apr 13, 2012 at 12:48 PM, Duncan <1i5t5.duncan@cox.net> wrote:> Hugo Mills posted on Fri, 13 Apr 2012 14:16:32 +0100 as excerpted: > >> On Thu, Apr 12, 2012 at 10:56:00PM +0000, Duncan wrote: >>> Hugo Mills posted on Thu, 12 Apr 2012 22:55:46 +0100 as excerpted: >>> > The general advice is -- use a single-device root filesystem, or an >>> > initramfs. These are simple, supported, and will generally get good >>> > help. Any other configuration will cause you to be told to use an >>> > initramfs. So far, I''ve not heard any concrete reason why one >>> > shouldn''t be used except "ooh, I don''t understand them, and they''re >>> > scary!". >>> >>> FWIW, device names appear to be reasonably stable, here. Stable enough >>> that I currently have this built into the kernel as part of my kernel >>> command line: >> >> That''s all well and good for you, but my point is that in the >> general case, device names are *not* stable, and the kernel does *not* >> claim to guarantee this. If you assume that they are stable, and your >> system breaks as a result, then you get to keep both pieces. Your >> choice, but your risk as well. The recommendation for a stable reliable >> system is to run btrfs dev scan before mounting a btrfs filesystem. > > BTW, I don''t believe I''ve thanked you for the replies, yet. Thanks, and > I don''t disagree with the rest. > > I guess I generally agree here, too... with /generally/ stressed. But > the wiki should cover more than the default, "general" case. > > Meanwhile [kernel command line] ... > >>> md=3,/dev/sda6,/dev/sdb6,/dev/sdc6,/dev/sdd6 root=/dev/md3p1 > >> So you''re doing the same thing as btrfs''s "device=" mount option. > > Same general thing, yes. > >> Again, if this works, all well and good, but it''ll break if devices move >> around, requiring the manual step. If you want to avoid that and have it >> all Just Work, use an initrd (for both MD and btrfs). > > Obviously, for systems where it all moves around at every boot, this > isn''t going to be a very useful alternative, I''ll agree. But that''s not > the case on a lot of hardware. > > And the trouble with "just works" isn''t when it does INDEED "just work", > but when it doesn''t. In that case, the admin needs to be comfortable > enough with the system and his understanding of it to get things back > into operational condition. The more layers of unnecessary complexity > (like a shouldn''t be necessary initr*) there are, the more difficult that > "grok the system well enough to be reasonably confident in one''s ability > to restore it" status is to achieve, and the higher the risk of failure, > should the admin''s disaster recovery skills actually be needed.The fact is that you _don''t_ know that your device names aren''t going to change one day, any more than a generation of developers who only worked with ext3 knew that "fsync is expensive and all you really need is the atomic guarantee of mv". [snip]> Stick to a distro''s packages? > > How are people only running distro packages supposed to run the patches > they''re asked to test on the list, talk the distro package maintainer > into including possibly one-shot debugging patches into a new general > rollout?On debian derivatives at least, creating a distro packaged kernel from a git checkout is a single, fairly short, command. And the logic to create the initramfs is separate from that, triggered when a kernel is installed (and which works even if you just installed a kernel by copying the bzimage to /boot). I''d be surprised if it was much different on any other distro. Everyone needs to start somewhere, but it''s not unreasonable to expect an admin to understand how their distro does things, and where to find answers when they don''t know, and to verify that their knowledge is correct. The middle of a disaster recovery should not be the first time you''ve tried a recovery. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Duncan
2012-Apr-13 21:55 UTC
Re: Wiki update request: source repo page Was: [PATCH] Btrfs: use i_version instead of our own sequence
cwillu posted on Fri, 13 Apr 2012 14:03:38 -0600 as excerpted:> The fact is that you _don''t_ know that your device names aren''t going to > change one day, any more than a generation of developers who only worked > with ext3 knew that "fsync is expensive and all you really need is the > atomic guarantee of mv".What I /do/ know is that such changes will be due to either (1) kernel upgrades or (2) hardware changes (planned upgrades or unpredicted failure). Both factors can be mitigated with redundant layers of fallback and dynamic reconfiguration. Regardless of what brings those device names, a fallback to an earlier kernel, or a different device, or a bit of manual grub commandline new device location detection and according alteration of the grub-passed kernel command line so I can boot and make the necessary permanent config changes, is all it''ll take to change that. And as an admin working with stuff directly within my reach to fix when necessary, unlike that generation of developers on ext3 with expensive fsync, once it''s out of my reach to directly manipulate for a fix, it''s no longer something I need to worry about. Granted, btrfs and distro devs have to worry about products out of their direct reach to fix, but that doesn''t mean they have to take the tools away (or even simply hide them) from those that can and do put them to use.> Everyone needs to start somewhere, but it''s not unreasonable to expect > an admin to understand how their distro does things, and where to find > answers when they don''t know, and to verify that their knowledge is > correct. The middle of a disaster recovery should not be the first time > you''ve tried a recovery.Agreed. That''s why a prudent admin continually scans the radar for incoming, as well as having tested fallbacks for the unexpected. Either that, or as I was at one point, they don''t have the knowledge or experience yet, but are willing to risk loss of data and a new install, in ordered to get that knowledge and experience. I remember being in that group myself, and to a certain extent, I''m still a part of it. After all, that level of risk and the knowledge and experienced gained from it is part of the pull of pre-releases, live-git kernels, and experimental filesystems, in the first place. But that doesn''t mean I don''t try to control that risk by building on knowledge and experience I already have to limit the risk stack at other levels. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html