Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 0/18] ocfs2: Add metadata ECC - almost there.
These patches add the metadata ECC code to ocfs2. They have one limitation - they do not ECC directory blocks yet. However, that will only take a simple insertion of the dirblock trailer. With xattr ECC completed, the patches are approaching a final form and are ready for initial review. They are atop my xattr-buckets-2 branch. The first patch adds buffer triggers to JDB2. This allows us to only calculate the ECC code when JBD2 is about to write a block to disk. Next, we add the actual blockcheck functions. These are the functions that compute the checksum (using the kernel's crc32_le()) and the ECC bits (this is hamming code, implemented in blockcheck.c). Then we create metadata-type specific ocfs2_journal_access_*() functions to make sure each metadata type uses its own checksum field. These are hooked up to the regular metadata types. We add ecc calculation to xattr buckets. Because buckets can be multiple blocks, they cannot use journal triggers. However, they are rarely updated, so the loss of efficiency isn't very important. Last, we refactor the xattr value setting code to make sure we know which metadata type an xattr value belongs to. That way we can call the appropriate ocfs2_journal_access_*() function. The code is also available in my repository. [View] http://oss.oracle.com/git/?p=jlbec/linux-2.6.git;a=shortlog;h=metaecc [Pull] git://oss.oracle.com/git/jlbec/linux-2.6.git metaecc Joel Becker (18): jbd2: Add buffer triggers ocfs2: Add the on-disk structures for metadata checksums. ocfs2: Add the underlying blockcheck code. ocfs2: Add a validation hook for quota block reads. ocfs2: block read meta ecc. ocfs2: Add journal_access functions with jbd2 triggers. ocfs2: Wrap up the common use cases of ocfs2_new_path(). ocfs2: Use metadata-specific ocfs2_journal_access_*() functions. ocfs2: Add ecc and checksums to ocfs2 xattr buckets. ocfs2: Create ocfs2_xattr_value_buf. ocfs2: Pull ocfs2_xattr_value_buf up from __ocfs2_remove_xattr_range(). ocfs2: Pull ocfs2_xattr_value_buf up into ocfs2_xattr_value_truncate(). ocfs2: Pass ocfs2_xattr_value_buf into ocfs2_xattr_value_truncate(). ocfs2: Pass value buf to ocfs2_xattr_update_entry(). ocfs2: Use ocfs2_xattr_value_buf in ocfs2_xattr_set_entry(). ocfs2: Pass value buf to ocfs2_remove_value_outside(). ocfs2: Use proper journal_access function in xattr.c ocfs2: Enable metadata checksums.
Filesystems often to do compute intensive operation on some metadata. If this operation is repeated many times, it can be very expensive. It would be much nicer if the operation could be performed once before a buffer goes to disk. This adds triggers to jbd2 buffer heads. Just before writing a metadata buffer to the journal, jbd2 will optionally call a commit trigger associated with the buffer. If the journal is aborted, an abort trigger will be called on any dirty buffers as they are dropped from pending transactions. ocfs2 will use this feature. Initially I tried to come up with a more generic trigger that could be used for non-buffer-related events like transaction completion. It doesn't tie nicely, because the information a buffer trigger needs (specific to a journal_head) isn't the same as what a transaction trigger needs (specific to a tranaction_t or perhaps journal_t). So I implemented a buffer set, with the understanding that journal/transaction wide triggers should be implemented separately. There is only one trigger set allowed per buffer. I can't think of any reason to attach more than one set. Contrast this with a journal or transaction in which multiple places may want to watch the entire transaction separately. The trigger sets are considered static allocation from the jbd2 perspective. ocfs2 will just have one trigger set per block type, setting the same set on every bh of the same type. Signed-off-by: Joel Becker <joel.becker at oracle.com> Cc: "Theodore Ts'o" <tytso at mit.edu> Cc: <linux-ext4 at vger.kernel.org> --- fs/jbd2/commit.c | 1 + fs/jbd2/journal.c | 8 ++++++++ fs/jbd2/transaction.c | 42 ++++++++++++++++++++++++++++++++++++++++++ include/linux/jbd2.h | 29 +++++++++++++++++++++++++++++ include/linux/journal-head.h | 5 +++++ 5 files changed, 85 insertions(+), 0 deletions(-) diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c index ebc667b..2ee799a 100644 --- a/fs/jbd2/commit.c +++ b/fs/jbd2/commit.c @@ -509,6 +509,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) if (is_journal_aborted(journal)) { clear_buffer_jbddirty(jh2bh(jh)); JBUFFER_TRACE(jh, "journal is aborting: refile"); + jbd2_buffer_abort_trigger(jh); jbd2_journal_refile_buffer(journal, jh); /* If that was the last one, we need to clean up * any descriptor buffers which may have been diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index e70d657..c5e8935 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -50,6 +50,7 @@ EXPORT_SYMBOL(jbd2_journal_unlock_updates); EXPORT_SYMBOL(jbd2_journal_get_write_access); EXPORT_SYMBOL(jbd2_journal_get_create_access); EXPORT_SYMBOL(jbd2_journal_get_undo_access); +EXPORT_SYMBOL(jbd2_journal_set_triggers); EXPORT_SYMBOL(jbd2_journal_dirty_metadata); EXPORT_SYMBOL(jbd2_journal_release_buffer); EXPORT_SYMBOL(jbd2_journal_forget); @@ -321,6 +322,13 @@ repeat: mapped_data = kmap_atomic(new_page, KM_USER0); /* + * Fire any commit trigger. Do this before checking for escaping, + * as the trigger may modify the magic offset. If a copy-out + * happens afterwards, it will have the correct data in the buffer. + */ + jbd2_buffer_commit_trigger(jh_in, mapped_data + new_offset); + + /* * Check for escaping */ if (*((__be32 *)(mapped_data + new_offset)) =diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index 39b7805..648624c 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -944,6 +944,48 @@ out: } /** + * void jbd2_journal_set_triggers() - Add triggers for commit writeout + * @bh: buffer to trigger on + * @type: struct jbd2_buffer_trigger_type containing the trigger(s). + * + * Set any triggers on this journal_head. + * + * Call with NULL to clear the triggers. + */ +void jbd2_journal_set_triggers(struct buffer_head *bh, + struct jbd2_buffer_trigger_type *type) +{ + struct journal_head *jh = bh2jh(bh); + + if (jh->b_triggers && type) { + WARN_ON_ONCE(jh->b_triggers != type); + return; + } + + jh->b_triggers = type; +} + +void jbd2_buffer_commit_trigger(struct journal_head *jh, void *mapped_data) +{ + struct buffer_head *bh = jh2bh(jh); + + if (!jh->b_triggers || !jh->b_triggers->t_commit) + return; + + jh->b_triggers->t_commit(jh->b_triggers, bh, mapped_data, bh->b_size); +} + +void jbd2_buffer_abort_trigger(struct journal_head *jh) +{ + if (!jh->b_triggers || !jh->b_triggers->t_abort) + return; + + jh->b_triggers->t_abort(jh->b_triggers, jh2bh(jh)); +} + + + +/** * int jbd2_journal_dirty_metadata() - mark a buffer as containing dirty metadata * @handle: transaction to add buffer to. * @bh: buffer to mark diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index f366457..493d67a 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -1008,6 +1008,33 @@ int __jbd2_journal_clean_checkpoint_list(journal_t *journal); int __jbd2_journal_remove_checkpoint(struct journal_head *); void __jbd2_journal_insert_checkpoint(struct journal_head *, transaction_t *); + +/* + * Triggers + */ + +struct jbd2_buffer_trigger_type { + /* + * Fired just before a buffer is written to the journal. + * mapped_data is a mapped buffer that is the frozen data for + * commit. + */ + void (*t_commit)(struct jbd2_buffer_trigger_type *type, + struct buffer_head *bh, void *mapped_data, + size_t size); + + /* + * Fired during journal abort for dirty buffers that will not be + * committed. + */ + void (*t_abort)(struct jbd2_buffer_trigger_type *type, + struct buffer_head *bh); +}; + +extern void jbd2_buffer_commit_trigger(struct journal_head *jh, + void *mapped_data); +extern void jbd2_buffer_abort_trigger(struct journal_head *jh); + /* Buffer IO */ extern int jbd2_journal_write_metadata_buffer(transaction_t *transaction, @@ -1046,6 +1073,8 @@ extern int jbd2_journal_extend (handle_t *, int nblocks); extern int jbd2_journal_get_write_access(handle_t *, struct buffer_head *); extern int jbd2_journal_get_create_access (handle_t *, struct buffer_head *); extern int jbd2_journal_get_undo_access(handle_t *, struct buffer_head *); +void jbd2_journal_set_triggers(struct buffer_head *, + struct jbd2_buffer_trigger_type *type); extern int jbd2_journal_dirty_metadata (handle_t *, struct buffer_head *); extern void jbd2_journal_release_buffer (handle_t *, struct buffer_head *); extern int jbd2_journal_forget (handle_t *, struct buffer_head *); diff --git a/include/linux/journal-head.h b/include/linux/journal-head.h index bb70ebb..6e762ad 100644 --- a/include/linux/journal-head.h +++ b/include/linux/journal-head.h @@ -12,6 +12,8 @@ typedef unsigned int tid_t; /* Unique transaction ID */ typedef struct transaction_s transaction_t; /* Compound transaction type */ + + struct buffer_head; struct journal_head { @@ -87,6 +89,9 @@ struct journal_head { * [j_list_lock] */ struct journal_head *b_cpnext, *b_cpprev; + + /* Trigger type */ + struct jbd2_buffer_trigger_type *b_triggers; }; #endif /* JOURNAL_HEAD_H_INCLUDED */ -- 1.5.6.5
Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 02/18] ocfs2: Add the on-disk structures for metadata checksums.
Define struct ocfs2_block_check, an 8-byte structure containing a 32bit crc32_le and a 16bit hamming code ecc. This will be used for metadata checksums. Add the structure to free spaces in the various metadata structures. Add the OCFS2_FEATURE_INCOMPAT_META_ECC bit. Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/ocfs2_fs.h | 55 ++++++++++++++++++++++++++++++++++++++++++++++---- 1 files changed, 50 insertions(+), 5 deletions(-) diff --git a/fs/ocfs2/ocfs2_fs.h b/fs/ocfs2/ocfs2_fs.h index 4ae3984..4e20f50 100644 --- a/fs/ocfs2/ocfs2_fs.h +++ b/fs/ocfs2/ocfs2_fs.h @@ -148,6 +148,9 @@ /* Support for extended attributes */ #define OCFS2_FEATURE_INCOMPAT_XATTR 0x0200 +/* Metadata checksum and error correction */ +#define OCFS2_FEATURE_INCOMPAT_META_ECC 0x0800 + /* * backup superblock flag is used to indicate that this volume * has backup superblocks. @@ -421,6 +424,22 @@ static unsigned char ocfs2_type_by_mode[S_IFMT >> S_SHIFT] = { #define OCFS2_RAW_SB(dinode) (&((dinode)->id2.i_super)) /* + * Block checking structure. This is used in metadata to validate the + * contents. If OCFS2_FEATURE_INCOMPAT_META_ECC is not set, it is all + * zeros. + */ +struct ocfs2_block_check { +/*00*/ __le32 bc_crc32e; /* 802.3 Ethernet II CRC32 */ + __le16 bc_ecc; /* Single-error-correction parity vector. + This is a simple Hamming code dependant + on the blocksize. OCFS2's maximum + blocksize, 4K, requires 16 parity bits, + so we fit in __le16. */ + __le16 bc_reserved1; +/*08*/ +}; + +/* * On disk extent record for OCFS2 * It describes a range of clusters on disk. * @@ -507,7 +526,7 @@ struct ocfs2_truncate_log { struct ocfs2_extent_block { /*00*/ __u8 h_signature[8]; /* Signature for verification */ - __le64 h_reserved1; + struct ocfs2_block_check h_check; /* Error checking */ /*10*/ __le16 h_suballoc_slot; /* Slot suballocator this extent_header belongs to */ __le16 h_suballoc_bit; /* Bit offset in suballocator @@ -677,7 +696,8 @@ struct ocfs2_dinode { was set in i_flags */ __le16 i_dyn_features; __le64 i_xattr_loc; -/*80*/ __le64 i_reserved2[7]; +/*80*/ struct ocfs2_block_check i_check; /* Error checking */ +/*88*/ __le64 i_reserved2[6]; /*B8*/ union { __le64 i_pad1; /* Generic way to refer to this 64bit union */ @@ -744,7 +764,8 @@ struct ocfs2_group_desc /*20*/ __le64 bg_parent_dinode; /* dinode which owns me, in blocks */ __le64 bg_blkno; /* Offset on disk, in blocks */ -/*30*/ __le64 bg_reserved2[2]; +/*30*/ struct ocfs2_block_check bg_check; /* Error checking */ + __le64 bg_reserved2; /*40*/ __u8 bg_bitmap[0]; }; @@ -787,7 +808,12 @@ struct ocfs2_xattr_header { in this extent record, only valid in the first bucket. */ - __le64 xh_csum; + struct ocfs2_block_check xh_check; /* Error checking + (Note, this is only + used for xattr + buckets. A block uses + xb_check and sets + this field to zero.) */ struct ocfs2_xattr_entry xh_entries[0]; /* xattr entry list. */ }; @@ -838,7 +864,7 @@ struct ocfs2_xattr_block { block group */ __le32 xb_fs_generation; /* Must match super block */ /*10*/ __le64 xb_blkno; /* Offset on disk, in blocks */ - __le64 xb_csum; + struct ocfs2_block_check xb_check; /* Error checking */ /*20*/ __le16 xb_flags; /* Indicates whether this block contains real xattr or a xattr tree. */ __le16 xb_reserved0; @@ -982,6 +1008,25 @@ struct ocfs2_local_disk_dqblk { /*10*/ __le64 dqb_inodemod; /* Change in the amount of used inodes */ }; + +/* + * The quota trailer lives at the end of each quota block. + */ + +struct ocfs2_disk_dqtrailer { +/*00*/ struct ocfs2_block_check dq_check; /* Error checking */ +/*08*/ /* Cannot be larger than OCFS2_QBLK_RESERVED_SPACE */ +}; + +static inline struct ocfs2_disk_dqtrailer *ocfs2_block_dqtrailer(int blocksize, + void *buf) +{ + char *ptr = buf; + ptr += blocksize - OCFS2_QBLK_RESERVED_SPACE; + + return (struct ocfs2_disk_dqtrailer *)ptr; +} + #ifdef __KERNEL__ static inline int ocfs2_fast_symlink_chars(struct super_block *sb) { -- 1.5.6.5
Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 03/18] ocfs2: Add the underlying blockcheck code.
This is the code that computes crc32 and ecc for ocfs2 metadata blocks. There are high-level functions that check whether the filesystem has the ecc feature, mid-level functions that work on a single block or array of buffer_heads, and the low-level ecc hamming code that can handle multiple buffers like crc32_le(). It's not hooked up to the filesystem yet. Signed-off-by: Joel Becker <joel.becker at oracle.com> Cc: Christoph Hellwig <hch at lst.de> --- fs/ocfs2/Makefile | 1 + fs/ocfs2/blockcheck.c | 455 +++++++++++++++++++++++++++++++++++++++++++++++++ fs/ocfs2/blockcheck.h | 81 +++++++++ fs/ocfs2/ocfs2.h | 8 + 4 files changed, 545 insertions(+), 0 deletions(-) create mode 100644 fs/ocfs2/blockcheck.c create mode 100644 fs/ocfs2/blockcheck.h diff --git a/fs/ocfs2/Makefile b/fs/ocfs2/Makefile index 7e4b361..0159607 100644 --- a/fs/ocfs2/Makefile +++ b/fs/ocfs2/Makefile @@ -12,6 +12,7 @@ obj-$(CONFIG_OCFS2_FS_USERSPACE_CLUSTER) += ocfs2_stack_user.o ocfs2-objs := \ alloc.o \ aops.o \ + blockcheck.o \ buffer_head_io.o \ dcache.o \ dir.o \ diff --git a/fs/ocfs2/blockcheck.c b/fs/ocfs2/blockcheck.c new file mode 100644 index 0000000..53498e3 --- /dev/null +++ b/fs/ocfs2/blockcheck.c @@ -0,0 +1,455 @@ +/* -*- mode: c; c-basic-offset: 8; -*- + * vim: noexpandtab sw=8 ts=8 sts=0: + * + * blockcheck.c + * + * Checksum and ECC codes for the OCFS2 userspace library. + * + * Copyright (C) 2006, 2008 Oracle. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License, version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + */ + +#include <linux/kernel.h> +#include <linux/types.h> +#include <linux/crc32.h> +#include <linux/buffer_head.h> +#include <linux/bitops.h> +#include <asm/byteorder.h> + +#include "ocfs2.h" + +#include "blockcheck.h" + + + +/* + * We use the following conventions: + * + * d = # data bits + * p = # parity bits + * c = # total code bits (d + p) + */ +static int calc_parity_bits(unsigned int d) +{ + unsigned int p; + + /* + * Bits required for Single Error Correction is as follows: + * + * d + p + 1 <= 2^p + * + * We're restricting ourselves to 31 bits of parity, that should be + * sufficient. + */ + for (p = 1; p < 32; p++) + { + if ((d + p + 1) <= (1 << p)) + return p; + } + + return 0; +} + +/* + * Calculate the bit offset in the hamming code buffer based on the bit's + * offset in the data buffer. Since the hamming code reserves all + * power-of-two bits for parity, the data bit number and the code bit + * number are offest by all the parity bits beforehand. + * + * Recall that bit numbers in hamming code are 1-based. This function + * takes the 0-based data bit from the caller. + * + * An example. Take bit 1 of the data buffer. 1 is a power of two (2^0), + * so it's a parity bit. 2 is a power of two (2^1), so it's a parity bit. + * 3 is not a power of two. So bit 1 of the data buffer ends up as bit 3 + * in the code buffer. + */ +static unsigned int calc_code_bit(unsigned int i) +{ + unsigned int b, p; + + /* + * Data bits are 0-based, but we're talking code bits, which + * are 1-based. + */ + b = i + 1; + + /* + * For every power of two below our bit number, bump our bit. + * + * We compare with (b + 1) becuase we have to compare with what b + * would be _if_ it were bumped up by the parity bit. Capice? + */ + for (p = 0; (1 << p) < (b + 1); p++) + b++; + + return b; +} + +/* + * This is the low level encoder function. It can be called across + * multiple hunks just like the crc32 code. 'd' is the number of bits + * _in_this_hunk_. nr is the bit offset of this hunk. So, if you had + * two 512B buffers, you would do it like so: + * + * parity = ocfs2_hamming_encode(0, buf1, 512 * 8, 0); + * parity = ocfs2_hamming_encode(parity, buf2, 512 * 8, 512 * 8); + * + * If you just have one buffer, use ocfs2_hamming_encode_block(). + */ +u32 ocfs2_hamming_encode(u32 parity, void *data, unsigned int d, unsigned int nr) +{ + unsigned int p = calc_parity_bits(nr + d); + unsigned int i, j, b; + + BUG_ON(!p); + + /* + * b is the hamming code bit number. Hamming code specifies a + * 1-based array, but C uses 0-based. So 'i' is for C, and 'b' is + * for the algorithm. + * + * The i++ in the for loop is so that the start offset passed + * to ocfs2_find_next_bit_set() is one greater than the previously + * found bit. + */ + for (i = 0; (i = ocfs2_find_next_bit(data, d, i)) < d; i++) + { + /* + * i is the offset in this hunk, nr + i is the total bit + * offset. + */ + b = calc_code_bit(nr + i); + + for (j = 0; j < p; j++) + { + /* + * Data bits in the resultant code are checked by + * parity bits that are part of the bit number + * representation. Huh? + * + * <wikipedia href="http://en.wikipedia.org/wiki/Hamming_code"> + * In other words, the parity bit at position 2^k + * checks bits in positions having bit k set in + * their binary representation. Conversely, for + * instance, bit 13, i.e. 1101(2), is checked by + * bits 1000(2) = 8, 0100(2)=4 and 0001(2) = 1. + * </wikipedia> + * + * Note that 'k' is the _code_ bit number. 'b' in + * our loop. + */ + if (b & (1 << j)) + parity ^= (1 << j); + } + } + + /* While the data buffer was treated as little endian, the + * return value is in host endian. */ + return parity; +} + +u32 ocfs2_hamming_encode_block(void *data, unsigned int blocksize) +{ + return ocfs2_hamming_encode(0, data, blocksize * 8, 0); +} + +/* + * Like ocfs2_hamming_encode(), this can handle hunks. nr is the bit + * offset of the current hunk. If bit to be fixed is not part of the + * current hunk, this does nothing. + * + * If you only have one hunk, use ocfs2_hamming_fix_block(). + */ +void ocfs2_hamming_fix(void *data, unsigned int d, int nr, unsigned int fix) +{ + unsigned int p = calc_parity_bits(nr + d); + unsigned int i, b; + + BUG_ON(!p); + + /* + * If the bit to fix has an hweight of 1, it's a parity bit. One + * busted parity bit is its own error. Nothing to do here. + */ + if (hweight32(fix) == 1) + return; + + /* See hamming_encode() for a description of 'b' */ + for (i = 0, b = 1; i < d; i++, b++) + { + /* Skip past parity bits */ + while (hweight32(b) == 1) + b++; + + if (b == fix) + { + if (ocfs2_test_bit(nr + i, data)) + ocfs2_clear_bit(nr + i, data); + else + ocfs2_set_bit(nr + i, data); + break; + } + } +} + +void ocfs2_hamming_fix_block(void *data, unsigned int blocksize, + unsigned int fix) +{ + ocfs2_hamming_fix(data, blocksize * 8, 0, fix); +} + +/* + * This function generates check information for a block. + * data is the block to be checked. bc is a pointer to the + * ocfs2_block_check structure describing the crc32 and the ecc. + * + * bc should be a pointer inside data, as the function will + * take care of zeroing it before calculating the check information. If + * bc does not point inside data, the caller must make sure any inline + * ocfs2_block_check structures are zeroed. + * + * The data buffer must be in on-disk endian (little endian for ocfs2). + * bc will be filled with little-endian values and will be ready to go to + * disk. + */ +void ocfs2_block_check_compute(void *data, size_t blocksize, + struct ocfs2_block_check *bc) +{ + u32 crc; + u32 ecc; + + memset(bc, 0, sizeof(struct ocfs2_block_check)); + + crc = crc32_le(~0, data, blocksize); + ecc = ocfs2_hamming_encode_block(data, blocksize); + + /* + * No ecc'd ocfs2 structure is larger than 4K, so ecc will be no + * larger than 16 bits. + */ + BUG_ON(ecc > USHORT_MAX); + + bc->bc_crc32e = cpu_to_le32(crc); + bc->bc_ecc = cpu_to_le16((u16)ecc); +} + +/* + * This function validates existing check information. Like _compute, + * the function will take care of zeroing bc before calculating check codes. + * If bc is not a pointer inside data, the caller must have zeroed any + * inline ocfs2_block_check structures. + * + * Again, the data passed in should be the on-disk endian. + */ +int ocfs2_block_check_validate(void *data, size_t blocksize, + struct ocfs2_block_check *bc) +{ + int rc = 0; + struct ocfs2_block_check check; + u32 crc, ecc; + + check.bc_crc32e = le32_to_cpu(bc->bc_crc32e); + check.bc_ecc = le16_to_cpu(bc->bc_ecc); + + memset(bc, 0, sizeof(struct ocfs2_block_check)); + + /* Fast path - if the crc32 validates, we're good to go */ + crc = crc32_le(~0, data, blocksize); + if (crc == check.bc_crc32e) + goto out; + + /* Ok, try ECC fixups */ + ecc = ocfs2_hamming_encode_block(data, blocksize); + ocfs2_hamming_fix_block(data, blocksize, ecc ^ check.bc_ecc); + + /* And check the crc32 again */ + crc = crc32_le(~0, data, blocksize); + if (crc == check.bc_crc32e) + goto out; + + rc = -EIO; + +out: + bc->bc_crc32e = cpu_to_le32(check.bc_crc32e); + bc->bc_ecc = cpu_to_le16(check.bc_ecc); + + return rc; +} + +/* + * This function generates check information for a list of buffer_heads. + * bhs is the blocks to be checked. bc is a pointer to the + * ocfs2_block_check structure describing the crc32 and the ecc. + * + * bc should be a pointer inside data, as the function will + * take care of zeroing it before calculating the check information. If + * bc does not point inside data, the caller must make sure any inline + * ocfs2_block_check structures are zeroed. + * + * The data buffer must be in on-disk endian (little endian for ocfs2). + * bc will be filled with little-endian values and will be ready to go to + * disk. + */ +void ocfs2_block_check_compute_bhs(struct buffer_head **bhs, int nr, + struct ocfs2_block_check *bc) +{ + int i; + u32 crc, ecc; + + BUG_ON(nr < 0); + + if (!nr) + return; + + memset(bc, 0, sizeof(struct ocfs2_block_check)); + + for (i = 0, crc = ~0, ecc = 0; i < nr; i++) { + crc = crc32_le(crc, bhs[i]->b_data, bhs[i]->b_size); + /* + * The number of bits in a buffer is obviously b_size*8. + * The offset of this buffer is b_size*i, so the bit offset + * of this buffer is b_size*8*i. + */ + ecc = (u16)ocfs2_hamming_encode(ecc, bhs[i]->b_data, + bhs[i]->b_size * 8, + bhs[i]->b_size * 8 * i); + } + + /* + * No ecc'd ocfs2 structure is larger than 4K, so ecc will be no + * larger than 16 bits. + */ + BUG_ON(ecc > USHORT_MAX); + + bc->bc_crc32e = cpu_to_le32(crc); + bc->bc_ecc = cpu_to_le16((u16)ecc); +} + +/* + * This function validates existing check information on a list of + * buffer_heads. Like _compute_bhs, the function will take care of + * zeroing bc before calculating check codes. If bc is not a pointer + * inside data, the caller must have zeroed any inline + * ocfs2_block_check structures. + * + * Again, the data passed in should be the on-disk endian. + */ +int ocfs2_block_check_validate_bhs(struct buffer_head **bhs, int nr, + struct ocfs2_block_check *bc) +{ + int i, rc = 0; + struct ocfs2_block_check check; + u32 crc, ecc, fix; + + BUG_ON(nr < 0); + + if (!nr) + return 0; + + check.bc_crc32e = le32_to_cpu(bc->bc_crc32e); + check.bc_ecc = le16_to_cpu(bc->bc_ecc); + + memset(bc, 0, sizeof(struct ocfs2_block_check)); + + /* Fast path - if the crc32 validates, we're good to go */ + for (i = 0, crc = ~0; i < nr; i++) + crc = crc32_le(crc, bhs[i]->b_data, bhs[i]->b_size); + if (crc == check.bc_crc32e) + goto out; + + mlog(ML_ERROR, + "CRC32 failed: stored: %u, computed %u. Applying ECC.\n", + (unsigned int)check.bc_crc32e, (unsigned int)crc); + + /* Ok, try ECC fixups */ + for (i = 0, ecc = 0; i < nr; i++) { + /* + * The number of bits in a buffer is obviously b_size*8. + * The offset of this buffer is b_size*i, so the bit offset + * of this buffer is b_size*8*i. + */ + ecc = (u16)ocfs2_hamming_encode(ecc, bhs[i]->b_data, + bhs[i]->b_size * 8, + bhs[i]->b_size * 8 * i); + } + fix = ecc ^ check.bc_ecc; + for (i = 0; i < nr; i++) { + /* + * Try the fix against each buffer. It will only affect + * one of them. + */ + ocfs2_hamming_fix(bhs[i]->b_data, bhs[i]->b_size * 8, + bhs[i]->b_size * 8 * i, fix); + } + + /* And check the crc32 again */ + for (i = 0, crc = ~0; i < nr; i++) + crc = crc32_le(crc, bhs[i]->b_data, bhs[i]->b_size); + if (crc == check.bc_crc32e) + goto out; + + mlog(ML_ERROR, "Fixed CRC32 failed: stored: %u, computed %u\n", + (unsigned int)check.bc_crc32e, (unsigned int)crc); + + rc = -EIO; + +out: + bc->bc_crc32e = cpu_to_le32(check.bc_crc32e); + bc->bc_ecc = cpu_to_le16(check.bc_ecc); + + return rc; +} + +/* + * These are the main API. They check the superblock flag before + * calling the underlying operations. + * + * They expect the buffer(s) to be in disk format. + */ +void ocfs2_compute_meta_ecc(struct super_block *sb, void *data, + struct ocfs2_block_check *bc) +{ + if (ocfs2_meta_ecc(OCFS2_SB(sb))) + ocfs2_block_check_compute(data, sb->s_blocksize, bc); +} + +int ocfs2_validate_meta_ecc(struct super_block *sb, void *data, + struct ocfs2_block_check *bc) +{ + int rc = 0; + + if (ocfs2_meta_ecc(OCFS2_SB(sb))) + rc = ocfs2_block_check_validate(data, sb->s_blocksize, bc); + + return rc; +} + +void ocfs2_compute_meta_ecc_bhs(struct super_block *sb, + struct buffer_head **bhs, int nr, + struct ocfs2_block_check *bc) +{ + if (ocfs2_meta_ecc(OCFS2_SB(sb))) + ocfs2_block_check_compute_bhs(bhs, nr, bc); +} + +int ocfs2_validate_meta_ecc_bhs(struct super_block *sb, + struct buffer_head **bhs, int nr, + struct ocfs2_block_check *bc) +{ + int rc = 0; + + if (ocfs2_meta_ecc(OCFS2_SB(sb))) + rc = ocfs2_block_check_validate_bhs(bhs, nr, bc); + + return rc; +} + diff --git a/fs/ocfs2/blockcheck.h b/fs/ocfs2/blockcheck.h new file mode 100644 index 0000000..504b761 --- /dev/null +++ b/fs/ocfs2/blockcheck.h @@ -0,0 +1,81 @@ +/* -*- mode: c; c-basic-offset: 8; -*- + * vim: noexpandtab sw=8 ts=8 sts=0: + * + * blockcheck.h + * + * Checksum and ECC codes for the OCFS2 userspace library. + * + * Copyright (C) 2004, 2008 Oracle. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License, version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + */ + +#ifndef OCFS2_BLOCKCHECK_H +#define OCFS2_BLOCKCHECK_H + + +/* High level block API */ +void ocfs2_compute_meta_ecc(struct super_block *sb, void *data, + struct ocfs2_block_check *bc); +int ocfs2_validate_meta_ecc(struct super_block *sb, void *data, + struct ocfs2_block_check *bc); +void ocfs2_compute_meta_ecc_bhs(struct super_block *sb, + struct buffer_head **bhs, int nr, + struct ocfs2_block_check *bc); +int ocfs2_validate_meta_ecc_bhs(struct super_block *sb, + struct buffer_head **bhs, int nr, + struct ocfs2_block_check *bc); + +/* Lower level API */ +void ocfs2_block_check_compute(void *data, size_t blocksize, + struct ocfs2_block_check *bc); +int ocfs2_block_check_validate(void *data, size_t blocksize, + struct ocfs2_block_check *bc); +void ocfs2_block_check_compute_bhs(struct buffer_head **bhs, int nr, + struct ocfs2_block_check *bc); +int ocfs2_block_check_validate_bhs(struct buffer_head **bhs, int nr, + struct ocfs2_block_check *bc); + +/* + * Hamming code functions + */ + +/* + * Encoding hamming code parity bits for a buffer. + * + * This is the low level encoder function. It can be called across + * multiple hunks just like the crc32 code. 'd' is the number of bits + * _in_this_hunk_. nr is the bit offset of this hunk. So, if you had + * two 512B buffers, you would do it like so: + * + * parity = ocfs2_hamming_encode(0, buf1, 512 * 8, 0); + * parity = ocfs2_hamming_encode(parity, buf2, 512 * 8, 512 * 8); + * + * If you just have one buffer, use ocfs2_hamming_encode_block(). + */ +u32 ocfs2_hamming_encode(u32 parity, void *data, unsigned int d, + unsigned int nr); +/* + * Fix a buffer with a bit error. The 'fix' is the original parity + * xor'd with the parity calculated now. + * + * Like ocfs2_hamming_encode(), this can handle hunks. nr is the bit + * offset of the current hunk. If bit to be fixed is not part of the + * current hunk, this does nothing. + * + * If you only have one buffer, use ocfs2_hamming_fix_block(). + */ +void ocfs2_hamming_fix(void *data, unsigned int d, int nr, unsigned int fix); + +/* Convenience wrappers for a single buffer of data */ +extern u32 ocfs2_hamming_encode_block(void *data, unsigned int blocksize); +extern void ocfs2_hamming_fix_block(void *data, unsigned int blocksize, + unsigned int fix); +#endif diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h index 5c77798..2bb389f 100644 --- a/fs/ocfs2/ocfs2.h +++ b/fs/ocfs2/ocfs2.h @@ -382,6 +382,13 @@ static inline int ocfs2_supports_xattr(struct ocfs2_super *osb) return 0; } +static inline int ocfs2_meta_ecc(struct ocfs2_super *osb) +{ + if (osb->s_feature_incompat & OCFS2_FEATURE_INCOMPAT_META_ECC) + return 1; + return 0; +} + /* set / clear functions because cluster events can make these happen * in parallel so we want the transitions to be atomic. this also * means that any future flags osb_flags must be protected by spinlock @@ -615,5 +622,6 @@ static inline s16 ocfs2_get_inode_steal_slot(struct ocfs2_super *osb) #define ocfs2_clear_bit ext2_clear_bit #define ocfs2_test_bit ext2_test_bit #define ocfs2_find_next_zero_bit ext2_find_next_zero_bit +#define ocfs2_find_next_bit ext2_find_next_bit #endif /* OCFS2_H */ -- 1.5.6.5
Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 04/18] ocfs2: Add a validation hook for quota block reads.
Add a currently-returns-success hook for quota block reads. We'll be adding checks to this. Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/quota_global.c | 14 +++++++++++++- 1 files changed, 13 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c index 4b38a2e..5b05f94 100644 --- a/fs/ocfs2/quota_global.c +++ b/fs/ocfs2/quota_global.c @@ -85,13 +85,25 @@ struct qtree_fmt_operations ocfs2_global_ops = { .is_id = ocfs2_global_is_id, }; +static int ocfs2_validate_quota_block(struct super_block *sb, + struct buffer_head *bh) +{ + struct ocfs2_disk_dqtrailer *dqt = ocfs2_dq_trailer(sb, bh->b_data); + + mlog(0, "Validating quota block %llu\n", + (unsigned long long)bh->b_blocknr); + + return 0; +} + int ocfs2_read_quota_block(struct inode *inode, u64 v_block, struct buffer_head **bh) { int rc = 0; struct buffer_head *tmp = *bh; - rc = ocfs2_read_virt_blocks(inode, v_block, 1, &tmp, 0, NULL); + rc = ocfs2_read_virt_blocks(inode, v_block, 1, &tmp, 0, + ocfs2_validate_quota_block); if (rc) mlog_errno(rc); -- 1.5.6.5
Add block check calls to the read_block validate functions. This is the almost all of the read-side checking of metaecc. xattr buckets are not checked yet. Writes are also unchecked, and so a read-write mount will quickly fail. Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/alloc.c | 17 +++++++++++++++++ fs/ocfs2/blockcheck.c | 9 +++++++++ fs/ocfs2/inode.c | 18 +++++++++++++++++- fs/ocfs2/quota_global.c | 13 +++++++++++-- fs/ocfs2/suballoc.c | 31 ++++++++++++++++++++++++++++++- fs/ocfs2/xattr.c | 17 +++++++++++++++++ 6 files changed, 101 insertions(+), 4 deletions(-) diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c index 84a7bd4..6b27f74 100644 --- a/fs/ocfs2/alloc.c +++ b/fs/ocfs2/alloc.c @@ -37,6 +37,7 @@ #include "alloc.h" #include "aops.h" +#include "blockcheck.h" #include "dlmglue.h" #include "extent_map.h" #include "inode.h" @@ -682,12 +683,28 @@ struct ocfs2_merge_ctxt { static int ocfs2_validate_extent_block(struct super_block *sb, struct buffer_head *bh) { + int rc; struct ocfs2_extent_block *eb (struct ocfs2_extent_block *)bh->b_data; mlog(0, "Validating extent block %llu\n", (unsigned long long)bh->b_blocknr); + BUG_ON(!buffer_uptodate(bh)); + + /* + * If the ecc fails, we return the error but otherwise + * leave the filesystem running. We know any error is + * local to this block. + */ + rc = ocfs2_validate_meta_ecc(sb, bh->b_data, &eb->h_check); + if (rc) + return rc; + + /* + * Errors after here are fatal. + */ + if (!OCFS2_IS_VALID_EXTENT_BLOCK(eb)) { ocfs2_error(sb, "Extent block #%llu has bad signature %.*s", diff --git a/fs/ocfs2/blockcheck.c b/fs/ocfs2/blockcheck.c index 53498e3..48284e4 100644 --- a/fs/ocfs2/blockcheck.c +++ b/fs/ocfs2/blockcheck.c @@ -24,6 +24,8 @@ #include <linux/bitops.h> #include <asm/byteorder.h> +#include <cluster/masklog.h> + #include "ocfs2.h" #include "blockcheck.h" @@ -267,6 +269,10 @@ int ocfs2_block_check_validate(void *data, size_t blocksize, if (crc == check.bc_crc32e) goto out; + mlog(ML_ERROR, + "CRC32 failed: stored: %u, computed %u. Applying ECC.\n", + (unsigned int)check.bc_crc32e, (unsigned int)crc); + /* Ok, try ECC fixups */ ecc = ocfs2_hamming_encode_block(data, blocksize); ocfs2_hamming_fix_block(data, blocksize, ecc ^ check.bc_ecc); @@ -276,6 +282,9 @@ int ocfs2_block_check_validate(void *data, size_t blocksize, if (crc == check.bc_crc32e) goto out; + mlog(ML_ERROR, "Fixed CRC32 failed: stored: %u, computed %u\n", + (unsigned int)check.bc_crc32e, (unsigned int)crc); + rc = -EIO; out: diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c index 288512c..9370b65 100644 --- a/fs/ocfs2/inode.c +++ b/fs/ocfs2/inode.c @@ -38,6 +38,7 @@ #include "ocfs2.h" #include "alloc.h" +#include "blockcheck.h" #include "dlmglue.h" #include "extent_map.h" #include "file.h" @@ -1262,7 +1263,7 @@ void ocfs2_refresh_inode(struct inode *inode, int ocfs2_validate_inode_block(struct super_block *sb, struct buffer_head *bh) { - int rc = -EINVAL; + int rc; struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data; mlog(0, "Validating dinode %llu\n", @@ -1270,6 +1271,21 @@ int ocfs2_validate_inode_block(struct super_block *sb, BUG_ON(!buffer_uptodate(bh)); + /* + * If the ecc fails, we return the error but otherwise + * leave the filesystem running. We know any error is + * local to this block. + */ + rc = ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check); + if (rc) + goto bail; + + /* + * Errors after here are fatal. + */ + + rc = -EINVAL; + if (!OCFS2_IS_VALID_DINODE(di)) { ocfs2_error(sb, "Invalid dinode #%llu: signature = %.*s\n", (unsigned long long)bh->b_blocknr, 7, diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c index 5b05f94..d338438 100644 --- a/fs/ocfs2/quota_global.c +++ b/fs/ocfs2/quota_global.c @@ -16,6 +16,7 @@ #include "ocfs2_fs.h" #include "ocfs2.h" #include "alloc.h" +#include "blockcheck.h" #include "inode.h" #include "journal.h" #include "file.h" @@ -88,12 +89,20 @@ struct qtree_fmt_operations ocfs2_global_ops = { static int ocfs2_validate_quota_block(struct super_block *sb, struct buffer_head *bh) { - struct ocfs2_disk_dqtrailer *dqt = ocfs2_dq_trailer(sb, bh->b_data); + struct ocfs2_disk_dqtrailer *dqt + ocfs2_block_dqtrailer(sb->s_blocksize, bh->b_data); mlog(0, "Validating quota block %llu\n", (unsigned long long)bh->b_blocknr); - return 0; + BUG_ON(!buffer_uptodate(bh)); + + /* + * If the ecc fails, we return the error but otherwise + * leave the filesystem running. We know any error is + * local to this block. + */ + return ocfs2_validate_meta_ecc(sb, bh->b_data, &dqt->dq_check); } int ocfs2_read_quota_block(struct inode *inode, u64 v_block, diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 226fe21..7875576 100644 --- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -35,6 +35,7 @@ #include "ocfs2.h" #include "alloc.h" +#include "blockcheck.h" #include "dlmglue.h" #include "inode.h" #include "journal.h" @@ -250,8 +251,18 @@ int ocfs2_check_group_descriptor(struct super_block *sb, struct buffer_head *bh) { int rc; + struct ocfs2_group_desc *gd = (struct ocfs2_group_desc *)bh->b_data; + + BUG_ON(!buffer_uptodate(bh)); - rc = ocfs2_validate_gd_self(sb, bh, 1); + /* + * If the ecc fails, we return the error but otherwise + * leave the filesystem running. We know any error is + * local to this block. + */ + rc = ocfs2_validate_meta_ecc(sb, bh->b_data, &gd->bg_check); + if (!rc) + rc = ocfs2_validate_gd_self(sb, bh, 1); if (!rc) rc = ocfs2_validate_gd_parent(sb, di, bh, 1); @@ -261,9 +272,27 @@ int ocfs2_check_group_descriptor(struct super_block *sb, static int ocfs2_validate_group_descriptor(struct super_block *sb, struct buffer_head *bh) { + int rc; + struct ocfs2_group_desc *gd = (struct ocfs2_group_desc *)bh->b_data; + mlog(0, "Validating group descriptor %llu\n", (unsigned long long)bh->b_blocknr); + BUG_ON(!buffer_uptodate(bh)); + + /* + * If the ecc fails, we return the error but otherwise + * leave the filesystem running. We know any error is + * local to this block. + */ + rc = ocfs2_validate_meta_ecc(sb, bh->b_data, &gd->bg_check); + if (rc) + return rc; + + /* + * Errors after here are fatal. + */ + return ocfs2_validate_gd_self(sb, bh, 0); } diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c index 94cadbe..dcf26ad 100644 --- a/fs/ocfs2/xattr.c +++ b/fs/ocfs2/xattr.c @@ -42,6 +42,7 @@ #include "ocfs2.h" #include "alloc.h" +#include "blockcheck.h" #include "dlmglue.h" #include "file.h" #include "symlink.h" @@ -322,12 +323,28 @@ static void ocfs2_xattr_bucket_copy_data(struct ocfs2_xattr_bucket *dest, static int ocfs2_validate_xattr_block(struct super_block *sb, struct buffer_head *bh) { + int rc; struct ocfs2_xattr_block *xb (struct ocfs2_xattr_block *)bh->b_data; mlog(0, "Validating xattr block %llu\n", (unsigned long long)bh->b_blocknr); + BUG_ON(!buffer_uptodate(bh)); + + /* + * If the ecc fails, we return the error but otherwise + * leave the filesystem running. We know any error is + * local to this block. + */ + rc = ocfs2_validate_meta_ecc(sb, bh->b_data, &xb->xb_check); + if (rc) + return rc; + + /* + * Errors after here are fatal + */ + if (!OCFS2_IS_VALID_XATTR_BLOCK(xb)) { ocfs2_error(sb, "Extended attribute block #%llu has bad " -- 1.5.6.5
Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 06/18] ocfs2: Add journal_access functions with jbd2 triggers.
We create wrappers for ocfs2_journal_access() that are specific to the type of metadata block. This allows us to associate jbd2 commit triggers with the block. The triggers will compute metadata ecc in a future commit. Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/journal.c | 159 ++++++++++++++++++++++++++++++++++++++++++++++++++- fs/ocfs2/journal.h | 31 +++++++++-- 2 files changed, 181 insertions(+), 9 deletions(-) diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c index 302f114..2daa584 100644 --- a/fs/ocfs2/journal.c +++ b/fs/ocfs2/journal.c @@ -35,6 +35,7 @@ #include "ocfs2.h" #include "alloc.h" +#include "blockcheck.h" #include "dir.h" #include "dlmglue.h" #include "extent_map.h" @@ -369,10 +370,110 @@ bail: return status; } -int ocfs2_journal_access(handle_t *handle, - struct inode *inode, - struct buffer_head *bh, - int type) +struct ocfs2_triggers { + struct jbd2_buffer_trigger_type ot_triggers; + int ot_offset; +}; + +static inline struct ocfs2_triggers *to_ocfs2_trigger(struct jbd2_buffer_trigger_type *triggers) +{ + return container_of(triggers, struct ocfs2_triggers, ot_triggers); +} + +static void ocfs2_commit_trigger(struct jbd2_buffer_trigger_type *triggers, + struct buffer_head *bh, + void *data, size_t size) +{ + struct ocfs2_triggers *ot = to_ocfs2_trigger(triggers); + + /* + * We aren't guaranteed to have the superblock here, so we + * must unconditionally compute the ecc data. + * __ocfs2_journal_access() will only set the triggers if + * metaecc is enabled. + */ + ocfs2_block_check_compute(data, size, data + ot->ot_offset); +} + +/* + * Quota blocks have their own trigger because the struct ocfs2_block_check + * offset depends on the blocksize. + */ +static void ocfs2_dq_commit_trigger(struct jbd2_buffer_trigger_type *triggers, + struct buffer_head *bh, + void *data, size_t size) +{ + struct ocfs2_disk_dqtrailer *dqt + ocfs2_block_dqtrailer(size, data); + + /* + * We aren't guaranteed to have the superblock here, so we + * must unconditionally compute the ecc data. + * __ocfs2_journal_access() will only set the triggers if + * metaecc is enabled. + */ + ocfs2_block_check_compute(data, size, &dqt->dq_check); +} + +static void ocfs2_abort_trigger(struct jbd2_buffer_trigger_type *triggers, + struct buffer_head *bh) +{ + mlog(ML_ERROR, + "ocfs2_abort_trigger called by JBD2. bh = 0x%lx, " + "bh->b_blocknr = %llu\n", + (unsigned long)bh, + (unsigned long long)bh->b_blocknr); + + /* We aren't guaranteed to have the superblock here - but if we + * don't, it'll just crash. */ + ocfs2_error(bh->b_assoc_map->host->i_sb, + "JBD2 has aborted our journal, ocfs2 cannot continue\n"); +} + +static struct ocfs2_triggers di_triggers = { + .ot_triggers = { + .t_commit = ocfs2_commit_trigger, + .t_abort = ocfs2_abort_trigger, + }, + .ot_offset = offsetof(struct ocfs2_dinode, i_check), +}; + +static struct ocfs2_triggers eb_triggers = { + .ot_triggers = { + .t_commit = ocfs2_commit_trigger, + .t_abort = ocfs2_abort_trigger, + }, + .ot_offset = offsetof(struct ocfs2_extent_block, h_check), +}; + +static struct ocfs2_triggers gd_triggers = { + .ot_triggers = { + .t_commit = ocfs2_commit_trigger, + .t_abort = ocfs2_abort_trigger, + }, + .ot_offset = offsetof(struct ocfs2_group_desc, bg_check), +}; + +static struct ocfs2_triggers xb_triggers = { + .ot_triggers = { + .t_commit = ocfs2_commit_trigger, + .t_abort = ocfs2_abort_trigger, + }, + .ot_offset = offsetof(struct ocfs2_xattr_block, xb_check), +}; + +static struct ocfs2_triggers dq_triggers = { + .ot_triggers = { + .t_commit = ocfs2_dq_commit_trigger, + .t_abort = ocfs2_abort_trigger, + }, +}; + +static int __ocfs2_journal_access(handle_t *handle, + struct inode *inode, + struct buffer_head *bh, + struct ocfs2_triggers *triggers, + int type) { int status; @@ -418,6 +519,8 @@ int ocfs2_journal_access(handle_t *handle, status = -EINVAL; mlog(ML_ERROR, "Uknown access type!\n"); } + if (!status && ocfs2_meta_ecc(OCFS2_SB(inode->i_sb)) && triggers) + jbd2_journal_set_triggers(bh, &triggers->ot_triggers); mutex_unlock(&OCFS2_I(inode)->ip_io_mutex); if (status < 0) @@ -428,6 +531,54 @@ int ocfs2_journal_access(handle_t *handle, return status; } +int ocfs2_journal_access_di(handle_t *handle, struct inode *inode, + struct buffer_head *bh, int type) +{ + return __ocfs2_journal_access(handle, inode, bh, &di_triggers, + type); +} + +int ocfs2_journal_access_eb(handle_t *handle, struct inode *inode, + struct buffer_head *bh, int type) +{ + return __ocfs2_journal_access(handle, inode, bh, &eb_triggers, + type); +} + +int ocfs2_journal_access_gd(handle_t *handle, struct inode *inode, + struct buffer_head *bh, int type) +{ + return __ocfs2_journal_access(handle, inode, bh, &gd_triggers, + type); +} + +int ocfs2_journal_access_db(handle_t *handle, struct inode *inode, + struct buffer_head *bh, int type) +{ + /* Right now, nothing for dirblocks */ + return __ocfs2_journal_access(handle, inode, bh, NULL, type); +} + +int ocfs2_journal_access_xb(handle_t *handle, struct inode *inode, + struct buffer_head *bh, int type) +{ + return __ocfs2_journal_access(handle, inode, bh, &xb_triggers, + type); +} + +int ocfs2_journal_access_dq(handle_t *handle, struct inode *inode, + struct buffer_head *bh, int type) +{ + return __ocfs2_journal_access(handle, inode, bh, &dq_triggers, + type); +} + +int ocfs2_journal_access(handle_t *handle, struct inode *inode, + struct buffer_head *bh, int type) +{ + return __ocfs2_journal_access(handle, inode, bh, NULL, type); +} + int ocfs2_journal_dirty(handle_t *handle, struct buffer_head *bh) { diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h index 37013bf..bca370d 100644 --- a/fs/ocfs2/journal.h +++ b/fs/ocfs2/journal.h @@ -212,9 +212,12 @@ static inline void ocfs2_checkpoint_inode(struct inode *inode) * ocfs2_extend_trans - Extend a handle by nblocks credits. This may * commit the handle to disk in the process, but will * not release any locks taken during the transaction. - * ocfs2_journal_access - Notify the handle that we want to journal this + * ocfs2_journal_access* - Notify the handle that we want to journal this * buffer. Will have to call ocfs2_journal_dirty once * we've actually dirtied it. Type is one of . or . + * Always call the specific flavor of + * ocfs2_journal_access_*() unless you intend to + * manage the checksum by hand. * ocfs2_journal_dirty - Mark a journalled buffer as having dirty data. * ocfs2_jbd2_file_inode - Mark an inode so that its data goes out before * the current handle commits. @@ -244,10 +247,28 @@ int ocfs2_extend_trans(handle_t *handle, int nblocks); #define OCFS2_JOURNAL_ACCESS_WRITE 1 #define OCFS2_JOURNAL_ACCESS_UNDO 2 -int ocfs2_journal_access(handle_t *handle, - struct inode *inode, - struct buffer_head *bh, - int type); +/* ocfs2_inode */ +int ocfs2_journal_access_di(handle_t *handle, struct inode *inode, + struct buffer_head *bh, int type); +/* ocfs2_extent_block */ +int ocfs2_journal_access_eb(handle_t *handle, struct inode *inode, + struct buffer_head *bh, int type); +/* ocfs2_group_desc */ +int ocfs2_journal_access_gd(handle_t *handle, struct inode *inode, + struct buffer_head *bh, int type); +/* ocfs2_xattr_block */ +int ocfs2_journal_access_xb(handle_t *handle, struct inode *inode, + struct buffer_head *bh, int type); +/* quota blocks */ +int ocfs2_journal_access_dq(handle_t *handle, struct inode *inode, + struct buffer_head *bh, int type); +/* dirblock */ +int ocfs2_journal_access_db(handle_t *handle, struct inode *inode, + struct buffer_head *bh, int type); +/* Anything that has no ecc */ +int ocfs2_journal_access(handle_t *handle, struct inode *inode, + struct buffer_head *bh, int type); + /* * A word about the journal_access/journal_dirty "dance". It is * entirely legal to journal_access a buffer more than once (as long -- 1.5.6.5
Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 07/18] ocfs2: Wrap up the common use cases of ocfs2_new_path().
The majority of ocfs2_new_path() calls are: ocfs2_new_path(path_root_bh(otherpath), path_root_el(otherpath)); Let's call that ocfs2_new_path_from_path(). The rest do similar things from struct ocfs2_extent_tree. Let's call those ocfs2_new_path_from_et(). This will make the next change easier. Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/alloc.c | 48 ++++++++++++++++++++++++------------------------ 1 files changed, 24 insertions(+), 24 deletions(-) diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c index 6b27f74..c22ff49 100644 --- a/fs/ocfs2/alloc.c +++ b/fs/ocfs2/alloc.c @@ -532,6 +532,16 @@ static struct ocfs2_path *ocfs2_new_path(struct buffer_head *root_bh, return path; } +static struct ocfs2_path *ocfs2_new_path_from_path(struct ocfs2_path *path) +{ + return ocfs2_new_path(path_root_bh(path), path_root_el(path)); +} + +static struct ocfs2_path *ocfs2_new_path_from_et(struct ocfs2_extent_tree *et) +{ + return ocfs2_new_path(et->et_root_bh, et->et_root_el); +} + /* * Convenience function to journal all components in a path. */ @@ -2150,8 +2160,7 @@ static int ocfs2_rotate_tree_right(struct inode *inode, *ret_left_path = NULL; - left_path = ocfs2_new_path(path_root_bh(right_path), - path_root_el(right_path)); + left_path = ocfs2_new_path_from_path(right_path); if (!left_path) { ret = -ENOMEM; mlog_errno(ret); @@ -2692,8 +2701,7 @@ static int __ocfs2_rotate_tree_left(struct inode *inode, goto out; } - left_path = ocfs2_new_path(path_root_bh(path), - path_root_el(path)); + left_path = ocfs2_new_path_from_path(path); if (!left_path) { ret = -ENOMEM; mlog_errno(ret); @@ -2702,8 +2710,7 @@ static int __ocfs2_rotate_tree_left(struct inode *inode, ocfs2_cp_path(left_path, path); - right_path = ocfs2_new_path(path_root_bh(path), - path_root_el(path)); + right_path = ocfs2_new_path_from_path(path); if (!right_path) { ret = -ENOMEM; mlog_errno(ret); @@ -2833,8 +2840,7 @@ static int ocfs2_remove_rightmost_path(struct inode *inode, handle_t *handle, * We have a path to the left of this one - it needs * an update too. */ - left_path = ocfs2_new_path(path_root_bh(path), - path_root_el(path)); + left_path = ocfs2_new_path_from_path(path); if (!left_path) { ret = -ENOMEM; mlog_errno(ret); @@ -3075,8 +3081,7 @@ static int ocfs2_get_right_path(struct inode *inode, /* This function shouldn't be called for the rightmost leaf. */ BUG_ON(right_cpos == 0); - right_path = ocfs2_new_path(path_root_bh(left_path), - path_root_el(left_path)); + right_path = ocfs2_new_path_from_path(left_path); if (!right_path) { ret = -ENOMEM; mlog_errno(ret); @@ -3247,8 +3252,7 @@ static int ocfs2_get_left_path(struct inode *inode, /* This function shouldn't be called for the leftmost leaf. */ BUG_ON(left_cpos == 0); - left_path = ocfs2_new_path(path_root_bh(right_path), - path_root_el(right_path)); + left_path = ocfs2_new_path_from_path(right_path); if (!left_path) { ret = -ENOMEM; mlog_errno(ret); @@ -3780,8 +3784,7 @@ static int ocfs2_append_rec_to_path(struct inode *inode, handle_t *handle, * leftmost leaf. */ if (left_cpos) { - left_path = ocfs2_new_path(path_root_bh(right_path), - path_root_el(right_path)); + left_path = ocfs2_new_path_from_path(right_path); if (!left_path) { ret = -ENOMEM; mlog_errno(ret); @@ -4018,7 +4021,7 @@ static int ocfs2_do_insert_extent(struct inode *inode, goto out_update_clusters; } - right_path = ocfs2_new_path(et->et_root_bh, et->et_root_el); + right_path = ocfs2_new_path_from_et(et); if (!right_path) { ret = -ENOMEM; mlog_errno(ret); @@ -4130,8 +4133,7 @@ ocfs2_figure_merge_contig_type(struct inode *inode, struct ocfs2_path *path, goto out; if (left_cpos != 0) { - left_path = ocfs2_new_path(path_root_bh(path), - path_root_el(path)); + left_path = ocfs2_new_path_from_path(path); if (!left_path) goto out; @@ -4187,8 +4189,7 @@ ocfs2_figure_merge_contig_type(struct inode *inode, struct ocfs2_path *path, if (right_cpos == 0) goto out; - right_path = ocfs2_new_path(path_root_bh(path), - path_root_el(path)); + right_path = ocfs2_new_path_from_path(path); if (!right_path) goto out; @@ -4381,7 +4382,7 @@ static int ocfs2_figure_insert_type(struct inode *inode, return 0; } - path = ocfs2_new_path(et->et_root_bh, et->et_root_el); + path = ocfs2_new_path_from_et(et); if (!path) { ret = -ENOMEM; mlog_errno(ret); @@ -4910,7 +4911,7 @@ int ocfs2_mark_extent_written(struct inode *inode, if (et->et_ops == &ocfs2_dinode_et_ops) ocfs2_extent_map_trunc(inode, 0); - left_path = ocfs2_new_path(et->et_root_bh, et->et_root_el); + left_path = ocfs2_new_path_from_et(et); if (!left_path) { ret = -ENOMEM; mlog_errno(ret); @@ -5082,8 +5083,7 @@ static int ocfs2_truncate_rec(struct inode *inode, handle_t *handle, } if (left_cpos && le16_to_cpu(el->l_next_free_rec) > 1) { - left_path = ocfs2_new_path(path_root_bh(path), - path_root_el(path)); + left_path = ocfs2_new_path_from_path(path); if (!left_path) { ret = -ENOMEM; mlog_errno(ret); @@ -5192,7 +5192,7 @@ int ocfs2_remove_extent(struct inode *inode, ocfs2_extent_map_trunc(inode, 0); - path = ocfs2_new_path(et->et_root_bh, et->et_root_el); + path = ocfs2_new_path_from_et(et); if (!path) { ret = -ENOMEM; mlog_errno(ret); -- 1.5.6.5
Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 08/18] ocfs2: Use metadata-specific ocfs2_journal_access_*() functions.
The per-metadata-type ocfs2_journal_access_*() functions hook up jbd2 commit triggers and allow us to compute metadata ecc right before the buffers are written out. This commit provides ecc for inodes, extent blocks, group descriptors, and quota blocks. It is not safe to use extened attributes and metaecc at the same time yet. The ocfs2_extent_tree and ocfs2_path abstractions in alloc.c both hide the type of block at their root. Before, it didn't matter, but now the root block must use the appropriate ocfs2_journal_access_*() function. To keep this abstract, the structures now have a pointer to the matching journal_access function and a wrapper call to call it. Since we pass around the journal_access functions. Let's typedef them in ocfs2.h. Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/alloc.c | 230 +++++++++++++++++++++++++++++-------------------- fs/ocfs2/alloc.h | 5 +- fs/ocfs2/aops.c | 8 +- fs/ocfs2/dir.c | 38 +++++---- fs/ocfs2/file.c | 16 ++-- fs/ocfs2/inode.c | 17 ++-- fs/ocfs2/journal.c | 2 + fs/ocfs2/journal.h | 3 +- fs/ocfs2/localalloc.c | 16 ++-- fs/ocfs2/namei.c | 38 ++++---- fs/ocfs2/ocfs2.h | 4 + fs/ocfs2/resize.c | 16 ++-- fs/ocfs2/suballoc.c | 58 +++++++------ 13 files changed, 257 insertions(+), 194 deletions(-) diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c index c22ff49..3c8d866 100644 --- a/fs/ocfs2/alloc.c +++ b/fs/ocfs2/alloc.c @@ -298,11 +298,13 @@ static struct ocfs2_extent_tree_operations ocfs2_xattr_tree_et_ops = { static void __ocfs2_init_extent_tree(struct ocfs2_extent_tree *et, struct inode *inode, struct buffer_head *bh, + ocfs2_journal_access_func access, void *obj, struct ocfs2_extent_tree_operations *ops) { et->et_ops = ops; et->et_root_bh = bh; + et->et_root_journal_access = access; if (!obj) obj = (void *)bh->b_data; et->et_object = obj; @@ -318,15 +320,16 @@ void ocfs2_init_dinode_extent_tree(struct ocfs2_extent_tree *et, struct inode *inode, struct buffer_head *bh) { - __ocfs2_init_extent_tree(et, inode, bh, NULL, &ocfs2_dinode_et_ops); + __ocfs2_init_extent_tree(et, inode, bh, ocfs2_journal_access_di, + NULL, &ocfs2_dinode_et_ops); } void ocfs2_init_xattr_tree_extent_tree(struct ocfs2_extent_tree *et, struct inode *inode, struct buffer_head *bh) { - __ocfs2_init_extent_tree(et, inode, bh, NULL, - &ocfs2_xattr_tree_et_ops); + __ocfs2_init_extent_tree(et, inode, bh, ocfs2_journal_access_xb, + NULL, &ocfs2_xattr_tree_et_ops); } void ocfs2_init_xattr_value_extent_tree(struct ocfs2_extent_tree *et, @@ -334,7 +337,7 @@ void ocfs2_init_xattr_value_extent_tree(struct ocfs2_extent_tree *et, struct buffer_head *bh, struct ocfs2_xattr_value_root *xv) { - __ocfs2_init_extent_tree(et, inode, bh, xv, + __ocfs2_init_extent_tree(et, inode, bh, ocfs2_journal_access, xv, &ocfs2_xattr_value_et_ops); } @@ -356,6 +359,15 @@ static inline void ocfs2_et_update_clusters(struct inode *inode, et->et_ops->eo_update_clusters(inode, et, clusters); } +static inline int ocfs2_et_root_journal_access(handle_t *handle, + struct inode *inode, + struct ocfs2_extent_tree *et, + int type) +{ + return et->et_root_journal_access(handle, inode, et->et_root_bh, + type); +} + static inline int ocfs2_et_insert_check(struct inode *inode, struct ocfs2_extent_tree *et, struct ocfs2_extent_rec *rec) @@ -396,12 +408,14 @@ struct ocfs2_path_item { #define OCFS2_MAX_PATH_DEPTH 5 struct ocfs2_path { - int p_tree_depth; - struct ocfs2_path_item p_node[OCFS2_MAX_PATH_DEPTH]; + int p_tree_depth; + ocfs2_journal_access_func p_root_access; + struct ocfs2_path_item p_node[OCFS2_MAX_PATH_DEPTH]; }; #define path_root_bh(_path) ((_path)->p_node[0].bh) #define path_root_el(_path) ((_path)->p_node[0].el) +#define path_root_access(_path)((_path)->p_root_access) #define path_leaf_bh(_path) ((_path)->p_node[(_path)->p_tree_depth].bh) #define path_leaf_el(_path) ((_path)->p_node[(_path)->p_tree_depth].el) #define path_num_items(_path) ((_path)->p_tree_depth + 1) @@ -434,6 +448,8 @@ static void ocfs2_reinit_path(struct ocfs2_path *path, int keep_root) */ if (keep_root) depth = le16_to_cpu(path_root_el(path)->l_tree_depth); + else + path_root_access(path) = NULL; path->p_tree_depth = depth; } @@ -459,6 +475,7 @@ static void ocfs2_cp_path(struct ocfs2_path *dest, struct ocfs2_path *src) BUG_ON(path_root_bh(dest) != path_root_bh(src)); BUG_ON(path_root_el(dest) != path_root_el(src)); + BUG_ON(path_root_access(dest) != path_root_access(src)); ocfs2_reinit_path(dest, 1); @@ -480,6 +497,7 @@ static void ocfs2_mv_path(struct ocfs2_path *dest, struct ocfs2_path *src) int i; BUG_ON(path_root_bh(dest) != path_root_bh(src)); + BUG_ON(path_root_access(dest) != path_root_access(src)); for(i = 1; i < OCFS2_MAX_PATH_DEPTH; i++) { brelse(dest->p_node[i].bh); @@ -515,7 +533,8 @@ static inline void ocfs2_path_insert_eb(struct ocfs2_path *path, int index, } static struct ocfs2_path *ocfs2_new_path(struct buffer_head *root_bh, - struct ocfs2_extent_list *root_el) + struct ocfs2_extent_list *root_el, + ocfs2_journal_access_func access) { struct ocfs2_path *path; @@ -527,6 +546,7 @@ static struct ocfs2_path *ocfs2_new_path(struct buffer_head *root_bh, get_bh(root_bh); path_root_bh(path) = root_bh; path_root_el(path) = root_el; + path_root_access(path) = access; } return path; @@ -534,12 +554,38 @@ static struct ocfs2_path *ocfs2_new_path(struct buffer_head *root_bh, static struct ocfs2_path *ocfs2_new_path_from_path(struct ocfs2_path *path) { - return ocfs2_new_path(path_root_bh(path), path_root_el(path)); + return ocfs2_new_path(path_root_bh(path), path_root_el(path), + path_root_access(path)); } static struct ocfs2_path *ocfs2_new_path_from_et(struct ocfs2_extent_tree *et) { - return ocfs2_new_path(et->et_root_bh, et->et_root_el); + return ocfs2_new_path(et->et_root_bh, et->et_root_el, + et->et_root_journal_access); +} + +/* + * Journal the buffer at depth idx. All idx>0 are extent_blocks, + * otherwise it's the root_access function. + * + * I don't like the way this function's name looks next to + * ocfs2_journal_access_path(), but I don't have a better one. + */ +static int ocfs2_path_bh_journal_access(handle_t *handle, + struct inode *inode, + struct ocfs2_path *path, + int idx) +{ + ocfs2_journal_access_func access = path_root_access(path); + + if (!access) + access = ocfs2_journal_access; + + if (idx) + access = ocfs2_journal_access_eb; + + return access(handle, inode, path->p_node[idx].bh, + OCFS2_JOURNAL_ACCESS_WRITE); } /* @@ -554,8 +600,7 @@ static int ocfs2_journal_access_path(struct inode *inode, handle_t *handle, goto out; for(i = 0; i < path_num_items(path); i++) { - ret = ocfs2_journal_access(handle, inode, path->p_node[i].bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, path, i); if (ret < 0) { mlog_errno(ret); goto out; @@ -708,8 +753,11 @@ static int ocfs2_validate_extent_block(struct super_block *sb, * local to this block. */ rc = ocfs2_validate_meta_ecc(sb, bh->b_data, &eb->h_check); - if (rc) + if (rc) { + mlog(ML_ERROR, "Checksum failed for extent block %llu\n", + (unsigned long long)bh->b_blocknr); return rc; + } /* * Errors after here are fatal. @@ -842,8 +890,8 @@ static int ocfs2_create_new_meta_bhs(struct ocfs2_super *osb, } ocfs2_set_new_buffer_uptodate(inode, bhs[i]); - status = ocfs2_journal_access(handle, inode, bhs[i], - OCFS2_JOURNAL_ACCESS_CREATE); + status = ocfs2_journal_access_eb(handle, inode, bhs[i], + OCFS2_JOURNAL_ACCESS_CREATE); if (status < 0) { mlog_errno(status); goto bail; @@ -986,8 +1034,8 @@ static int ocfs2_add_branch(struct ocfs2_super *osb, BUG_ON(!OCFS2_IS_VALID_EXTENT_BLOCK(eb)); eb_el = &eb->h_list; - status = ocfs2_journal_access(handle, inode, bh, - OCFS2_JOURNAL_ACCESS_CREATE); + status = ocfs2_journal_access_eb(handle, inode, bh, + OCFS2_JOURNAL_ACCESS_CREATE); if (status < 0) { mlog_errno(status); goto bail; @@ -1026,21 +1074,21 @@ static int ocfs2_add_branch(struct ocfs2_super *osb, * journal_dirty erroring as it won't unless we've aborted the * handle (in which case we would never be here) so reserving * the write with journal_access is all we need to do. */ - status = ocfs2_journal_access(handle, inode, *last_eb_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_eb(handle, inode, *last_eb_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto bail; } - status = ocfs2_journal_access(handle, inode, et->et_root_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_et_root_journal_access(handle, inode, et, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto bail; } if (eb_bh) { - status = ocfs2_journal_access(handle, inode, eb_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_eb(handle, inode, eb_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto bail; @@ -1129,8 +1177,8 @@ static int ocfs2_shift_tree_depth(struct ocfs2_super *osb, eb_el = &eb->h_list; root_el = et->et_root_el; - status = ocfs2_journal_access(handle, inode, new_eb_bh, - OCFS2_JOURNAL_ACCESS_CREATE); + status = ocfs2_journal_access_eb(handle, inode, new_eb_bh, + OCFS2_JOURNAL_ACCESS_CREATE); if (status < 0) { mlog_errno(status); goto bail; @@ -1148,8 +1196,8 @@ static int ocfs2_shift_tree_depth(struct ocfs2_super *osb, goto bail; } - status = ocfs2_journal_access(handle, inode, et->et_root_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_et_root_journal_access(handle, inode, et, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto bail; @@ -1918,25 +1966,23 @@ static int ocfs2_rotate_subtree_right(struct inode *inode, root_bh = left_path->p_node[subtree_index].bh; BUG_ON(root_bh != right_path->p_node[subtree_index].bh); - ret = ocfs2_journal_access(handle, inode, root_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, right_path, + subtree_index); if (ret) { mlog_errno(ret); goto out; } for(i = subtree_index + 1; i < path_num_items(right_path); i++) { - ret = ocfs2_journal_access(handle, inode, - right_path->p_node[i].bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, + right_path, i); if (ret) { mlog_errno(ret); goto out; } - ret = ocfs2_journal_access(handle, inode, - left_path->p_node[i].bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, + left_path, i); if (ret) { mlog_errno(ret); goto out; @@ -2455,9 +2501,9 @@ static int ocfs2_rotate_subtree_left(struct inode *inode, handle_t *handle, return -EAGAIN; if (le16_to_cpu(right_leaf_el->l_next_free_rec) > 1) { - ret = ocfs2_journal_access(handle, inode, - path_leaf_bh(right_path), - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_eb(handle, inode, + path_leaf_bh(right_path), + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out; @@ -2474,8 +2520,8 @@ static int ocfs2_rotate_subtree_left(struct inode *inode, handle_t *handle, * We have to update i_last_eb_blk during the meta * data delete. */ - ret = ocfs2_journal_access(handle, inode, et_root_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_et_root_journal_access(handle, inode, et, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out; @@ -2490,25 +2536,23 @@ static int ocfs2_rotate_subtree_left(struct inode *inode, handle_t *handle, */ BUG_ON(right_has_empty && !del_right_subtree); - ret = ocfs2_journal_access(handle, inode, root_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, right_path, + subtree_index); if (ret) { mlog_errno(ret); goto out; } for(i = subtree_index + 1; i < path_num_items(right_path); i++) { - ret = ocfs2_journal_access(handle, inode, - right_path->p_node[i].bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, + right_path, i); if (ret) { mlog_errno(ret); goto out; } - ret = ocfs2_journal_access(handle, inode, - left_path->p_node[i].bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, + left_path, i); if (ret) { mlog_errno(ret); goto out; @@ -2653,16 +2697,17 @@ out: static int ocfs2_rotate_rightmost_leaf_left(struct inode *inode, handle_t *handle, - struct buffer_head *bh, - struct ocfs2_extent_list *el) + struct ocfs2_path *path) { int ret; + struct buffer_head *bh = path_leaf_bh(path); + struct ocfs2_extent_list *el = path_leaf_el(path); if (!ocfs2_is_empty_extent(&el->l_recs[0])) return 0; - ret = ocfs2_journal_access(handle, inode, bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, path, + path_num_items(path) - 1); if (ret) { mlog_errno(ret); goto out; @@ -2744,9 +2789,8 @@ static int __ocfs2_rotate_tree_left(struct inode *inode, * Caller might still want to make changes to the * tree root, so re-add it to the journal here. */ - ret = ocfs2_journal_access(handle, inode, - path_root_bh(left_path), - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, + left_path, 0); if (ret) { mlog_errno(ret); goto out; @@ -2929,8 +2973,7 @@ rightmost_no_delete: * it up front. */ ret = ocfs2_rotate_rightmost_leaf_left(inode, handle, - path_leaf_bh(path), - path_leaf_el(path)); + path); if (ret) mlog_errno(ret); goto out; @@ -3164,8 +3207,8 @@ static int ocfs2_merge_rec_right(struct inode *inode, root_bh = left_path->p_node[subtree_index].bh; BUG_ON(root_bh != right_path->p_node[subtree_index].bh); - ret = ocfs2_journal_access(handle, inode, root_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, right_path, + subtree_index); if (ret) { mlog_errno(ret); goto out; @@ -3173,17 +3216,15 @@ static int ocfs2_merge_rec_right(struct inode *inode, for (i = subtree_index + 1; i < path_num_items(right_path); i++) { - ret = ocfs2_journal_access(handle, inode, - right_path->p_node[i].bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, + right_path, i); if (ret) { mlog_errno(ret); goto out; } - ret = ocfs2_journal_access(handle, inode, - left_path->p_node[i].bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, + left_path, i); if (ret) { mlog_errno(ret); goto out; @@ -3195,8 +3236,8 @@ static int ocfs2_merge_rec_right(struct inode *inode, right_rec = &el->l_recs[index + 1]; } - ret = ocfs2_journal_access(handle, inode, bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, left_path, + path_num_items(left_path) - 1); if (ret) { mlog_errno(ret); goto out; @@ -3335,8 +3376,8 @@ static int ocfs2_merge_rec_left(struct inode *inode, root_bh = left_path->p_node[subtree_index].bh; BUG_ON(root_bh != right_path->p_node[subtree_index].bh); - ret = ocfs2_journal_access(handle, inode, root_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, right_path, + subtree_index); if (ret) { mlog_errno(ret); goto out; @@ -3344,17 +3385,15 @@ static int ocfs2_merge_rec_left(struct inode *inode, for (i = subtree_index + 1; i < path_num_items(right_path); i++) { - ret = ocfs2_journal_access(handle, inode, - right_path->p_node[i].bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, + right_path, i); if (ret) { mlog_errno(ret); goto out; } - ret = ocfs2_journal_access(handle, inode, - left_path->p_node[i].bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, + left_path, i); if (ret) { mlog_errno(ret); goto out; @@ -3366,8 +3405,8 @@ static int ocfs2_merge_rec_left(struct inode *inode, has_empty_extent = 1; } - ret = ocfs2_journal_access(handle, inode, bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_path_bh_journal_access(handle, inode, left_path, + path_num_items(left_path) - 1); if (ret) { mlog_errno(ret); goto out; @@ -4009,8 +4048,8 @@ static int ocfs2_do_insert_extent(struct inode *inode, el = et->et_root_el; - ret = ocfs2_journal_access(handle, inode, et->et_root_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_et_root_journal_access(handle, inode, et, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out; @@ -4071,8 +4110,8 @@ static int ocfs2_do_insert_extent(struct inode *inode, * ocfs2_rotate_tree_right() might have extended the * transaction without re-journaling our tree root. */ - ret = ocfs2_journal_access(handle, inode, et->et_root_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_et_root_journal_access(handle, inode, et, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out; @@ -4594,8 +4633,8 @@ int ocfs2_add_clusters_in_btree(struct ocfs2_super *osb, BUG_ON(num_bits > clusters_to_add); /* reserve our write early -- insert_extent may update the inode */ - status = ocfs2_journal_access(handle, inode, et->et_root_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_et_root_journal_access(handle, inode, et, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto leave; @@ -5347,8 +5386,8 @@ int ocfs2_remove_btree_range(struct inode *inode, goto out; } - ret = ocfs2_journal_access(handle, inode, et->et_root_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_et_root_journal_access(handle, inode, et, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out; @@ -5461,8 +5500,8 @@ int ocfs2_truncate_log_append(struct ocfs2_super *osb, goto bail; } - status = ocfs2_journal_access(handle, tl_inode, tl_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, tl_inode, tl_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto bail; @@ -5523,8 +5562,8 @@ static int ocfs2_replay_truncate_records(struct ocfs2_super *osb, while (i >= 0) { /* Caller has given us at least enough credits to * update the truncate log dinode */ - status = ocfs2_journal_access(handle, tl_inode, tl_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, tl_inode, tl_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto bail; @@ -6546,8 +6585,8 @@ static int ocfs2_do_truncate(struct ocfs2_super *osb, } if (last_eb_bh) { - status = ocfs2_journal_access(handle, inode, last_eb_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_eb(handle, inode, last_eb_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto bail; @@ -6908,8 +6947,8 @@ int ocfs2_convert_inline_data_to_extents(struct inode *inode, goto out_unlock; } - ret = ocfs2_journal_access(handle, inode, di_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_di(handle, inode, di_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out_commit; @@ -7043,7 +7082,8 @@ int ocfs2_commit_truncate(struct ocfs2_super *osb, new_highest_cpos = ocfs2_clusters_for_bytes(osb->sb, i_size_read(inode)); - path = ocfs2_new_path(fe_bh, &di->id2.i_list); + path = ocfs2_new_path(fe_bh, &di->id2.i_list, + ocfs2_journal_access_di); if (!path) { status = -ENOMEM; mlog_errno(status); @@ -7276,8 +7316,8 @@ int ocfs2_truncate_inline(struct inode *inode, struct buffer_head *di_bh, goto out; } - ret = ocfs2_journal_access(handle, inode, di_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_di(handle, inode, di_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out_commit; diff --git a/fs/ocfs2/alloc.h b/fs/ocfs2/alloc.h index 59d37d1..4b6fea2 100644 --- a/fs/ocfs2/alloc.h +++ b/fs/ocfs2/alloc.h @@ -45,7 +45,9 @@ * * ocfs2_extent_tree contains info for the root of the b-tree, it must have a * root ocfs2_extent_list and a root_bh so that they can be used in the b-tree - * functions. + * functions. With metadata ecc, we now call different journal_access + * functions for each type of metadata, so it must have the + * root_journal_access function. * ocfs2_extent_tree_operations abstract the normal operations we do for * the root of extent b-tree. */ @@ -54,6 +56,7 @@ struct ocfs2_extent_tree { struct ocfs2_extent_tree_operations *et_ops; struct buffer_head *et_root_bh; struct ocfs2_extent_list *et_root_el; + ocfs2_journal_access_func et_root_journal_access; void *et_object; unsigned int et_max_leaf_clusters; }; diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index 6b647ec..a067a6c 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -1512,8 +1512,8 @@ static int ocfs2_write_begin_inline(struct address_space *mapping, goto out; } - ret = ocfs2_journal_access(handle, inode, wc->w_di_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_di(handle, inode, wc->w_di_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { ocfs2_commit_trans(osb, handle); @@ -1740,8 +1740,8 @@ int ocfs2_write_begin_nolock(struct address_space *mapping, * We don't want this to fail in ocfs2_write_end(), so do it * here. */ - ret = ocfs2_journal_access(handle, inode, wc->w_di_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_di(handle, inode, wc->w_di_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out_quota; diff --git a/fs/ocfs2/dir.c b/fs/ocfs2/dir.c index 3708fe4..1ab2a3c 100644 --- a/fs/ocfs2/dir.c +++ b/fs/ocfs2/dir.c @@ -384,8 +384,8 @@ int ocfs2_update_entry(struct inode *dir, handle_t *handle, * based directories, so no need to split this up. */ - ret = ocfs2_journal_access(handle, dir, de_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_db(handle, dir, de_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out; @@ -420,8 +420,8 @@ static int __ocfs2_delete_entry(handle_t *handle, struct inode *dir, goto bail; } if (de == de_del) { - status = ocfs2_journal_access(handle, dir, bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_db(handle, dir, bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { status = -EIO; mlog_errno(status); @@ -581,8 +581,14 @@ int __ocfs2_add_entry(handle_t *handle, goto bail; } - status = ocfs2_journal_access(handle, dir, insert_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + if (insert_bh == parent_fe_bh) + status = ocfs2_journal_access_di(handle, dir, + insert_bh, + OCFS2_JOURNAL_ACCESS_WRITE); + else + status = ocfs2_journal_access_db(handle, dir, + insert_bh, + OCFS2_JOURNAL_ACCESS_WRITE); /* By now the buffer is marked for journaling */ offset += le16_to_cpu(de->rec_len); if (le64_to_cpu(de->inode)) { @@ -1081,8 +1087,8 @@ static int ocfs2_fill_new_dir_id(struct ocfs2_super *osb, struct ocfs2_inline_data *data = &di->id2.i_data; unsigned int size = le16_to_cpu(data->id_count); - ret = ocfs2_journal_access(handle, inode, di_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_di(handle, inode, di_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out; @@ -1129,8 +1135,8 @@ static int ocfs2_fill_new_dir_el(struct ocfs2_super *osb, ocfs2_set_new_buffer_uptodate(inode, new_bh); - status = ocfs2_journal_access(handle, inode, new_bh, - OCFS2_JOURNAL_ACCESS_CREATE); + status = ocfs2_journal_access_db(handle, inode, new_bh, + OCFS2_JOURNAL_ACCESS_CREATE); if (status < 0) { mlog_errno(status); goto bail; @@ -1292,8 +1298,8 @@ static int ocfs2_expand_inline_dir(struct inode *dir, struct buffer_head *di_bh, ocfs2_set_new_buffer_uptodate(dir, dirdata_bh); - ret = ocfs2_journal_access(handle, dir, dirdata_bh, - OCFS2_JOURNAL_ACCESS_CREATE); + ret = ocfs2_journal_access_db(handle, dir, dirdata_bh, + OCFS2_JOURNAL_ACCESS_CREATE); if (ret) { mlog_errno(ret); goto out_commit; @@ -1319,8 +1325,8 @@ static int ocfs2_expand_inline_dir(struct inode *dir, struct buffer_head *di_bh, * We let the later dirent insert modify c/mtime - to the user * the data hasn't changed. */ - ret = ocfs2_journal_access(handle, dir, di_bh, - OCFS2_JOURNAL_ACCESS_CREATE); + ret = ocfs2_journal_access_di(handle, dir, di_bh, + OCFS2_JOURNAL_ACCESS_CREATE); if (ret) { mlog_errno(ret); goto out_commit; @@ -1583,8 +1589,8 @@ do_extend: ocfs2_set_new_buffer_uptodate(dir, new_bh); - status = ocfs2_journal_access(handle, dir, new_bh, - OCFS2_JOURNAL_ACCESS_CREATE); + status = ocfs2_journal_access_db(handle, dir, new_bh, + OCFS2_JOURNAL_ACCESS_CREATE); if (status < 0) { mlog_errno(status); goto bail; diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index 9374d37..e8f795f 100644 --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -256,8 +256,8 @@ int ocfs2_update_inode_atime(struct inode *inode, goto out; } - ret = ocfs2_journal_access(handle, inode, bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_di(handle, inode, bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out_commit; @@ -353,8 +353,8 @@ static int ocfs2_orphan_for_truncate(struct ocfs2_super *osb, goto out; } - status = ocfs2_journal_access(handle, inode, fe_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, inode, fe_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto out_commit; @@ -590,8 +590,8 @@ restarted_transaction: /* reserve a write to the file entry early on - that we if we * run out of credits in the allocation path, we can still * update i_size. */ - status = ocfs2_journal_access(handle, inode, bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, inode, bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto leave; @@ -1121,8 +1121,8 @@ static int __ocfs2_write_remove_suid(struct inode *inode, goto out; } - ret = ocfs2_journal_access(handle, inode, bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_di(handle, inode, bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret < 0) { mlog_errno(ret); goto out_trans; diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c index 9370b65..229e707 100644 --- a/fs/ocfs2/inode.c +++ b/fs/ocfs2/inode.c @@ -537,8 +537,8 @@ static int ocfs2_truncate_for_delete(struct ocfs2_super *osb, goto out; } - status = ocfs2_journal_access(handle, inode, fe_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, inode, fe_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto out; @@ -621,8 +621,8 @@ static int ocfs2_remove_inode(struct inode *inode, } /* set the inodes dtime */ - status = ocfs2_journal_access(handle, inode, di_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, inode, di_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto bail_commit; @@ -1190,8 +1190,8 @@ int ocfs2_mark_inode_dirty(handle_t *handle, mlog_entry("(inode %llu)\n", (unsigned long long)OCFS2_I(inode)->ip_blkno); - status = ocfs2_journal_access(handle, inode, bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, inode, bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto leave; @@ -1277,8 +1277,11 @@ int ocfs2_validate_inode_block(struct super_block *sb, * local to this block. */ rc = ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check); - if (rc) + if (rc) { + mlog(ML_ERROR, "Checksum failed for dinode %llu\n", + (unsigned long long)bh->b_blocknr); goto bail; + } /* * Errors after here are fatal. diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c index 2daa584..3b54dba 100644 --- a/fs/ocfs2/journal.c +++ b/fs/ocfs2/journal.c @@ -752,6 +752,7 @@ static int ocfs2_journal_toggle_dirty(struct ocfs2_super *osb, if (replayed) ocfs2_bump_recovery_generation(fe); + ocfs2_compute_meta_ecc(osb->sb, bh->b_data, &fe->i_check); status = ocfs2_write_block(osb, bh, journal->j_inode); if (status < 0) mlog_errno(status); @@ -1486,6 +1487,7 @@ static int ocfs2_replay_journal(struct ocfs2_super *osb, osb->slot_recovery_generations[slot_num] ocfs2_get_recovery_generation(fe); + ocfs2_compute_meta_ecc(osb->sb, bh->b_data, &fe->i_check); status = ocfs2_write_block(osb, bh, inode); if (status < 0) mlog_errno(status); diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h index bca370d..3c3532e 100644 --- a/fs/ocfs2/journal.h +++ b/fs/ocfs2/journal.h @@ -247,9 +247,10 @@ int ocfs2_extend_trans(handle_t *handle, int nblocks); #define OCFS2_JOURNAL_ACCESS_WRITE 1 #define OCFS2_JOURNAL_ACCESS_UNDO 2 + /* ocfs2_inode */ int ocfs2_journal_access_di(handle_t *handle, struct inode *inode, - struct buffer_head *bh, int type); + struct buffer_head *bh, int type); /* ocfs2_extent_block */ int ocfs2_journal_access_eb(handle_t *handle, struct inode *inode, struct buffer_head *bh, int type); diff --git a/fs/ocfs2/localalloc.c b/fs/ocfs2/localalloc.c index 19cfb1b..1368e75 100644 --- a/fs/ocfs2/localalloc.c +++ b/fs/ocfs2/localalloc.c @@ -382,8 +382,8 @@ void ocfs2_shutdown_local_alloc(struct ocfs2_super *osb) } memcpy(alloc_copy, alloc, bh->b_size); - status = ocfs2_journal_access(handle, local_alloc_inode, bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, local_alloc_inode, bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto out_commit; @@ -762,9 +762,9 @@ int ocfs2_claim_local_alloc_bits(struct ocfs2_super *osb, * delete bits from it! */ *num_bits = bits_wanted; - status = ocfs2_journal_access(handle, local_alloc_inode, - osb->local_alloc_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, local_alloc_inode, + osb->local_alloc_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto bail; @@ -1240,9 +1240,9 @@ static int ocfs2_local_alloc_slide_window(struct ocfs2_super *osb, } memcpy(alloc_copy, alloc, osb->local_alloc_bh->b_size); - status = ocfs2_journal_access(handle, local_alloc_inode, - osb->local_alloc_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, local_alloc_inode, + osb->local_alloc_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto bail; diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c index 02c8026..7587d24 100644 --- a/fs/ocfs2/namei.c +++ b/fs/ocfs2/namei.c @@ -361,8 +361,8 @@ static int ocfs2_mknod(struct inode *dir, goto leave; } - status = ocfs2_journal_access(handle, dir, parent_fe_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, dir, parent_fe_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto leave; @@ -493,8 +493,8 @@ static int ocfs2_mknod_locked(struct ocfs2_super *osb, } ocfs2_set_new_buffer_uptodate(inode, *new_fe_bh); - status = ocfs2_journal_access(handle, inode, *new_fe_bh, - OCFS2_JOURNAL_ACCESS_CREATE); + status = ocfs2_journal_access_di(handle, inode, *new_fe_bh, + OCFS2_JOURNAL_ACCESS_CREATE); if (status < 0) { mlog_errno(status); goto leave; @@ -664,8 +664,8 @@ static int ocfs2_link(struct dentry *old_dentry, goto out_unlock_inode; } - err = ocfs2_journal_access(handle, inode, fe_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + err = ocfs2_journal_access_di(handle, inode, fe_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (err < 0) { mlog_errno(err); goto out_commit; @@ -851,8 +851,8 @@ static int ocfs2_unlink(struct inode *dir, goto leave; } - status = ocfs2_journal_access(handle, inode, fe_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, inode, fe_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto leave; @@ -1265,8 +1265,8 @@ static int ocfs2_rename(struct inode *old_dir, goto bail; } } - status = ocfs2_journal_access(handle, new_inode, newfe_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, new_inode, newfe_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto bail; @@ -1312,8 +1312,8 @@ static int ocfs2_rename(struct inode *old_dir, old_inode->i_ctime = CURRENT_TIME; mark_inode_dirty(old_inode); - status = ocfs2_journal_access(handle, old_inode, old_inode_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, old_inode, old_inode_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status >= 0) { old_di = (struct ocfs2_dinode *) old_inode_bh->b_data; @@ -1389,9 +1389,9 @@ static int ocfs2_rename(struct inode *old_dir, (int)old_dir_nlink, old_dir->i_nlink); } else { struct ocfs2_dinode *fe; - status = ocfs2_journal_access(handle, old_dir, - old_dir_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, old_dir, + old_dir_bh, + OCFS2_JOURNAL_ACCESS_WRITE); fe = (struct ocfs2_dinode *) old_dir_bh->b_data; fe->i_links_count = cpu_to_le16(old_dir->i_nlink); status = ocfs2_journal_dirty(handle, old_dir_bh); @@ -1898,8 +1898,8 @@ static int ocfs2_orphan_add(struct ocfs2_super *osb, goto leave; } - status = ocfs2_journal_access(handle, orphan_dir_inode, orphan_dir_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, orphan_dir_inode, orphan_dir_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto leave; @@ -1986,8 +1986,8 @@ int ocfs2_orphan_del(struct ocfs2_super *osb, goto leave; } - status = ocfs2_journal_access(handle,orphan_dir_inode, orphan_dir_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle,orphan_dir_inode, orphan_dir_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto leave; diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h index 2bb389f..bad87d0 100644 --- a/fs/ocfs2/ocfs2.h +++ b/fs/ocfs2/ocfs2.h @@ -339,6 +339,10 @@ struct ocfs2_super #define OCFS2_SB(sb) ((struct ocfs2_super *)(sb)->s_fs_info) +/* Useful typedef for passing around journal access functions */ +typedef int (*ocfs2_journal_access_func)(handle_t *handle, struct inode *inode, + struct buffer_head *bh, int type); + static inline int ocfs2_should_order_data(struct inode *inode) { if (!S_ISREG(inode->i_mode)) diff --git a/fs/ocfs2/resize.c b/fs/ocfs2/resize.c index 867de3e..424adaa 100644 --- a/fs/ocfs2/resize.c +++ b/fs/ocfs2/resize.c @@ -106,8 +106,8 @@ static int ocfs2_update_last_group_and_inode(handle_t *handle, mlog_entry("(new_clusters=%d, first_new_cluster = %u)\n", new_clusters, first_new_cluster); - ret = ocfs2_journal_access(handle, bm_inode, group_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_gd(handle, bm_inode, group_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret < 0) { mlog_errno(ret); goto out; @@ -141,8 +141,8 @@ static int ocfs2_update_last_group_and_inode(handle_t *handle, } /* update the inode accordingly. */ - ret = ocfs2_journal_access(handle, bm_inode, bm_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_di(handle, bm_inode, bm_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret < 0) { mlog_errno(ret); goto out_rollback; @@ -536,8 +536,8 @@ int ocfs2_group_add(struct inode *inode, struct ocfs2_new_group_input *input) cl = &fe->id2.i_chain; cr = &cl->cl_recs[input->chain]; - ret = ocfs2_journal_access(handle, main_bm_inode, group_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_gd(handle, main_bm_inode, group_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret < 0) { mlog_errno(ret); goto out_commit; @@ -552,8 +552,8 @@ int ocfs2_group_add(struct inode *inode, struct ocfs2_new_group_input *input) goto out_commit; } - ret = ocfs2_journal_access(handle, main_bm_inode, main_bm_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_di(handle, main_bm_inode, main_bm_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret < 0) { mlog_errno(ret); goto out_commit; diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 7875576..a696286 100644 --- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -261,7 +261,11 @@ int ocfs2_check_group_descriptor(struct super_block *sb, * local to this block. */ rc = ocfs2_validate_meta_ecc(sb, bh->b_data, &gd->bg_check); - if (!rc) + if (rc) { + mlog(ML_ERROR, + "Checksum failed for group descriptor %llu\n", + (unsigned long long)bh->b_blocknr); + } else rc = ocfs2_validate_gd_self(sb, bh, 1); if (!rc) rc = ocfs2_validate_gd_parent(sb, di, bh, 1); @@ -343,10 +347,10 @@ static int ocfs2_block_group_fill(handle_t *handle, goto bail; } - status = ocfs2_journal_access(handle, - alloc_inode, - bg_bh, - OCFS2_JOURNAL_ACCESS_CREATE); + status = ocfs2_journal_access_gd(handle, + alloc_inode, + bg_bh, + OCFS2_JOURNAL_ACCESS_CREATE); if (status < 0) { mlog_errno(status); goto bail; @@ -476,8 +480,8 @@ static int ocfs2_block_group_alloc(struct ocfs2_super *osb, bg = (struct ocfs2_group_desc *) bg_bh->b_data; - status = ocfs2_journal_access(handle, alloc_inode, - bh, OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, alloc_inode, + bh, OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto bail; @@ -986,10 +990,10 @@ static inline int ocfs2_block_group_set_bits(handle_t *handle, if (ocfs2_is_cluster_bitmap(alloc_inode)) journal_type = OCFS2_JOURNAL_ACCESS_UNDO; - status = ocfs2_journal_access(handle, - alloc_inode, - group_bh, - journal_type); + status = ocfs2_journal_access_gd(handle, + alloc_inode, + group_bh, + journal_type); if (status < 0) { mlog_errno(status); goto bail; @@ -1060,8 +1064,8 @@ static int ocfs2_relink_block_group(handle_t *handle, bg_ptr = le64_to_cpu(bg->bg_next_group); prev_bg_ptr = le64_to_cpu(prev_bg->bg_next_group); - status = ocfs2_journal_access(handle, alloc_inode, prev_bg_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_gd(handle, alloc_inode, prev_bg_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto out_rollback; @@ -1075,8 +1079,8 @@ static int ocfs2_relink_block_group(handle_t *handle, goto out_rollback; } - status = ocfs2_journal_access(handle, alloc_inode, bg_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_gd(handle, alloc_inode, bg_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto out_rollback; @@ -1090,8 +1094,8 @@ static int ocfs2_relink_block_group(handle_t *handle, goto out_rollback; } - status = ocfs2_journal_access(handle, alloc_inode, fe_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, alloc_inode, fe_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto out_rollback; @@ -1242,8 +1246,8 @@ static int ocfs2_alloc_dinode_update_counts(struct inode *inode, struct ocfs2_dinode *di = (struct ocfs2_dinode *) di_bh->b_data; struct ocfs2_chain_list *cl = (struct ocfs2_chain_list *) &di->id2.i_chain; - ret = ocfs2_journal_access(handle, inode, di_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_di(handle, inode, di_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret < 0) { mlog_errno(ret); goto out; @@ -1414,10 +1418,10 @@ static int ocfs2_search_chain(struct ocfs2_alloc_context *ac, /* Ok, claim our bits now: set the info on dinode, chainlist * and then the group */ - status = ocfs2_journal_access(handle, - alloc_inode, - ac->ac_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, + alloc_inode, + ac->ac_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto bail; @@ -1824,8 +1828,8 @@ static inline int ocfs2_block_group_clear_bits(handle_t *handle, if (ocfs2_is_cluster_bitmap(alloc_inode)) journal_type = OCFS2_JOURNAL_ACCESS_UNDO; - status = ocfs2_journal_access(handle, alloc_inode, group_bh, - journal_type); + status = ocfs2_journal_access_gd(handle, alloc_inode, group_bh, + journal_type); if (status < 0) { mlog_errno(status); goto bail; @@ -1900,8 +1904,8 @@ int ocfs2_free_suballoc_bits(handle_t *handle, goto bail; } - status = ocfs2_journal_access(handle, alloc_inode, alloc_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = ocfs2_journal_access_di(handle, alloc_inode, alloc_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto bail; -- 1.5.6.5
Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 09/18] ocfs2: Add ecc and checksums to ocfs2 xattr buckets.
The xattr bucket can span multiple blocks on disk. We have wrappers for this structure in the code. We use the new multi-block ecc calls to calculate and validate the bucket. Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/xattr.c | 15 ++++++++++++++- 1 files changed, 14 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c index dcf26ad..1540e4d 100644 --- a/fs/ocfs2/xattr.c +++ b/fs/ocfs2/xattr.c @@ -273,6 +273,15 @@ static int ocfs2_read_xattr_bucket(struct ocfs2_xattr_bucket *bucket, rc = ocfs2_read_blocks(bucket->bu_inode, xb_blkno, bucket->bu_blocks, bucket->bu_bhs, 0, NULL); + if (!rc) { + rc = ocfs2_validate_meta_ecc_bhs(bucket->bu_inode->i_sb, + bucket->bu_bhs, + bucket->bu_blocks, + &bucket_xh(bucket)->xh_check); + if (rc) + mlog_errno(rc); + } + if (rc) ocfs2_xattr_bucket_relse(bucket); return rc; @@ -301,6 +310,10 @@ static void ocfs2_xattr_bucket_journal_dirty(handle_t *handle, { int i; + ocfs2_compute_meta_ecc_bhs(bucket->bu_inode->i_sb, + bucket->bu_bhs, bucket->bu_blocks, + &bucket_xh(bucket)->xh_check); + for (i = 0; i < bucket->bu_blocks; i++) ocfs2_journal_dirty(handle, bucket->bu_bhs[i]); } @@ -3080,7 +3093,7 @@ static int ocfs2_iterate_xattr_buckets(struct inode *inode, if (func) { ret = func(inode, bucket, para); if (ret) - mlog_errno(ret); + mlog_errno(ret); /* Fall through to bucket_relse() */ } -- 1.5.6.5
Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 10/18] ocfs2: Create ocfs2_xattr_value_buf.
When an ocfs2 extended attribute is large enough to require its own allocation tree, we root it with an ocfs2_xattr_value_root. However, these roots can be a part of inodes, xattr blocks, or xattr buckets. Thus, they need a different journal access function for each container. We wrap the bh, its journal access function, and the value root (xv) in a structure called ocfs2_xattr_valu_buf. This is a package that can be passed around. In this first pass, we simply pass it to the extent tree code. Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/alloc.c | 25 +++++++++++-------------- fs/ocfs2/alloc.h | 4 ++-- fs/ocfs2/xattr.c | 34 ++++++++++++++++++++++------------ fs/ocfs2/xattr.h | 14 ++++++++++++++ 4 files changed, 49 insertions(+), 28 deletions(-) diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c index 3c8d866..f5087c7 100644 --- a/fs/ocfs2/alloc.c +++ b/fs/ocfs2/alloc.c @@ -48,6 +48,7 @@ #include "file.h" #include "super.h" #include "uptodate.h" +#include "xattr.h" #include "buffer_head_io.h" @@ -207,36 +208,33 @@ static void ocfs2_dinode_fill_root_el(struct ocfs2_extent_tree *et) static void ocfs2_xattr_value_fill_root_el(struct ocfs2_extent_tree *et) { - struct ocfs2_xattr_value_root *xv = et->et_object; + struct ocfs2_xattr_value_buf *vb = et->et_object; - et->et_root_el = &xv->xr_list; + et->et_root_el = &vb->vb_xv->xr_list; } static void ocfs2_xattr_value_set_last_eb_blk(struct ocfs2_extent_tree *et, u64 blkno) { - struct ocfs2_xattr_value_root *xv - (struct ocfs2_xattr_value_root *)et->et_object; + struct ocfs2_xattr_value_buf *vb = et->et_object; - xv->xr_last_eb_blk = cpu_to_le64(blkno); + vb->vb_xv->xr_last_eb_blk = cpu_to_le64(blkno); } static u64 ocfs2_xattr_value_get_last_eb_blk(struct ocfs2_extent_tree *et) { - struct ocfs2_xattr_value_root *xv - (struct ocfs2_xattr_value_root *) et->et_object; + struct ocfs2_xattr_value_buf *vb = et->et_object; - return le64_to_cpu(xv->xr_last_eb_blk); + return le64_to_cpu(vb->vb_xv->xr_last_eb_blk); } static void ocfs2_xattr_value_update_clusters(struct inode *inode, struct ocfs2_extent_tree *et, u32 clusters) { - struct ocfs2_xattr_value_root *xv - (struct ocfs2_xattr_value_root *)et->et_object; + struct ocfs2_xattr_value_buf *vb = et->et_object; - le32_add_cpu(&xv->xr_clusters, clusters); + le32_add_cpu(&vb->vb_xv->xr_clusters, clusters); } static struct ocfs2_extent_tree_operations ocfs2_xattr_value_et_ops = { @@ -334,10 +332,9 @@ void ocfs2_init_xattr_tree_extent_tree(struct ocfs2_extent_tree *et, void ocfs2_init_xattr_value_extent_tree(struct ocfs2_extent_tree *et, struct inode *inode, - struct buffer_head *bh, - struct ocfs2_xattr_value_root *xv) + struct ocfs2_xattr_value_buf *vb) { - __ocfs2_init_extent_tree(et, inode, bh, ocfs2_journal_access, xv, + __ocfs2_init_extent_tree(et, inode, vb->vb_bh, vb->vb_access, vb, &ocfs2_xattr_value_et_ops); } diff --git a/fs/ocfs2/alloc.h b/fs/ocfs2/alloc.h index 4b6fea2..cceff5c 100644 --- a/fs/ocfs2/alloc.h +++ b/fs/ocfs2/alloc.h @@ -71,10 +71,10 @@ void ocfs2_init_dinode_extent_tree(struct ocfs2_extent_tree *et, void ocfs2_init_xattr_tree_extent_tree(struct ocfs2_extent_tree *et, struct inode *inode, struct buffer_head *bh); +struct ocfs2_xattr_value_buf; void ocfs2_init_xattr_value_extent_tree(struct ocfs2_extent_tree *et, struct inode *inode, - struct buffer_head *bh, - struct ocfs2_xattr_value_root *xv); + struct ocfs2_xattr_value_buf *vb); /* * Read an extent block into *bh. If *bh is NULL, a bh will be diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c index 1540e4d..df35b7d 100644 --- a/fs/ocfs2/xattr.c +++ b/fs/ocfs2/xattr.c @@ -581,21 +581,26 @@ static int ocfs2_xattr_extend_allocation(struct inode *inode, handle_t *handle = ctxt->handle; enum ocfs2_alloc_restarted why; struct ocfs2_super *osb = OCFS2_SB(inode->i_sb); - u32 prev_clusters, logical_start = le32_to_cpu(xv->xr_clusters); + struct ocfs2_xattr_value_buf vb = { + .vb_bh = xattr_bh, + .vb_xv = xv, + .vb_access = ocfs2_journal_access, + }; + u32 prev_clusters, logical_start = le32_to_cpu(vb.vb_xv->xr_clusters); struct ocfs2_extent_tree et; mlog(0, "(clusters_to_add for xattr= %u)\n", clusters_to_add); - ocfs2_init_xattr_value_extent_tree(&et, inode, xattr_bh, xv); + ocfs2_init_xattr_value_extent_tree(&et, inode, &vb); - status = ocfs2_journal_access(handle, inode, xattr_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + status = vb.vb_access(handle, inode, vb.vb_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto leave; } - prev_clusters = le32_to_cpu(xv->xr_clusters); + prev_clusters = le32_to_cpu(vb.vb_xv->xr_clusters); status = ocfs2_add_clusters_in_btree(osb, inode, &logical_start, @@ -611,13 +616,13 @@ static int ocfs2_xattr_extend_allocation(struct inode *inode, goto leave; } - status = ocfs2_journal_dirty(handle, xattr_bh); + status = ocfs2_journal_dirty(handle, vb.vb_bh); if (status < 0) { mlog_errno(status); goto leave; } - clusters_to_add -= le32_to_cpu(xv->xr_clusters) - prev_clusters; + clusters_to_add -= le32_to_cpu(vb.vb_xv->xr_clusters) - prev_clusters; /* * We should have already allocated enough space before the transaction, @@ -640,11 +645,16 @@ static int __ocfs2_remove_xattr_range(struct inode *inode, u64 phys_blkno = ocfs2_clusters_to_blocks(inode->i_sb, phys_cpos); handle_t *handle = ctxt->handle; struct ocfs2_extent_tree et; + struct ocfs2_xattr_value_buf vb = { + .vb_bh = root_bh, + .vb_xv = xv, + .vb_access = ocfs2_journal_access, + }; - ocfs2_init_xattr_value_extent_tree(&et, inode, root_bh, xv); + ocfs2_init_xattr_value_extent_tree(&et, inode, &vb); - ret = ocfs2_journal_access(handle, inode, root_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = vb.vb_access(handle, inode, vb.vb_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out; @@ -657,9 +667,9 @@ static int __ocfs2_remove_xattr_range(struct inode *inode, goto out; } - le32_add_cpu(&xv->xr_clusters, -len); + le32_add_cpu(&vb.vb_xv->xr_clusters, -len); - ret = ocfs2_journal_dirty(handle, root_bh); + ret = ocfs2_journal_dirty(handle, vb.vb_bh); if (ret) { mlog_errno(ret); goto out; diff --git a/fs/ocfs2/xattr.h b/fs/ocfs2/xattr.h index 9a67e7d..5a1ebc7 100644 --- a/fs/ocfs2/xattr.h +++ b/fs/ocfs2/xattr.h @@ -70,4 +70,18 @@ int ocfs2_calc_xattr_init(struct inode *, struct buffer_head *, int, struct ocfs2_security_xattr_info *, int *, int *, struct ocfs2_alloc_context **); +/* + * xattrs can live inside an inode, as part of an external xattr block, + * or inside an xattr bucket, which is the leaf of a tree rooted in an + * xattr block. Some of the xattr calls, especially the value setting + * functions, want to treat each of these locations as equal. Let's wrap + * them in a structure that we can pass around instead of raw buffer_heads. + */ +struct ocfs2_xattr_value_buf { + struct buffer_head *vb_bh; + ocfs2_journal_access_func vb_access; + struct ocfs2_xattr_value_root *vb_xv; +}; + + #endif /* OCFS2_XATTR_H */ -- 1.5.6.5
Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 11/18] ocfs2: Pull ocfs2_xattr_value_buf up from __ocfs2_remove_xattr_range().
Place an ocfs2_xattr_value_buf in __ocfs2_xattr_shrink_size() and pass it down to __ocfs2_remove_xattr_range(). Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/xattr.c | 28 ++++++++++++++-------------- 1 files changed, 14 insertions(+), 14 deletions(-) diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c index df35b7d..42e062c 100644 --- a/fs/ocfs2/xattr.c +++ b/fs/ocfs2/xattr.c @@ -636,8 +636,7 @@ leave: } static int __ocfs2_remove_xattr_range(struct inode *inode, - struct buffer_head *root_bh, - struct ocfs2_xattr_value_root *xv, + struct ocfs2_xattr_value_buf *vb, u32 cpos, u32 phys_cpos, u32 len, struct ocfs2_xattr_set_ctxt *ctxt) { @@ -645,16 +644,11 @@ static int __ocfs2_remove_xattr_range(struct inode *inode, u64 phys_blkno = ocfs2_clusters_to_blocks(inode->i_sb, phys_cpos); handle_t *handle = ctxt->handle; struct ocfs2_extent_tree et; - struct ocfs2_xattr_value_buf vb = { - .vb_bh = root_bh, - .vb_xv = xv, - .vb_access = ocfs2_journal_access, - }; - ocfs2_init_xattr_value_extent_tree(&et, inode, &vb); + ocfs2_init_xattr_value_extent_tree(&et, inode, vb); - ret = vb.vb_access(handle, inode, vb.vb_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = vb->vb_access(handle, inode, vb->vb_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out; @@ -667,9 +661,9 @@ static int __ocfs2_remove_xattr_range(struct inode *inode, goto out; } - le32_add_cpu(&vb.vb_xv->xr_clusters, -len); + le32_add_cpu(&vb->vb_xv->xr_clusters, -len); - ret = ocfs2_journal_dirty(handle, vb.vb_bh); + ret = ocfs2_journal_dirty(handle, vb->vb_bh); if (ret) { mlog_errno(ret); goto out; @@ -693,6 +687,11 @@ static int ocfs2_xattr_shrink_size(struct inode *inode, int ret = 0; u32 trunc_len, cpos, phys_cpos, alloc_size; u64 block; + struct ocfs2_xattr_value_buf vb = { + .vb_bh = root_bh, + .vb_xv = xv, + .vb_access = ocfs2_journal_access, + }; if (old_clusters <= new_clusters) return 0; @@ -701,7 +700,8 @@ static int ocfs2_xattr_shrink_size(struct inode *inode, trunc_len = old_clusters - new_clusters; while (trunc_len) { ret = ocfs2_xattr_get_clusters(inode, cpos, &phys_cpos, - &alloc_size, &xv->xr_list); + &alloc_size, + &vb.vb_xv->xr_list); if (ret) { mlog_errno(ret); goto out; @@ -710,7 +710,7 @@ static int ocfs2_xattr_shrink_size(struct inode *inode, if (alloc_size > trunc_len) alloc_size = trunc_len; - ret = __ocfs2_remove_xattr_range(inode, root_bh, xv, cpos, + ret = __ocfs2_remove_xattr_range(inode, &vb, cpos, phys_cpos, alloc_size, ctxt); if (ret) { -- 1.5.6.5
Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 12/18] ocfs2: Pull ocfs2_xattr_value_buf up into ocfs2_xattr_value_truncate().
Place an ocfs2_xattr_value_buf in ocfs2_xattr_value_truncate() and pass it down to ocfs2_xattr_shrink_size(). We can also pass it into ocfs2_xattr_extend_allocation(), replacing its ocfs2_xattr_value_buf. Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/xattr.c | 41 +++++++++++++++++------------------------ 1 files changed, 17 insertions(+), 24 deletions(-) diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c index 42e062c..a53f49d 100644 --- a/fs/ocfs2/xattr.c +++ b/fs/ocfs2/xattr.c @@ -573,34 +573,28 @@ int ocfs2_calc_xattr_init(struct inode *dir, static int ocfs2_xattr_extend_allocation(struct inode *inode, u32 clusters_to_add, - struct buffer_head *xattr_bh, - struct ocfs2_xattr_value_root *xv, + struct ocfs2_xattr_value_buf *vb, struct ocfs2_xattr_set_ctxt *ctxt) { int status = 0; handle_t *handle = ctxt->handle; enum ocfs2_alloc_restarted why; struct ocfs2_super *osb = OCFS2_SB(inode->i_sb); - struct ocfs2_xattr_value_buf vb = { - .vb_bh = xattr_bh, - .vb_xv = xv, - .vb_access = ocfs2_journal_access, - }; - u32 prev_clusters, logical_start = le32_to_cpu(vb.vb_xv->xr_clusters); + u32 prev_clusters, logical_start = le32_to_cpu(vb->vb_xv->xr_clusters); struct ocfs2_extent_tree et; mlog(0, "(clusters_to_add for xattr= %u)\n", clusters_to_add); - ocfs2_init_xattr_value_extent_tree(&et, inode, &vb); + ocfs2_init_xattr_value_extent_tree(&et, inode, vb); - status = vb.vb_access(handle, inode, vb.vb_bh, + status = vb->vb_access(handle, inode, vb->vb_bh, OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { mlog_errno(status); goto leave; } - prev_clusters = le32_to_cpu(vb.vb_xv->xr_clusters); + prev_clusters = le32_to_cpu(vb->vb_xv->xr_clusters); status = ocfs2_add_clusters_in_btree(osb, inode, &logical_start, @@ -616,13 +610,13 @@ static int ocfs2_xattr_extend_allocation(struct inode *inode, goto leave; } - status = ocfs2_journal_dirty(handle, vb.vb_bh); + status = ocfs2_journal_dirty(handle, vb->vb_bh); if (status < 0) { mlog_errno(status); goto leave; } - clusters_to_add -= le32_to_cpu(vb.vb_xv->xr_clusters) - prev_clusters; + clusters_to_add -= le32_to_cpu(vb->vb_xv->xr_clusters) - prev_clusters; /* * We should have already allocated enough space before the transaction, @@ -680,18 +674,12 @@ out: static int ocfs2_xattr_shrink_size(struct inode *inode, u32 old_clusters, u32 new_clusters, - struct buffer_head *root_bh, - struct ocfs2_xattr_value_root *xv, + struct ocfs2_xattr_value_buf *vb, struct ocfs2_xattr_set_ctxt *ctxt) { int ret = 0; u32 trunc_len, cpos, phys_cpos, alloc_size; u64 block; - struct ocfs2_xattr_value_buf vb = { - .vb_bh = root_bh, - .vb_xv = xv, - .vb_access = ocfs2_journal_access, - }; if (old_clusters <= new_clusters) return 0; @@ -701,7 +689,7 @@ static int ocfs2_xattr_shrink_size(struct inode *inode, while (trunc_len) { ret = ocfs2_xattr_get_clusters(inode, cpos, &phys_cpos, &alloc_size, - &vb.vb_xv->xr_list); + &vb->vb_xv->xr_list); if (ret) { mlog_errno(ret); goto out; @@ -710,7 +698,7 @@ static int ocfs2_xattr_shrink_size(struct inode *inode, if (alloc_size > trunc_len) alloc_size = trunc_len; - ret = __ocfs2_remove_xattr_range(inode, &vb, cpos, + ret = __ocfs2_remove_xattr_range(inode, vb, cpos, phys_cpos, alloc_size, ctxt); if (ret) { @@ -738,6 +726,11 @@ static int ocfs2_xattr_value_truncate(struct inode *inode, int ret; u32 new_clusters = ocfs2_clusters_for_bytes(inode->i_sb, len); u32 old_clusters = le32_to_cpu(xv->xr_clusters); + struct ocfs2_xattr_value_buf vb = { + .vb_bh = root_bh, + .vb_xv = xv, + .vb_access = ocfs2_journal_access, + }; if (new_clusters == old_clusters) return 0; @@ -745,11 +738,11 @@ static int ocfs2_xattr_value_truncate(struct inode *inode, if (new_clusters > old_clusters) ret = ocfs2_xattr_extend_allocation(inode, new_clusters - old_clusters, - root_bh, xv, ctxt); + &vb, ctxt); else ret = ocfs2_xattr_shrink_size(inode, old_clusters, new_clusters, - root_bh, xv, ctxt); + &vb, ctxt); return ret; } -- 1.5.6.5
Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 13/18] ocfs2: Pass ocfs2_xattr_value_buf into ocfs2_xattr_value_truncate().
The callers of ocfs2_xattr_value_truncate() now pass in ocfs2_xattr_value_bufs. These callers are the ones that calculated the xv location, so they are the right starting point. Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/xattr.c | 66 +++++++++++++++++++++++++++-------------------------- 1 files changed, 34 insertions(+), 32 deletions(-) diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c index a53f49d..316bc37 100644 --- a/fs/ocfs2/xattr.c +++ b/fs/ocfs2/xattr.c @@ -718,19 +718,13 @@ out: } static int ocfs2_xattr_value_truncate(struct inode *inode, - struct buffer_head *root_bh, - struct ocfs2_xattr_value_root *xv, + struct ocfs2_xattr_value_buf *vb, int len, struct ocfs2_xattr_set_ctxt *ctxt) { int ret; u32 new_clusters = ocfs2_clusters_for_bytes(inode->i_sb, len); - u32 old_clusters = le32_to_cpu(xv->xr_clusters); - struct ocfs2_xattr_value_buf vb = { - .vb_bh = root_bh, - .vb_xv = xv, - .vb_access = ocfs2_journal_access, - }; + u32 old_clusters = le32_to_cpu(vb->vb_xv->xr_clusters); if (new_clusters == old_clusters) return 0; @@ -738,11 +732,11 @@ static int ocfs2_xattr_value_truncate(struct inode *inode, if (new_clusters > old_clusters) ret = ocfs2_xattr_extend_allocation(inode, new_clusters - old_clusters, - &vb, ctxt); + vb, ctxt); else ret = ocfs2_xattr_shrink_size(inode, old_clusters, new_clusters, - &vb, ctxt); + vb, ctxt); return ret; } @@ -1330,6 +1324,10 @@ static int ocfs2_xattr_set_value_outside(struct inode *inode, struct ocfs2_xattr_value_root *xv = NULL; size_t size = OCFS2_XATTR_SIZE(name_len) + OCFS2_XATTR_ROOT_SIZE; int ret = 0; + struct ocfs2_xattr_value_buf vb = { + .vb_bh = xs->xattr_bh, + .vb_access = ocfs2_journal_access + }; memset(val, 0, size); memcpy(val, xi->name, name_len); @@ -1340,9 +1338,9 @@ static int ocfs2_xattr_set_value_outside(struct inode *inode, xv->xr_list.l_tree_depth = 0; xv->xr_list.l_count = cpu_to_le16(1); xv->xr_list.l_next_free_rec = 0; + vb.vb_xv = xv; - ret = ocfs2_xattr_value_truncate(inode, xs->xattr_bh, xv, - xi->value_len, ctxt); + ret = ocfs2_xattr_value_truncate(inode, &vb, xi->value_len, ctxt); if (ret < 0) { mlog_errno(ret); return ret; @@ -1352,7 +1350,7 @@ static int ocfs2_xattr_set_value_outside(struct inode *inode, mlog_errno(ret); return ret; } - ret = __ocfs2_xattr_set_value_outside(inode, ctxt->handle, xv, + ret = __ocfs2_xattr_set_value_outside(inode, ctxt->handle, vb.vb_xv, xi->value, xi->value_len); if (ret < 0) mlog_errno(ret); @@ -1550,9 +1548,12 @@ static int ocfs2_xattr_set_entry(struct inode *inode, goto out; } else if (!ocfs2_xattr_is_local(xs->here)) { /* For existing xattr which has value outside */ - struct ocfs2_xattr_value_root *xv = NULL; - xv = (struct ocfs2_xattr_value_root *)(val + - OCFS2_XATTR_SIZE(name_len)); + struct ocfs2_xattr_value_buf vb = { + .vb_bh = xs->xattr_bh, + .vb_xv = (struct ocfs2_xattr_value_root *) + (val + OCFS2_XATTR_SIZE(name_len)), + .vb_access = ocfs2_journal_access, + }; if (xi->value_len > OCFS2_XATTR_INLINE_SIZE) { /* @@ -1561,8 +1562,7 @@ static int ocfs2_xattr_set_entry(struct inode *inode, * then set new value with set_value_outside(). */ ret = ocfs2_xattr_value_truncate(inode, - xs->xattr_bh, - xv, + &vb, xi->value_len, ctxt); if (ret < 0) { @@ -1582,7 +1582,7 @@ static int ocfs2_xattr_set_entry(struct inode *inode, ret = __ocfs2_xattr_set_value_outside(inode, handle, - xv, + vb.vb_xv, xi->value, xi->value_len); if (ret < 0) @@ -1594,8 +1594,7 @@ static int ocfs2_xattr_set_entry(struct inode *inode, * just trucate old value to zero. */ ret = ocfs2_xattr_value_truncate(inode, - xs->xattr_bh, - xv, + &vb, 0, ctxt); if (ret < 0) @@ -1714,15 +1713,17 @@ static int ocfs2_remove_value_outside(struct inode*inode, struct ocfs2_xattr_entry *entry = &header->xh_entries[i]; if (!ocfs2_xattr_is_local(entry)) { - struct ocfs2_xattr_value_root *xv; + struct ocfs2_xattr_value_buf vb = { + .vb_bh = bh, + .vb_access = ocfs2_journal_access, + }; void *val; val = (void *)header + le16_to_cpu(entry->xe_name_offset); - xv = (struct ocfs2_xattr_value_root *) + vb.vb_xv = (struct ocfs2_xattr_value_root *) (val + OCFS2_XATTR_SIZE(entry->xe_name_len)); - ret = ocfs2_xattr_value_truncate(inode, bh, xv, - 0, &ctxt); + ret = ocfs2_xattr_value_truncate(inode, &vb, 0, &ctxt); if (ret < 0) { mlog_errno(ret); break; @@ -4659,11 +4660,12 @@ static int ocfs2_xattr_bucket_value_truncate(struct inode *inode, { int ret, offset; u64 value_blk; - struct buffer_head *value_bh = NULL; - struct ocfs2_xattr_value_root *xv; struct ocfs2_xattr_entry *xe; struct ocfs2_xattr_header *xh = bucket_xh(bucket); size_t blocksize = inode->i_sb->s_blocksize; + struct ocfs2_xattr_value_buf vb = { + .vb_access = ocfs2_journal_access, + }; xe = &xh->xh_entries[xe_off]; @@ -4677,11 +4679,11 @@ static int ocfs2_xattr_bucket_value_truncate(struct inode *inode, /* We don't allow ocfs2_xattr_value to be stored in different block. */ BUG_ON(value_blk != (offset + OCFS2_XATTR_ROOT_SIZE - 1) / blocksize); - value_bh = bucket->bu_bhs[value_blk]; - BUG_ON(!value_bh); + vb.vb_bh = bucket->bu_bhs[value_blk]; + BUG_ON(!vb.vb_bh); - xv = (struct ocfs2_xattr_value_root *) - (value_bh->b_data + offset % blocksize); + vb.vb_xv = (struct ocfs2_xattr_value_root *) + (vb.vb_bh->b_data + offset % blocksize); ret = ocfs2_xattr_bucket_journal_access(ctxt->handle, bucket, OCFS2_JOURNAL_ACCESS_WRITE); @@ -4699,7 +4701,7 @@ static int ocfs2_xattr_bucket_value_truncate(struct inode *inode, */ mlog(0, "truncate %u in xattr bucket %llu to %d bytes.\n", xe_off, (unsigned long long)bucket_blkno(bucket), len); - ret = ocfs2_xattr_value_truncate(inode, value_bh, xv, len, ctxt); + ret = ocfs2_xattr_value_truncate(inode, &vb, len, ctxt); if (ret) { mlog_errno(ret); goto out_dirty; -- 1.5.6.5
Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 14/18] ocfs2: Pass value buf to ocfs2_xattr_update_entry().
ocfs2_xattr_update_entry() updates the entry portion of an xattr buffer. This can be part of multiple metadata block types, so pass the buffer in via an ocfs2_xattr_value_buf. Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/xattr.c | 10 ++++++---- 1 files changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c index 316bc37..f1a7042 100644 --- a/fs/ocfs2/xattr.c +++ b/fs/ocfs2/xattr.c @@ -1282,12 +1282,13 @@ static int ocfs2_xattr_update_entry(struct inode *inode, handle_t *handle, struct ocfs2_xattr_info *xi, struct ocfs2_xattr_search *xs, + struct ocfs2_xattr_value_buf *vb, size_t offs) { int ret; - ret = ocfs2_journal_access(handle, inode, xs->xattr_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = vb->vb_access(handle, inode, vb->vb_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out; @@ -1301,7 +1302,7 @@ static int ocfs2_xattr_update_entry(struct inode *inode, ocfs2_xattr_set_local(xs->here, 0); ocfs2_xattr_hash_entry(inode, xs->header, xs->here); - ret = ocfs2_journal_dirty(handle, xs->xattr_bh); + ret = ocfs2_journal_dirty(handle, vb->vb_bh); if (ret < 0) mlog_errno(ret); out: @@ -1345,7 +1346,7 @@ static int ocfs2_xattr_set_value_outside(struct inode *inode, mlog_errno(ret); return ret; } - ret = ocfs2_xattr_update_entry(inode, ctxt->handle, xi, xs, offs); + ret = ocfs2_xattr_update_entry(inode, ctxt->handle, xi, xs, &vb, offs); if (ret < 0) { mlog_errno(ret); return ret; @@ -1574,6 +1575,7 @@ static int ocfs2_xattr_set_entry(struct inode *inode, handle, xi, xs, + &vb, offs); if (ret < 0) { mlog_errno(ret); -- 1.5.6.5
Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 15/18] ocfs2: Use ocfs2_xattr_value_buf in ocfs2_xattr_set_entry().
ocfs2_xattr_set_entry is the function that knows what type of block it is setting into. This is what we wanted from ocfs2_xattr_value_buf. Plus, moving the value buf up into ocfs2_xattr_set_entry() allows us to pass it into ocfs2_xattr_set_value_outside() and ocfs2_xattr_cleanup(). Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/xattr.c | 53 +++++++++++++++++++++++++++++------------------------ 1 files changed, 29 insertions(+), 24 deletions(-) diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c index f1a7042..a33e2df 100644 --- a/fs/ocfs2/xattr.c +++ b/fs/ocfs2/xattr.c @@ -1252,6 +1252,7 @@ static int ocfs2_xattr_cleanup(struct inode *inode, handle_t *handle, struct ocfs2_xattr_info *xi, struct ocfs2_xattr_search *xs, + struct ocfs2_xattr_value_buf *vb, size_t offs) { int ret = 0; @@ -1259,8 +1260,8 @@ static int ocfs2_xattr_cleanup(struct inode *inode, void *val = xs->base + offs; size_t size = OCFS2_XATTR_SIZE(name_len) + OCFS2_XATTR_ROOT_SIZE; - ret = ocfs2_journal_access(handle, inode, xs->xattr_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = vb->vb_access(handle, inode, vb->vb_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out; @@ -1271,7 +1272,7 @@ static int ocfs2_xattr_cleanup(struct inode *inode, memset((void *)xs->here, 0, sizeof(struct ocfs2_xattr_entry)); memset(val, 0, size); - ret = ocfs2_journal_dirty(handle, xs->xattr_bh); + ret = ocfs2_journal_dirty(handle, vb->vb_bh); if (ret < 0) mlog_errno(ret); out: @@ -1318,6 +1319,7 @@ static int ocfs2_xattr_set_value_outside(struct inode *inode, struct ocfs2_xattr_info *xi, struct ocfs2_xattr_search *xs, struct ocfs2_xattr_set_ctxt *ctxt, + struct ocfs2_xattr_value_buf *vb, size_t offs) { size_t name_len = strlen(xi->name); @@ -1325,10 +1327,6 @@ static int ocfs2_xattr_set_value_outside(struct inode *inode, struct ocfs2_xattr_value_root *xv = NULL; size_t size = OCFS2_XATTR_SIZE(name_len) + OCFS2_XATTR_ROOT_SIZE; int ret = 0; - struct ocfs2_xattr_value_buf vb = { - .vb_bh = xs->xattr_bh, - .vb_access = ocfs2_journal_access - }; memset(val, 0, size); memcpy(val, xi->name, name_len); @@ -1339,19 +1337,19 @@ static int ocfs2_xattr_set_value_outside(struct inode *inode, xv->xr_list.l_tree_depth = 0; xv->xr_list.l_count = cpu_to_le16(1); xv->xr_list.l_next_free_rec = 0; - vb.vb_xv = xv; + vb->vb_xv = xv; - ret = ocfs2_xattr_value_truncate(inode, &vb, xi->value_len, ctxt); + ret = ocfs2_xattr_value_truncate(inode, vb, xi->value_len, ctxt); if (ret < 0) { mlog_errno(ret); return ret; } - ret = ocfs2_xattr_update_entry(inode, ctxt->handle, xi, xs, &vb, offs); + ret = ocfs2_xattr_update_entry(inode, ctxt->handle, xi, xs, vb, offs); if (ret < 0) { mlog_errno(ret); return ret; } - ret = __ocfs2_xattr_set_value_outside(inode, ctxt->handle, vb.vb_xv, + ret = __ocfs2_xattr_set_value_outside(inode, ctxt->handle, vb->vb_xv, xi->value, xi->value_len); if (ret < 0) mlog_errno(ret); @@ -1488,6 +1486,16 @@ static int ocfs2_xattr_set_entry(struct inode *inode, .value = xi->value, .value_len = xi->value_len, }; + struct ocfs2_xattr_value_buf vb = { + .vb_bh = xs->xattr_bh, + .vb_access = ocfs2_journal_access_di, + }; + + if (!(flag & OCFS2_INLINE_XATTR_FL)) { + BUG_ON(xs->xattr_bh == xs->inode_bh); + vb.vb_access = ocfs2_journal_access_xb; + } else + BUG_ON(xs->xattr_bh != xs->inode_bh); /* Compute min_offs, last and free space. */ last = xs->header->xh_entries; @@ -1543,18 +1551,14 @@ static int ocfs2_xattr_set_entry(struct inode *inode, if (ocfs2_xattr_is_local(xs->here) && size == size_l) { /* Replace existing local xattr with tree root */ ret = ocfs2_xattr_set_value_outside(inode, xi, xs, - ctxt, offs); + ctxt, &vb, offs); if (ret < 0) mlog_errno(ret); goto out; } else if (!ocfs2_xattr_is_local(xs->here)) { /* For existing xattr which has value outside */ - struct ocfs2_xattr_value_buf vb = { - .vb_bh = xs->xattr_bh, - .vb_xv = (struct ocfs2_xattr_value_root *) - (val + OCFS2_XATTR_SIZE(name_len)), - .vb_access = ocfs2_journal_access, - }; + vb.vb_xv = (struct ocfs2_xattr_value_root *) + (val + OCFS2_XATTR_SIZE(name_len)); if (xi->value_len > OCFS2_XATTR_INLINE_SIZE) { /* @@ -1605,16 +1609,16 @@ static int ocfs2_xattr_set_entry(struct inode *inode, } } - ret = ocfs2_journal_access(handle, inode, xs->inode_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_di(handle, inode, xs->inode_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out; } if (!(flag & OCFS2_INLINE_XATTR_FL)) { - ret = ocfs2_journal_access(handle, inode, xs->xattr_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = vb.vb_access(handle, inode, vb.vb_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out; @@ -1674,7 +1678,8 @@ static int ocfs2_xattr_set_entry(struct inode *inode, * This is the second step for value size > INLINE_SIZE. */ size_t offs = le16_to_cpu(xs->here->xe_name_offset); - ret = ocfs2_xattr_set_value_outside(inode, xi, xs, ctxt, offs); + ret = ocfs2_xattr_set_value_outside(inode, xi, xs, ctxt, + &vb, offs); if (ret < 0) { int ret2; @@ -1684,7 +1689,7 @@ static int ocfs2_xattr_set_entry(struct inode *inode, * the junk tree root we have already set in local. */ ret2 = ocfs2_xattr_cleanup(inode, ctxt->handle, - xi, xs, offs); + xi, xs, &vb, offs); if (ret2 < 0) mlog_errno(ret2); } -- 1.5.6.5
Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 16/18] ocfs2: Pass value buf to ocfs2_remove_value_outside().
ocfs2_remove_value_outside() needs to know the type of buffer it is looking at. Pass in an ocfs2_xattr_value_buf. Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/xattr.c | 22 +++++++++++++--------- 1 files changed, 13 insertions(+), 9 deletions(-) diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c index a33e2df..7b61c1c 100644 --- a/fs/ocfs2/xattr.c +++ b/fs/ocfs2/xattr.c @@ -1699,7 +1699,7 @@ out: } static int ocfs2_remove_value_outside(struct inode*inode, - struct buffer_head *bh, + struct ocfs2_xattr_value_buf *vb, struct ocfs2_xattr_header *header) { int ret = 0, i; @@ -1720,17 +1720,13 @@ static int ocfs2_remove_value_outside(struct inode*inode, struct ocfs2_xattr_entry *entry = &header->xh_entries[i]; if (!ocfs2_xattr_is_local(entry)) { - struct ocfs2_xattr_value_buf vb = { - .vb_bh = bh, - .vb_access = ocfs2_journal_access, - }; void *val; val = (void *)header + le16_to_cpu(entry->xe_name_offset); - vb.vb_xv = (struct ocfs2_xattr_value_root *) + vb->vb_xv = (struct ocfs2_xattr_value_root *) (val + OCFS2_XATTR_SIZE(entry->xe_name_len)); - ret = ocfs2_xattr_value_truncate(inode, &vb, 0, &ctxt); + ret = ocfs2_xattr_value_truncate(inode, vb, 0, &ctxt); if (ret < 0) { mlog_errno(ret); break; @@ -1752,12 +1748,16 @@ static int ocfs2_xattr_ibody_remove(struct inode *inode, struct ocfs2_dinode *di = (struct ocfs2_dinode *)di_bh->b_data; struct ocfs2_xattr_header *header; int ret; + struct ocfs2_xattr_value_buf vb = { + .vb_bh = di_bh, + .vb_access = ocfs2_journal_access_di, + }; header = (struct ocfs2_xattr_header *) ((void *)di + inode->i_sb->s_blocksize - le16_to_cpu(di->i_xattr_inline_size)); - ret = ocfs2_remove_value_outside(inode, di_bh, header); + ret = ocfs2_remove_value_outside(inode, &vb, header); return ret; } @@ -1767,11 +1767,15 @@ static int ocfs2_xattr_block_remove(struct inode *inode, { struct ocfs2_xattr_block *xb; int ret = 0; + struct ocfs2_xattr_value_buf vb = { + .vb_bh = blk_bh, + .vb_access = ocfs2_journal_access_xb, + }; xb = (struct ocfs2_xattr_block *)blk_bh->b_data; if (!(le16_to_cpu(xb->xb_flags) & OCFS2_XATTR_INDEXED)) { struct ocfs2_xattr_header *header = &(xb->xb_attrs.xb_header); - ret = ocfs2_remove_value_outside(inode, blk_bh, header); + ret = ocfs2_remove_value_outside(inode, &vb, header); } else ret = ocfs2_delete_xattr_index_block(inode, blk_bh); -- 1.5.6.5
Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 17/18] ocfs2: Use proper journal_access function in xattr.c
Change the rest of the naked ocfs2_journal_access() calls in fs/ocfs2/xattr.c to use the appropriate ocfs2_journal_access_*() call for their metadata type. Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/xattr.c | 24 ++++++++++++------------ 1 files changed, 12 insertions(+), 12 deletions(-) diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c index 7b61c1c..b75522e 100644 --- a/fs/ocfs2/xattr.c +++ b/fs/ocfs2/xattr.c @@ -1894,8 +1894,8 @@ int ocfs2_xattr_remove(struct inode *inode, struct buffer_head *di_bh) mlog_errno(ret); goto out; } - ret = ocfs2_journal_access(handle, inode, di_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_di(handle, inode, di_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out_commit; @@ -2103,8 +2103,8 @@ static int ocfs2_xattr_block_set(struct inode *inode, int ret; if (!xs->xattr_bh) { - ret = ocfs2_journal_access(handle, inode, xs->inode_bh, - OCFS2_JOURNAL_ACCESS_CREATE); + ret = ocfs2_journal_access_di(handle, inode, xs->inode_bh, + OCFS2_JOURNAL_ACCESS_CREATE); if (ret < 0) { mlog_errno(ret); goto end; @@ -2121,8 +2121,8 @@ static int ocfs2_xattr_block_set(struct inode *inode, new_bh = sb_getblk(inode->i_sb, first_blkno); ocfs2_set_new_buffer_uptodate(inode, new_bh); - ret = ocfs2_journal_access(handle, inode, new_bh, - OCFS2_JOURNAL_ACCESS_CREATE); + ret = ocfs2_journal_access_xb(handle, inode, new_bh, + OCFS2_JOURNAL_ACCESS_CREATE); if (ret < 0) { mlog_errno(ret); goto end; @@ -3383,8 +3383,8 @@ static int ocfs2_xattr_create_index_block(struct inode *inode, */ down_write(&oi->ip_alloc_sem); - ret = ocfs2_journal_access(handle, inode, xb_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_xb(handle, inode, xb_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out; @@ -4222,8 +4222,8 @@ static int ocfs2_add_new_xattr_cluster(struct inode *inode, ocfs2_init_xattr_tree_extent_tree(&et, inode, root_bh); - ret = ocfs2_journal_access(handle, inode, root_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_xb(handle, inode, root_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret < 0) { mlog_errno(ret); goto leave; @@ -4816,8 +4816,8 @@ static int ocfs2_rm_xattr_cluster(struct inode *inode, goto out; } - ret = ocfs2_journal_access(handle, inode, root_bh, - OCFS2_JOURNAL_ACCESS_WRITE); + ret = ocfs2_journal_access_xb(handle, inode, root_bh, + OCFS2_JOURNAL_ACCESS_WRITE); if (ret) { mlog_errno(ret); goto out_commit; -- 1.5.6.5
Joel Becker
2008-Dec-10 01:09 UTC
[Ocfs2-devel] [PATCH 18/18] ocfs2: Enable metadata checksums.
Add OCFS2_FEATURE_INCOMPAT_META_ECC to the list of supported features. Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/ocfs2_fs.h | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/ocfs2_fs.h b/fs/ocfs2/ocfs2_fs.h index 4e20f50..95b231f 100644 --- a/fs/ocfs2/ocfs2_fs.h +++ b/fs/ocfs2/ocfs2_fs.h @@ -92,7 +92,8 @@ | OCFS2_FEATURE_INCOMPAT_INLINE_DATA \ | OCFS2_FEATURE_INCOMPAT_EXTENDED_SLOT_MAP \ | OCFS2_FEATURE_INCOMPAT_USERSPACE_STACK \ - | OCFS2_FEATURE_INCOMPAT_XATTR) + | OCFS2_FEATURE_INCOMPAT_XATTR \ + | OCFS2_FEATURE_INCOMPAT_META_ECC) #define OCFS2_FEATURE_RO_COMPAT_SUPP (OCFS2_FEATURE_RO_COMPAT_UNWRITTEN \ | OCFS2_FEATURE_RO_COMPAT_USRQUOTA \ | OCFS2_FEATURE_RO_COMPAT_GRPQUOTA) -- 1.5.6.5
Joel Becker
2008-Dec-10 01:52 UTC
[Ocfs2-devel] [PATCH 0/18] ocfs2: Add metadata ECC - almost there.
On Tue, Dec 09, 2008 at 05:09:37PM -0800, Joel Becker wrote:> The code is also available in my repository. > > [View] > http://oss.oracle.com/git/?p=jlbec/linux-2.6.git;a=shortlog;h=metaecc > [Pull] > git://oss.oracle.com/git/jlbec/linux-2.6.git metaeccTools are in ocfs2-tools.git [View] http://oss.oracle.com/git/?p=ocfs2-tools.git;a=shortlog;h=metaecc [Pull] git://oss.oracle.com/git/ocfs2-tools.git metaecc -- "I think it would be a good idea." - Mahatma Ghandi, when asked what he thought of Western civilization Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127
Andi Kleen
2008-Dec-10 12:54 UTC
[Ocfs2-devel] [PATCH 0/18] ocfs2: Add metadata ECC - almost there.
Joel Becker <joel.becker at oracle.com> writes:> These patches add the metadata ECC code to ocfs2. They have one > limitation - they do not ECC directory blocks yet. However, that will > only take a simple insertion of the dirblock trailer. With xattr ECC > completed, the patches are approaching a final form and are ready for > initial review. They are atop my xattr-buckets-2 branch.Could you please expand a bit why you chose ECC over more standard CRCs? My assumption would be normally that if there is disk corruption it won't be single flipped bits, but more large scale corruption. Do you have data/experiences on the contrary? Thanks, -andi -- ak at linux.intel.com
Joel Becker
2008-Dec-10 18:15 UTC
[Ocfs2-devel] [PATCH 0/18] ocfs2: Add metadata ECC - almost there.
On Wed, Dec 10, 2008 at 01:54:51PM +0100, Andi Kleen wrote:> Joel Becker <joel.becker at oracle.com> writes: > > > These patches add the metadata ECC code to ocfs2. They have one > > limitation - they do not ECC directory blocks yet. However, that will > > only take a simple insertion of the dirblock trailer. With xattr ECC > > completed, the patches are approaching a final form and are ready for > > initial review. They are atop my xattr-buckets-2 branch. > > Could you please expand a bit why you chose ECC over more standard CRCs? > > My assumption would be normally that if there is disk corruption it won't > be single flipped bits, but more large scale corruption. Do you have > data/experiences on the contrary?If you look at our block_check structure, you'll see that it uses crc32c for the overall block, and then the ECC can attempt to correct a single bit if the crc fails. Joel -- "If the human brain were so simple we could understand it, we would be so simple that we could not." - W. A. Clouston Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127
Joel Becker
2008-Dec-11 02:16 UTC
[Ocfs2-devel] [PATCH 0/18] ocfs2: Add metadata ECC - almost there.
On Tue, Dec 09, 2008 at 05:09:37PM -0800, Joel Becker wrote:> These patches add the metadata ECC code to ocfs2. They have one > limitation - they do not ECC directory blocks yet. However, that willFollowing this are the two patches for directory blocks. With these patches, the ECC code is complete (barring comments/testing/bugs). The metaecc branch will be updated with these patches shortly. I need to rebase it on a fixed xattr-buckets-2. Joel -- "In the arms of the angel, fly away from here, From this dark, cold hotel room and the endlessness that you fear. You are pulled from the wreckage of your silent reverie. In the arms of the angel, may you find some comfort here." Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127
Tao Ma
2008-Dec-11 02:46 UTC
[Ocfs2-devel] [PATCH 0/18] ocfs2: Add metadata ECC - almost there.
Joel Becker wrote:> These patches add the metadata ECC code to ocfs2. They have one > limitation - they do not ECC directory blocks yet. However, that will > only take a simple insertion of the dirblock trailer. With xattr ECC > completed, the patches are approaching a final form and are ready for > initial review. They are atop my xattr-buckets-2 branch. > > The first patch adds buffer triggers to JDB2. This allows us to only > calculate the ECC code when JBD2 is about to write a block to disk. > > Next, we add the actual blockcheck functions. These are the functions > that compute the checksum (using the kernel's crc32_le()) and the ECC > bits (this is hamming code, implemented in blockcheck.c). > > Then we create metadata-type specific ocfs2_journal_access_*() functions > to make sure each metadata type uses its own checksum field. These are > hooked up to the regular metadata types. > > We add ecc calculation to xattr buckets. Because buckets can be > multiple blocks, they cannot use journal triggers. However, they are > rarely updated, so the loss of efficiency isn't very important. > > Last, we refactor the xattr value setting code to make sure we know > which metadata type an xattr value belongs to. That way we can call the > appropriate ocfs2_journal_access_*() function. > > The code is also available in my repository. > > [View] > http://oss.oracle.com/git/?p=jlbec/linux-2.6.git;a=shortlog;h=metaecc > [Pull] > git://oss.oracle.com/git/jlbec/linux-2.6.git metaecc > > Joel Becker (18): > jbd2: Add buffer triggers > ocfs2: Add the on-disk structures for metadata checksums. > ocfs2: Add the underlying blockcheck code. > ocfs2: Add a validation hook for quota block reads. > ocfs2: block read meta ecc. > ocfs2: Add journal_access functions with jbd2 triggers. > ocfs2: Wrap up the common use cases of ocfs2_new_path(). > ocfs2: Use metadata-specific ocfs2_journal_access_*() functions. > ocfs2: Add ecc and checksums to ocfs2 xattr buckets. > ocfs2: Create ocfs2_xattr_value_buf. > ocfs2: Pull ocfs2_xattr_value_buf up from __ocfs2_remove_xattr_range(). > ocfs2: Pull ocfs2_xattr_value_buf up into ocfs2_xattr_value_truncate(). > ocfs2: Pass ocfs2_xattr_value_buf into ocfs2_xattr_value_truncate(). > ocfs2: Pass value buf to ocfs2_xattr_update_entry(). > ocfs2: Use ocfs2_xattr_value_buf in ocfs2_xattr_set_entry(). > ocfs2: Pass value buf to ocfs2_remove_value_outside(). > ocfs2: Use proper journal_access function in xattr.c > ocfs2: Enable metadata checksums.Acked-by: Tao Ma <tao.ma at oracle.com>> >