I have just submitted the following bug report: http://oss.oracle.com/bugzilla/show_bug.cgi?id=1266 This is formally reporting the issue originally identified (and fixed) by Robert Smith back in December: http://www.mail-archive.com/ocfs2-devel at oss.oracle.com/msg04728.html Specifically, even the latest OCFS2 produces an error when you attempt to mount a volume larger than 16 TiB: "ocfs2_initialize_super:2157 ERROR: Volume might try to write to blocks beyond what jbd can address in 32 bits." I would like to use large volumes in production later this year or early next, so I am interested in seeing this issue resolved so I can begin testing. I believe this check in fs/ocfs2/super.c is the only known issue standing in the way of large volume support for OCFS2. I want to submit a patch to fix it. The simplest approach is just to delete the check, like so: diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c index 0eaa929..0ba41f3 100644 --- a/fs/ocfs2/super.c +++ b/fs/ocfs2/super.c @@ -2215,14 +2215,6 @@ static int ocfs2_initialize_super(struct super_block *sb, goto bail; } - if (ocfs2_clusters_to_blocks(osb->sb, le32_to_cpu(di->i_clusters) - 1) - > (u32)~0UL) { - mlog(ML_ERROR, "Volume might try to write to blocks beyond " - "what jbd can address in 32 bits.\n"); - status = -EINVAL; - goto bail; - } - if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid, sizeof(di->id2.i_super.s_uuid))) { mlog(ML_ERROR, "Out of memory trying to setup our uuid.\n"); Questions for the list: 1) Is this patch sufficient? Or should I try to modify the check to take into account the cluster size? Anything else I need to check here (e.g. inode64 mount option)? 2) Should mkfs.ocfs2 contain a similar check? (It may already; I have not looked yet...) - Pat
On Tue, Jun 22, 2010 at 05:11:50PM -0700, Patrick J. LoPresti wrote:> The simplest approach is just to delete the check, like so: > > diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c > index 0eaa929..0ba41f3 100644 > --- a/fs/ocfs2/super.c > +++ b/fs/ocfs2/super.c > @@ -2215,14 +2215,6 @@ static int ocfs2_initialize_super(struct super_block *sb, > goto bail; > } > > - if (ocfs2_clusters_to_blocks(osb->sb, le32_to_cpu(di->i_clusters) - 1) > - > (u32)~0UL) { > - mlog(ML_ERROR, "Volume might try to write to blocks beyond " > - "what jbd can address in 32 bits.\n"); > - status = -EINVAL; > - goto bail; > - } > - > if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid, > sizeof(di->id2.i_super.s_uuid))) { > mlog(ML_ERROR, "Out of memory trying to setup our uuid.\n"); > > > Questions for the list: > > 1) Is this patch sufficient? Or should I try to modify the check to > take into account the cluster size? Anything else I need to check > here (e.g. inode64 mount option)?This patch is not sufficient. The journal options need to be checked, as does the block addressing space of the kernel. Joel -- "Gone to plant a weeping willow On the bank's green edge it will roll, roll, roll. Sing a lulaby beside the waters. Lovers come and go, the river roll, roll, rolls." Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127
On Tue, Jun 22, 2010 at 5:49 PM, Joel Becker <Joel.Becker at oracle.com> wrote:> On Tue, Jun 22, 2010 at 05:11:50PM -0700, Patrick J. LoPresti wrote: > > ? ? ? ?This patch is not sufficient. ?The journal options need to be > checked, as does the block addressing space of the kernel.OK. But I am new to this code base and would appreciate some pointers... Specifically: Which journal options? Are they visible at this point in super.c? By "block addressing space of the kernel", are we just talking sizeof(some_kernel_type) or something more involved? Thanks! - Pat
On 06/28/2010 06:15 PM, Patrick J. LoPresti wrote:> On Mon, Jun 28, 2010 at 5:47 PM, Sunil Mushran<sunil.mushran at oracle.com> wrote: > >> BTW, have you tested with cleared INCOMPAT_64BIT journal flag? >> > Hm, easier said than done. "tunefs.ocfs2 -J noblock64 ..." bombs out with: > > tunefs.ocfs2: Unknown journal option: "block64" > Valid journal options are: > size=<journal-size> > Usage: tunefs.ocfs2 [options]<device> [new-size] > [etc.] > > By the way, it complains similarly about "tunefs.ocfs2 -J block64", > which puts the lie to the failure message in my patch... > > When I try to re-format the partition without "-J block64", I get: > > ERROR: jbd can only store block numbers in 32 bits. /dev/md0 can hold > 5082795264 blocks which overflows this limit. If you have a new enough > Ocfs2 with JBD2 support, you can try formatting with the "-Jblock64" > option to turn on support for this size block device. > Otherwise, consider increasing the block size or decreasing the device size. >Which version of tools are you running? block64 was added after ocfs2-tools 1.4.2.
On Mon, Jun 28, 2010 at 6:54 PM, Sunil Mushran <sunil.mushran at oracle.com> wrote:> > Which version of tools are you running? block64 was added after ocfs2-tools > 1.4.2.1.4.4 (from Debian experimental). "-J block64" It works fine for mkfs.ocfs2; just not for tunefs.ocfs2...
Yes, that needs to be fixed. Meanwhile you could hand edit the jsb to test your patch. On Jun 28, 2010, at 7:50 PM, "Patrick J. LoPresti" <lopresti at gmail.com> wrote:> On Mon, Jun 28, 2010 at 6:54 PM, Sunil Mushran <sunil.mushran at oracle.com > > wrote: >> >> Which version of tools are you running? block64 was added after >> ocfs2-tools >> 1.4.2. > > 1.4.4 (from Debian experimental). > > "-J block64" It works fine for mkfs.ocfs2; just not for > tunefs.ocfs2...
On Mon, Jun 28, 2010 at 9:21 PM, Sunil Mushran <sunil.mushran at oracle.com> wrote:> Yes, that needs to be fixed. Meanwhile you could hand edit the jsb to test > your patch.Well, I do not understand the bits well enough to do that... But I did download the source for mkfs.ocfs2 and eliminated its check. I created a large volume without "-J block64". With my patch applies, the kernel refuses to mount it and generates the appropriate error message. OK to submit my patch to linux-kernel? - Pat
On 06/29/2010 12:43 PM, Patrick J. LoPresti wrote:> On Mon, Jun 28, 2010 at 9:21 PM, Sunil Mushran<sunil.mushran at oracle.com> wrote: > >> Yes, that needs to be fixed. Meanwhile you could hand edit the jsb to test >> your patch. >> > Well, I do not understand the bits well enough to do that... But I > did download the source for mkfs.ocfs2 and eliminated its check. > > I created a large volume without "-J block64". With my patch applies, > the kernel refuses to mount it and generates the appropriate error > message. > > OK to submit my patch to linux-kernel? >Thanks. Mention that in the submission. Submit it to ocfs2-devel. Joel rounds up all the patches and sends them to Linus during the merge window.