.. evidently doesn''t work. GRUB reboots the machine moments after loading stage2, and doesn''t recognise the fstype when examining the disk loaded from an alernate source. This is with SX-151. Here''s hoping a future version (with grub2?) resolves this, as well as lets us boot from raidz. Just a note for the archives in case it helps someone else get back the afternoon I just burnt. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110729/4354d9b9/attachment.bin>
On Fri, Jul 29, 2011 at 01:04:49AM -0400, Daniel Carosone wrote:> .. evidently doesn''t work. GRUB reboots the machine moments after > loading stage2, and doesn''t recognise the fstype when examining the > disk loaded from an alernate source. > > This is with SX-151. Here''s hoping a future version (with grub2?) > resolves this, as well as lets us boot from raidz. > > Just a note for the archives in case it helps someone else get back > the afternoon I just burnt.I''ve noticed this behaviour this morning and have been debugging it since. I found out that, for some unknown reason, grub fails to get the disk geometry, assumes 0 sectors/track and then does a divide-by-zero. I don''t think this is a zfs issue. Hans -- %SYSTEM-F-ANARCHISM, The operating system has been overthrown
On Fri, Jul 29, 2011 at 4:57 PM, Hans Rosenfeld <hans.rosenfeld at amd.com> wrote:> On Fri, Jul 29, 2011 at 01:04:49AM -0400, Daniel Carosone wrote: >> .. evidently doesn''t work. ?GRUB reboots the machine moments after >> loading stage2, and doesn''t recognise the fstype when examining the >> disk loaded from an alernate source. >> >> This is with SX-151. ?Here''s hoping a future version (with grub2?) >> resolves this, as well as lets us boot from raidz. >> >> Just a note for the archives in case it helps someone else get back >> the afternoon I just burnt. > > I''ve noticed this behaviour this morning and have been debugging it > since. I found out that, for some unknown reason, grub fails to get the > disk geometry, assumes 0 sectors/track and then does a divide-by-zero. > > I don''t think this is a zfs issue.If the problem is on zfs code in grub/grub2, then it should be zfs issue, right? Anyway, for comparison purposes, with ubuntu + grub2 + zfsonlinux (which can force ashift at pool creation time) + zfs root, grub2 won''t even install on pools with ashift=12, while it works just fine with ashift=9. There were also booting problems if you''ve scrubbed rpool. Does zfs code for grub/grub2 also depend on Oracle releasing updates, or is it simply a matter of no one with enough skill have looked into it yet? -- Fajar
On Fri, Jul 29, 2011 at 10:22:27AM -0400, Fajar A. Nugraha wrote:> On Fri, Jul 29, 2011 at 4:57 PM, Hans Rosenfeld <hans.rosenfeld at amd.com> wrote: > > I''ve noticed this behaviour this morning and have been debugging it > > since. I found out that, for some unknown reason, grub fails to get the > > disk geometry, assumes 0 sectors/track and then does a divide-by-zero. > > > > I don''t think this is a zfs issue. > > If the problem is on zfs code in grub/grub2, then it should be zfs issue, right?I thought that due to the geometry stuff the zfs code never runs, but after some more debugging I know that was wrong. These are in fact two unrelated problems.> Anyway, for comparison purposes, with ubuntu + grub2 + zfsonlinux > (which can force ashift at pool creation time) + zfs root, grub2 > won''t even install on pools with ashift=12, while it works just fine > with ashift=9. There were also booting problems if you''ve scrubbed > rpool. > > Does zfs code for grub/grub2 also depend on Oracle releasing updates, > or is it simply a matter of no one with enough skill have looked into > it yet?I''m working on a patch for grub that fixes the ashift=12 issue. I''m probably not going to fix the div-by-zero reboot. Hans -- %SYSTEM-F-ANARCHISM, The operating system has been overthrown
On Fri, Jul 29, 2011 at 04:36:33PM +0200, Hans Rosenfeld wrote:> > Does zfs code for grub/grub2 also depend on Oracle releasing > > updates, > > or is it simply a matter of no one with enough skill have looked > > into > > it yet? > > I''m working on a patch for grub that fixes the ashift=12 issue. I''m > probably not going to fix the div-by-zero reboot.If you want to try it, the patch can be found at http://cr.illumos.org/view/6qc99xkh/illumos-1303-webrev/illumos-1303-webrev.patch Hans -- %SYSTEM-F-ANARCHISM, The operating system has been overthrown
On Fri, Jul 29, 2011 at 05:58:49PM +0200, Hans Rosenfeld wrote:> I''m working on a patch for grub that fixes the ashift=12 issue.Oh, great - and from the looks of the patch, for other values of 12 as well :)> I''m probably not going to fix the div-by-zero reboot.Fair enough, if it''s an existing unrelated error we no longer expose. Perhaps it''s even fixed/irrelevant for grub2, can this be checked easily?> If you want to try it, the patch can be found at > http://cr.illumos.org/view/6qc99xkh/illumos-1303-webrev/illumos-1303-webrev.patchAny chance of providing an alternate stage1/stage2 binary I can feed to installgrub? When you''re ready.. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110801/b74c2c32/attachment.bin>
On Mon, Aug 01, 2011 at 11:22:36AM +1000, Daniel Carosone wrote:> On Fri, Jul 29, 2011 at 05:58:49PM +0200, Hans Rosenfeld wrote: > > > I''m working on a patch for grub that fixes the ashift=12 issue. > > Oh, great - and from the looks of the patch, for other values of 12 as > well :) > > > I''m probably not going to fix the div-by-zero reboot. > > Fair enough, if it''s an existing unrelated error we no longer > expose. Perhaps it''s even fixed/irrelevant for grub2, can this be > checked easily? >FWIW, this seems to be a live issue with the zfs-on-linux folks too, perhaps some coordination would be helpful? See, e.g.: http://groups.google.com/a/zfsonlinux.org/group/zfs-discuss/browse_thread/thread/0c80103a8d5c0bb0#> > If you want to try it, the patch can be found at > > http://cr.illumos.org/view/6qc99xkh/illumos-1303-webrev/illumos-1303-webrev.patch > > Any chance of providing an alternate stage1/stage2 binary I can feed > to installgrub? When you''re ready..To be clear, the system I was working on the other day is now running with a normal ashift=9 pool, on a mirror of WD 2TB EARX. Not quite what I was hoping for, but hopefully it will be OK; I won''t have much chance to mess with it again for a little while. I will be building something else useful for testing this, sometime in the next couple of weeks. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110801/d648d544/attachment.bin>
On Mon, Aug 01, 2011 at 11:22:36AM +1000, Daniel Carosone wrote:> > If you want to try it, the patch can be found at > > http://cr.illumos.org/view/6qc99xkh/illumos-1303-webrev/illumos-1303-webrev.patch > > Any chance of providing an alternate stage1/stage2 binary I can feed > to installgrub? When you''re ready..There is already an updated patch that also works with mirrored pools: http://cr.illumos.org/view/77bat2me/illumos-1303-webrev-2/illumos-1303-webrev-2.patch We track this issue at https://www.illumos.org/issues/1303, so if I have to update the patch again, you can find it there. Hans -- %SYSTEM-F-ANARCHISM, The operating system has been overthrown
On Mon, Aug 01, 2011 at 01:25:35PM +1000, Daniel Carosone wrote:> To be clear, the system I was working on the other day is now running > with a normal ashift=9 pool, on a mirror of WD 2TB EARX. Not quite > what I was hoping for, but hopefully it will be OK; I won''t have much > chance to mess with it again for a little while.That turned out to be a false hope. The system is almost unusable. Soon after anything creates a lot of metadata updates, it grinds into the ground doing ~350 write iops, 0 read, forever trying to write them out. Processes start blocking on reads and/or txg closes and the system never comes back. rsync and atimes were the first and worst culprit, but atime=off wasn''t enough to prevent it competely. It can''t just be that these are slow to write out, because I would expect it to eventually finish. I suspect something else is going on here. Would zfs re-issue writes if they haven''t gotten to the disk yet, somehow? I''m not talking about ata commands timing out, but something at a higher level making a long queue worse. I''ll have to find time to convert this bac to ashift=12 and try your boot blocks soon. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110809/b6e2f850/attachment.bin>
On Tue, Aug 09, 2011 at 10:51:37AM +1000, Daniel Carosone wrote:> On Mon, Aug 01, 2011 at 01:25:35PM +1000, Daniel Carosone wrote: > > To be clear, the system I was working on the other day is now running > > with a normal ashift=9 pool, on a mirror of WD 2TB EARX. Not quite > > what I was hoping for, but hopefully it will be OK; I won''t have much > > chance to mess with it again for a little while. > > That turned out to be a false hope. The system is almost unusable. > > Soon after anything creates a lot of metadata updates, it grinds into > the ground doing ~350 write iops, 0 read, forever trying to write them > out. Processes start blocking on reads and/or txg closes and the > system never comes back. rsync and atimes were the first and worst > culprit, but atime=off wasn''t enough to prevent it competely.With a bit of tweaking of workload, I bought enough time to put this off.> It can''t just be that these are slow to write out, because I would > expect it to eventually finish. I suspect something else is going on > here.That something, apparently, was thrashing on swap space. :-/> I''ll have to find time to convert this bac to ashift=12 and try your > boot blocks soon.I see via the issue tracker that there have been several updates since, and an integration back into the main Illumos tree. How do I go about getting hold of current boot blocks? -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110905/56ba44cb/attachment.bin>
Hi, On Mon, Sep 05, 2011 at 02:18:48AM -0400, Daniel Carosone wrote:> I see via the issue tracker that there have been several updates > since, and an integration back into the main Illumos tree. How do I > go about getting hold of current boot blocks?The OpenIndiana release that was announced earlier today has the fixed boot blocks. Hans -- %SYSTEM-F-ANARCHISM, The operating system has been overthrown
On Wed, Sep 14, 2011 at 04:08:19PM +0200, Hans Rosenfeld wrote:> On Mon, Sep 05, 2011 at 02:18:48AM -0400, Daniel Carosone wrote: > > I see via the issue tracker that there have been several updates > > since, and an integration back into the main Illumos tree. How do I > > go about getting hold of current boot blocks? > > The OpenIndiana release that was announced earlier today has the fixed > boot blocks.Yep, saw that and have it here ready to boot and install grub. I hope the fact that the pool itself is v31 for zfs crypto will not be a problem.. If it should be the case that the pool version is an issue running from the OI CD, can I take the updated stage* files and use them with the installgrub from solaris express b151? I guess I''ll find out in due course. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110915/da3fb446/attachment.bin>
On Wed, Sep 14, 2011 at 07:10:21PM -0400, Daniel Carosone wrote:> Yep, saw that and have it here ready to boot and install grub. I hope > the fact that the pool itself is v31 for zfs crypto will not be a > problem..I doubt that works. IIRC the highest supported zpool and zfs versions are hardcoded in the grub zfs code, probably to make sure that no garbage is read.> If it should be the case that the pool version is an issue running > from the OI CD, can I take the updated stage* files and use them with > the installgrub from solaris express b151?Would make no difference. You could try to get the grub source code from S11X and patch and rebuild it manually. Hans -- %SYSTEM-F-ANARCHISM, The operating system has been overthrown