Juergen Keil
2007-Jul-23  13:03 UTC
GRUB, zfs-root + Xen: Error 16: Inconsistent filesystem structure
Hi Lin,
In addition to bug 6541114...
Bug ID    6541114
Synopsis  GRUB/ZFS fails to load files from a default compressed (lzjb) root
... I found yet another way to get the "Error 16: Inconsistent filesystem
structure" from GRUB.  This time when trying to boot a Xen Dom0 from a
zfs bootfs
Synopsis: grub/zfs-root: cannot boot xen from a zfs root
=======================================================================
I''ve tried to install snv66 + xen into an lzjb compressed zfs
root filesystem.
menu.lst entry for xen is:
# ------------------------------------------------------------
title Solaris Nevada snv_66 X86 (xen dom0)
root (,0,g)
bootfs files/s11-root-xen
kernel$ /boot/$ISADIR/xen.gz
module$ /platform/i86xpv/kernel/$ISADIR/unix /platform/i86xpv/kernel/$ISADIR/uni
x -B $ZFS-BOOTFS -vk
module$ /platform/i86pc/$ISADIR/boot_archive
# ------------------------------------------------------------
grub boot for xen crashes with the error message:
    Error 16: Inconsistent filesystem structure
GRUB uses fixed memory locations for MOS, DNODE, ZFS_SCRATCH...
MOS is at memory location 0x100000.
DNODE is at memory location 0x140000.
ZFS_SCRATCH is at memory location 0x180000.
Standard Solaris kernel /platform/i86pc/kernel/amd64/unix loads at
0x400000, 0x800000 and 0xC00000, and /platform/i86pc/amd64/boot_archive
is loaded at 0xd5d000 - all after grub''s MOS / DNODE / ZFS_SCRATCH
location.
Xen hypervisor /boot/amd64/xen.gz is loaded at
<0x100000:0x9c878:0x58788>.
GRUB is able to read the first 128k of compressed data from the zfs
root, decompresses the data to address 0x100000, and the attempt to
read the next 128k block from xen.gz fails because the DNODE data is
overwritten.  Things start to fail when we find 
"DNODE->dn_datablkszsec == 35656" (should be 256) in zfs_read(),
that is,
a datablk size of ~18mbytes instead of the expected 128kbytes.
Problem #1:
==========
fsys_zfs.c is supposed to use the following memory map:
 * (memory addr)   MOS      DNODE       ZFS_SCRATCH
 *                  |         |          |
 *          +-------V---------V----------V---------------+
 *   memory |       | dnode   | dnode    |  scratch      |
 *          |       | 512B    | 512B     |  area         |
 *          +--------------------------------------------+
Using these defines...
#define MOS                     ((dnode_phys_t *)(RAW_ADDR(0x100000)))
#define DNODE                   ((dnode_phys_t *)(MOS + DNODE_SIZE))
#define ZFS_SCRATCH             ((char *)(DNODE + DNODE_SIZE))
... the DNODE area is located ``512*sizeof(dnode_phys_t)''''
bytes after
MOS, not 512 bytes!  Instead of 512 bytes for MOS, fsys_zfs is using
256 kbytes.   Same problem with the size for the DNODE area.
Apparently we want:
#define MOS                     ((dnode_phys_t *)(RAW_ADDR(0x100000)))
#define DNODE                   ((dnode_phys_t *)((char*)MOS + DNODE_SIZE))
#define ZFS_SCRATCH             ((char *)DNODE + DNODE_SIZE)
Problem #2:
==========
We should find a better base address for MOS/DNODE/ZFS_SCRATCH
This seems to be the memory in use by GRUB:
 0x007be BOOT_PART_TABLE
 0x01000-0x1fff STAGE1_STACK / real mode stage2 STACKOFF (< 0x2000)
 0x02000 MB_CMDLINE_BUF
 0x07C00 BOOTSEC_LOCATION / MBR
 0x08000 stage1 / PBR (start.S)
 0x08200 stage2 (asm.S)
 0x10000 LINUX_ZIMAGE_ADDR
 0x60000-0x67fff protected mode stack
 0x68000 FSYS_BUF (filesystem (not raw device) buffer / 32k)
 0x70000 BUFFERADDR (raw device buffer / 31.5K)
 0x77e00 SCRATCHADDR (512-byte scratch area)
 0x78000 PASSWORD_BUF ... MENU_BUF
 0x80000 free?
 0x90000 LINUX_OLD_REAL_MODE_ADDR
 0xA0000 Video memory?
 0xB0000 HERCULES_VIDEO_ADDR
0x100000 LINUX_BZIMAGE_ADDR / XEN
Maybe reusing 0x90000 could work (because we don''t want to boot old
linux stuff)?
Or the FSYS_BUF at 0x68000?  Other fsys_xxx modules use the 32k at
0x68000 FSYS_BUF.
Well, I experimented with these addresses, but the problem seems to be
that ZFS_SCRATCH needs *lots* of free space. All the areas below 0x100000
appear to be too small for fsys_zfs.c
I''m currently using 0x4000000 as MOS base address, as an ugly
workaround,
to boot both standard Solaris kernels and the xen hypervisor:
#define MOS                     ((dnode_phys_t *)(RAW_ADDR(0x4000000)))
#define DNODE                   ((dnode_phys_t *)((char *)MOS + DNODE_SIZE))
#define ZFS_SCRATCH             ((char *)DNODE + DNODE_SIZE)
I guess another option would be to change the load address in the
xen hypervisor from 0x100000 to 0x400000 (just like
/platform/i86pc/kernel/unix) ?  That''ll leave ~ 3MB of free space for
ZFS_SCRATCH ...
Lin Ling
2007-Jul-24  02:12 UTC
Re: GRUB, zfs-root + Xen: Error 16: Inconsistent filesystem structure
Hi Juergen, Thanks for the findings, see inline comments: Juergen Keil wrote:> Hi Lin, > > In addition to bug 6541114... > > Bug ID 6541114 > Synopsis GRUB/ZFS fails to load files from a default compressed (lzjb) root > > ... I found yet another way to get the "Error 16: Inconsistent filesystem > structure" from GRUB. This time when trying to boot a Xen Dom0 from a > zfs bootfs > > > Synopsis: grub/zfs-root: cannot boot xen from a zfs root > =======================================================================> > I''ve tried to install snv66 + xen into an lzjb compressed zfs > root filesystem. > > menu.lst entry for xen is: > > # ------------------------------------------------------------ > title Solaris Nevada snv_66 X86 (xen dom0) > root (,0,g) > bootfs files/s11-root-xen > kernel$ /boot/$ISADIR/xen.gz > module$ /platform/i86xpv/kernel/$ISADIR/unix /platform/i86xpv/kernel/$ISADIR/uni > x -B $ZFS-BOOTFS -vk > module$ /platform/i86pc/$ISADIR/boot_archive > # ------------------------------------------------------------ > > grub boot for xen crashes with the error message: > > Error 16: Inconsistent filesystem structure > > > > GRUB uses fixed memory locations for MOS, DNODE, ZFS_SCRATCH... > > MOS is at memory location 0x100000. > DNODE is at memory location 0x140000. > ZFS_SCRATCH is at memory location 0x180000. > > Standard Solaris kernel /platform/i86pc/kernel/amd64/unix loads at > 0x400000, 0x800000 and 0xC00000, and /platform/i86pc/amd64/boot_archive > is loaded at 0xd5d000 - all after grub''s MOS / DNODE / ZFS_SCRATCH location. > > > > Xen hypervisor /boot/amd64/xen.gz is loaded at > <0x100000:0x9c878:0x58788>. > > GRUB is able to read the first 128k of compressed data from the zfs > root, decompresses the data to address 0x100000, and the attempt to > read the next 128k block from xen.gz fails because the DNODE data is > overwritten. Things start to fail when we find > "DNODE->dn_datablkszsec == 35656" (should be 256) in zfs_read(), that is, > a datablk size of ~18mbytes instead of the expected 128kbytes. > > > Problem #1: > ==========> > fsys_zfs.c is supposed to use the following memory map: > > * (memory addr) MOS DNODE ZFS_SCRATCH > * | | | > * +-------V---------V----------V---------------+ > * memory | | dnode | dnode | scratch | > * | | 512B | 512B | area | > * +--------------------------------------------+ > > Using these defines... > > #define MOS ((dnode_phys_t *)(RAW_ADDR(0x100000))) > #define DNODE ((dnode_phys_t *)(MOS + DNODE_SIZE)) > #define ZFS_SCRATCH ((char *)(DNODE + DNODE_SIZE)) > > ... the DNODE area is located ``512*sizeof(dnode_phys_t)'''' bytes after > MOS, not 512 bytes! Instead of 512 bytes for MOS, fsys_zfs is using > 256 kbytes. Same problem with the size for the DNODE area. > > Apparently we want: > > #define MOS ((dnode_phys_t *)(RAW_ADDR(0x100000))) > #define DNODE ((dnode_phys_t *)((char*)MOS + DNODE_SIZE)) > #define ZFS_SCRATCH ((char *)DNODE + DNODE_SIZE) > > >I will putback this fix along with 6541114.> Problem #2: > ==========> > We should find a better base address for MOS/DNODE/ZFS_SCRATCH > > This seems to be the memory in use by GRUB: > > 0x007be BOOT_PART_TABLE > 0x01000-0x1fff STAGE1_STACK / real mode stage2 STACKOFF (< 0x2000) > 0x02000 MB_CMDLINE_BUF > 0x07C00 BOOTSEC_LOCATION / MBR > 0x08000 stage1 / PBR (start.S) > 0x08200 stage2 (asm.S) > 0x10000 LINUX_ZIMAGE_ADDR > 0x60000-0x67fff protected mode stack > 0x68000 FSYS_BUF (filesystem (not raw device) buffer / 32k) > 0x70000 BUFFERADDR (raw device buffer / 31.5K) > 0x77e00 SCRATCHADDR (512-byte scratch area) > 0x78000 PASSWORD_BUF ... MENU_BUF > 0x80000 free? > 0x90000 LINUX_OLD_REAL_MODE_ADDR > 0xA0000 Video memory? > 0xB0000 HERCULES_VIDEO_ADDR > 0x100000 LINUX_BZIMAGE_ADDR / XEN > > Maybe reusing 0x90000 could work (because we don''t want to boot old > linux stuff)? > > Or the FSYS_BUF at 0x68000? Other fsys_xxx modules use the 32k at > 0x68000 FSYS_BUF. > > > Well, I experimented with these addresses, but the problem seems to be > that ZFS_SCRATCH needs *lots* of free space. All the areas below 0x100000 > appear to be too small for fsys_zfs.c > > > I''m currently using 0x4000000 as MOS base address, as an ugly workaround, > to boot both standard Solaris kernels and the xen hypervisor: > > > #define MOS ((dnode_phys_t *)(RAW_ADDR(0x4000000))) > #define DNODE ((dnode_phys_t *)((char *)MOS + DNODE_SIZE)) > #define ZFS_SCRATCH ((char *)DNODE + DNODE_SIZE) > > > > I guess another option would be to change the load address in the > xen hypervisor from 0x100000 to 0x400000 (just like > /platform/i86pc/kernel/unix) ? That''ll leave ~ 3MB of free space for > ZFS_SCRATCH ... > > >Changing the load address in the xen hypervisor from 0x100000 to 0x400000 makes sense to me. Thanks, Lin
Joe Bonasera
2007-Jul-24  16:05 UTC
Re: GRUB, zfs-root + Xen: Error 16: Inconsistent filesystem structure
Chaning Xen''s load address is something that we really don''t have as much control over as one would want. We could customize the version of Xen that we ship with Solaris, but that makes our Dom0 incompatible with every other Xen being shipped in the open source world out there -- which would make Solaris look rather bad and rather defeats some of the purpose of open source. Even if the Xen people are willing to take back the change to the load address, it would thro a major monkey wrench into our current schedule. I''m wondering if it would be simpler to fix ZFS grub to not use hard coded physical addresses. That seems to be a rather poor design, especially since address space above 1Meg is what GRUB traditionally leaves for any OS it''s booting. Did the ZFS module try to use the GRUB dynamic allocation mechanism and find problems or did you just not even try that direction? Joe Lin Ling wrote:> Hi Juergen, > > Thanks for the findings, see inline comments: > > Juergen Keil wrote: >> Hi Lin, >> >> In addition to bug 6541114... >> >> Bug ID 6541114 >> Synopsis GRUB/ZFS fails to load files from a default compressed (lzjb) root >> >> ... I found yet another way to get the "Error 16: Inconsistent filesystem >> structure" from GRUB. This time when trying to boot a Xen Dom0 from a >> zfs bootfs >> >> >> Synopsis: grub/zfs-root: cannot boot xen from a zfs root >> =======================================================================>> >> I''ve tried to install snv66 + xen into an lzjb compressed zfs >> root filesystem. >> >> menu.lst entry for xen is: >> >> # ------------------------------------------------------------ >> title Solaris Nevada snv_66 X86 (xen dom0) >> root (,0,g) >> bootfs files/s11-root-xen >> kernel$ /boot/$ISADIR/xen.gz >> module$ /platform/i86xpv/kernel/$ISADIR/unix /platform/i86xpv/kernel/$ISADIR/uni >> x -B $ZFS-BOOTFS -vk >> module$ /platform/i86pc/$ISADIR/boot_archive >> # ------------------------------------------------------------ >> >> grub boot for xen crashes with the error message: >> >> Error 16: Inconsistent filesystem structure >> >> >> >> GRUB uses fixed memory locations for MOS, DNODE, ZFS_SCRATCH... >> >> MOS is at memory location 0x100000. >> DNODE is at memory location 0x140000. >> ZFS_SCRATCH is at memory location 0x180000. >> >> Standard Solaris kernel /platform/i86pc/kernel/amd64/unix loads at >> 0x400000, 0x800000 and 0xC00000, and /platform/i86pc/amd64/boot_archive >> is loaded at 0xd5d000 - all after grub''s MOS / DNODE / ZFS_SCRATCH location. >> >> >> >> Xen hypervisor /boot/amd64/xen.gz is loaded at >> <0x100000:0x9c878:0x58788>. >> >> GRUB is able to read the first 128k of compressed data from the zfs >> root, decompresses the data to address 0x100000, and the attempt to >> read the next 128k block from xen.gz fails because the DNODE data is >> overwritten. Things start to fail when we find >> "DNODE->dn_datablkszsec == 35656" (should be 256) in zfs_read(), that is, >> a datablk size of ~18mbytes instead of the expected 128kbytes. >> >> >> Problem #1: >> ==========>> >> fsys_zfs.c is supposed to use the following memory map: >> >> * (memory addr) MOS DNODE ZFS_SCRATCH >> * | | | >> * +-------V---------V----------V---------------+ >> * memory | | dnode | dnode | scratch | >> * | | 512B | 512B | area | >> * +--------------------------------------------+ >> >> Using these defines... >> >> #define MOS ((dnode_phys_t *)(RAW_ADDR(0x100000))) >> #define DNODE ((dnode_phys_t *)(MOS + DNODE_SIZE)) >> #define ZFS_SCRATCH ((char *)(DNODE + DNODE_SIZE)) >> >> ... the DNODE area is located ``512*sizeof(dnode_phys_t)'''' bytes after >> MOS, not 512 bytes! Instead of 512 bytes for MOS, fsys_zfs is using >> 256 kbytes. Same problem with the size for the DNODE area. >> >> Apparently we want: >> >> #define MOS ((dnode_phys_t *)(RAW_ADDR(0x100000))) >> #define DNODE ((dnode_phys_t *)((char*)MOS + DNODE_SIZE)) >> #define ZFS_SCRATCH ((char *)DNODE + DNODE_SIZE) >> >> >> > > I will putback this fix along with 6541114. > >> Problem #2: >> ==========>> >> We should find a better base address for MOS/DNODE/ZFS_SCRATCH >> >> This seems to be the memory in use by GRUB: >> >> 0x007be BOOT_PART_TABLE >> 0x01000-0x1fff STAGE1_STACK / real mode stage2 STACKOFF (< 0x2000) >> 0x02000 MB_CMDLINE_BUF >> 0x07C00 BOOTSEC_LOCATION / MBR >> 0x08000 stage1 / PBR (start.S) >> 0x08200 stage2 (asm.S) >> 0x10000 LINUX_ZIMAGE_ADDR >> 0x60000-0x67fff protected mode stack >> 0x68000 FSYS_BUF (filesystem (not raw device) buffer / 32k) >> 0x70000 BUFFERADDR (raw device buffer / 31.5K) >> 0x77e00 SCRATCHADDR (512-byte scratch area) >> 0x78000 PASSWORD_BUF ... MENU_BUF >> 0x80000 free? >> 0x90000 LINUX_OLD_REAL_MODE_ADDR >> 0xA0000 Video memory? >> 0xB0000 HERCULES_VIDEO_ADDR >> 0x100000 LINUX_BZIMAGE_ADDR / XEN >> >> Maybe reusing 0x90000 could work (because we don''t want to boot old >> linux stuff)? >> >> Or the FSYS_BUF at 0x68000? Other fsys_xxx modules use the 32k at >> 0x68000 FSYS_BUF. >> >> >> Well, I experimented with these addresses, but the problem seems to be >> that ZFS_SCRATCH needs *lots* of free space. All the areas below 0x100000 >> appear to be too small for fsys_zfs.c >> >> >> I''m currently using 0x4000000 as MOS base address, as an ugly workaround, >> to boot both standard Solaris kernels and the xen hypervisor: >> >> >> #define MOS ((dnode_phys_t *)(RAW_ADDR(0x4000000))) >> #define DNODE ((dnode_phys_t *)((char *)MOS + DNODE_SIZE)) >> #define ZFS_SCRATCH ((char *)DNODE + DNODE_SIZE) >> >> >> >> I guess another option would be to change the load address in the >> xen hypervisor from 0x100000 to 0x400000 (just like >> /platform/i86pc/kernel/unix) ? That''ll leave ~ 3MB of free space for >> ZFS_SCRATCH ... >> >> >> > > Changing the load address in the xen hypervisor from 0x100000 to 0x400000 > makes sense to me. > > Thanks, > Lin > _______________________________________________ > xen-discuss mailing list > xen-discuss@opensolaris.org
Joe Bonasera
2007-Jul-24  16:36 UTC
Re: GRUB, zfs-root + Xen: Error 16: Inconsistent filesystem structure
Thinking about this, I think the simplest fix may be to locate the memory via something like: uint64_t top; top = ... top of physical memory ... if (top > 4Gig) top = 4Gig zfs_addresses to use = top - AMOUNT needed That''s because GRUB and loaded kernel modules use physical memory from the bottom direction in order to boot on smaller memory machines. The top of physical memory should be available pretty easily, as GRUB is passing that information on to the booting OS. Joe Joe Bonasera wrote:> Chaning Xen''s load address is something that we really > don''t have as much control over as one would want. > > We could customize the version of Xen that we ship with > Solaris, but that makes our Dom0 incompatible with > every other Xen being shipped in the open source world > out there -- which would make Solaris look rather bad and rather > defeats some of the purpose of open source. > > Even if the Xen people are willing to take back the change to > the load address, it would thro a major monkey wrench into > our current schedule. > > I''m wondering if it would be simpler to fix ZFS grub to not use hard > coded physical addresses. That seems to be a rather poor > design, especially since address space above 1Meg is what > GRUB traditionally leaves for any OS it''s booting. > > Did the ZFS module try to use the GRUB dynamic allocation > mechanism and find problems or did you just not even try > that direction? > > Joe > > > Lin Ling wrote: >> Hi Juergen, >> >> Thanks for the findings, see inline comments: >> >> Juergen Keil wrote: >>> Hi Lin, >>> >>> In addition to bug 6541114... >>> >>> Bug ID 6541114 >>> Synopsis GRUB/ZFS fails to load files from a default compressed (lzjb) root >>> >>> ... I found yet another way to get the "Error 16: Inconsistent filesystem >>> structure" from GRUB. This time when trying to boot a Xen Dom0 from a >>> zfs bootfs >>> >>> >>> Synopsis: grub/zfs-root: cannot boot xen from a zfs root >>> =======================================================================>>> >>> I''ve tried to install snv66 + xen into an lzjb compressed zfs >>> root filesystem. >>> >>> menu.lst entry for xen is: >>> >>> # ------------------------------------------------------------ >>> title Solaris Nevada snv_66 X86 (xen dom0) >>> root (,0,g) >>> bootfs files/s11-root-xen >>> kernel$ /boot/$ISADIR/xen.gz >>> module$ /platform/i86xpv/kernel/$ISADIR/unix /platform/i86xpv/kernel/$ISADIR/uni >>> x -B $ZFS-BOOTFS -vk >>> module$ /platform/i86pc/$ISADIR/boot_archive >>> # ------------------------------------------------------------ >>> >>> grub boot for xen crashes with the error message: >>> >>> Error 16: Inconsistent filesystem structure >>> >>> >>> >>> GRUB uses fixed memory locations for MOS, DNODE, ZFS_SCRATCH... >>> >>> MOS is at memory location 0x100000. >>> DNODE is at memory location 0x140000. >>> ZFS_SCRATCH is at memory location 0x180000. >>> >>> Standard Solaris kernel /platform/i86pc/kernel/amd64/unix loads at >>> 0x400000, 0x800000 and 0xC00000, and /platform/i86pc/amd64/boot_archive >>> is loaded at 0xd5d000 - all after grub''s MOS / DNODE / ZFS_SCRATCH location. >>> >>> >>> >>> Xen hypervisor /boot/amd64/xen.gz is loaded at >>> <0x100000:0x9c878:0x58788>. >>> >>> GRUB is able to read the first 128k of compressed data from the zfs >>> root, decompresses the data to address 0x100000, and the attempt to >>> read the next 128k block from xen.gz fails because the DNODE data is >>> overwritten. Things start to fail when we find >>> "DNODE->dn_datablkszsec == 35656" (should be 256) in zfs_read(), that is, >>> a datablk size of ~18mbytes instead of the expected 128kbytes. >>> >>> >>> Problem #1: >>> ==========>>> >>> fsys_zfs.c is supposed to use the following memory map: >>> >>> * (memory addr) MOS DNODE ZFS_SCRATCH >>> * | | | >>> * +-------V---------V----------V---------------+ >>> * memory | | dnode | dnode | scratch | >>> * | | 512B | 512B | area | >>> * +--------------------------------------------+ >>> >>> Using these defines... >>> >>> #define MOS ((dnode_phys_t *)(RAW_ADDR(0x100000))) >>> #define DNODE ((dnode_phys_t *)(MOS + DNODE_SIZE)) >>> #define ZFS_SCRATCH ((char *)(DNODE + DNODE_SIZE)) >>> >>> ... the DNODE area is located ``512*sizeof(dnode_phys_t)'''' bytes after >>> MOS, not 512 bytes! Instead of 512 bytes for MOS, fsys_zfs is using >>> 256 kbytes. Same problem with the size for the DNODE area. >>> >>> Apparently we want: >>> >>> #define MOS ((dnode_phys_t *)(RAW_ADDR(0x100000))) >>> #define DNODE ((dnode_phys_t *)((char*)MOS + DNODE_SIZE)) >>> #define ZFS_SCRATCH ((char *)DNODE + DNODE_SIZE) >>> >>> >>> >> I will putback this fix along with 6541114. >> >>> Problem #2: >>> ==========>>> >>> We should find a better base address for MOS/DNODE/ZFS_SCRATCH >>> >>> This seems to be the memory in use by GRUB: >>> >>> 0x007be BOOT_PART_TABLE >>> 0x01000-0x1fff STAGE1_STACK / real mode stage2 STACKOFF (< 0x2000) >>> 0x02000 MB_CMDLINE_BUF >>> 0x07C00 BOOTSEC_LOCATION / MBR >>> 0x08000 stage1 / PBR (start.S) >>> 0x08200 stage2 (asm.S) >>> 0x10000 LINUX_ZIMAGE_ADDR >>> 0x60000-0x67fff protected mode stack >>> 0x68000 FSYS_BUF (filesystem (not raw device) buffer / 32k) >>> 0x70000 BUFFERADDR (raw device buffer / 31.5K) >>> 0x77e00 SCRATCHADDR (512-byte scratch area) >>> 0x78000 PASSWORD_BUF ... MENU_BUF >>> 0x80000 free? >>> 0x90000 LINUX_OLD_REAL_MODE_ADDR >>> 0xA0000 Video memory? >>> 0xB0000 HERCULES_VIDEO_ADDR >>> 0x100000 LINUX_BZIMAGE_ADDR / XEN >>> >>> Maybe reusing 0x90000 could work (because we don''t want to boot old >>> linux stuff)? >>> >>> Or the FSYS_BUF at 0x68000? Other fsys_xxx modules use the 32k at >>> 0x68000 FSYS_BUF. >>> >>> >>> Well, I experimented with these addresses, but the problem seems to be >>> that ZFS_SCRATCH needs *lots* of free space. All the areas below 0x100000 >>> appear to be too small for fsys_zfs.c >>> >>> >>> I''m currently using 0x4000000 as MOS base address, as an ugly workaround, >>> to boot both standard Solaris kernels and the xen hypervisor: >>> >>> >>> #define MOS ((dnode_phys_t *)(RAW_ADDR(0x4000000))) >>> #define DNODE ((dnode_phys_t *)((char *)MOS + DNODE_SIZE)) >>> #define ZFS_SCRATCH ((char *)DNODE + DNODE_SIZE) >>> >>> >>> >>> I guess another option would be to change the load address in the >>> xen hypervisor from 0x100000 to 0x400000 (just like >>> /platform/i86pc/kernel/unix) ? That''ll leave ~ 3MB of free space for >>> ZFS_SCRATCH ... >>> >>> >>> >> Changing the load address in the xen hypervisor from 0x100000 to 0x400000 >> makes sense to me. >> >> Thanks, >> Lin >> _______________________________________________ >> xen-discuss mailing list >> xen-discuss@opensolaris.org > > _______________________________________________ > xen-discuss mailing list > xen-discuss@opensolaris.org
Lin Ling
2007-Jul-24  17:45 UTC
Re: GRUB, zfs-root + Xen: Error 16: Inconsistent filesystem structure
Joe, This sounds like a possible solution. I will file a development/zfs/boot bug to track this. Thanks, Lin Joe Bonasera wrote:> > Thinking about this, I think the simplest fix may be to > locate the memory via something like: > > uint64_t top; > top = ... top of physical memory ... > if (top > 4Gig) > top = 4Gig > zfs_addresses to use = top - AMOUNT needed > > That''s because GRUB and loaded kernel modules use > physical memory from the bottom direction in order to boot > on smaller memory machines. > > The top of physical memory should be available pretty > easily, as GRUB is passing that information on to the > booting OS. > > Joe > > > Joe Bonasera wrote: >> Chaning Xen''s load address is something that we really >> don''t have as much control over as one would want. >> >> We could customize the version of Xen that we ship with >> Solaris, but that makes our Dom0 incompatible with >> every other Xen being shipped in the open source world >> out there -- which would make Solaris look rather bad and rather >> defeats some of the purpose of open source. >> >> Even if the Xen people are willing to take back the change to >> the load address, it would thro a major monkey wrench into >> our current schedule. >> >> I''m wondering if it would be simpler to fix ZFS grub to not use hard >> coded physical addresses. That seems to be a rather poor >> design, especially since address space above 1Meg is what >> GRUB traditionally leaves for any OS it''s booting. >> >> Did the ZFS module try to use the GRUB dynamic allocation >> mechanism and find problems or did you just not even try >> that direction? >> >> Joe >> >> >> Lin Ling wrote: >>> Hi Juergen, >>> >>> Thanks for the findings, see inline comments: >>> >>> Juergen Keil wrote: >>>> Hi Lin, >>>> >>>> In addition to bug 6541114... >>>> >>>> Bug ID 6541114 >>>> Synopsis GRUB/ZFS fails to load files from a default compressed >>>> (lzjb) root >>>> >>>> ... I found yet another way to get the "Error 16: Inconsistent >>>> filesystem >>>> structure" from GRUB. This time when trying to boot a Xen Dom0 from a >>>> zfs bootfs >>>> >>>> >>>> Synopsis: grub/zfs-root: cannot boot xen from a zfs root >>>> ======================================================================== >>>> >>>> >>>> I''ve tried to install snv66 + xen into an lzjb compressed zfs >>>> root filesystem. >>>> >>>> menu.lst entry for xen is: >>>> >>>> # ------------------------------------------------------------ >>>> title Solaris Nevada snv_66 X86 (xen dom0) >>>> root (,0,g) >>>> bootfs files/s11-root-xen >>>> kernel$ /boot/$ISADIR/xen.gz >>>> module$ /platform/i86xpv/kernel/$ISADIR/unix >>>> /platform/i86xpv/kernel/$ISADIR/uni >>>> x -B $ZFS-BOOTFS -vk >>>> module$ /platform/i86pc/$ISADIR/boot_archive >>>> # ------------------------------------------------------------ >>>> >>>> grub boot for xen crashes with the error message: >>>> >>>> Error 16: Inconsistent filesystem structure >>>> >>>> >>>> >>>> GRUB uses fixed memory locations for MOS, DNODE, ZFS_SCRATCH... >>>> >>>> MOS is at memory location 0x100000. >>>> DNODE is at memory location 0x140000. >>>> ZFS_SCRATCH is at memory location 0x180000. >>>> >>>> Standard Solaris kernel /platform/i86pc/kernel/amd64/unix loads at >>>> 0x400000, 0x800000 and 0xC00000, and >>>> /platform/i86pc/amd64/boot_archive >>>> is loaded at 0xd5d000 - all after grub''s MOS / DNODE / ZFS_SCRATCH >>>> location. >>>> >>>> >>>> >>>> Xen hypervisor /boot/amd64/xen.gz is loaded at >>>> <0x100000:0x9c878:0x58788>. >>>> >>>> GRUB is able to read the first 128k of compressed data from the zfs >>>> root, decompresses the data to address 0x100000, and the attempt to >>>> read the next 128k block from xen.gz fails because the DNODE data is >>>> overwritten. Things start to fail when we find >>>> "DNODE->dn_datablkszsec == 35656" (should be 256) in zfs_read(), >>>> that is, >>>> a datablk size of ~18mbytes instead of the expected 128kbytes. >>>> >>>> >>>> Problem #1: >>>> ==========>>>> >>>> fsys_zfs.c is supposed to use the following memory map: >>>> >>>> * (memory addr) MOS DNODE ZFS_SCRATCH >>>> * | | | >>>> * +-------V---------V----------V---------------+ >>>> * memory | | dnode | dnode | scratch | >>>> * | | 512B | 512B | area | >>>> * +--------------------------------------------+ >>>> >>>> Using these defines... >>>> >>>> #define MOS ((dnode_phys_t *)(RAW_ADDR(0x100000))) >>>> #define DNODE ((dnode_phys_t *)(MOS + DNODE_SIZE)) >>>> #define ZFS_SCRATCH ((char *)(DNODE + DNODE_SIZE)) >>>> >>>> ... the DNODE area is located ``512*sizeof(dnode_phys_t)'''' bytes after >>>> MOS, not 512 bytes! Instead of 512 bytes for MOS, fsys_zfs is using >>>> 256 kbytes. Same problem with the size for the DNODE area. >>>> >>>> Apparently we want: >>>> >>>> #define MOS ((dnode_phys_t *)(RAW_ADDR(0x100000))) >>>> #define DNODE ((dnode_phys_t *)((char*)MOS + >>>> DNODE_SIZE)) >>>> #define ZFS_SCRATCH ((char *)DNODE + DNODE_SIZE) >>>> >>>> >>>> >>> I will putback this fix along with 6541114. >>> >>>> Problem #2: >>>> ==========>>>> >>>> We should find a better base address for MOS/DNODE/ZFS_SCRATCH >>>> >>>> This seems to be the memory in use by GRUB: >>>> >>>> 0x007be BOOT_PART_TABLE >>>> 0x01000-0x1fff STAGE1_STACK / real mode stage2 STACKOFF (< 0x2000) >>>> 0x02000 MB_CMDLINE_BUF >>>> 0x07C00 BOOTSEC_LOCATION / MBR >>>> 0x08000 stage1 / PBR (start.S) >>>> 0x08200 stage2 (asm.S) >>>> 0x10000 LINUX_ZIMAGE_ADDR >>>> 0x60000-0x67fff protected mode stack >>>> 0x68000 FSYS_BUF (filesystem (not raw device) buffer / 32k) >>>> 0x70000 BUFFERADDR (raw device buffer / 31.5K) >>>> 0x77e00 SCRATCHADDR (512-byte scratch area) >>>> 0x78000 PASSWORD_BUF ... MENU_BUF >>>> 0x80000 free? >>>> 0x90000 LINUX_OLD_REAL_MODE_ADDR >>>> 0xA0000 Video memory? >>>> 0xB0000 HERCULES_VIDEO_ADDR >>>> 0x100000 LINUX_BZIMAGE_ADDR / XEN >>>> >>>> Maybe reusing 0x90000 could work (because we don''t want to boot old >>>> linux stuff)? >>>> >>>> Or the FSYS_BUF at 0x68000? Other fsys_xxx modules use the 32k at >>>> 0x68000 FSYS_BUF. >>>> >>>> >>>> Well, I experimented with these addresses, but the problem seems to be >>>> that ZFS_SCRATCH needs *lots* of free space. All the areas below >>>> 0x100000 >>>> appear to be too small for fsys_zfs.c >>>> >>>> >>>> I''m currently using 0x4000000 as MOS base address, as an ugly >>>> workaround, >>>> to boot both standard Solaris kernels and the xen hypervisor: >>>> >>>> >>>> #define MOS ((dnode_phys_t >>>> *)(RAW_ADDR(0x4000000))) >>>> #define DNODE ((dnode_phys_t *)((char *)MOS + >>>> DNODE_SIZE)) >>>> #define ZFS_SCRATCH ((char *)DNODE + DNODE_SIZE) >>>> >>>> >>>> >>>> I guess another option would be to change the load address in the >>>> xen hypervisor from 0x100000 to 0x400000 (just like >>>> /platform/i86pc/kernel/unix) ? That''ll leave ~ 3MB of free space for >>>> ZFS_SCRATCH ... >>>> >>>> >>>> >>> Changing the load address in the xen hypervisor from 0x100000 to >>> 0x400000 >>> makes sense to me. >>> >>> Thanks, >>> Lin >>> _______________________________________________ >>> xen-discuss mailing list >>> xen-discuss@opensolaris.org >> >> _______________________________________________ >> xen-discuss mailing list >> xen-discuss@opensolaris.org >
Joe Bonasera
2007-Jul-24  17:49 UTC
Re: GRUB, zfs-root + Xen: Error 16: Inconsistent filesystem structure
I''ve been looking through the grub source and it seems like the only gotcha is that GRUB gunzip.c is using something like this trick already to get scratch memory. It shouldn''t be too hard to make zfs and gunzip.c not step on each other. Lin Ling wrote:> > Joe, > > This sounds like a possible solution. > I will file a development/zfs/boot bug to track this. > > Thanks, > Lin > > Joe Bonasera wrote: >> >> Thinking about this, I think the simplest fix may be to >> locate the memory via something like: >> >> uint64_t top; >> top = ... top of physical memory ... >> if (top > 4Gig) >> top = 4Gig >> zfs_addresses to use = top - AMOUNT needed >> >> That''s because GRUB and loaded kernel modules use >> physical memory from the bottom direction in order to boot >> on smaller memory machines. >> >> The top of physical memory should be available pretty >> easily, as GRUB is passing that information on to the >> booting OS. >> >> Joe >> >> >> Joe Bonasera wrote: >>> Chaning Xen''s load address is something that we really >>> don''t have as much control over as one would want. >>> >>> We could customize the version of Xen that we ship with >>> Solaris, but that makes our Dom0 incompatible with >>> every other Xen being shipped in the open source world >>> out there -- which would make Solaris look rather bad and rather >>> defeats some of the purpose of open source. >>> >>> Even if the Xen people are willing to take back the change to >>> the load address, it would thro a major monkey wrench into >>> our current schedule. >>> >>> I''m wondering if it would be simpler to fix ZFS grub to not use hard >>> coded physical addresses. That seems to be a rather poor >>> design, especially since address space above 1Meg is what >>> GRUB traditionally leaves for any OS it''s booting. >>> >>> Did the ZFS module try to use the GRUB dynamic allocation >>> mechanism and find problems or did you just not even try >>> that direction? >>> >>> Joe >>> >>> >>> Lin Ling wrote: >>>> Hi Juergen, >>>> >>>> Thanks for the findings, see inline comments: >>>> >>>> Juergen Keil wrote: >>>>> Hi Lin, >>>>> >>>>> In addition to bug 6541114... >>>>> >>>>> Bug ID 6541114 >>>>> Synopsis GRUB/ZFS fails to load files from a default compressed >>>>> (lzjb) root >>>>> >>>>> ... I found yet another way to get the "Error 16: Inconsistent >>>>> filesystem >>>>> structure" from GRUB. This time when trying to boot a Xen Dom0 from a >>>>> zfs bootfs >>>>> >>>>> >>>>> Synopsis: grub/zfs-root: cannot boot xen from a zfs root >>>>> ======================================================================== >>>>> >>>>> >>>>> I''ve tried to install snv66 + xen into an lzjb compressed zfs >>>>> root filesystem. >>>>> >>>>> menu.lst entry for xen is: >>>>> >>>>> # ------------------------------------------------------------ >>>>> title Solaris Nevada snv_66 X86 (xen dom0) >>>>> root (,0,g) >>>>> bootfs files/s11-root-xen >>>>> kernel$ /boot/$ISADIR/xen.gz >>>>> module$ /platform/i86xpv/kernel/$ISADIR/unix >>>>> /platform/i86xpv/kernel/$ISADIR/uni >>>>> x -B $ZFS-BOOTFS -vk >>>>> module$ /platform/i86pc/$ISADIR/boot_archive >>>>> # ------------------------------------------------------------ >>>>> >>>>> grub boot for xen crashes with the error message: >>>>> >>>>> Error 16: Inconsistent filesystem structure >>>>> >>>>> >>>>> >>>>> GRUB uses fixed memory locations for MOS, DNODE, ZFS_SCRATCH... >>>>> >>>>> MOS is at memory location 0x100000. >>>>> DNODE is at memory location 0x140000. >>>>> ZFS_SCRATCH is at memory location 0x180000. >>>>> >>>>> Standard Solaris kernel /platform/i86pc/kernel/amd64/unix loads at >>>>> 0x400000, 0x800000 and 0xC00000, and >>>>> /platform/i86pc/amd64/boot_archive >>>>> is loaded at 0xd5d000 - all after grub''s MOS / DNODE / ZFS_SCRATCH >>>>> location. >>>>> >>>>> >>>>> >>>>> Xen hypervisor /boot/amd64/xen.gz is loaded at >>>>> <0x100000:0x9c878:0x58788>. >>>>> >>>>> GRUB is able to read the first 128k of compressed data from the zfs >>>>> root, decompresses the data to address 0x100000, and the attempt to >>>>> read the next 128k block from xen.gz fails because the DNODE data is >>>>> overwritten. Things start to fail when we find >>>>> "DNODE->dn_datablkszsec == 35656" (should be 256) in zfs_read(), >>>>> that is, >>>>> a datablk size of ~18mbytes instead of the expected 128kbytes. >>>>> >>>>> >>>>> Problem #1: >>>>> ==========>>>>> >>>>> fsys_zfs.c is supposed to use the following memory map: >>>>> >>>>> * (memory addr) MOS DNODE ZFS_SCRATCH >>>>> * | | | >>>>> * +-------V---------V----------V---------------+ >>>>> * memory | | dnode | dnode | scratch | >>>>> * | | 512B | 512B | area | >>>>> * +--------------------------------------------+ >>>>> >>>>> Using these defines... >>>>> >>>>> #define MOS ((dnode_phys_t *)(RAW_ADDR(0x100000))) >>>>> #define DNODE ((dnode_phys_t *)(MOS + DNODE_SIZE)) >>>>> #define ZFS_SCRATCH ((char *)(DNODE + DNODE_SIZE)) >>>>> >>>>> ... the DNODE area is located ``512*sizeof(dnode_phys_t)'''' bytes after >>>>> MOS, not 512 bytes! Instead of 512 bytes for MOS, fsys_zfs is using >>>>> 256 kbytes. Same problem with the size for the DNODE area. >>>>> >>>>> Apparently we want: >>>>> >>>>> #define MOS ((dnode_phys_t *)(RAW_ADDR(0x100000))) >>>>> #define DNODE ((dnode_phys_t *)((char*)MOS + >>>>> DNODE_SIZE)) >>>>> #define ZFS_SCRATCH ((char *)DNODE + DNODE_SIZE) >>>>> >>>>> >>>>> >>>> I will putback this fix along with 6541114. >>>> >>>>> Problem #2: >>>>> ==========>>>>> >>>>> We should find a better base address for MOS/DNODE/ZFS_SCRATCH >>>>> >>>>> This seems to be the memory in use by GRUB: >>>>> >>>>> 0x007be BOOT_PART_TABLE >>>>> 0x01000-0x1fff STAGE1_STACK / real mode stage2 STACKOFF (< 0x2000) >>>>> 0x02000 MB_CMDLINE_BUF >>>>> 0x07C00 BOOTSEC_LOCATION / MBR >>>>> 0x08000 stage1 / PBR (start.S) >>>>> 0x08200 stage2 (asm.S) >>>>> 0x10000 LINUX_ZIMAGE_ADDR >>>>> 0x60000-0x67fff protected mode stack >>>>> 0x68000 FSYS_BUF (filesystem (not raw device) buffer / 32k) >>>>> 0x70000 BUFFERADDR (raw device buffer / 31.5K) >>>>> 0x77e00 SCRATCHADDR (512-byte scratch area) >>>>> 0x78000 PASSWORD_BUF ... MENU_BUF >>>>> 0x80000 free? >>>>> 0x90000 LINUX_OLD_REAL_MODE_ADDR >>>>> 0xA0000 Video memory? >>>>> 0xB0000 HERCULES_VIDEO_ADDR >>>>> 0x100000 LINUX_BZIMAGE_ADDR / XEN >>>>> >>>>> Maybe reusing 0x90000 could work (because we don''t want to boot old >>>>> linux stuff)? >>>>> >>>>> Or the FSYS_BUF at 0x68000? Other fsys_xxx modules use the 32k at >>>>> 0x68000 FSYS_BUF. >>>>> >>>>> >>>>> Well, I experimented with these addresses, but the problem seems to be >>>>> that ZFS_SCRATCH needs *lots* of free space. All the areas below >>>>> 0x100000 >>>>> appear to be too small for fsys_zfs.c >>>>> >>>>> >>>>> I''m currently using 0x4000000 as MOS base address, as an ugly >>>>> workaround, >>>>> to boot both standard Solaris kernels and the xen hypervisor: >>>>> >>>>> >>>>> #define MOS ((dnode_phys_t >>>>> *)(RAW_ADDR(0x4000000))) >>>>> #define DNODE ((dnode_phys_t *)((char *)MOS + >>>>> DNODE_SIZE)) >>>>> #define ZFS_SCRATCH ((char *)DNODE + DNODE_SIZE) >>>>> >>>>> >>>>> >>>>> I guess another option would be to change the load address in the >>>>> xen hypervisor from 0x100000 to 0x400000 (just like >>>>> /platform/i86pc/kernel/unix) ? That''ll leave ~ 3MB of free space for >>>>> ZFS_SCRATCH ... >>>>> >>>>> >>>>> >>>> Changing the load address in the xen hypervisor from 0x100000 to >>>> 0x400000 >>>> makes sense to me. >>>> >>>> Thanks, >>>> Lin >>>> _______________________________________________ >>>> xen-discuss mailing list >>>> xen-discuss@opensolaris.org >>> >>> _______________________________________________ >>> xen-discuss mailing list >>> xen-discuss@opensolaris.org >>
Lin Ling
2007-Jul-24  18:13 UTC
Re: GRUB, zfs-root + Xen: Error 16: Inconsistent filesystem structure
>> I will file a development/zfs/boot bug to track this.6584769 is it. Lin
Juergen Keil
2007-Jul-24  18:19 UTC
Re: GRUB, zfs-root + Xen: Error 16: Inconsistent filesystem structure
It might not be that easy, because usr/src/grub/grub-0.95/stage2/gunzip.c is already using the top physical memory during gzip decompresssion, see functions linalloc() and reset_linalloc().> Thinking about this, I think the simplest fix may be to > locate the memory via something like: > > uint64_t top; > top = ... top of physical memory ... > if (top > 4Gig) > top = 4Gig > zfs_addresses to use = top - AMOUNT needed > > That''s because GRUB and loaded kernel modules use > physical memory from the bottom direction in order to boot > on smaller memory machines. > > The top of physical memory should be available pretty > easily, as GRUB is passing that information on to the > booting OS. > > Joe > > > Joe Bonasera wrote: > > Chaning Xen''s load address is something that we really > > don''t have as much control over as one would want. > > > > We could customize the version of Xen that we ship with > > Solaris, but that makes our Dom0 incompatible with > > every other Xen being shipped in the open source world > > out there -- which would make Solaris look rather bad and rather > > defeats some of the purpose of open source. > > > > Even if the Xen people are willing to take back the change to > > the load address, it would thro a major monkey wrench into > > our current schedule. > > > > I''m wondering if it would be simpler to fix ZFS grub to not use hard > > coded physical addresses. That seems to be a rather poor > > design, especially since address space above 1Meg is what > > GRUB traditionally leaves for any OS it''s booting. > > > > Did the ZFS module try to use the GRUB dynamic allocation > > mechanism and find problems or did you just not even try > > that direction? > > > > Joe > > > > > > Lin Ling wrote: > >> Hi Juergen, > >> > >> Thanks for the findings, see inline comments: > >> > >> Juergen Keil wrote: > >>> Hi Lin, > >>> > >>> In addition to bug 6541114... > >>> > >>> Bug ID 6541114 > >>> Synopsis GRUB/ZFS fails to load files from a default compressed (lzjb)root> >>> > >>> ... I found yet another way to get the "Error 16: Inconsistent filesystem > >>> structure" from GRUB. This time when trying to boot a Xen Dom0 from a > >>> zfs bootfs > >>> > >>> > >>> Synopsis: grub/zfs-root: cannot boot xen from a zfs root > >>> =======================================================================> >>> > >>> I''ve tried to install snv66 + xen into an lzjb compressed zfs > >>> root filesystem. > >>> > >>> menu.lst entry for xen is: > >>> > >>> # ------------------------------------------------------------ > >>> title Solaris Nevada snv_66 X86 (xen dom0) > >>> root (,0,g) > >>> bootfs files/s11-root-xen > >>> kernel$ /boot/$ISADIR/xen.gz > >>> module$ /platform/i86xpv/kernel/$ISADIR/unix/platform/i86xpv/kernel/$ISADIR/uni> >>> x -B $ZFS-BOOTFS -vk > >>> module$ /platform/i86pc/$ISADIR/boot_archive > >>> # ------------------------------------------------------------ > >>> > >>> grub boot for xen crashes with the error message: > >>> > >>> Error 16: Inconsistent filesystem structure > >>> > >>> > >>> > >>> GRUB uses fixed memory locations for MOS, DNODE, ZFS_SCRATCH... > >>> > >>> MOS is at memory location 0x100000. > >>> DNODE is at memory location 0x140000. > >>> ZFS_SCRATCH is at memory location 0x180000. > >>> > >>> Standard Solaris kernel /platform/i86pc/kernel/amd64/unix loads at > >>> 0x400000, 0x800000 and 0xC00000, and /platform/i86pc/amd64/boot_archive > >>> is loaded at 0xd5d000 - all after grub''s MOS / DNODE / ZFS_SCRATCHlocation.> >>> > >>> > >>> > >>> Xen hypervisor /boot/amd64/xen.gz is loaded at > >>> <0x100000:0x9c878:0x58788>. > >>> > >>> GRUB is able to read the first 128k of compressed data from the zfs > >>> root, decompresses the data to address 0x100000, and the attempt to > >>> read the next 128k block from xen.gz fails because the DNODE data is > >>> overwritten. Things start to fail when we find > >>> "DNODE->dn_datablkszsec == 35656" (should be 256) in zfs_read(), that is, > >>> a datablk size of ~18mbytes instead of the expected 128kbytes. > >>> > >>> > >>> Problem #1: > >>> ==========> >>> > >>> fsys_zfs.c is supposed to use the following memory map: > >>> > >>> * (memory addr) MOS DNODE ZFS_SCRATCH > >>> * | | | > >>> * +-------V---------V----------V---------------+ > >>> * memory | | dnode | dnode | scratch | > >>> * | | 512B | 512B | area | > >>> * +--------------------------------------------+ > >>> > >>> Using these defines... > >>> > >>> #define MOS ((dnode_phys_t *)(RAW_ADDR(0x100000))) > >>> #define DNODE ((dnode_phys_t *)(MOS + DNODE_SIZE)) > >>> #define ZFS_SCRATCH ((char *)(DNODE + DNODE_SIZE)) > >>> > >>> ... the DNODE area is located ``512*sizeof(dnode_phys_t)'''' bytes after > >>> MOS, not 512 bytes! Instead of 512 bytes for MOS, fsys_zfs is using > >>> 256 kbytes. Same problem with the size for the DNODE area. > >>> > >>> Apparently we want: > >>> > >>> #define MOS ((dnode_phys_t *)(RAW_ADDR(0x100000))) > >>> #define DNODE ((dnode_phys_t *)((char*)MOS +DNODE_SIZE))> >>> #define ZFS_SCRATCH ((char *)DNODE + DNODE_SIZE) > >>> > >>> > >>> > >> I will putback this fix along with 6541114. > >> > >>> Problem #2: > >>> ==========> >>> > >>> We should find a better base address for MOS/DNODE/ZFS_SCRATCH > >>> > >>> This seems to be the memory in use by GRUB: > >>> > >>> 0x007be BOOT_PART_TABLE > >>> 0x01000-0x1fff STAGE1_STACK / real mode stage2 STACKOFF (< 0x2000) > >>> 0x02000 MB_CMDLINE_BUF > >>> 0x07C00 BOOTSEC_LOCATION / MBR > >>> 0x08000 stage1 / PBR (start.S) > >>> 0x08200 stage2 (asm.S) > >>> 0x10000 LINUX_ZIMAGE_ADDR > >>> 0x60000-0x67fff protected mode stack > >>> 0x68000 FSYS_BUF (filesystem (not raw device) buffer / 32k) > >>> 0x70000 BUFFERADDR (raw device buffer / 31.5K) > >>> 0x77e00 SCRATCHADDR (512-byte scratch area) > >>> 0x78000 PASSWORD_BUF ... MENU_BUF > >>> 0x80000 free? > >>> 0x90000 LINUX_OLD_REAL_MODE_ADDR > >>> 0xA0000 Video memory? > >>> 0xB0000 HERCULES_VIDEO_ADDR > >>> 0x100000 LINUX_BZIMAGE_ADDR / XEN > >>> > >>> Maybe reusing 0x90000 could work (because we don''t want to boot old > >>> linux stuff)? > >>> > >>> Or the FSYS_BUF at 0x68000? Other fsys_xxx modules use the 32k at > >>> 0x68000 FSYS_BUF. > >>> > >>> > >>> Well, I experimented with these addresses, but the problem seems to be > >>> that ZFS_SCRATCH needs *lots* of free space. All the areas below 0x100000 > >>> appear to be too small for fsys_zfs.c > >>> > >>> > >>> I''m currently using 0x4000000 as MOS base address, as an ugly workaround, > >>> to boot both standard Solaris kernels and the xen hypervisor: > >>> > >>> > >>> #define MOS ((dnode_phys_t *)(RAW_ADDR(0x4000000))) > >>> #define DNODE ((dnode_phys_t *)((char *)MOS +DNODE_SIZE))> >>> #define ZFS_SCRATCH ((char *)DNODE + DNODE_SIZE) > >>> > >>> > >>> > >>> I guess another option would be to change the load address in the > >>> xen hypervisor from 0x100000 to 0x400000 (just like > >>> /platform/i86pc/kernel/unix) ? That''ll leave ~ 3MB of free space for > >>> ZFS_SCRATCH ... > >>> > >>> > >>> > >> Changing the load address in the xen hypervisor from 0x100000 to 0x400000 > >> makes sense to me. > >> > >> Thanks, > >> Lin > >> _______________________________________________ > >> xen-discuss mailing list > >> xen-discuss@opensolaris.org > > > > _______________________________________________ > > xen-discuss mailing list > > xen-discuss@opensolaris.org > > _______________________________________________ > xen-discuss mailing list > xen-discuss@opensolaris.orgJuergen Keil jk@tools.de Tools GmbH +49 (228) 9858011 Vorgebirgsstraße 37-39 http://www.tools.de 53119 BONN Sitz- und Registergericht HRB Bonn 4026 Geschäftsführung Wolfgang Franke & Wolfgang Solfrank
Lin Ling
2007-Jul-24  18:21 UTC
Re: GRUB, zfs-root + Xen: Error 16: Inconsistent filesystem structure
Just noticed that Joe has filed one. So it is 6584697 instead. Lin Lin Ling wrote:> >>> I will file a development/zfs/boot bug to track this. > > 6584769 is it. > > Lin >
Juergen Keil
2007-Jul-24  18:23 UTC
Re: GRUB, zfs-root + Xen: Error 16: Inconsistent filesystem structure
> Did the ZFS module try to use the GRUB dynamic allocation > mechanism and find problems or did you just not even try > that direction?GRUB dynamic allocation mechanism? What''s that?
Joe Bonasera
2007-Jul-24  19:39 UTC
Re: GRUB, zfs-root + Xen: Error 16: Inconsistent filesystem structure
I filed one for development/kernel/xen, as we''ll need to
apply at fix in matrix-gate before it gets to onnv-gate.
Here are the minimal grub diffs I''m thinking will work,
based on /ws/matrix-gate:
------- usr/src/grub/grub-0.95/stage2/shared.h -------
Index: usr/src/grub/grub-0.95/stage2/shared.h
*** /ws/matrix-gate/usr/src/grub/grub-0.95/stage2/shared.h      Wed Apr 18
13:34:31 2007
--- /export/build/josephb/ws.zgrub/usr/src/grub/grub-0.95/stage2/shared.h      
Tue Jul 24 12:35:06 2007
***************
*** 44,49 ****
--- 44,52 ----
   # define RAW_SEG(x) (x)
   #endif
+ /* ZFS will use the top 4 Meg of physical memory (below 4Gig) for sratch */
+ #define ZFS_SCRATCH_SIZE 0x400000
+
   #define       MAXNAMELEN      256
   #define MIN(x, y) ((x) < (y) ? (x) : (y))
------- usr/src/grub/grub-0.95/stage2/gunzip.c -------
Index: usr/src/grub/grub-0.95/stage2/gunzip.c
*** /ws/matrix-gate/usr/src/grub/grub-0.95/stage2/gunzip.c      Fri Jan 12
14:50:58 2007
--- /export/build/josephb/ws.zgrub/usr/src/grub/grub-0.95/stage2/gunzip.c      
Tue Jul 24 12:35:37 2007
***************
*** 174,179 ****
--- 174,180 ----
   reset_linalloc (void)
   {
     linalloc_topaddr = RAW_ADDR ((mbi.mem_upper << 10) + 0x100000);
+   linalloc_topaddr -= ZFS_SCRATCH_SIZE;
   }
------- usr/src/grub/grub-0.95/stage2/fsys_zfs.h -------
Index: usr/src/grub/grub-0.95/stage2/fsys_zfs.h
*** /ws/matrix-gate/usr/src/grub/grub-0.95/stage2/fsys_zfs.h    Wed Apr 18
13:35:25 2007
--- /export/build/josephb/ws.zgrub/usr/src/grub/grub-0.95/stage2/fsys_zfs.h    
Tue Jul 24 12:37:24 2007
***************
*** 23,29 ****
   #ifndef _FSYS_ZFS_H
   #define       _FSYS_ZFS_H
! #pragma ident "@(#)fsys_zfs.h 1.1     07/03/27 SMI"
   #ifdef        FSYS_ZFS
--- 23,29 ----
   #ifndef _FSYS_ZFS_H
   #define       _FSYS_ZFS_H
! #pragma ident "%Z%%M% %I%     %E% SMI"
   #ifdef        FSYS_ZFS
***************
*** 59,65 ****
   /*
    * Global Memory addresses to store MOS and DNODE data
    */
! #define       MOS                     ((dnode_phys_t *)(RAW_ADDR(0x100000)))
   #define       DNODE                   ((dnode_phys_t *)(MOS + DNODE_SIZE))
   #define       ZFS_SCRATCH             ((char *)(DNODE + DNODE_SIZE))
--- 59,66 ----
   /*
    * Global Memory addresses to store MOS and DNODE data
    */
! #define       MOS                     ((dnode_phys_t *)\
!       (RAW_ADDR ((mbi.mem_upper << 10) + 0x100000) - ZFS_SCRATCH_SIZE))
   #define       DNODE                   ((dnode_phys_t *)(MOS + DNODE_SIZE))
   #define       ZFS_SCRATCH             ((char *)(DNODE + DNODE_SIZE))
Lin Ling wrote:> 
> Just noticed that Joe has filed one.
> So it is 6584697 instead.
> 
> Lin
> 
> Lin Ling wrote:
>>
>>>> I will file a development/zfs/boot bug to track this.
>>
>> 6584769 is it.
>>
>> Lin
>>
Juergen Keil
2007-Jul-25  09:22 UTC
Re: GRUB, zfs-root + Xen: Error 16: Inconsistent filesystem structure
A possible refinement is to put the MOS and DNODE at some low core fixed address (that is: using the memory starting at FSYS_BUF; just like other grub filesystems do), and only put the ZFS_SCRATCH area at the top of available memory area. Something like this (Note: untested) #define MOS ((dnode_phys_t *)FSYS_BUF) #define DNODE ((dnode_phys_t *)((char*)MOS + DNODE_SIZE)) #define ZFS_SCRATCH (RAW_ADDR ((mbi.mem_upper << 10) + 0x100000) - \ ZFS_SCRATCH_SIZE) This might produce a bit more compact fsys_zfs.o code.> I filed one for development/kernel/xen, as we''ll need to > apply at fix in matrix-gate before it gets to onnv-gate. > > Here are the minimal grub diffs I''m thinking will work, > based on /ws/matrix-gate: > > > ------- usr/src/grub/grub-0.95/stage2/shared.h ------- > > Index: usr/src/grub/grub-0.95/stage2/shared.h > *** /ws/matrix-gate/usr/src/grub/grub-0.95/stage2/shared.h Wed Apr 18 13:34:31 2007 > --- /export/build/josephb/ws.zgrub/usr/src/grub/grub-0.95/stage2/shared.h Tue Jul 24 12:35:06 2007 > *************** > *** 44,49 **** > --- 44,52 ---- > # define RAW_SEG(x) (x) > #endif > > + /* ZFS will use the top 4 Meg of physical memory (below 4Gig) for sratch */ > + #define ZFS_SCRATCH_SIZE 0x400000 > + > #define MAXNAMELEN 256 > #define MIN(x, y) ((x) < (y) ? (x) : (y)) > > > ------- usr/src/grub/grub-0.95/stage2/gunzip.c ------- > > Index: usr/src/grub/grub-0.95/stage2/gunzip.c > *** /ws/matrix-gate/usr/src/grub/grub-0.95/stage2/gunzip.c Fri Jan 12 14:50:58 2007 > --- /export/build/josephb/ws.zgrub/usr/src/grub/grub-0.95/stage2/gunzip.c Tue Jul 24 12:35:37 2007 > *************** > *** 174,179 **** > --- 174,180 ---- > reset_linalloc (void) > { > linalloc_topaddr = RAW_ADDR ((mbi.mem_upper << 10) + 0x100000); > + linalloc_topaddr -= ZFS_SCRATCH_SIZE; > } > > > > > ------- usr/src/grub/grub-0.95/stage2/fsys_zfs.h ------- > > Index: usr/src/grub/grub-0.95/stage2/fsys_zfs.h > *** /ws/matrix-gate/usr/src/grub/grub-0.95/stage2/fsys_zfs.h Wed Apr 18 13:35:25 2007 > --- /export/build/josephb/ws.zgrub/usr/src/grub/grub-0.95/stage2/fsys_zfs.h Tue Jul 24 12:37:24 2007 > *************** > *** 23,29 **** > #ifndef _FSYS_ZFS_H > #define _FSYS_ZFS_H > > ! #pragma ident "@(#)fsys_zfs.h 1.1 07/03/27 SMI" > > #ifdef FSYS_ZFS > > --- 23,29 ---- > #ifndef _FSYS_ZFS_H > #define _FSYS_ZFS_H > > ! #pragma ident "%Z%%M% %I% %E% SMI" > > #ifdef FSYS_ZFS > > *************** > *** 59,65 **** > /* > * Global Memory addresses to store MOS and DNODE data > */ > ! #define MOS ((dnode_phys_t *)(RAW_ADDR(0x100000))) > #define DNODE ((dnode_phys_t *)(MOS + DNODE_SIZE)) > #define ZFS_SCRATCH ((char *)(DNODE + DNODE_SIZE)) > > --- 59,66 ---- > /* > * Global Memory addresses to store MOS and DNODE data > */ > ! #define MOS ((dnode_phys_t *)\ > ! (RAW_ADDR ((mbi.mem_upper << 10) + 0x100000) - ZFS_SCRATCH_SIZE)) > #define DNODE ((dnode_phys_t *)(MOS + DNODE_SIZE)) > #define ZFS_SCRATCH ((char *)(DNODE + DNODE_SIZE)) > > > > Lin Ling wrote: > > > > Just noticed that Joe has filed one. > > So it is 6584697 instead. > > > > Lin > > > > Lin Ling wrote: > >> > >>>> I will file a development/zfs/boot bug to track this. > >> > >> 6584769 is it. > >> > >> Lin > >> > > _______________________________________________ > xen-discuss mailing list > xen-discuss@opensolaris.orgJuergen Keil jk@tools.de Tools GmbH +49 (228) 9858011 Vorgebirgsstraße 37-39 http://www.tools.de 53119 BONN Sitz- und Registergericht HRB Bonn 4026 Geschäftsführung Wolfgang Franke & Wolfgang Solfrank
Jürgen Keil
2007-Jul-26  09:23 UTC
Re: GRUB, zfs-root + Xen: Error 16: Inconsistent filesystem structure
The description text for bug 6584697 contains this:> I''m going to investigate fixing this by locating the > GRUB ZFS memory areas dynamically down from the top of > physical memory or the 4 Gig addresssibility limit, > whichever is lower. > So I managed to get a lab system into this setup.. > By grub fix worked enough to get Xen to boot and > dom0 to make it through startup.. > > However... > > startup.c:1970: Enabling interrupts > startup.c:1981: startup_end() done > startup.c:2072: Unmapping lower boot pages > startup.c:2089: Releasing boot pages > startup.c:2103: Boot pages released > WARNING: init(1M) exited on fatal signal 9: restarting automatically > WARNING: init(1M) exited on fatal signal 9: restarting automatically > WARNING: init(1M) exited on fatal signal 9: restarting automatically > WARNING: init(1M) exited on fatal signal 9: restarting automatically > WARNING: init(1M) exited on fatal signal 9: restarting automatically > and so on forever.. > > there are obviously more issues with ZFS root and Xen to explore.That doesn''t happen on my ZFS root + Xen setups. Last time I had an "init(1M) exited on fatal signal 9" problem was because of 6572151 / 6332924: Bug ID 6572151 Synopsis snv boot failure since snv_66 Bug ID 6332924 Synopsis snv_24 /usr/ccs/bin/as adds new HWCAP tags to previously untagged objects But that''s on issue on old cpus without SSE support only. Not sure it that bug could explain the init(1M) crash from 6584697. Another bug is: Bug ID 6423745 Synopsis zfs root pool created while booted 64 bit can not be booted 32 bit> Just noticed that Joe has filed one. > So it is 6584697 instead. > > Lin > > Lin Ling wrote: > > > >>> I will file a development/zfs/boot bug to track this. > > > > 6584769 is it.This message posted from opensolaris.org