Artem Belevich
2013-Nov-03 03:22 UTC
Can''t mount root from raidz2 after r255763 in stable/9
Hi,
I have a box with root mounted from 8-disk raidz2 ZFS volume.
After recent buildworld I''ve ran into an issue that kernel fails to
mount root with error 6.
r255763 on stable/9 is the first revision that fails to mount root on
mybox. Preceding r255749 boots fine.
Commit r255763
(http://svnweb.freebsd.org/base?view=revision&revision=255763)
MFCs bunch of changes from 10 but I don''t see anything that obviously
impacts ZFS.
Attempting to boot with vfs.zfs.debug=1 shows that order in which geom
providers are probed by zfs has apparently changed. Kernels that boot,
show "guid match for provider /dev/gpt/<valid pool slice>" while
failing kernels show "guid match for provider /dev/daX" -- the raw
disks that are *not* the right geom provider for my pool slices. Beats
me why ZFS picks raw disks over GPT partitions it should have.
Pool configuration:
#zpool status z0
pool: z0
state: ONLINE
scan: scrub repaired 0 in 8h57m with 0 errors on Sat Oct 19 20:23:52 2013
config:
NAME STATE READ WRITE CKSUM
z0 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
gpt/da0p4-z0 ONLINE 0 0 0
gpt/da1p4-z0 ONLINE 0 0 0
gpt/da2p4-z0 ONLINE 0 0 0
gpt/da3p4-z0 ONLINE 0 0 0
gpt/da4p4-z0 ONLINE 0 0 0
gpt/da5p4-z0 ONLINE 0 0 0
gpt/da6p4-z0 ONLINE 0 0 0
gpt/da7p4-z0 ONLINE 0 0 0
logs
mirror-1 ONLINE 0 0 0
gpt/ssd-zil-z0 ONLINE 0 0 0
gpt/ssd1-zil-z0 ONLINE 0 0 0
cache
gpt/ssd1-l2arc-z0 ONLINE 0 0 0
errors: No known data errors
Here are screen captures from a failed boot:
https://plus.google.com/photos/+ArtemBelevich/albums/5941857781891332785
And here''s boot log from successful boot on the same system:
http://pastebin.com/XCwebsh7
Removing ZIL and L2ARC makes no difference -- r255763 still fails to mount root.
I''m thoroughly baffled. Is there''s something wrong with the
pool --
some junk metadata somewhere on the disk that now screws with the root
mounting? Changed order in geom provider enumeration? Something else?
Any suggestions on what I can do to debug this further?
--Artem
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to
"freebsd-stable-unsubscribe@freebsd.org"
Andriy Gapon
2013-Nov-03 08:02 UTC
Re: Can''t mount root from raidz2 after r255763 in stable/9
on 03/11/2013 05:22 Artem Belevich said the following:> Hi, > > I have a box with root mounted from 8-disk raidz2 ZFS volume. > After recent buildworld I''ve ran into an issue that kernel fails to > mount root with error 6. > r255763 on stable/9 is the first revision that fails to mount root on > mybox. Preceding r255749 boots fine. > > Commit r255763 (http://svnweb.freebsd.org/base?view=revision&revision=255763) > MFCs bunch of changes from 10 but I don''t see anything that obviously > impacts ZFS.Indeed.> Attempting to boot with vfs.zfs.debug=1 shows that order in which geom > providers are probed by zfs has apparently changed. Kernels that boot, > show "guid match for provider /dev/gpt/<valid pool slice>" while > failing kernels show "guid match for provider /dev/daX" -- the raw > disks that are *not* the right geom provider for my pool slices. Beats > me why ZFS picks raw disks over GPT partitions it should have.Perhaps the kernel gpart code fails to recognize the partitions and thus ZFS can''t see them?> Pool configuration: > #zpool status z0 > pool: z0 > state: ONLINE > scan: scrub repaired 0 in 8h57m with 0 errors on Sat Oct 19 20:23:52 2013 > config: > > NAME STATE READ WRITE CKSUM > z0 ONLINE 0 0 0 > raidz2-0 ONLINE 0 0 0 > gpt/da0p4-z0 ONLINE 0 0 0 > gpt/da1p4-z0 ONLINE 0 0 0 > gpt/da2p4-z0 ONLINE 0 0 0 > gpt/da3p4-z0 ONLINE 0 0 0 > gpt/da4p4-z0 ONLINE 0 0 0 > gpt/da5p4-z0 ONLINE 0 0 0 > gpt/da6p4-z0 ONLINE 0 0 0 > gpt/da7p4-z0 ONLINE 0 0 0 > logs > mirror-1 ONLINE 0 0 0 > gpt/ssd-zil-z0 ONLINE 0 0 0 > gpt/ssd1-zil-z0 ONLINE 0 0 0 > cache > gpt/ssd1-l2arc-z0 ONLINE 0 0 0 > > errors: No known data errors > > Here are screen captures from a failed boot: > https://plus.google.com/photos/+ArtemBelevich/albums/5941857781891332785I don''t have permission to view this album.> And here''s boot log from successful boot on the same system: > http://pastebin.com/XCwebsh7 > > Removing ZIL and L2ARC makes no difference -- r255763 still fails to mount root. > > I''m thoroughly baffled. Is there''s something wrong with the pool -- > some junk metadata somewhere on the disk that now screws with the root > mounting? Changed order in geom provider enumeration? Something else? > Any suggestions on what I can do to debug this further?gpart. -- Andriy Gapon _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
on 03/11/2013 05:22 Artem Belevich said the following:> Hi, > > I have a box with root mounted from 8-disk raidz2 ZFS volume. > After recent buildworld I've ran into an issue that kernel fails to > mount root with error 6. > r255763 on stable/9 is the first revision that fails to mount root on > mybox. Preceding r255749 boots fine. > > Commit r255763 (http://svnweb.freebsd.org/base?view=revision&revision=255763) > MFCs bunch of changes from 10 but I don't see anything that obviously > impacts ZFS.Indeed.> Attempting to boot with vfs.zfs.debug=1 shows that order in which geom > providers are probed by zfs has apparently changed. Kernels that boot, > show "guid match for provider /dev/gpt/<valid pool slice>" while > failing kernels show "guid match for provider /dev/daX" -- the raw > disks that are *not* the right geom provider for my pool slices. Beats > me why ZFS picks raw disks over GPT partitions it should have.Perhaps the kernel gpart code fails to recognize the partitions and thus ZFS can't see them?> Pool configuration: > #zpool status z0 > pool: z0 > state: ONLINE > scan: scrub repaired 0 in 8h57m with 0 errors on Sat Oct 19 20:23:52 2013 > config: > > NAME STATE READ WRITE CKSUM > z0 ONLINE 0 0 0 > raidz2-0 ONLINE 0 0 0 > gpt/da0p4-z0 ONLINE 0 0 0 > gpt/da1p4-z0 ONLINE 0 0 0 > gpt/da2p4-z0 ONLINE 0 0 0 > gpt/da3p4-z0 ONLINE 0 0 0 > gpt/da4p4-z0 ONLINE 0 0 0 > gpt/da5p4-z0 ONLINE 0 0 0 > gpt/da6p4-z0 ONLINE 0 0 0 > gpt/da7p4-z0 ONLINE 0 0 0 > logs > mirror-1 ONLINE 0 0 0 > gpt/ssd-zil-z0 ONLINE 0 0 0 > gpt/ssd1-zil-z0 ONLINE 0 0 0 > cache > gpt/ssd1-l2arc-z0 ONLINE 0 0 0 > > errors: No known data errors > > Here are screen captures from a failed boot: > https://plus.google.com/photos/+ArtemBelevich/albums/5941857781891332785I don't have permission to view this album.> And here's boot log from successful boot on the same system: > http://pastebin.com/XCwebsh7 > > Removing ZIL and L2ARC makes no difference -- r255763 still fails to mount root. > > I'm thoroughly baffled. Is there's something wrong with the pool -- > some junk metadata somewhere on the disk that now screws with the root > mounting? Changed order in geom provider enumeration? Something else? > Any suggestions on what I can do to debug this further?gpart. -- Andriy Gapon