Warner Losh
2018-Oct-22 16:26 UTC
head -r338804 boots threadripper 1950X fine; head -r338810+ do not; -r338807 seems implicated
On Mon, Oct 22, 2018 at 6:39 AM Mark Millard <marklmi at yahoo.com> wrote:> On 2018-Oct-22, at 4:07 AM, Toomas Soome <tsoome at me.com> wrote: > > > On 22 Oct 2018, at 13:58, Mark Millard <marklmi at yahoo.com> wrote: > >> > >> On 2018-Oct-22, at 2:27 AM, Toomas Soome <tsoome at me.com> wrote: > >>> > >>>> On 22 Oct 2018, at 06:30, Warner Losh <imp at bsdimp.com> wrote: > >>>> > >>>> On Sun, Oct 21, 2018 at 9:28 PM Warner Losh <imp at bsdimp.com> wrote: > >>>> > >>>>> > >>>>> > >>>>> On Sun, Oct 21, 2018 at 8:57 PM Mark Millard via freebsd-stable < > >>>>> freebsd-stable at freebsd.org> wrote: > >>>>> > >>>>>> [I built based on WITHOUT_ZFS= for other reasons. But, > >>>>>> after installing the build, Hyper-V based boots are > >>>>>> working.] > >>>>>> > >>>>>> On 2018-Oct-20, at 2:09 AM, Mark Millard <marklmi at yahoo.com> > wrote: > >>>>>> > >>>>>>> On 2018-Oct-20, at 1:39 AM, Mark Millard <marklmi at yahoo.com> > wrote: > >>>>>>> . . . > >>>> > >>> > >>> It would help to get output from loader lsdev -v command. > >> > >> That turned out to be very interesting: The non-ZFS loader > >> crashes during the listing, during disk8, which shows a > >> x0 instead of a x512. > >> > > > > Yes, thats the root cause there. The non-zfs loader does only *read* the > boot disk, thats why the issue was not revealed there. > > > > It would help to identify the sector size for that disk, at least from > OS, so we can compare with what we can get from INT13. > > > > I have pretty good idea what to look there, but I am afraid we need to > run few tests with you to understand why that disk is reporting sector size > 0 there. > > > > > > Looks like I guessed wrong about the device > for "drive8". > > So I unplugged the only other external > storage device, so the original drives > 0-13 become 0-11 overall. > > The machine has a multi-LUN media card reader with > no cards plugged in. It is built-in rather than > one that I plugged into a port. It has 4 LUN's. > > So 8+4=12 and drives 0-7 show up with media before > it tries any of the 4 LUN's with no card in place. > > I conclude that "drive8" is an empty LUN in a media > card reader. > > I conclude that there is no sector size available for > any of the empty LUNs in the media reader. >I think you are probably right and we're hitting some divide by 0 error when we should just ignore the disk. Warner> > > > > >> Hand transcribed from pictures: > >> > >> OK lsdev -v > >> disk devices > >> disk0: BIOS drive C (937703088 x 512): > >> disk0p1: FreeBSD boot 512K > >> disk0p2: FreeBSD UFS 356G > >> disk0p3: FreeBSD swap 15G > >> disp0p4: FreeBSD swap 76G > >> disk1: BIOS drive D (16514064 x 512): > >> disk1s1: Linux 2048KB > >> disk1s2: Unknown 952GB > >> disk2: BIOS drive E (16514064 x 512): > >> disk2p1: Unknown 128MB > >> disk3: BIOS drive F (16514064 x 512): > >> disk3p1: Unknown 128MB > >> disk4: BIOS drive G (16434495 x 512): > >> disk2p1: Unknown 128MB > >> disk4p2: DOS/Windwos 1716GB > >> disk5: BIOS drive H (16434495 x 512): > >> disk5p1: FreeBSD boot 512K > >> disk5p2: FreeBSD UFS 176G > >> disk5p3: FreeBSD swap 193G > >> disp5p4: FreeBSD swap 15G > >> disk6: BIOS drive I (16434495 x 512): > >> disk6p1: Unknown 499MB > >> disk6p2: EFI 99MB > >> disk6p3: Unknown 16MB > >> disp6p4: DOS/Windows 886G > >> dis7: BIOS drive H (16434495 x 512): > >> disk7p1: FreeBSD boot 512K > >> disk7p2: FreeBSD UFS 953G > >> disk8: BIOS drive K (262144 x 0): > >> > >> int=00000000 err=00000000 efl=00010246 eip=000286bd > >> eax=00000000 ebx=72b50430 ecx=00000000 edx=00000000 > >> esi=00000000 edi=00092080 ebp=00091eec esp=00091ea8 > >> cs=002b ds=0033 es=0033 fs=0033 gs=0033 ss=0033 > >> cs:eip=f7 f1 89 c1 85 d2 0f 85-d8 01 00 00 6a 05 58 85 > >> f6 0f 88 75 01 00 00 89-cb c1 fb 1f 89 ca 03 55 > >> ss:esp=09 00 00 00 00 00 00 00-0a 00 00 00 02 00 00 00 > >> 00 00 00 00 00 00 00 00-78 1f 09 00 33 45 04 00 > >> BTX halted > >> > >> I expect that "disk8" is what gpart show -p > >> from a native boot showed as: > >> > >> => 1 60062499 da1 MBR (29G) > >> 1 31 - free - (16K) > >> 32 60062468 da1s1 fat32lba (29G) > >> > >> (That gpart show -p output is in another of the > >> list messages.) > >> > >>> Also if you could test boot loader with UEFI - for example get to > loader prompt via usb/cd boot and then get the same lsdev -v output. > >> > >> Still true given the above crash? Or, going the > >> other way, should "drive8" be left as it is in > >> order to be sure to do this test with the drive > >> present? > >> > >> If I do this test later, it will take a bit to > >> get media to do it with. (It is about 4AM in the > >> morning and I've yet to get to sleep.) > >> > >> Note: I've never tried a UEFI based boot of FreeBSD > >> on this machine (but the Windows 10 Pro x64 is EFI > >> based). The only FreeBSD context using a EFI partition > >> to boot that I have used is on an arm aarch64 > >> Cortex-A57 system. > >> > >>> I would be interested to see the sector size information and if the > UEFI loader does also have issues. > >> > >> Understood. > >> > >>> If it does, I?d like to see the outputs from commands: > >> > >>> zpool status > >>> zpool import > >> > >> Independent of the UEFI test . . . > >> > >> I do have a -r331924 head version on another one > >> of the devices and can native-boot that. It still > >> has its ZFS software (but a default loader without > >> ZFS). > >> > >> Trying from that context, hand transcribed: > >> > >> # zpool status > >> ZFS filesystem version: 5 > >> ZFS storage pool version: features support (5000) > >> no pools available > >> # zpool import > >> # > >> > >> [That was based on the old (default) loader being > >> a non-ZFS one.] > > > > > > ==> Mark Millard > marklmi at yahoo.com > ( dsl-only.net went > away in early 2018-Mar) > >
Mark Millard
2018-Oct-22 17:01 UTC
head -r338804 boots threadripper 1950X fine; head -r338810+ do not; -r338807 seems implicated
[I will note the the loader problem has been shown to not be involved in the kernel problem that this "Subject:" was originally for.] On 2018-Oct-22, at 9:26 AM, Warner Losh <imp at sdimp.com> wrote:> On Mon, Oct 22, 2018 at 6:39 AM Mark Millard <marklmi at yahoo.com> wrote: >> On 2018-Oct-22, at 4:07 AM, Toomas Soome <tsoome at me.com> wrote: >> >> > On 22 Oct 2018, at 13:58, Mark Millard <marklmi at yahoo.com> wrote: >> >> >> >> On 2018-Oct-22, at 2:27 AM, Toomas Soome <tsoome at me.com> wrote: >> >>> >> >>>> On 22 Oct 2018, at 06:30, Warner Losh <imp at bsdimp.com> wrote: >> >>>> >> >>>> On Sun, Oct 21, 2018 at 9:28 PM Warner Losh <imp at bsdimp.com> wrote: >> >>>> >> >>>>> >> >>>>> >> >>>>> On Sun, Oct 21, 2018 at 8:57 PM Mark Millard via freebsd-stable < >> >>>>> freebsd-stable at freebsd.org> wrote: >> >>>>> >> >>>>>> [I built based on WITHOUT_ZFS= for other reasons. But, >> >>>>>> after installing the build, Hyper-V based boots are >> >>>>>> working.] >> >>>>>> >> >>>>>> On 2018-Oct-20, at 2:09 AM, Mark Millard <marklmi at yahoo.com> wrote: >> >>>>>> >> >>>>>>> On 2018-Oct-20, at 1:39 AM, Mark Millard <marklmi at yahoo.com> wrote: >> >>>>>>> . . . >> >>>> >> >>> >> >>> It would help to get output from loader lsdev -v command. >> >> >> >> That turned out to be very interesting: The non-ZFS loader >> >> crashes during the listing, during disk8, which shows a >> >> x0 instead of a x512. >> >> >> > >> > Yes, thats the root cause there. The non-zfs loader does only *read* the boot disk, thats why the issue was not revealed there. >> > >> > It would help to identify the sector size for that disk, at least from OS, so we can compare with what we can get from INT13. >> > >> > I have pretty good idea what to look there, but I am afraid we need to run few tests with you to understand why that disk is reporting sector size 0 there. >> > >> > >> >> Looks like I guessed wrong about the device >> for "drive8". >> >> So I unplugged the only other external >> storage device, so the original drives >> 0-13 become 0-11 overall. >> >> The machine has a multi-LUN media card reader with >> no cards plugged in. It is built-in rather than >> one that I plugged into a port. It has 4 LUN's. >> >> So 8+4=12 and drives 0-7 show up with media before >> it tries any of the 4 LUN's with no card in place. >> >> I conclude that "drive8" is an empty LUN in a media >> card reader. >> >> I conclude that there is no sector size available for >> any of the empty LUNs in the media reader. >> > I think you are probably right and we're hitting some divide by 0 error when we should just ignore the disk.In the Hyper-V context, the loader and kernel do not see the 4-LUN media reader at all: only drives with normal freebsd-* style partitions and free space. This explains why I did not see a loader problem in that context. So I conclude that the kernel crash under Hyper-V associated with -r338807 is a separate issue even though WITHOUT_ZFS= seems to have avoided the crash. My plan is to continue with the -r338807 investigation after the loader problem is fixed in my builds. Then I've go back to trying builds using WITH_ZFS= (implicit), both native boots and Hyper-V based ones. ==Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)