On Wed, Mar 30, 2011 at 9:24 AM, Alexander Pyhalov <alp@rsu.ru> wrote:
> Hello.
> I have IBM blade, which is connected to EMC Clarion disk storage (2 FC
> adapters connected to 2 FC switches, so system sees 4 paths to storage).
One
> lun is provided to the system. The problem is that FreeBSD doesn't boot
> randomly (at least 1 attempt from 5 boots is unsuccessful). The blade
stalls
> and I see only blank screen.
> I've tried other operating systems - OpenIndiana b148 and Ubuntu 10.10
-
> each of them has booted perfectly 10 times without any issues.
>
> I don't see any messages from boot1 stage and system is logged in to
EMC
> storage with only one path. When the system boots successfully, I can see
on
> EMC Clarion that it is connected with all paths. I've tried to use
boot0
> from CURRENT - results are the same (boot fails randomly).
> How can I debug this issue?
>
> Additional info:
> # uname -a
> FreeBSD fbsdhost5.xx 8.2-RELEASE FreeBSD 8.2-RELEASE #0 r219027M: Wed Mar
> 9 15:12:21 MSK 2011
alp@xx:/usr/obj/usr/src-releng-8.2/sys/ibm-hs-21xm-vnet-amd64.releng-8.2
> amd64
>
> # camcontrol devlist -v
> scbus0 on isp0 bus 0:
> <DGC RAID 5 0326> at scbus0 target 0 lun 0
(sg0,pass0,da0)
> <DGC RAID 5 0326> at scbus0 target 1 lun 0
(sg1,pass1,da1)
> <> at scbus0 target -1 lun -1 ()
> scbus1 on isp1 bus 0:
> <DGC RAID 5 0326> at scbus1 target 0 lun 0
(sg2,pass2,da2)
> <DGC RAID 5 0326> at scbus1 target 1 lun 0
(sg3,pass3,da3)
> <> at scbus1 target -1 lun -1 ()
> scbus-1 on xpt0 bus 0:
> <> at scbus-1 target -1 lun -1 (xpt0)
>
> # gmultipath status
> Name Status Components
> multipath/fbsdhost5tst N/A da0
> da1
> da2
> da3
> # gpart show
> => 63 33554367 multipath/fbsdhost5tst MBR (16G)
> 63 33543657 1 freebsd [active] (16G)
> 33543720 10710 - free - (5.2M)
>
> => 0 33543657 multipath/fbsdhost5tsts1 BSD (16G)
> 0 16 - free - (8.0K)
> 16 18863577 1 freebsd-ufs (9.0G)
> 18863593 4194304 2 freebsd-swap (2.0G)
> 23057897 2097152 4 freebsd-ufs (1.0G)
> 25155049 8388608 5 freebsd-ufs (4.0G)
>
> # boot0cfg -v /dev/multipath/fbsdhost5tst
> # flag start chs type end chs offset size
> 1 0x80 0: 1: 1 0xa5 39:254:63 63 33543657
>
> version=2.0 drive=0x80 mask=0xf ticks=182 bell=# (0x23)
> options=packet,update,nosetdrv
> volume serial ID 9090-9090
> default_selection=F1 (Slice 1)
>
> # df
> Filesystem 1K-blocks Used Avail Capacity Mounted
> on
> /dev/multipath/fbsdhost5tsts1a 9129786 4522594 3876810 54% /
> devfs 1 1 0 100% /dev
> /dev/multipath/fbsdhost5tsts1d 1012974 12 931926 0% /tmp
> /dev/multipath/fbsdhost5tsts1e 4058062 141846 3591572 4% /var
>
> --
> Best regards,
> Alexander Pyhalov,
> system administrator of Computer Center of Southern Federal University
>
I will mention the following issue similar to above problem as my
observation .
The problem is NOT only belong to FreeBSD , all of the BSD based operating
systems ( such as PC-BSD , NetBSD , DragonFlyBSD ) independent from version
numbers are exhibiting the same behavior .
Assume an operating system other than BSD based operating systems is booted
on my computer ( Intel DG965WH board ) and then it is shut down . When I
start the booting of an BSD based operating system ,
it is exactly crashing at some point , especially when it becomes necessary
to accept a user response .
In that point it is unknown whether key board is locked or there is another
problem . What ever the reason is , it is necessary to hard reset the
computer . The second and subsequent boots are successful .
When another operating system is booted , the above crash-successes cycle is
starting again .
Neither of the other operating systems ( mostly Linux and others ) is
exhibiting such a behavior what ever is the previously booted operating
system . I can say that this issue is only belong to BSD based operating
systems .
I do not know the reason , but I suspect that there is a missing part in the
booting code , especially within initialization code at the beginning .
First , unsuccessful boot is setting some value(s) , but itself is crashing
, the subsequent booting is using that previously set value(s) and they are
succeeding up to the point where the other operating systems is setting that
or those value(s) differently .
Thank you very much .
Mehmet Erol Sanliturk