Konstantin Belousov
2015-Aug-28 10:35 UTC
ia64 stable/10 r286316: hang at Entering /boot/kernel/kernel
On Fri, Aug 28, 2015 at 11:30:18AM +0100, Anton Shterenlikht wrote:> >From kostikbel at gmail.com Thu Aug 27 18:22:37 2015 > > > >On Thu, Aug 27, 2015 at 01:12:16PM +0100, Anton Shterenlikht wrote: > >> ia64 stable/10 r286315 boots, but > >> r286316 hangs at "Entering /boot/kernel/kernel". > >> > >> Please advise > > > >To state an obvious thing. The commit which you pointed to, changes > >the code which is not executed at that early kernel boot stage. The > >revision cannot cause the consequences you described. > > yes, I'm surprised too. > > >I think that you either have build-environment issue which randomly pops > >up, or there is some other boot-time issue which is sporadic. The only > >suggestion I have, try many boots with kernels which look either good > >or bad, I would be not surprised if statistic would be completely > >different from binary good/bad outcome. > > > >Otherwise, I do not have an idea. > > > > I doubt it's a random or a sporadic issue. > I did a bisection, as suggested, during which > I built world/kernel on 7 revisions, and when I > narrowed it down to <50, a further 4 kernels. > All kernels <=286315 boot, all kernels >= 286316 > do not. I think if it were something random, > it wouldn't be such a clear cut picture. > > What about my loader.conf: > > # cat /boot/loader.conf > zfs_load="YES" > # soft limits > kern.dfldsiz=536748032 # default soft limit for process data > kern.dflssiz=536748032 # default soft limit for stack > # hard limits > kern.maxdsiz=536748032 # hard limit for process data > kern.maxssiz=536748032 # hard limit for stack > kern.maxtsiz=536748032 # hard limit for text size > # processes may not exceed these limits. > # > > My memory: > > real memory = 8589934592 (8192 MB) > avail memory = 8387649536 (7999 MB) > > I'll try disabling all these settings in loader.conf > and see if makes a difference. > But these settings have been there for a few years > with no problems.In the initial range you mentioned, there were some changes related to the handling of the userspace stacks. But again, the problem occurs too early for a userspace-related modification to affect the outcome. Might be, try the latest stable/10 kernel with the problematic revision r286316 reversed ? This might add more points to the Marcel' note about some static relocation table processed early.
Marcel Moolenaar
2015-Aug-28 20:32 UTC
ia64 stable/10 r286316: hang at Entering /boot/kernel/kernel
> On Aug 28, 2015, at 3:35 AM, Konstantin Belousov <kostikbel at gmail.com> wrote: > > Might be, try the latest stable/10 kernel with the problematic revision > r286316 reversed ? This might add more points to the Marcel' note about > some static relocation table processed early.I built a kernel off of revision 286315 and got this: eris% objdump -R kernel | grep FPTR64LSB | wc -l 5377 We only reserve room for 4096 relocations, so we?re over as it is. A kernel off of revision 286316 gave me this: eris% objdump -R kernel | grep FPTR64LSB | wc -l 5377 Same. Odd, but ok. It?s possible that the memory layout changed such that we now scribble over something that?s important. To be sure: Anton can you apply the following patch and tell me if it makes a difference. It doubles the space we set aside for relocations. Index: sys/ia64/ia64/locore.S ==================================================================--- sys/ia64/ia64/locore.S (revision 286316) +++ sys/ia64/ia64/locore.S (working copy) @@ -357,5 +357,5 @@ .align 16 .global fptr_storage fptr_storage: - .space 4096*16 // XXX + .space 8192*16 // XXX fptr_storage_end: -- Marcel Moolenaar marcel at xcllnt.net -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20150828/63eaff65/attachment.bin>
Anton Shterenlikht
2015-Sep-01 15:43 UTC
ia64 stable/10 r286316: hang at Entering /boot/kernel/kernel
>From marcel at xcllnt.net Fri Aug 28 23:15:06 2015 > > >> On Aug 28, 2015, at 3:35 AM, Konstantin Belousov <kostikbel at gmail.com> >wrote: >>=20 >> Might be, try the latest stable/10 kernel with the problematic >revision >> r286316 reversed ? This might add more points to the Marcel' note >about >> some static relocation table processed early. > >I built a kernel off of revision 286315 and got this: > > eris% objdump -R kernel | grep FPTR64LSB | wc -l > 5377 > >We only reserve room for 4096 relocations, so we=E2=80=99re over >as it is. > >A kernel off of revision 286316 gave me this: > eris% objdump -R kernel | grep FPTR64LSB | wc -l > 5377 > >Same. Odd, but ok. It=E2=80=99s possible that the memory layout >changed such that we now scribble over something that=E2=80=99s >important. > >To be sure: Anton can you apply the following patch and >tell me if it makes a difference. It doubles the space >we set aside for relocations. > >Index: sys/ia64/ia64/locore.S >=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >--- sys/ia64/ia64/locore.S (revision 286316) >+++ sys/ia64/ia64/locore.S (working copy) >@@ -357,5 +357,5 @@ > .align 16 > .global fptr_storage > fptr_storage: >- .space 4096*16 // XXX >+ .space 8192*16 // XXX > fptr_storage_end:So, 286316 boots ok without the patch if I remove everything from /boot/loader.conf. With the patch, and with kern.dfldsiz=536748032 # default soft limit for process data kern.dflssiz=536748032 # default soft limit for stack # hard limits kern.maxdsiz=536748032 # hard limit for process data kern.maxssiz=536748032 # hard limit for stack kern.maxtsiz=536748032 # hard limit for text size First time round I got: da1: Command Queueing enabled da1: 17366MB (35566478 512 byte sectors: 255H 63S/T 2213C) Loader variables: Manual root filesystem specification: <fstype>:<device> [options] Mount <device> using filesystem <fstype> and with the specified (optional) option list. eg. ufs:/dev/da0s1a zfs:tank cd9660:/dev/acd0 ro (which is equivalent to: mount -t cd9660 -o ro /dev/acd0 /) ? List valid disk boot devices . Yield 1 second (for background tasks) <empty line> Abort manual input mountroot> And following an auto-reboot: OK boot -s ?[37m?[44mBooting...?[m Entering /boot/kernel/kernel at 0x9ffc000000010500... I'll do a few more tries now. Anton
Anton Shterenlikht
2015-Sep-02 17:30 UTC
ia64 stable/10 r286316: hang at Entering /boot/kernel/kernel
The kernel limits I have in /boot/loader.conf are following this PR: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=156900 #kern.dfldsiz=536748032 # default soft limit for process data #kern.dflssiz=536748032 # default soft limit for stack # hard limits #kern.maxdsiz=536748032 # hard limit for process data #kern.maxssiz=536748032 # hard limit for stack #kern.maxtsiz=536748032 # hard limit for text size If I leave these in, then with Marcel's patch I get to mountroot> prompt, and then panic. If I remove these, then I can boot. If I try booting with -s, then I get to a hang at "Entering /boot/kernel": /boot/kernel.old/kernel text=0x1110710 data=0xdfce8+0xa54f8 syms=[0x8+0xc29e8+0x8+0xb78f6] ?[37m?[44mBooting...?[m Entering /boot/kernel.old/kernel at 0x9ffc000000010500... *********************************************************** * ROM Version : 04.29 * ROM Date : 11/30/2007 * BMC Version : 04.04 Is it worth investigating what limit value will boot? Anton