Milan Holzäpfel
2004-Dec-01 20:37 UTC
[Xen-devel] nFroce SATA lockup - problem location tracked down
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello, I finally did some more tests with system with nForce3-250Gb SATA controller, whose driver locks the system at boot time when inside xen. The following was used: - mobo with nForce3-250 Gb chipset, which has got a S-ATA controller - sata_nv from 2.6.10-rc2-bk11 pushed into vanilla 2.6.9, patched with Xen and reiser4 - some Xen stable bk snapshot from a few days ago Files: <URL:http://mjh.name/misc-files/xen-caps-20041130/> - cap.*: captures from serial console - cap.linux.*: native linux bootup - cap.xenolinux.*: linux inside xen bootup - cap.*.dbg.*: kernel command line option "debug" passed - cap.*.nd-dbg: libata debug enabled (ATA_DEBUG, ATA_VERBOSE_DEBUG, ATA_IRQ_TRAP) - also the are two kernel .config files, and lspci output (from native linux) - cap.xenolinux.extra-dbg: log output with some extra dbg options added by me - driver/scsi/libata-core.c, include/linux/libata.h, kernel/sched.c: files with my extra stuff added The lockup takes place in drivers/scsi/libata-core.c, in function ata_dev_set_xfermode(). The call which does not return is wait_for_completion(&wait) (line 1837). Inside wait_for_completion(), which is defined in kernel/sched.c, the last call of this function is schedule(), line 2862. I added some extra debug output into wait_for_completion() and schedule(), which shows that schedule() runs on and on, checking some stuff each few moments. I''d assume that schedule() checks whether the thread locked by wait_for_completion should get unlocked, but this condition never seems to be fulfilled, maybe because of some address glibberish or whatever. At the site mentioned above you can also find my modified version of libata-core.c, libata.h and sched.c, and the boot output when using these versions (cap.xenolinux.extra-dbg). Now I''d hope that this information will help you to get a closer view of the problem, and maybe even get an idea of a solution, since the deeper I dig into all this code, the more other code I have to read to get an idea of what''s actually going on. (and hey, I am by no means sth. like a experienced C programmer ;) ) I''d be happy to provide whatever other information might be useful, however. TIA Milan - -- Milan Holzäpfel alias jagdfalke alias jag Antworten direkt an mich Answers directly to me gehen bitte an eine Addresse, go to an address one die man hier finden kann: can find here, please: Kontaktinfos sowie Contact infos as well as Öff GnuPG-Schlüssel <URL:http://con.mjh.name/> GnuPG Public Key GnuPG Fingerabdruck 4C8A 5FAF 5D32 6125 89D1 GnuPG Fingerprint 0CE5 DB0C AF4F 6583 7966 http://www.deppenleerzeichen.de/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQFBritx2wyvT2WDeWYRAuT6AKDIuhEDQBiy/Bm0dUkitZeN2JNw1wCg1HPH d+k0NBqFFZcxvK1RnyUsPo8=uSd/ -----END PGP SIGNATURE----- ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://productguide.itmanagersjournal.com/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Keir Fraser
2004-Dec-02 09:30 UTC
Re: [Xen-devel] nFroce SATA lockup - problem location tracked down
> I finally did some more tests with system with nForce3-250Gb SATA > controller, whose driver locks the system at boot time when inside xen.Looks like an interrupt problem. We plan to start using more of the Linux DOM0 platform code in our next release which should avoid these problems. It also may be that you have some large-numbered IRQs and we can simply extend Xen to support those. Can you post the output of ''cat /proc/interrupts'' from your working Linux installation? -- Keir ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://productguide.itmanagersjournal.com/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Milan Holzäpfel
2004-Dec-02 15:13 UTC
Re: [Xen-devel] nFroce SATA lockup - problem location tracked down
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thu, 02 Dec 2004 09:30:52 +0000 Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:> > > I finally did some more tests with system with nForce3-250Gb SATA > > controller, whose driver locks the system at boot time when inside xen. > > Looks like an interrupt problem. We plan to start using more of the > Linux DOM0 platform code in our next release which should avoid these > problems. It also may be that you have some large-numbered IRQs and > we can simply extend Xen to support those. Can you post the output of > ''cat /proc/interrupts'' from your working Linux installation?/proc/interrupts on 2.6.9: | CPU0 | 0: 2002603 XT-PIC timer | 1: 5087 IO-APIC-edge i8042 | 8: 2 IO-APIC-edge rtc | 9: 0 IO-APIC-level acpi | 12: 67 IO-APIC-edge i8042 | 14: 467 IO-APIC-edge ide0 | 15: 2331 IO-APIC-edge ide1 | 17: 80 IO-APIC-level EMU10K1 | 19: 261861 IO-APIC-level fcdsl | 20: 2 IO-APIC-level ehci_hcd | 21: 88475 IO-APIC-level libata, ohci_hcd | 22: 0 IO-APIC-level ohci_hcd, NVidia CK8S | 23: 198342 IO-APIC-level eth0 | NMI: 0 | LOC: 2002471 | ERR: 1 | MIS: 0 private mail is following, but I guess that it may be useful for other ppl too... On Thu, 02 Dec 2004 09:33:44 +0000 Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:> > Further to my previous mail, I actually suspect that your setup is > doomed until we start using the ACPI code in DOM0 Linux. It looks like > you need a pretty complete ACPI configuration in order to set up IRQ > routing correctly. That is getting done under Xen/XenLinux and so your > sata interrupts are going nowhere. :-( > > Does your system work with 2.4 kernels? Does your system work if you > compile a non-ACPI kernel?When booting with 2.6.9 without ACPI, it hangs at the same position as Linux does inside Xen. (at least according to the messages displayed usually, but I think they should do the job...) Since I haven''t run any 2.4 Kernel on my installation I use normally, I built & tried to boot a 2.4.28 on a smaller "rescue" installation, which hasn''t got up-do-date GCCs. I first tried with 3.4.1, then with 3.3.3 (most current GCC from portage is 3.3.4/3.4.3), but the result was the same: The last line I get is from grub saying "file ok, booting the kernel" or sth. like that, an then the system does a reset. (giving panic=10 didn''t change anything) I could try compiling a 2.4 kernel with GCC 3.3.4 and with updated bin86, but I guess this would change anyhting (?) Also I''m using "unofficial" gentoo profiles, which use gcc-kernel-headers from 2.6 instead of from 2.4, but well, the problem occurs fat before glibc is even touched :)) I''d happy to try & report when the changes you mentioned take place. (unstable bk or whatever I wouldn''t mind...) As soon as this is working, can you say whether it will be possible to give a 2.4 non-priviledged kernel functional access a PCI device? (the reason for me asking this is that I have a freakin'' mostly-binary-only driver for my ADSL hardware, but I some version of that driver for 2.4 is said to be stable, so my idea was to run a driver domain using 2.4 for this crappy piece of hardware...)> -- KeirRegards, Milan - -- Milan Holzäpfel alias jagdfalke alias jag Antworten direkt an mich Answers directly to me gehen bitte an eine Addresse, go to an address one die man hier finden kann: can find here, please: Kontaktinfos sowie Contact infos as well as Öff GnuPG-Schlüssel <URL:http://con.mjh.name/> GnuPG Public Key GnuPG Fingerabdruck 4C8A 5FAF 5D32 6125 89D1 GnuPG Fingerprint 0CE5 DB0C AF4F 6583 7966 http://www.deppenleerzeichen.de/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQFBrzEv2wyvT2WDeWYRApzEAKDqXYB1qy7V63ib2sJlMBqu56T2WwCg1tyF Kd5K2NM38QLc58YhXYnQcYo=2aZV -----END PGP SIGNATURE----- ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://productguide.itmanagersjournal.com/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel