On 29 Dec 2014, at 17:04, Steven Hartland <killing at multiplay.co.uk> wrote:> On 29/12/2014 16:02, Dr Josef Karthauser wrote: >> On 29 Dec 2014, at 08:36, Steven Hartland <killing at multiplay.co.uk> wrote: >>> That looks like a 10.0 boot not a 10.1 boot could you confirm and provide a 10.1 boot if thats the case please Joe? >> Whoops! Sorry. >> >> Attached is a verbose boot time dmesg for the 10.1 that causes the problem under load. >> I immediately rebooted back onto 10.0, so the (un-verbose) dmesg for that follows. >> >> Joe > Thanks Joe actually a verbose boot from 10.0 for comparison would good too.Ok - I?m attaching a 10.0, 10.1 verbose boot and a diff of the two.> Also something to try on the 10.1 to see if it makes any difference, add the following to /boot/loader.conf or run from the loader prompt: > hint.ahci.0.msi=1 > > You can also try =0 as well if 1 makes no difference.I?ll try these later when the machine?s less busy. Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: 10.0-1.diff Type: application/octet-stream Size: 15431 bytes Desc: not available URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20141230/972429a6/attachment.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: dmesg.boot-10.0 Type: application/octet-stream Size: 42400 bytes Desc: not available URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20141230/972429a6/attachment-0001.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: dmesg.boot-10.1 Type: application/octet-stream Size: 42789 bytes Desc: not available URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20141230/972429a6/attachment-0002.obj> -------------- next part -------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 496 bytes Desc: Message signed with OpenPGP using GPGMail URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20141230/972429a6/attachment.sig>
On 30/12/2014 14:08, Dr Josef Karthauser wrote:> On 29 Dec 2014, at 17:04, Steven Hartland <killing at multiplay.co.uk> wrote: > >> On 29/12/2014 16:02, Dr Josef Karthauser wrote: >>> On 29 Dec 2014, at 08:36, Steven Hartland <killing at multiplay.co.uk> wrote: >>>> That looks like a 10.0 boot not a 10.1 boot could you confirm and provide a 10.1 boot if thats the case please Joe? >>> Whoops! Sorry. >>> >>> Attached is a verbose boot time dmesg for the 10.1 that causes the problem under load. >>> I immediately rebooted back onto 10.0, so the (un-verbose) dmesg for that follows. >>> >>> Joe >> Thanks Joe actually a verbose boot from 10.0 for comparison would good too. > Ok - I?m attaching a 10.0, 10.1 verbose boot and a diff of the two. > >> Also something to try on the 10.1 to see if it makes any difference, add the following to /boot/loader.conf or run from the loader prompt: >> hint.ahci.0.msi=1 >> >> You can also try =0 as well if 1 makes no difference. > I?ll try these later when the machine?s less busy.Ah now that's very interesting! On 10.0 we're only allocating 1 out of the possible 8 MSI vectors where as on 10.1 we're allocating all 8. == 10.0 =ahci0: <ATI IXP700 AHCI SATA controller> port 0xc000-0xc007,0xb000-0xb003,0xa000-0xa007,0x9000-0x9003,0x8000-0x800f mem 0xfe5ffc00-0xfe5fffff irq 19 at device 17.0 on pci0 ahci0: attempting to allocate 1 MSI vectors (8 supported) msi: routing MSI IRQ 263 to local APIC 0 vector 58 ahci0: using IRQ 263 for MSI == 10.1 =ahci0: <AMD SB7x0/SB8x0/SB9x0 AHCI SATA controller> port 0xc000-0xc007,0xb000-0xb003,0xa000-0xa007,0x9000-0x9003,0x8000-0x800f mem 0xfe5ffc00-0xfe5fffff irq 19 at device 17.0 on pci0 ahci0: attempting to allocate 8 MSI vectors (8 supported) msi: routing MSI IRQ 263 to local APIC 0 vector 64 msi: routing MSI IRQ 264 to local APIC 0 vector 65 msi: routing MSI IRQ 265 to local APIC 0 vector 66 msi: routing MSI IRQ 266 to local APIC 0 vector 67 msi: routing MSI IRQ 267 to local APIC 0 vector 68 msi: routing MSI IRQ 268 to local APIC 0 vector 69 msi: routing MSI IRQ 269 to local APIC 0 vector 70 msi: routing MSI IRQ 270 to local APIC 0 vector 71 ahci0: using IRQs 263-270 for MSI This change was brought into stable/10 by r260387 and originally came from r256843. I don't believe there's anything wrong with the change, but if this is indeed the cause it could indicate some sort of hardware bug which when throughput is increased by the use of multiple MSI vectors causes an issue. This is strengthened by the fact that ATI's previous generation HW (SB600) had MSI disabled by r245875 due to a very similar issue. So given all the evidence so far ahci.0.msi=1 may well be the fix. Regards Steve
On 30 Dec 2014, at 16:34, Steven Hartland <killing at multiplay.co.uk> wrote:> > This is strengthened by the fact that ATI's previous generation HW (SB600) had MSI disabled by r245875 due to a very similar issue. > > So given all the evidence so far ahci.0.msi=1 may well be the fix. >Is there any benefit to also trying with mdi > 1 < 8? Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 496 bytes Desc: Message signed with OpenPGP using GPGMail URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20150101/dd25307f/attachment.sig>