I've recently acquired an AMD64 box (dual Opteron 242, SiS Master@-FAR motherboard (http://www.msi.com.tw/program/products/server/svr/pro_svr_detail.php?UID=484). See below for more details). I find it very unstable running with 8 GB memory, though 4 GB are not a problem. At first I thought it was the onboard peripherals, but after disabling them it still persisted. What's unstable? I only once got it through the boot process. Running a 5.3-RELEASE i386 kernel it panics, though I haven't investigated the panic (yet), since I'm not interested in the i386 kernel. The amd64 5.4-PRERELEASE kernel just hangs/freezes. When the peripherals are enabled, it's after probing the onboard NIC (bge) and before probing SATA (no drives present). I've done a verbose boot, of course, but no additional information is present. The NIC is recognized, and that's all. Without the peripherals, but with a 3Com 3c905 PCI NIC, it continues beyond this point, but doesn't enable the NIC. I don't have dmesg output for these attempts, so I can't produce the exact message, and I suspect it's not important. It continues until trying to mount NFS file systems, where it hangs for obvious reasons. Pressing ^C causes the system to either panic (and be unable to dump because I don't have that much swap) or just hang. None of these problems occur when I use 4 GB memory. About the only strangeness, which seems to come from the BIOS, is that it recognizes only 3.5 GB. If I put all DIMMS in, it recognizes the full 8 GB memory. I realize that this isn't enough to diagnose the problem. The reason for this message now is to ask: 1. Has anybody else seen this problem? 2. Has anybody else used this hardware configuration and *not* seen this problem? 3. Where should I look next? I'm attaching the (non-verbose) dmesg from a successful boot. Greg -- See complete headers for address and phone numbers. Mar 30 14:17:16 obelix kernel: FreeBSD 5.4-PRERELEASE #0: Tue Mar 22 04:02:17 UTC 2005 Mar 30 14:17:16 obelix kernel: root@obelix:/usr/obj/src/FreeBSD/OBELIX/src/sys/OBELIX Mar 30 14:17:16 obelix kernel: Timecounter "i8254" frequency 1193182 Hz quality 0 Mar 30 14:17:16 obelix kernel: CPU: AMD Opteron(tm) Processor 242 (1603.65-MHz K8-class CPU) Mar 30 14:17:16 obelix kernel: Origin = "AuthenticAMD" Id = 0xf5a Stepping = 10 Mar 30 14:17:16 obelix kernel: Features=0x78bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE3 6,CLFLUSH,MMX,FXSR,SSE,SSE2> Mar 30 14:17:16 obelix kernel: AMD Features=0xe0500800<SYSCALL,NX,MMX+,LM,3DNow+,3DNow> Mar 30 14:17:16 obelix kernel: real memory = 3756916736 (3582 MB) Mar 30 14:17:16 obelix kernel: avail memory = 3623907328 (3456 MB) Mar 30 14:17:16 obelix kernel: ACPI APIC Table: <VIAK8 AWRDACPI> Mar 30 14:17:16 obelix kernel: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs Mar 30 14:17:16 obelix kernel: cpu0 (BSP): APIC ID: 0 Mar 30 14:17:16 obelix kernel: cpu1 (AP): APIC ID: 1 Mar 30 14:17:16 obelix kernel: ioapic0: Changing APIC ID to 2 Mar 30 14:17:16 obelix kernel: ioapic0 <Version 0.3> irqs 0-23 on motherboard Mar 30 14:17:16 obelix kernel: acpi0: <VIAK8 AWRDACPI> on motherboard Mar 30 14:17:16 obelix kernel: acpi0: Power Button (fixed) Mar 30 14:17:16 obelix kernel: Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 Mar 30 14:17:16 obelix kernel: acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0 Mar 30 14:17:16 obelix kernel: cpu0: <ACPI CPU> on acpi0 Mar 30 14:17:16 obelix kernel: cpu1: <ACPI CPU> on acpi0 Mar 30 14:17:16 obelix kernel: acpi_button0: <Power Button> on acpi0 Mar 30 14:17:16 obelix kernel: pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 Mar 30 14:17:16 obelix kernel: pci0: <ACPI PCI bus> on pcib0 Mar 30 14:17:16 obelix kernel: pcib1: <PCI-PCI bridge> at device 1.0 on pci0 Mar 30 14:17:16 obelix kernel: pci1: <PCI bus> on pcib1 Mar 30 14:17:16 obelix kernel: pci1: <display, VGA> at device 0.0 (no driver attached) Mar 30 14:17:16 obelix kernel: xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0xd000-0xd07f mem 0xfb000000-0xfb00007f irq 18 at device 7.0 on pci0 Mar 30 14:17:16 obelix kernel: miibus0: <MII bus> on xl0 Mar 30 14:17:16 obelix kernel: xlphy0: <3c905C 10/100 internal PHY> on miibus0 Mar 30 14:17:16 obelix kernel: xlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto Mar 30 14:17:16 obelix kernel: xl0: Ethernet address: 00:50:da:cf:17:d3 Mar 30 14:17:16 obelix kernel: atapci0: <VIA 8237 UDMA133 controller> port 0xd400-0xd40f,0x376,0x170-0x177,0x3f6,0x1f0-0 x1f7 at device 15.0 on pci0 Mar 30 14:17:16 obelix kernel: ata0: channel #0 on atapci0 Mar 30 14:17:16 obelix kernel: ata1: channel #1 on atapci0 Mar 30 14:17:16 obelix kernel: uhci0: <VIA 83C572 USB controller> port 0xd800-0xd81f irq 21 at device 16.0 on pci0 Mar 30 14:17:16 obelix kernel: usb0: <VIA 83C572 USB controller> on uhci0 Mar 30 14:17:16 obelix kernel: usb0: USB revision 1.0 Mar 30 14:17:16 obelix kernel: uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 Mar 30 14:17:16 obelix kernel: uhub0: 2 ports with 2 removable, self powered Mar 30 14:17:16 obelix kernel: uhci1: <VIA 83C572 USB controller> port 0xdc00-0xdc1f irq 21 at device 16.1 on pci0 Mar 30 14:17:16 obelix kernel: usb1: <VIA 83C572 USB controller> on uhci1 Mar 30 14:17:16 obelix kernel: usb1: USB revision 1.0 Mar 30 14:17:16 obelix kernel: uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 Mar 30 14:17:16 obelix kernel: uhub1: 2 ports with 2 removable, self powered Mar 30 14:17:16 obelix kernel: uhci2: <VIA 83C572 USB controller> port 0xe000-0xe01f irq 21 at device 16.2 on pci0 Mar 30 14:17:16 obelix kernel: usb2: <VIA 83C572 USB controller> on uhci2 Mar 30 14:17:16 obelix kernel: usb2: USB revision 1.0 Mar 30 14:17:16 obelix kernel: uhub2: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 Mar 30 14:17:16 obelix kernel: uhub2: 2 ports with 2 removable, self powered Mar 30 14:17:16 obelix kernel: pci0: <serial bus, USB> at device 16.4 (no driver attached) Mar 30 14:17:16 obelix kernel: isab0: <PCI-ISA bridge> at device 17.0 on pci0 Mar 30 14:17:16 obelix kernel: isa0: <ISA bus> on isab0 Mar 30 14:17:16 obelix kernel: pci0: <multimedia, audio> at device 17.5 (no driver attached) Mar 30 14:17:16 obelix kernel: acpi_tz0: <Thermal Zone> on acpi0 Mar 30 14:17:16 obelix kernel: fdc0: <floppy drive controller> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0 Mar 30 14:17:16 obelix kernel: sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 Mar 30 14:17:16 obelix kernel: sio0: type 16550A Mar 30 14:17:16 obelix kernel: sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 Mar 30 14:17:16 obelix kernel: sio1: type 16550A Mar 30 14:17:16 obelix kernel: ppc0: <Standard parallel printer port> port 0x778-0x77b,0x378-0x37f irq 7 on acpi0 Mar 30 14:17:16 obelix kernel: ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode Mar 30 14:17:16 obelix kernel: ppbus0: <Parallel port bus> on ppc0 Mar 30 14:17:16 obelix kernel: plip0: <PLIP network interface> on ppbus0 Mar 30 14:17:16 obelix kernel: lpt0: <Printer> on ppbus0 Mar 30 14:17:16 obelix kernel: lpt0: Interrupt-driven port Mar 30 14:17:16 obelix kernel: ppi0: <Parallel I/O> on ppbus0 Mar 30 14:17:16 obelix kernel: atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0 Mar 30 14:17:16 obelix kernel: atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0 Mar 30 14:17:16 obelix kernel: kbd0 at atkbd0 Mar 30 14:17:16 obelix kernel: orm0: <ISA Option ROMs> at iomem 0xd0000-0xd07ff,0xc0000-0xcffff on isa0 Mar 30 14:17:16 obelix kernel: sc0: <System console> at flags 0x100 on isa0 Mar 30 14:17:16 obelix kernel: sc0: VGA <16 virtual consoles, flags=0x300> Mar 30 14:17:16 obelix kernel: vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Mar 30 14:17:16 obelix kernel: Timecounters tick every 1.000 msec Mar 30 14:17:16 obelix kernel: ad0: 190782MB <ST3200826A/3.01> [387621/16/63] at ata0-master UDMA100 Mar 30 14:17:16 obelix kernel: ad1: 190782MB <ST3200826A/3.01> [387621/16/63] at ata0-slave UDMA100 Mar 30 14:17:16 obelix kernel: acd0: DVDR <PIONEER DVD-RW DVR-108/1.04> at ata1-master UDMA66 Mar 30 14:17:16 obelix kernel: SMP: AP CPU #1 Launched! Mar 30 14:17:16 obelix kernel: Mounting root from ufs:/dev/ad0s1a -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20050331/b671270f/attachment.bin
Greg 'groggy' Lehey wrote:> I've recently acquired an AMD64 box (dual Opteron 242, SiS Master@-FAR > motherboard > (http://www.msi.com.tw/program/products/server/svr/pro_svr_detail.php?UID=484). > See below for more details). I find it very unstable running with 8 > GB memory, though 4 GB are not a problem. At first I thought it was > the onboard peripherals, but after disabling them it still persisted. > > What's unstable? I only once got it through the boot process. > Running a 5.3-RELEASE i386 kernel it panics, though I haven't > investigated the panic (yet), since I'm not interested in the i386 > kernel. The amd64 5.4-PRERELEASE kernel just hangs/freezes. When the > peripherals are enabled, it's after probing the onboard NIC (bge) and > before probing SATA (no drives present). I've done a verbose boot, of > course, but no additional information is present. The NIC is > recognized, and that's all. >> Without the peripherals, but with a 3Com 3c905 PCI NIC, it continues > beyond this point, but doesn't enable the NIC. I don't have dmesg > output for these attempts, so I can't produce the exact message, and I > suspect it's not important. It continues until trying to mount NFS > file systems, where it hangs for obvious reasons. Pressing ^C causes > the system to either panic (and be unable to dump because I don't have > that much swap) or just hang. > > None of these problems occur when I use 4 GB memory. About the only > strangeness, which seems to come from the BIOS, is that it recognizes > only 3.5 GB. If I put all DIMMS in, it recognizes the full 8 GB > memory. > > I realize that this isn't enough to diagnose the problem. The reason > for this message now is to ask: > > 1. Has anybody else seen this problem? > 2. Has anybody else used this hardware configuration and *not* seen > this problem? > 3. Where should I look next? > > I'm attaching the (non-verbose) dmesg from a successful boot. > > Greg > -- > See complete headers for address and phone numbers. >5.3-RELEASE has a lot of problems with >4GB due to busdma issues. Those should no longer be an issue in RELENG_5, including 5.4-PRE. You'll need to dig in and provide some more details, I guess. I have an HDAMA dual Opteron system that behaves fine now with 8GB of RAM, so your problem might lie with particular hardware and/or drivers. Scott
On Thu, Mar 31, 2005 at 07:54:39AM +0930, Greg 'groggy' Lehey wrote:> None of these problems occur when I use 4 GB memory. About the only > strangeness, which seems to come from the BIOS, is that it recognizes > only 3.5 GB. If I put all DIMMS in, it recognizes the full 8 GB > memory. > > I realize that this isn't enough to diagnose the problem. The reason > for this message now is to ask: > > 1. Has anybody else seen this problem? > 2. Has anybody else used this hardware configuration and *not* seen > this problem? > 3. Where should I look next? >Have you run sysutils/memtest86 with the 8 GB? I had 4 bad out of 12 tested where the DIMMs were Crucial PC2700 2GB Reg. ECC DIMMs. -- Steve
> Date: Thu, 31 Mar 2005 16:12:35 +0900 > From: Ganbold <ganbold@micom.mng.net> > Subject: Re: Problems with AMD64 and 8 GB RAM? > > Hi, > > Since we are discussing AMD64 with 8GB RAM, I also would like to point my > problem. > > I'm still looking for possibility to run FreeBSD 5.3-STABLE with more than > 4GB RAM > on Dual amd64 2.2GHz machine (IBM @server 325) with ServeRAID 6M (ips > driver)). > Right now I'm using only 4GB RAM and this server is in production. > > #uname -an > FreeBSD publica.ub.mng.net 5.3-STABLE FreeBSD 5.3-STABLE #12: Mon Nov 22 > 12:04:57 ULAT 2004 tsgan@publicc.ub.mng.net:/usr/obj/usr/src/sys/AMD > amd64 > > As Scott said a few months ago, problem is below: > > "The ips driver looks like it will fail under heavy load when more than 4GB > of RAM is present. It tries to force busdma to not defer requests when the > bounce page reserve is low, but that looks to be broken and > will result in corrupted commands."[Alan Jay] Since we are talking about FreeBSD on AMD64 on the AMD64 list I have reported issues on that list. I have a TyanThunder K8S pro S2882 twin Operteron with 8Gb of RAM and although I can get the machine to run reasonably stably with 8Gb of RAM with limited loading when pushed it falls over unpredictably. We did some tests with the latest 5.3-STABLE / 5.4-PRERELEASE and still found the same issues when using a mySQL database heavily hit over the Ethernet controller. Our final tests limited the memory on boot-up to 4Gb and the bug is still there so we think it may well be some interaction with the Ethernet controller. The motherboard we have has a BroadcomBCM5704C 10/100/1000 based card on board. Again this works fine initially but then we get a very dramatic failure with no warning messages and the system falls over. There are still a few issues to be ironed out with the FreeBSD 5.x on AMD64 the latest STABLE/PRE-RELEASE is much improved but be aware there may be issues. We will be waiting a few more weeks before re-trying these tests to see if the latest fixes that have been discussed have solved our problems.
Greg 'groggy' Lehey wrote:> I've recently acquired an AMD64 box (dual Opteron 242, SiS Master@-FAR > motherboard > (http://www.msi.com.tw/program/products/server/svr/pro_svr_detail.php?UID=484). > See below for more details). I find it very unstable running with 8 > GB memory, though 4 GB are not a problem. At first I thought it was > the onboard peripherals, but after disabling them it still persisted. > > What's unstable? I only once got it through the boot process. > Running a 5.3-RELEASE i386 kernel it panics, though I haven't > investigated the panic (yet), since I'm not interested in the i386 > kernel. The amd64 5.4-PRERELEASE kernel just hangs/freezes. When the > peripherals are enabled, it's after probing the onboard NIC (bge) and > before probing SATA (no drives present). I've done a verbose boot, of > course, but no additional information is present. The NIC is > recognized, and that's all. > > Without the peripherals, but with a 3Com 3c905 PCI NIC, it continues > beyond this point, but doesn't enable the NIC. I don't have dmesg > output for these attempts, so I can't produce the exact message, and I > suspect it's not important. It continues until trying to mount NFS > file systems, where it hangs for obvious reasons. Pressing ^C causes > the system to either panic (and be unable to dump because I don't have > that much swap) or just hang. > > None of these problems occur when I use 4 GB memory. About the only > strangeness, which seems to come from the BIOS, is that it recognizes > only 3.5 GB. If I put all DIMMS in, it recognizes the full 8 GB > memory. > > I realize that this isn't enough to diagnose the problem. The reason > for this message now is to ask: > > 1. Has anybody else seen this problem?Hi Greg, [Currently little time so I'll dig the archives later for more details] I'm sorry to come into this discussion after 58 messages, but this board has been extensively discussed about 1 year ago, because it gave me trouble to no end (even with 2Gb). One of the early amd64 developers (not David or Scott) had the same board but could not get it stable under amd64 (i386 was fine with 2Gb). He tossed it, and suggested me to do the same. Which I did, and went to a Tyan board S278. After that there where no more problems at all. At the time I think things we're at 5.1 so now with 5.3 some features might have made the board act more stable. --WjW
Greg 'groggy' Lehey wrote:> I've recently acquired an AMD64 box (dual Opteron 242, SiS Master@-FAR > motherboard > (http://www.msi.com.tw/program/products/server/svr/pro_svr_detail.php?UID=484). > See below for more details). I find it very unstable running with 8 > GB memory, though 4 GB are not a problem. At first I thought it was > the onboard peripherals, but after disabling them it still persisted. > > What's unstable? I only once got it through the boot process. > Running a 5.3-RELEASE i386 kernel it panics, though I haven't> 1. Has anybody else seen this problem? > 2. Has anybody else used this hardware configuration and *not* seen > this problem?[Posted something like thisearlier, but did not see it on the list] Little late to the discussion, but none the less. I bought this board over a year ago to run amd64 (you're running i386). But in the end I trashed it for running amd64 since one of then involved developers also tried to use the board without much success. It was running fine with amd64, 1 CPU, 2Gb, but as soon as I added the 2nd CPU the slightest load crashed the system somewhere in IPI-areas. After long discussions I came to the point that it was easier to get a new motherboard, so I got a Tyan Tiger S275. Which as not yet failed on me. So if ever you'd like to run amd64 on this system, even now you've determined the problem to be the load on the memory-bus, be warned that odd things could be happening. --WjW