Scott Silva wrote:> on 5-27-2008 10:16 AM Ross S. W. Walker spake the following:
> > sbeam wrote:
> > > On Tuesday 27 May 2008 11:39, Scott Silva wrote:
> > > >
> > > > Running memtest for 24 hours should be enough to test the
ram.
> > > > A 3ware 7006 is a fairly old card. Does it have the latest
bios available
> > > > from 3ware?
> > > >
> > > > You could always eliminate the 3ware controller by
installing a drive on
> > > > whatever built in controller it has.
> > >
> > > this is a production server, so running an extended memtest not
going to
> > > happen. But I can swap it out and put it in a backup system to do
the test.
> > > It's beginning to look a lot like a RAM issue as I have now
seen a couple
> > > segfaults from programs that have always run fine. Every kernel
panic message
> > > is different (crashed again 1 hour ago). Fans and case temp are
nominal.
> > >
> > > the 3ware card was just purchased last month, it has the latest
firmware and
> > > bios installed.
> > >
> > > the memory is from PQI - supposed to be an OK brand right? it has
a lifetime
> > > warranty... heh
> > >
> > > next steps... HA and fault-tolerant clustering, per the adjacent
thread...
> > > this is the cautionary tale come to life.
> >
> > It would be great if there were a simple machine that you could plug
> > a bunch of dimms of varying types into and it will perform high-speed
> > tests on them continuously and flag ones that show an error.
> >
> > Then you could test all memory modules thoroughly before putting them
> > into production servers (or any server for that matter).
>
> That is why a good long burn in test is a worthwhile thing to
> plan for. That is unless you need to rush a replacement
> server out quickly.
Yes, but even then, with say 16GB or 32GB of memory it happens
that some errors just fall through the cracks.
> I usually run memtest86 for 48 hours, and then run a burn in
> test with some load.
>
> There are simple machines for testing memory, but they tend
> to be very expensive and time consuming. Manufacturers can't
> take the time to do thorough memory tests before they ship,
> so they usually do some quick go-nogo tests and depend on
> their warranty dept. to do the hard tests.
>
> I don't think it would pay for anyone to buy one of these
> testers, unless you are a very large var like Dell or HP. It
> is easier (and probably cheaper) to just send new ram out and
> send the returns back to your supplier for them to check.
I actually found a memory testing system for around $4K, yes
it's about the cost of a well equiped server, but if it
works well it should earn it's keep pretty quick.
It's called RAMCHECK, I priced out the DDR/DDR2 unit, but
there is add-ons for SODIMM, SDRAM, EDO, if you got it
fully loaded I suspect it would be around $5K.
Company's called Innovations http://www.memorytesters.com/
They're Government registered and CDW seems to resell it,
so it isn't completely suspect.
-Ross
______________________________________________________________________
This e-mail, and any attachments thereto, is intended only for use by
the addressee(s) named herein and may contain legally privileged
and/or confidential information. If you are not the intended recipient
of this e-mail, you are hereby notified that any dissemination,
distribution or copying of this e-mail, and any attachments thereto,
is strictly prohibited. If you have received this e-mail in error,
please immediately notify the sender and permanently delete the
original and any copy or printout thereof.