thr3ads.net - freebsd stable - 6.2-R on Dell Poweredge 2950 with Dell PERC 5/i [mfi(4)] [May 2007]

If this information is useful, please help other people find it:
Share via:

David Wolfskill

2007-May-10 20:16 UTC

6.2-R on Dell Poweredge 2950 with Dell PERC 5/i [mfi(4)]

From a quick look in the lists, I get the impression that the Dell PERC
5/i may be a bit problematic.  Since I hadn't any plans on using that
hardware, though, I've paid more attention to other things.

Well, now a colleague is trying to run 6.2-R on one of these 2950s; dmesg
says the controller is:

mfi0: <Dell PERC 5/i> mem 0xd80f0000-0xd80fffff,0xfc4e0000-0xfc4fffff irq
78 at device 14.0 on pci2
mfi0: 817 (224963336s/0x0020/0) - Shutdown command received from host
mfi0: 818 (4278190080s/0x0020/0) - PCI 0x041028 0x0415 0x041028 0x041f03:
Firmware initialization started (PCI ID 0015/1028/1f03/1028)
mfi0: 819 (4278190080s/0x0020/0) - Type 18: Firmware version 1.00.02-0157
mfi0: 820 (4278190096s/0x0008/0) - Battery Present
mfi0: 821 (4278190124s/0x0004/0) - PD 08(e1/s255) event: Enclosure (SES)
discovered on PD 08(e1/s255)
mfi0: 822 (4278190124s/0x0002/0) - PD 08(e1/s255) event: Inserted: PD
08(e1/s255)
mfi0: 823 (4278190124s/0x0002/0) - Type 29: Inserted: PD 08(e1/s255) Info:
enclPd=08, scsiType=d, portMap=00, sasAddr=500180b04413ce00,0000000000000000
mfi0: 824 (4278190124s/0x0002/0) - PD 00(e1/s0) event: Inserted: PD 00(e1/s0)
mfi0: 825 (4278190124s/0x0002/0) - Type 29: Inserted: PD 00(e1/s0) Info:
enclPd=08, scsiType=0, portMap=01, sasAddr=50010b900046038e,0000000000000000
mfi0: 826 (4278190124s/0x0002/0) - PD 01(e1/s1) event: Inserted: PD 01(e1/s1)
mfi0: 827 (4278190124s/0x0002/0) - Type 29: Inserted: PD 01(e1/s1) Info:
enclPd=08, scsiType=0, portMap=02, sasAddr=50010b9000460376,0000000000000000
mfi0: 828 (4278190124s/0x0002/0) - PD 02(e1/s2) event: Inserted: PD 02(e1/s2)
mfi0: 829 (4278190124s/0x0002/0) - Type 29: Inserted: PD 02(e1/s2) Info:
enclPd=08, scsiType=0, portMap=04, sasAddr=50010b900046035a,0000000000000000
mfi0: 830 (4278190124s/0x0002/0) - PD 03(e1/s3) event: Inserted: PD 03(e1/s3)
mfi0: 831 (4278190124s/0x0002/0) - Type 29: Inserted: PD 03(e1/s3) Info:
enclPd=08, scsiType=0, portMap=08, sasAddr=50010b90004603be,0000000000000000
mfi0: 832 (4278190124s/0x0002/0) - PD 04(e1/s4) event: Inserted: PD 04(e1/s4)
mfi0: 833 (4278190124s/0x0002/0) - Type 29: Inserted: PD 04(e1/s4) Info:
enclPd=08, scsiType=0, portMap=10, sasAddr=50010b900045f6d6,0000000000000000
mfi0: 834 (4278190124s/0x0002/0) - PD 05(e1/s5) event: Inserted: PD 05(e1/s5)
mfi0: 835 (4278190124s/0x0002/0) - Type 29: Inserted: PD 05(e1/s5) Info:
enclPd=08, scsiType=0, portMap=20, sasAddr=50010b9000460246,0000000000000000
mfi0: 836 (224964238s/0x0020/0) - Adapter ticks 224964238 elapsed 45s: Time
established as 02/16/07 18:03:58; (45 seconds since power on)

and the disks looks like:

mfid0: <MFI Logical Disk> on mfi0
mfid0: 418176MB (856424448 sectors) RAID volume '' is optimal


The intended production workload involves creation and deletion of
a large number of files rather rapidly.

I recalled that for the first year or two with Soft Updates, there
were problems with that kind of workload, such that there was enough
hysteresis in making free blocks actually available for subsequent
allocation that processes that were trying to write to new blocks
on such file systems would often fail, reporting ENOSPC.  Un-mounting
and re-mounting the file system would clean things up, but that
doesn't tend to be a viable approach for keeping a long-running
application happy.  :-}

I reminded my colleague of this, since she also reported that an
un-mount/re-mount sequence caused a lot of free space to show up
on the file system in question, and she responded that she had been
aware of this, and had been turning off Soft Updates on the file
systems for the application in question, but she had forgotten that
Soft Updates was on by default when she set up this (test) system.

She then turned off Soft Updates and started the test workload again.
And instead of failing with ENOSPC after 3 days, it only took 2.

Hmmm... well; that wasn't exactly what I had expected.

Any hints, here?  The machine is running the i386 arch, with a pair of
dual-core 2.33HHz Xeons.

I have a recent dmesg.boot, but I'd rather keep list messages fairly
short.

We have a local private mirror of the FreeBSD CVS repository, so we have
some flexibility in what we can do for testing, but the objective is to
put the box in production -- and I'd rather not run CURRENT as part of a
customer-visible production workload.  :-}  [My laptop is a different
matter, of course....]

Thanks!

Peace,
david
--
David H. Wolfskill				david@catwhisker.org
Believe SORBS at your own risk: 63.193.123.122 has been static since Aug 1999.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
Url :
http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20070510/a9ed6165/attachment.pgp

Scott Long

2007-May-10 23:15 UTC

head link

6.2-R on Dell Poweredge 2950 with Dell PERC 5/i [mfi(4)]

David Wolfskill wrote:> From a quick look in the lists, I get the impression that the Dell PERC
> 5/i may be a bit problematic.  Since I hadn't any plans on using that
> hardware, though, I've paid more attention to other things.
> 
Not sure that this impression is entirely accurate.  The biggest problem
with MFI machines is online RAID management.  The storage driver itself
matured very quickly and has been very reliable.
> Well, now a colleague is trying to run 6.2-R on one of these 2950s; dmesg
> says the controller is:
> 
> mfi0: <Dell PERC 5/i> mem 0xd80f0000-0xd80fffff,0xfc4e0000-0xfc4fffff
irq 78 at device 14.0 on pci2
> mfi0: 817 (224963336s/0x0020/0) - Shutdown command received from host
> mfi0: 818 (4278190080s/0x0020/0) - PCI 0x041028 0x0415 0x041028 0x041f03:
Firmware initialization started (PCI ID 0015/1028/1f03/1028)
> mfi0: 819 (4278190080s/0x0020/0) - Type 18: Firmware version 1.00.02-0157
> mfi0: 820 (4278190096s/0x0008/0) - Battery Present
> mfi0: 821 (4278190124s/0x0004/0) - PD 08(e1/s255) event: Enclosure (SES)
discovered on PD 08(e1/s255)
> mfi0: 822 (4278190124s/0x0002/0) - PD 08(e1/s255) event: Inserted: PD
08(e1/s255)
> mfi0: 823 (4278190124s/0x0002/0) - Type 29: Inserted: PD 08(e1/s255) Info:
enclPd=08, scsiType=d, portMap=00, sasAddr=500180b04413ce00,0000000000000000
> mfi0: 824 (4278190124s/0x0002/0) - PD 00(e1/s0) event: Inserted: PD
00(e1/s0)
> mfi0: 825 (4278190124s/0x0002/0) - Type 29: Inserted: PD 00(e1/s0) Info:
enclPd=08, scsiType=0, portMap=01, sasAddr=50010b900046038e,0000000000000000
> mfi0: 826 (4278190124s/0x0002/0) - PD 01(e1/s1) event: Inserted: PD
01(e1/s1)
> mfi0: 827 (4278190124s/0x0002/0) - Type 29: Inserted: PD 01(e1/s1) Info:
enclPd=08, scsiType=0, portMap=02, sasAddr=50010b9000460376,0000000000000000
> mfi0: 828 (4278190124s/0x0002/0) - PD 02(e1/s2) event: Inserted: PD
02(e1/s2)
> mfi0: 829 (4278190124s/0x0002/0) - Type 29: Inserted: PD 02(e1/s2) Info:
enclPd=08, scsiType=0, portMap=04, sasAddr=50010b900046035a,0000000000000000
> mfi0: 830 (4278190124s/0x0002/0) - PD 03(e1/s3) event: Inserted: PD
03(e1/s3)
> mfi0: 831 (4278190124s/0x0002/0) - Type 29: Inserted: PD 03(e1/s3) Info:
enclPd=08, scsiType=0, portMap=08, sasAddr=50010b90004603be,0000000000000000
> mfi0: 832 (4278190124s/0x0002/0) - PD 04(e1/s4) event: Inserted: PD
04(e1/s4)
> mfi0: 833 (4278190124s/0x0002/0) - Type 29: Inserted: PD 04(e1/s4) Info:
enclPd=08, scsiType=0, portMap=10, sasAddr=50010b900045f6d6,0000000000000000
> mfi0: 834 (4278190124s/0x0002/0) - PD 05(e1/s5) event: Inserted: PD
05(e1/s5)
> mfi0: 835 (4278190124s/0x0002/0) - Type 29: Inserted: PD 05(e1/s5) Info:
enclPd=08, scsiType=0, portMap=20, sasAddr=50010b9000460246,0000000000000000
> mfi0: 836 (224964238s/0x0020/0) - Adapter ticks 224964238 elapsed 45s: Time
established as 02/16/07 18:03:58; (45 seconds since power on)
> 
> and the disks looks like:
> 
> mfid0: <MFI Logical Disk> on mfi0
> mfid0: 418176MB (856424448 sectors) RAID volume '' is optimal
> 
Looks A OK to me.
> 
> The intended production workload involves creation and deletion of
> a large number of files rather rapidly.
> 
> I recalled that for the first year or two with Soft Updates, there
> were problems with that kind of workload, such that there was enough
> hysteresis in making free blocks actually available for subsequent
> allocation that processes that were trying to write to new blocks
> on such file systems would often fail, reporting ENOSPC.  Un-mounting
> and re-mounting the file system would clean things up, but that
> doesn't tend to be a viable approach for keeping a long-running
> application happy.  :-}
> 
sysctl vfs.ffs.doasyncfree=0 might help.  Running the syncer more 
frequently might also help, but I don't recall the sysctl node for
that.
> I reminded my colleague of this, since she also reported that an
> un-mount/re-mount sequence caused a lot of free space to show up
> on the file system in question, and she responded that she had been
> aware of this, and had been turning off Soft Updates on the file
> systems for the application in question, but she had forgotten that
> Soft Updates was on by default when she set up this (test) system.
> 
> She then turned off Soft Updates and started the test workload again.
> And instead of failing with ENOSPC after 3 days, it only took 2.
Very strange.  No chance that it was due to files that were deleted but
still referenced by open apps?
> 
> Hmmm... well; that wasn't exactly what I had expected.
> 
> Any hints, here?  The machine is running the i386 arch, with a pair of
> dual-core 2.33HHz Xeons.
> 
> I have a recent dmesg.boot, but I'd rather keep list messages fairly
> short.
> 
> We have a local private mirror of the FreeBSD CVS repository, so we have
> some flexibility in what we can do for testing, but the objective is to
> put the box in production -- and I'd rather not run CURRENT as part of
a
> customer-visible production workload.  :-}  [My laptop is a different
> matter, of course....]
> 
This sounds purely like a filesystem issue, not an MFI driver issue.

Scott

freebsd stable - May 2007 - 6.2-R on Dell Poweredge 2950 with Dell PERC 5/i [mfi(4)]

6.2-R on Dell Poweredge 2950 with Dell PERC 5/i [mfi(4)]

6.2-R on Dell Poweredge 2950 with Dell PERC 5/i [mfi(4)]